Discussion:
[Linaro-validation] Further steps with Zephyr tests
Paul Sokolovsky
2017-06-19 12:47:31 UTC
Permalink
Hello,

The LITE team appreciates bootstrapping of Zephyr-related LAVA testing
done by LAVA, LAVA Lab, B&B and QA teams. It was quite a backlogged
task for ourselves to be more involved with LAVA testing, and
hopefully, the time has come ;-).

I've reviewed the current status of on-device testing for Zephyr CI
jobs and see the following picture (feel free to correct me if
something is wrong are missing): "zephyr-upstream" and
"zephyr-upstream-arm" (https://ci.linaro.org/view/lite-iot-ci/) CI jobs
submit a number of tests to LAVA (via https://qa-reports.linaro.org/)
for the following boards: arduino_101, frdm_k64f, frdm_kw41z,
qemu_cortex_m3. Here's an example of cumulative test report for these
platforms: https://qa-reports.linaro.org/lite/zephyr-upstream/tests/

That's really great! (Though the list of tests to run in LAVA seems to
be hardcoded:
https://git.linaro.org/ci/job/configs.git/tree/zephyr-upstream/submit_for_testing.py#n13)

But we'd like to test things beyond Zephyr testsuite, for example,
application frameworks (JerryScript, Zephyr.js, MicroPython) and
the mcuboot bootloader. For starters, we'd like to perform just a boot
test to make sure that each application can boot and start up, then
later hopefully to extend that to functional testing.

The most basic testing would be just check that after boot there's an
expected prompt from each of the apps, i.e. test it in "passive" manner,
similar to Zephyr unittests discussed above. I tried this with
Zephyr.js and was able to make it work (with manual submission so far):
https://validation.linaro.org/scheduler/job/1534097 . A peculiarity in
this case is that the default test app of Zephyr.js outputs just a
single line "Hello, ZJS world!", whereas LAVA's test/monitors test
job config specifies testsuite begin pattern, end pattern, and testcase
patterns, and I had a suspicion that each of them need to be on a
separate line. But I was able to make it pass with the following config:

- test:
monitors:
- name: foo
start: ""
end: Hello, ZJS world!
pattern: (?P<result>(PASS|FAIL))\s-\s(?P<test_case_id>\w+)\.

So, the "start" substring is empty, and perhaps matches a line output by
a USB multiplexer or board bootloader. "End" substring is actually the
expected single-line output. And "pattern" is unused (dunno if it can
be dropped without def file syntax error). Is there a better way to
handle single-line test output?

Well, beyond a simple output matching, it would be nice even for the
initial "smoke testing" to actually make some input into the application
and check the expected output (e.g., input: "2+2", expected output:
"4"). Is this already supported for LAVA "v2" pipeline tests? I may
imagine that would be the same kind of functionality required to test
bootloaders like U-boot for Linux boards.


Thanks,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
Milosz Wasilewski
2017-07-03 12:34:49 UTC
Permalink
Post by Paul Sokolovsky
Hello,
The LITE team appreciates bootstrapping of Zephyr-related LAVA testing
done by LAVA, LAVA Lab, B&B and QA teams. It was quite a backlogged
task for ourselves to be more involved with LAVA testing, and
hopefully, the time has come ;-).
I've reviewed the current status of on-device testing for Zephyr CI
jobs and see the following picture (feel free to correct me if
something is wrong are missing): "zephyr-upstream" and
"zephyr-upstream-arm" (https://ci.linaro.org/view/lite-iot-ci/) CI jobs
submit a number of tests to LAVA (via https://qa-reports.linaro.org/)
for the following boards: arduino_101, frdm_k64f, frdm_kw41z,
qemu_cortex_m3. Here's an example of cumulative test report for these
platforms: https://qa-reports.linaro.org/lite/zephyr-upstream/tests/
That's really great! (Though the list of tests to run in LAVA seems to
https://git.linaro.org/ci/job/configs.git/tree/zephyr-upstream/submit_for_testing.py#n13)
It is, as I wasn't really sure what to test. The build job needs to
prepare the test templates to be submitted to LAVA. In case of zephyr
each tests is a separate binary. So we end up with the number of file
paths to substitute in the template. Hardcoding was the easiest thing
to get things running. But I see no reason why it wouldn't be changed
with some smarter code to discover the binaries. The problem with this
approach is that some of these tests are just build time. They have no
meaning when running on the board and need to be filter out somehow.
Post by Paul Sokolovsky
But we'd like to test things beyond Zephyr testsuite, for example,
application frameworks (JerryScript, Zephyr.js, MicroPython) and
the mcuboot bootloader. For starters, we'd like to perform just a boot
test to make sure that each application can boot and start up, then
later hopefully to extend that to functional testing.
The most basic testing would be just check that after boot there's an
expected prompt from each of the apps, i.e. test it in "passive" manner,
similar to Zephyr unittests discussed above. I tried this with
https://validation.linaro.org/scheduler/job/1534097 . A peculiarity in
this case is that the default test app of Zephyr.js outputs just a
single line "Hello, ZJS world!", whereas LAVA's test/monitors test
job config specifies testsuite begin pattern, end pattern, and testcase
patterns, and I had a suspicion that each of them need to be on a
- name: foo
start: ""
end: Hello, ZJS world!
pattern: (?P<result>(PASS|FAIL))\s-\s(?P<test_case_id>\w+)\.
So, the "start" substring is empty, and perhaps matches a line output by
a USB multiplexer or board bootloader. "End" substring is actually the
expected single-line output. And "pattern" is unused (dunno if it can
be dropped without def file syntax error). Is there a better way to
handle single-line test output?
You're making a silent assumption that if there is a matching line,
the test is passed. In case of other tests (zephyr unit tests), it's
not the case. The 'start' matches some line which is displayed when
zephyr is booting. End matches the line which is displayed after all
testing is done. The pattern follows the unit test pattern.
Post by Paul Sokolovsky
Well, beyond a simple output matching, it would be nice even for the
initial "smoke testing" to actually make some input into the application
"4"). Is this already supported for LAVA "v2" pipeline tests? I may
imagine that would be the same kind of functionality required to test
bootloaders like U-boot for Linux boards.
I didn't use anything like this in v2 so far, but you're probably best
off doing sth like

test 2+2=4 PASS.

than you can easily create pattern that will filter the output. In
case of zephyr pattern is the only way to filter things out as there
is no shell (?) on the board.

milosz
Paul Sokolovsky
2017-07-03 20:50:25 UTC
Permalink
Hello Milosz,

I appreciate getting at least some response ;-). Some questions however
could use a reply from LAVA team, I guess.

On Mon, 3 Jul 2017 13:34:49 +0100
Milosz Wasilewski <***@linaro.org> wrote:

[]
Post by Milosz Wasilewski
jobs submit a number of tests to LAVA (via
arduino_101, frdm_k64f, frdm_kw41z, qemu_cortex_m3. Here's an
https://qa-reports.linaro.org/lite/zephyr-upstream/tests/
That's really great! (Though the list of tests to run in LAVA seems
https://git.linaro.org/ci/job/configs.git/tree/zephyr-upstream/submit_for_testing.py#n13)
It is, as I wasn't really sure what to test. The build job needs to
prepare the test templates to be submitted to LAVA. In case of zephyr
each tests is a separate binary. So we end up with the number of file
paths to substitute in the template. Hardcoding was the easiest thing
to get things running. But I see no reason why it wouldn't be changed
with some smarter code to discover the binaries. The problem with this
approach is that some of these tests are just build time. They have no
meaning when running on the board and need to be filter out somehow.
I see, that makes some sense. But thinking further, I'm not entirely
sure about "build only" tests. Zephyr's sanitycheck test has such
concept, but I'd imagine it comes from the following reasons: a)
sanitycheck runs tests on QEMU, which has very bare hardware support,
so many tests are not runnable; b) sanitycheck can operate on "samples",
not just "tests", as sample can be interactive, etc. it makes sense to
only build them, not run.

So, I'm not exactly sure about build-only tests on real HW boards. The
"default" idea would be that they should run, but I imagine in reality,
some may need to be filtered out. But then blacklisting would be better
approach than whitelisting. And I'm not sure if Zephyr has concept of
"skipped" tests which may be useful to handle hardware variations.
(Well, I actually dunno if LAVA supports skipped tests!)

Anyway, these are rough ideas for the future. I've spent couple of
weeks of munging with LITE CI setup, there're definitely some
improvements, but also a Pandora box of other ideas and improvements to
make. I'm wrapping up for now, but hope to look again in some time
(definitely hope to look before the Connect, so we can discuss further
steps there). In the meantime, I hope that more boards will be
installed in the Lab and stability of them improves (so far they seem
to be pretty flaky).

[]
Post by Milosz Wasilewski
- name: foo
start: ""
end: Hello, ZJS world!
pattern: (?P<result>(PASS|FAIL))\s-\s(?P<test_case_id>\w+)\.
So, the "start" substring is empty, and perhaps matches a line
output by a USB multiplexer or board bootloader. "End" substring is
actually the expected single-line output. And "pattern" is unused
(dunno if it can be dropped without def file syntax error). Is
there a better way to handle single-line test output?
You're making a silent assumption that if there is a matching line,
the test is passed. In case of other tests (zephyr unit tests), it's
not the case. The 'start' matches some line which is displayed when
zephyr is booting. End matches the line which is displayed after all
testing is done. The pattern follows the unit test pattern.
Thanks, but I'm not sure I understand this response. I don't challenge
that Zephyr unittests need this support, or the way they're handled.
LITE however needs to test more things than "batch" Zephyr unittests. I
present another usercase which albeit simple, barely supported by LAVA.
(That's a question to LAVA team definitely.)
Post by Milosz Wasilewski
Well, beyond a simple output matching, it would be nice even for the
initial "smoke testing" to actually make some input into the
application and check the expected output (e.g., input: "2+2",
expected output: "4"). Is this already supported for LAVA "v2"
pipeline tests? I may imagine that would be the same kind of
functionality required to test bootloaders like U-boot for Linux
boards.
I didn't use anything like this in v2 so far, but you're probably best
off doing sth like
test 2+2=4 PASS.
than you can easily create pattern that will filter the output. In
case of zephyr pattern is the only way to filter things out as there
is no shell (?) on the board.
So, the problem, for starters, is how to make LAVA *feed* the
input, as specified in the test definition (like "2+2") into a board.

As there were no reply from LAVA team (I may imagine they're busy with
other things), I decided to create a user story in Jira for them, as I
couldn't create a LAVA-* ticket, I created it as
https://projects.linaro.org/browse/LITE-175 . Hopefully that won't go
unnoticed and LAVA team would get to it eventually.
Post by Milosz Wasilewski
milosz
Thanks!
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
Milosz Wasilewski
2017-07-03 21:25:31 UTC
Permalink
Post by Paul Sokolovsky
Hello Milosz,
I appreciate getting at least some response ;-). Some questions however
could use a reply from LAVA team, I guess.
On Mon, 3 Jul 2017 13:34:49 +0100
[]
Post by Milosz Wasilewski
jobs submit a number of tests to LAVA (via
arduino_101, frdm_k64f, frdm_kw41z, qemu_cortex_m3. Here's an
https://qa-reports.linaro.org/lite/zephyr-upstream/tests/
That's really great! (Though the list of tests to run in LAVA seems
https://git.linaro.org/ci/job/configs.git/tree/zephyr-upstream/submit_for_testing.py#n13)
It is, as I wasn't really sure what to test. The build job needs to
prepare the test templates to be submitted to LAVA. In case of zephyr
each tests is a separate binary. So we end up with the number of file
paths to substitute in the template. Hardcoding was the easiest thing
to get things running. But I see no reason why it wouldn't be changed
with some smarter code to discover the binaries. The problem with this
approach is that some of these tests are just build time. They have no
meaning when running on the board and need to be filter out somehow.
I see, that makes some sense. But thinking further, I'm not entirely
sure about "build only" tests. Zephyr's sanitycheck test has such
concept, but I'd imagine it comes from the following reasons: a)
sanitycheck runs tests on QEMU, which has very bare hardware support,
so many tests are not runnable; b) sanitycheck can operate on "samples",
not just "tests", as sample can be interactive, etc. it makes sense to
only build them, not run.
So, I'm not exactly sure about build-only tests on real HW boards. The
"default" idea would be that they should run, but I imagine in reality,
some may need to be filtered out. But then blacklisting would be better
approach than whitelisting. And I'm not sure if Zephyr has concept of
"skipped" tests which may be useful to handle hardware variations.
(Well, I actually dunno if LAVA supports skipped tests!)
As far as I can tell they acutely run on the board, but usually output
just 'Hello world!' or sth similar. As we discussed with Kumar, this
is still OK. What Kumar requested (and I still didn't deliver) is that
whenever the LAVA test job completes, the test should be considered
'passed'. So we wouldn't have to do any parsing of patterns. I'm not
sure if that will work, but it's worth to try.
Post by Paul Sokolovsky
Anyway, these are rough ideas for the future. I've spent couple of
weeks of munging with LITE CI setup, there're definitely some
improvements, but also a Pandora box of other ideas and improvements to
make. I'm wrapping up for now, but hope to look again in some time
(definitely hope to look before the Connect, so we can discuss further
steps there). In the meantime, I hope that more boards will be
installed in the Lab and stability of them improves (so far they seem
to be pretty flaky).
You're absolutely right. This is a pretty big task to work on and IMHO
requires someone to work full time at least for couple of weeks. The
second part is also true, the boards don't behave as they should. I
guess Dave can elaborate more on that. I can only see the result -
boards (frdm-kw41z) don't run the tests they're requested.
Post by Paul Sokolovsky
[]
Post by Milosz Wasilewski
- name: foo
start: ""
end: Hello, ZJS world!
pattern: (?P<result>(PASS|FAIL))\s-\s(?P<test_case_id>\w+)\.
So, the "start" substring is empty, and perhaps matches a line
output by a USB multiplexer or board bootloader. "End" substring is
actually the expected single-line output. And "pattern" is unused
(dunno if it can be dropped without def file syntax error). Is
there a better way to handle single-line test output?
You're making a silent assumption that if there is a matching line,
the test is passed. In case of other tests (zephyr unit tests), it's
not the case. The 'start' matches some line which is displayed when
zephyr is booting. End matches the line which is displayed after all
testing is done. The pattern follows the unit test pattern.
Thanks, but I'm not sure I understand this response. I don't challenge
that Zephyr unittests need this support, or the way they're handled.
LITE however needs to test more things than "batch" Zephyr unittests. I
present another usercase which albeit simple, barely supported by LAVA.
(That's a question to LAVA team definitely.)
I probably misunderstood the question as well. So let's wait for
response from someone else.
Post by Paul Sokolovsky
Post by Milosz Wasilewski
Well, beyond a simple output matching, it would be nice even for the
initial "smoke testing" to actually make some input into the
application and check the expected output (e.g., input: "2+2",
expected output: "4"). Is this already supported for LAVA "v2"
pipeline tests? I may imagine that would be the same kind of
functionality required to test bootloaders like U-boot for Linux
boards.
I didn't use anything like this in v2 so far, but you're probably best
off doing sth like
test 2+2=4 PASS.
than you can easily create pattern that will filter the output. In
case of zephyr pattern is the only way to filter things out as there
is no shell (?) on the board.
So, the problem, for starters, is how to make LAVA *feed* the
input, as specified in the test definition (like "2+2") into a board.
Right. What I proposed was coding all the inputs in the test itself.
Post by Paul Sokolovsky
As there were no reply from LAVA team (I may imagine they're busy with
other things), I decided to create a user story in Jira for them, as I
couldn't create a LAVA-* ticket, I created it as
https://projects.linaro.org/browse/LITE-175 . Hopefully that won't go
unnoticed and LAVA team would get to it eventually.
It's probably best to create CTT ticket here:
https://projects.linaro.org/servicedesk/customer/portal/1
These tickets won't go unnoticed.

milosz
Paul Sokolovsky
2017-07-04 18:43:09 UTC
Permalink
Hello Milosz,

Thanks for routing this thread to lava-users - when I made initial post
to linaro-validation, I check my archive and so that e.g. Neil posts
there frequently, but I missed that it's not official LAVA list.

On Mon, 3 Jul 2017 22:25:31 +0100
Milosz Wasilewski <***@linaro.org> wrote:

[]
Post by Milosz Wasilewski
Post by Paul Sokolovsky
So, I'm not exactly sure about build-only tests on real HW boards.
The "default" idea would be that they should run, but I imagine in
reality, some may need to be filtered out. But then blacklisting
would be better approach than whitelisting. And I'm not sure if
Zephyr has concept of "skipped" tests which may be useful to handle
hardware variations. (Well, I actually dunno if LAVA supports
skipped tests!)
As far as I can tell they acutely run on the board, but usually output
just 'Hello world!' or sth similar. As we discussed with Kumar, this
is still OK. What Kumar requested (and I still didn't deliver) is that
whenever the LAVA test job completes, the test should be considered
'passed'. So we wouldn't have to do any parsing of patterns. I'm not
sure if that will work, but it's worth to try.
Hmm, I wonder what would be criteria for being "failed" for such
tests... Anyway, thanks for sharing - I'm not familiar with all Zephyr
tests/samples myself, will keep in mind such issues when looking into
them.

[]
Post by Milosz Wasilewski
Post by Paul Sokolovsky
more boards will be installed in the Lab and stability of them
improves (so far they seem to be pretty flaky).
You're absolutely right. This is a pretty big task to work on and IMHO
requires someone to work full time at least for couple of weeks. The
second part is also true, the boards don't behave as they should. I
guess Dave can elaborate more on that. I can only see the result -
boards (frdm-kw41z) don't run the tests they're requested.
Matt Hart actually showed me a ticket on that, so at least it's
confirmed/known issue in works. But even with arduino_101 and
frdm_k64f, I hit cases more than once when board(s) were stuck for
extended time, but still were routed jobs to (which either failed or
timed out). So, there may be problem with health checks, which either
don't run frequently enough or aren't robust enough. arduino_101 is all
the lone one, so if something happens to it, there's no backup. Etc,
etc.

[]
Post by Milosz Wasilewski
Post by Paul Sokolovsky
So, the problem, for starters, is how to make LAVA *feed* the
input, as specified in the test definition (like "2+2") into a board.
Right. What I proposed was coding all the inputs in the test itself.
Well, that would require bunch of legwork, but the biggest problem is
that it wouldn't test what's actually required. E.g. both JerryScript
and MicroPython Zephyr ports are actually interactive apps working over
serial connection. And functional testing of them would be feeding
something over this serial connection and checking that results are as
expected. I'll keep in mind idea of "builtin" tests though.

Thanks!
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
Milosz Wasilewski
2017-07-03 21:26:54 UTC
Permalink
I was too quick to hit reply. CCing lava-users for comments from LAVA team.

milosz
Post by Paul Sokolovsky
Hello Milosz,
I appreciate getting at least some response ;-). Some questions however
could use a reply from LAVA team, I guess.
On Mon, 3 Jul 2017 13:34:49 +0100
[]
Post by Milosz Wasilewski
jobs submit a number of tests to LAVA (via
arduino_101, frdm_k64f, frdm_kw41z, qemu_cortex_m3. Here's an
https://qa-reports.linaro.org/lite/zephyr-upstream/tests/
That's really great! (Though the list of tests to run in LAVA seems
https://git.linaro.org/ci/job/configs.git/tree/zephyr-upstream/submit_for_testing.py#n13)
It is, as I wasn't really sure what to test. The build job needs to
prepare the test templates to be submitted to LAVA. In case of zephyr
each tests is a separate binary. So we end up with the number of file
paths to substitute in the template. Hardcoding was the easiest thing
to get things running. But I see no reason why it wouldn't be changed
with some smarter code to discover the binaries. The problem with this
approach is that some of these tests are just build time. They have no
meaning when running on the board and need to be filter out somehow.
I see, that makes some sense. But thinking further, I'm not entirely
sure about "build only" tests. Zephyr's sanitycheck test has such
concept, but I'd imagine it comes from the following reasons: a)
sanitycheck runs tests on QEMU, which has very bare hardware support,
so many tests are not runnable; b) sanitycheck can operate on "samples",
not just "tests", as sample can be interactive, etc. it makes sense to
only build them, not run.
So, I'm not exactly sure about build-only tests on real HW boards. The
"default" idea would be that they should run, but I imagine in reality,
some may need to be filtered out. But then blacklisting would be better
approach than whitelisting. And I'm not sure if Zephyr has concept of
"skipped" tests which may be useful to handle hardware variations.
(Well, I actually dunno if LAVA supports skipped tests!)
Anyway, these are rough ideas for the future. I've spent couple of
weeks of munging with LITE CI setup, there're definitely some
improvements, but also a Pandora box of other ideas and improvements to
make. I'm wrapping up for now, but hope to look again in some time
(definitely hope to look before the Connect, so we can discuss further
steps there). In the meantime, I hope that more boards will be
installed in the Lab and stability of them improves (so far they seem
to be pretty flaky).
[]
Post by Milosz Wasilewski
- name: foo
start: ""
end: Hello, ZJS world!
pattern: (?P<result>(PASS|FAIL))\s-\s(?P<test_case_id>\w+)\.
So, the "start" substring is empty, and perhaps matches a line
output by a USB multiplexer or board bootloader. "End" substring is
actually the expected single-line output. And "pattern" is unused
(dunno if it can be dropped without def file syntax error). Is
there a better way to handle single-line test output?
You're making a silent assumption that if there is a matching line,
the test is passed. In case of other tests (zephyr unit tests), it's
not the case. The 'start' matches some line which is displayed when
zephyr is booting. End matches the line which is displayed after all
testing is done. The pattern follows the unit test pattern.
Thanks, but I'm not sure I understand this response. I don't challenge
that Zephyr unittests need this support, or the way they're handled.
LITE however needs to test more things than "batch" Zephyr unittests. I
present another usercase which albeit simple, barely supported by LAVA.
(That's a question to LAVA team definitely.)
Post by Milosz Wasilewski
Well, beyond a simple output matching, it would be nice even for the
initial "smoke testing" to actually make some input into the
application and check the expected output (e.g., input: "2+2",
expected output: "4"). Is this already supported for LAVA "v2"
pipeline tests? I may imagine that would be the same kind of
functionality required to test bootloaders like U-boot for Linux
boards.
I didn't use anything like this in v2 so far, but you're probably best
off doing sth like
test 2+2=4 PASS.
than you can easily create pattern that will filter the output. In
case of zephyr pattern is the only way to filter things out as there
is no shell (?) on the board.
So, the problem, for starters, is how to make LAVA *feed* the
input, as specified in the test definition (like "2+2") into a board.
As there were no reply from LAVA team (I may imagine they're busy with
other things), I decided to create a user story in Jira for them, as I
couldn't create a LAVA-* ticket, I created it as
https://projects.linaro.org/browse/LITE-175 . Hopefully that won't go
unnoticed and LAVA team would get to it eventually.
Post by Milosz Wasilewski
milosz
Thanks!
--
Best Regards,
Paul
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
Neil Williams
2017-07-04 06:54:30 UTC
Permalink
On Mon, 3 Jul 2017 23:50:25 +0300
Post by Paul Sokolovsky
Hello Milosz,
I appreciate getting at least some response ;-). Some questions
however could use a reply from LAVA team, I guess.
On Mon, 3 Jul 2017 13:34:49 +0100
[]
Post by Milosz Wasilewski
jobs submit a number of tests to LAVA (via
arduino_101, frdm_k64f, frdm_kw41z, qemu_cortex_m3. Here's an
https://qa-reports.linaro.org/lite/zephyr-upstream/tests/
That's really great! (Though the list of tests to run in LAVA
https://git.linaro.org/ci/job/configs.git/tree/zephyr-upstream/submit_for_testing.py#n13)
It is, as I wasn't really sure what to test. The build job needs to
prepare the test templates to be submitted to LAVA. In case of
zephyr each tests is a separate binary. So we end up with the
number of file paths to substitute in the template. Hardcoding was
the easiest thing to get things running. But I see no reason why it
wouldn't be changed with some smarter code to discover the
binaries. The problem with this approach is that some of these
tests are just build time. They have no meaning when running on the
board and need to be filter out somehow.
Running the build tests within the Jenkins build makes a lot of sense.
Typically, the build tests will have a different command syntax to the
runtime tests (otherwise Jenkins would attempt to run both), so
filtering should be possible. If the build tests are just a different
set of binary blobs from the runtime tests, that may need a fix
upstream in Zephyr to distinguish between the two modes.
Post by Paul Sokolovsky
I see, that makes some sense. But thinking further, I'm not entirely
sure about "build only" tests. Zephyr's sanitycheck test has such
concept, but I'd imagine it comes from the following reasons: a)
sanitycheck runs tests on QEMU, which has very bare hardware support,
so many tests are not runnable; b) sanitycheck can operate on
"samples", not just "tests", as sample can be interactive, etc. it
makes sense to only build them, not run.
So, I'm not exactly sure about build-only tests on real HW boards. The
"default" idea would be that they should run, but I imagine in
reality, some may need to be filtered out. But then blacklisting
would be better approach than whitelisting. And I'm not sure if
Zephyr has concept of "skipped" tests which may be useful to handle
hardware variations. (Well, I actually dunno if LAVA supports skipped
tests!)
Yes, LAVA has support for pass, fail, skip, unknown.

For POSIX shell tests, the test writer just calls lava-test-case name
--result skip

For monitor tests, like Zephyr, it's down to the pattern but skip is as
valid as pass and fail (as is unknown) for the result of the matches
within the pattern.
Post by Paul Sokolovsky
Anyway, these are rough ideas for the future. I've spent couple of
weeks of munging with LITE CI setup, there're definitely some
improvements, but also a Pandora box of other ideas and improvements
to make. I'm wrapping up for now, but hope to look again in some time
(definitely hope to look before the Connect, so we can discuss further
steps there). In the meantime, I hope that more boards will be
installed in the Lab and stability of them improves (so far they seem
to be pretty flaky).
There are known limitations with the USB subsystem and associated
hardware across all architectures, affecting test devices and the
workers which run the automation. LAVA has to drive that subsystem very
hard for both fastboot devices and IoT devices. There are also problems
due to the design of methods like fastboot and some of the IoT support
which result from a single-developer model, leading to buggy
performance when used at scale and added complexity in deploying
workarounds to isolate such protocols in order to prevent interference
between tests. The protocols themselves often lack robust error
handling or retry support.

Other deployment methods which rely on TFTP/network deployments are
massively more reliable at scale, so comparing reliability across
different device types is problematic.
Post by Paul Sokolovsky
[]
Post by Milosz Wasilewski
- name: foo
start: ""
end: Hello, ZJS world!
pattern: (?P<result>(PASS|FAIL))\s-\s(?P<test_case_id>\w+)\.
So, the "start" substring is empty, and perhaps matches a line
output by a USB multiplexer or board bootloader. "End" substring
is actually the expected single-line output. And "pattern" is
unused (dunno if it can be dropped without def file syntax
error). Is there a better way to handle single-line test
output?
You're making a silent assumption that if there is a matching line,
the test is passed. In case of other tests (zephyr unit tests), it's
not the case. The 'start' matches some line which is displayed when
zephyr is booting. End matches the line which is displayed after all
testing is done. The pattern follows the unit test pattern.
Thanks, but I'm not sure I understand this response. I don't challenge
that Zephyr unittests need this support, or the way they're handled.
LITE however needs to test more things than "batch" Zephyr unittests.
I present another usercase which albeit simple, barely supported by
LAVA. (That's a question to LAVA team definitely.)
LAVA result handling is ultimately a pattern matching system. Patterns
must have a unique and reliable start string and a unique and reliable
end string. An empty start string is just going to cause misleading
results and bad pattern matches as the reality is that most boards emit
some level of random junk immediately upon connection which needs to be
ignored. So there needs to be a reliable, unique, start string emitted
by the test device. It is not enough to *assume* a start at line zero,
doing so increases the problems with reliability.
Post by Paul Sokolovsky
Post by Milosz Wasilewski
Well, beyond a simple output matching, it would be nice even for
the initial "smoke testing" to actually make some input into the
application and check the expected output (e.g., input: "2+2",
expected output: "4"). Is this already supported for LAVA "v2"
pipeline tests? I may imagine that would be the same kind of
functionality required to test bootloaders like U-boot for Linux
boards.
I didn't use anything like this in v2 so far, but you're probably
best off doing sth like
test 2+2=4 PASS.
than you can easily create pattern that will filter the output. In
case of zephyr pattern is the only way to filter things out as there
is no shell (?) on the board.
So, the problem, for starters, is how to make LAVA *feed* the
input, as specified in the test definition (like "2+2") into a board.
That will need code changes, so please make a formal request for this
support at CTT
https://projects.linaro.org/servicedesk/customer/portal/1 so that we
can track exactly what is required.
Post by Paul Sokolovsky
As there were no reply from LAVA team (I may imagine they're busy with
other things), I decided to create a user story in Jira for them, as I
couldn't create a LAVA-* ticket, I created it as
https://projects.linaro.org/browse/LITE-175 . Hopefully that won't go
unnoticed and LAVA team would get to it eventually.
That JIRA story is in the LITE project. Nobody in the LAVA team can
manage those stories. It needs a CTT issue which can then be linked to
the LITE story and from which a LAVA story can also be linked.

Sadly, any story in the LITE project would go completely unnoticed by
the LAVA software team until it is linked to CTT so that the work can
be prioritised and the relevant LAVA story created. That's just how
JIRA works.
Post by Paul Sokolovsky
Post by Milosz Wasilewski
milosz
Thanks!
--
Best Regards,
Paul
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
_______________________________________________
linaro-validation mailing list
https://lists.linaro.org/mailman/listinfo/linaro-validation
--
Neil Williams
=============
http://www.linux.codehelp.co.uk/
Paul Sokolovsky
2017-07-04 18:56:24 UTC
Permalink
Hello Neil,

On Tue, 4 Jul 2017 07:54:30 +0100
Neil Williams <***@debian.org> wrote:

[]
Post by Neil Williams
Yes, LAVA has support for pass, fail, skip, unknown.
For POSIX shell tests, the test writer just calls lava-test-case name
--result skip
For monitor tests, like Zephyr, it's down to the pattern but skip is
as valid as pass and fail (as is unknown) for the result of the
matches within the pattern.
Thanks, now that you said it, I remembered that some years ago I saw
them all ;-). Thanks for confirming!

[]
Post by Neil Williams
There are known limitations with the USB subsystem and associated
hardware across all architectures, affecting test devices and the
workers which run the automation. LAVA has to drive that subsystem
very hard for both fastboot devices and IoT devices. There are also
problems due to the design of methods like fastboot and some of the
IoT support which result from a single-developer model, leading to
buggy performance when used at scale and added complexity in deploying
workarounds to isolate such protocols in order to prevent interference
between tests. The protocols themselves often lack robust error
handling or retry support.
Yes, I understand all the complexity of it. But I'd hope that
ability to power off both a device and USB hub connecting it would be
the ultimate solution to such issues. Anyway, I guess this time, the
question is more for the Lab team, not the LAVA team.

[]
Post by Neil Williams
LAVA result handling is ultimately a pattern matching system. Patterns
must have a unique and reliable start string and a unique and reliable
end string. An empty start string is just going to cause misleading
results and bad pattern matches as the reality is that most boards
emit some level of random junk immediately upon connection which
needs to be ignored. So there needs to be a reliable, unique, start
string emitted by the test device. It is not enough to *assume* a
start at line zero, doing so increases the problems with reliability.
Good to know, and I generally agree. But with a real usecase of testing
something which outputs just a single line, the only option is then to
modify "device under test", which isn't always desirable and shows
LAVA's inflexibility. Well, at least there's a workaround which works
well enough, and hopefully will keep working ;-).

[]
Post by Neil Williams
That will need code changes, so please make a formal request for this
support at CTT
https://projects.linaro.org/servicedesk/customer/portal/1 so that we
can track exactly what is required.
Thanks, now done as
https://projects.linaro.org/servicedesk/customer/portal/1/CTT-413

[]
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
Loading...