Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.12.0
-
3
-
9223372036854775807
Description
When replay-single test_90 is skipped and all other replay-single tests pass, the test suite is marked as FAIL. Here are two examples
https://testing.whamcloud.com/test_sets/033880de-3a37-11e8-b45c-52540065bddc
https://testing.whamcloud.com/test_sets/ec2a2478-8ba9-11e8-b0aa-52540065bddc
From the test log, we can see that the test is skipped
== replay-single test 90: lfs find identifies the missing striped file segments ====================== 23:05:40 (1532041540) CMD: trevis-17vm6 /usr/sbin/lctl dl CMD: trevis-17vm6 /usr/sbin/lctl dl CMD: trevis-17vm6 /usr/sbin/lctl dl CMD: trevis-17vm6 /usr/sbin/lctl dl CMD: trevis-17vm6 /usr/sbin/lctl dl CMD: trevis-17vm6 /usr/sbin/lctl dl CMD: trevis-17vm6 /usr/sbin/lctl dl SKIP: replay-single test_90 not functional with FAILURE_MODE=HARD, affected: ost1,ost2,ost3,ost4,ost5,ost6,ost7 SKIP 90 (4s)
How do we know that skipping this test is causing the test suite failure? All tests that fail, they are listed at the end of the test script output. For the replay-single suite failures, at the end of the test_log, we see
== replay-single test complete, duration 10784 sec =================================================== 17:23:17 (1532712197) replay-single: SKIP: test_90 not functional with FAILURE_MODE=HARD, affected: ost1,ost2,ost3,ost4,ost5,ost6,ost7
We see this in failover testing when FAILURE_MODE=HARD. It is not clear what is causing this issue, but we see examples of this as far back as January 2018 and probably earlier.
This is a test script/framework issue.