[LU-7673] conf-sanity test failure cause multiple tests to fail Created: 15/Jan/16 Updated: 06/Oct/20 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | tests | ||
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
There are several examples in conf-sanity and other test suites where a test fails and that failure causes several tests that follow to fail. This ticket is to harden tests when a previous test fails. For example, when conf-sanity test 52 fails, test 53a also fails because the MDS is already mounted. Test 52 calls cleanup() at the end of the test and cleanup() calls stop_ost, stop_mds and unloads modules. Test 53a runs right after test 52 and calls setup(), which calls start_mds, start_ost, etc. and will return an error if any of these fail. Thus, when test 52 fails, it does not call cleanup(), all servers are will remain mounted when test 53a starts. Test 53a calls setup() and returns an error. Then test 53b starts and calls setup() and fails. Then 54a fails because the OST is still mounted. Test 54b fails because the MDT is still mounted. One example of this cascade of errors is at Another example of one test failing leading to several others to fail is at https://testing.hpdd.intel.com/test_sets/df65fd10-bad8-11e5-b3d5-5254006e85c2. In this case, test 44 failed and this caused test 45 to fail for the same reason as above; not calling cleanup() due to test failure and the next test calls setup(). There are a few ways to stop these failures. A couple of possible solutions are: |
| Comments |
| Comment by James Nunez (Inactive) [ 27/Jan/16 ] |
|
Looks like same issue when conf-sanity test 22 fails, test 23a fails because the MDT is already mounted. Logs at |
| Comment by Saurabh Tandan (Inactive) [ 10/Feb/16 ] |
|
Another instance found for interop tag 2.7.66 - EL7 Server/2.7.1 Client, build# 3316 Another instance found for interop tag 2.7.66 - EL6.7 Server/2.7.1 Client, build# 3316 Another instance found for interop tag 2.7.66 - EL6.7 Server/2.5.5 Client, build# 3316 Another instance found for interop tag 2.7.66 - EL7 Server/2.5.5 Client, build# 3316 |
| Comment by Saurabh Tandan (Inactive) [ 24/Feb/16 ] |
|
Another instance found for interop - EL7 Server/2.7.1 Client, tag 2.7.90. |
| Comment by James Nunez (Inactive) [ 14/Sep/16 ] |
|
When conf-sanity test 20 fails, tests 21*, 22 and 23a typically also fail. These tests need to be more resilient to previous failures. See https://testing.hpdd.intel.com/test_sets/afcbd7ba-79d6-11e6-b058-5254006e85c2 for one example of this cascade of failures. |