[LU-12247] sanityn test 34 fails with 'test_34 returned 1' Created: 30/Apr/19 Updated: 24/Jun/19 Resolved: 24/Jun/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0, Lustre 2.12.1 |
| Fix Version/s: | Lustre 2.13.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | ppc | ||
| Environment: |
ppc clients |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
sanityn test_34 fails with 'test_34 returned 1'. We see this test fail for PPC only. Looking at the suite_log for a recent failure, with logs at https://testing.whamcloud.com/test_sets/7d68fa34-668f-11e9-8bb1-52540065bddc, we see that wait_osc_import_ready() does not return in time CMD: trevis-55vm11 lctl set_param -n fail_loc=0 2>/dev/null || true CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0001-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0002-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0003-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0004-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0005-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0006-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-55vm11 lctl set_param -n fail_loc=0 2>/dev/null || true CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs test_34 returned 1 FAIL 34 (353s) When sanityn test 34 succeeds, we should see client 2 checking each OST that it in the IDLE state. When this test fails, we don’t see any activity on the console of client 2 (vm2) [10987.532565] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanityn test 34: no lock timeout under IO ========================================================= 04:24:20 \(1555993460\) [10987.714805] Lustre: DEBUG MARKER: == sanityn test 34: no lock timeout under IO ========================================================= 04:24:20 The OSS console log does not have errors outside what is seen when sanityn test 34 succeeds. Logs for other failures are at |
| Comments |
| Comment by Andreas Dilger [ 30/Apr/19 ] |
|
This looks like a test script problem because it is checking for -ffff in the OSC name which is x86 specific. |
| Comment by James A Simmons [ 22/Jun/19 ] |
|
Is this still true now that https://review.whamcloud.com/33894 landed? |
| Comment by James Nunez (Inactive) [ 24/Jun/19 ] |
|
sanityn test 34 hasn't failed for ppc clients since May 1 (2019). As James pointed out, it looks like https://review.whamcloud.com/#/c/33894/ fixed this issue. |