Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.13.0, Lustre 2.12.1
-
ppc clients
-
3
-
9223372036854775807
Description
sanityn test_34 fails with 'test_34 returned 1'. We see this test fail for PPC only.
Looking at the suite_log for a recent failure, with logs at https://testing.whamcloud.com/test_sets/7d68fa34-668f-11e9-8bb1-52540065bddc, we see that wait_osc_import_ready() does not return in time
CMD: trevis-55vm11 lctl set_param -n fail_loc=0 2>/dev/null || true CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0001-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0002-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0003-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0004-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0005-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0006-osc-ffff*.ost_server_uuid in 40 secs CMD: trevis-55vm11 lctl set_param -n fail_loc=0 2>/dev/null || true CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs test_34 returned 1 FAIL 34 (353s)
When sanityn test 34 succeeds, we should see client 2 checking each OST that it in the IDLE state. When this test fails, we don’t see any activity on the console of client 2 (vm2)
[10987.532565] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanityn test 34: no lock timeout under IO ========================================================= 04:24:20 \(1555993460\) [10987.714805] Lustre: DEBUG MARKER: == sanityn test 34: no lock timeout under IO ========================================================= 04:24:20
The OSS console log does not have errors outside what is seen when sanityn test 34 succeeds.
Logs for other failures are at
https://testing.whamcloud.com/test_sets/07ef1416-2970-11e9-b901-52540065bddc
https://testing.whamcloud.com/test_sets/052e8140-2555-11e9-830a-52540065bddc