Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.8.0
-
None
-
review-dne-part-2 in autotest
-
3
-
9223372036854775807
Description
replay-single test 70 fails, actually hangs. There are several problems here:
1. The test fails complaining that "import is not in FULL state":
07:07:46:shadow-9vm9: rpc : @@@@@@ FAIL: can't put import for mdc.lustre-MDT0002-mdc-*.mds_server_uuid into FULL state after 1475 sec, have REPLAY 07:07:46:shadow-9vm6: 1 cleanup 2741 sec 07:07:46:shadow-9vm9: 1 cleanup 2741 sec 07:07:47:shadow-9vm9: Trace dump: 07:07:47:shadow-9vm9: = /usr/lib64/lustre/tests/test-framework.sh:4727:error_noexit() 07:07:47:shadow-9vm9: = /usr/lib64/lustre/tests/test-framework.sh:4758:error() 07:07:47:shadow-9vm9: = /usr/lib64/lustre/tests/test-framework.sh:5830:_wait_import_state() 07:07:47:shadow-9vm9: = /usr/lib64/lustre/tests/test-framework.sh:5849:wait_import_state() 07:07:47:shadow-9vm9: = /usr/lib64/lustre/tests/test-framework.sh:5858:wait_import_state_mount() 07:07:47:shadow-9vm9: = rpc.sh:20:main() 07:07:47:shadow-9vm9: CMD: shadow-9vm4,shadow-9vm6,shadow-9vm7,shadow-9vm8,shadow-9vm9.shadow.whamcloud.com PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:./../utils:/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh check_logdir /tmp/test_logs/1435992181 07:07:47:shadow-9vm6: rpc : @@@@@@ FAIL: can't put import for mdc.lustre-MDT0002-mdc-*.mds_server_uuid into FULL state after 1475 sec, have REPLAY 07:07:47:shadow-9vm6: 1 cleanup 2742 sec 07:07:47:shadow-9vm9: 1 cleanup 2742 sec 07:07:47:shadow-9vm6: Trace dump: 07:07:47:shadow-9vm6: = /usr/lib64/lustre/tests/test-framework.sh:4727:error_noexit() 07:07:47:shadow-9vm6: = /usr/lib64/lustre/tests/test-framework.sh:4758:error() 07:07:47:shadow-9vm6: = /usr/lib64/lustre/tests/test-framework.sh:5830:_wait_import_state() 07:07:47:shadow-9vm6: = /usr/lib64/lustre/tests/test-framework.sh:5849:wait_import_state() 07:07:47:shadow-9vm6: = /usr/lib64/lustre/tests/test-framework.sh:5858:wait_import_state_mount() 07:07:47:shadow-9vm6: = rpc.sh:20:main() 07:07:47:shadow-9vm9: CMD: shadow-9vm4 uname -n 07:07:47:shadow-9vm6: CMD: shadow-9vm4,shadow-9vm6,shadow-9vm6.shadow.whamcloud.com,shadow-9vm7,shadow-9vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:./../utils:/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh check_logdir /tmp/test_logs/1435992181 07:07:47:shadow-9vm9: Dumping lctl log to /tmp/test_logs/1435992181/rpc..*.1435993666.log 07:07:47:shadow-9vm9: CMD: shadow-9vm4,shadow-9vm6,shadow-9vm7,shadow-9vm8,shadow-9vm9.shadow.whamcloud.com /usr/sbin/lctl dk > /tmp/test_logs/1435992181/rpc..debug_log.\$(hostname -s).1435993666.log; 07:07:47:shadow-9vm9: dmesg > /tmp/test_logs/1435992181/rpc..dmesg.\$(hostname -s).1435993666.log 07:07:48:shadow-9vm6: CMD: shadow-9vm4 uname -n 07:07:48:shadow-9vm6: Dumping lctl log to /tmp/test_logs/1435992181/rpc..*.1435993667.log 07:07:48:shadow-9vm6: CMD: shadow-9vm4,shadow-9vm6,shadow-9vm6.shadow.whamcloud.com,shadow-9vm7,shadow-9vm8 /usr/sbin/lctl dk > /tmp/test_logs/1435992181/rpc..debug_log.\$(hostname -s).1435993667.log; 07:07:48:shadow-9vm6: dmesg > /tmp/test_logs/1435992181/rpc..dmesg.\$(hostname -s).1435993667.log 07:07:48:shadow-9vm6: 1 cleanup 2743 sec 07:07:48:shadow-9vm9: 1 cleanup 2743 sec 07:07:48:shadow-9vm9: CMD: shadow-9vm4,shadow-9vm6,shadow-9vm7,shadow-9vm8,shadow-9vm9.shadow.whamcloud.com rsync -az /tmp/test_logs/1435992181/rpc..*.1435993666.log shadow-9vm9.shadow.whamcloud.com:/tmp/test_logs/1435992181 07:07:48:shadow-9vm6: CMD: shadow-9vm4,shadow-9vm6,shadow-9vm6.shadow.whamcloud.com,shadow-9vm7,shadow-9vm8 rsync -az /tmp/test_logs/1435992181/rpc..*.1435993667.log shadow-9vm6.shadow.whamcloud.com:/tmp/test_logs/1435992181 07:07:49:shadow-9vm6: 1 cleanup 2744 sec 07:07:49:shadow-9vm9: 1 cleanup 2744 sec 07:07:49: replay-single test_70b: @@@@@@ FAIL: import is not in FULL state 07:07:49: Trace dump: 07:07:49: = /usr/lib64/lustre/tests/test-framework.sh:4727:error_noexit() 07:07:49: = /usr/lib64/lustre/tests/test-framework.sh:4758:error() 07:07:49: = /usr/lib64/lustre/tests/test-framework.sh:6004:wait_clients_import_state() 07:07:49: = /usr/lib64/lustre/tests/test-framework.sh:2574:fail() 07:07:49: = /usr/lib64/lustre/tests/replay-single.sh:2091:test_70b() 07:07:49: = /usr/lib64/lustre/tests/test-framework.sh:5020:run_one() 07:07:49: = /usr/lib64/lustre/tests/test-framework.sh:5057:run_one_logged() 07:07:49: = /usr/lib64/lustre/tests/test-framework.sh:4907:run_test() 07:07:49: = /usr/lib64/lustre/tests/replay-single.sh:2102:main()
2. The test does fail in a way that Maloo can recognize. So, autotest times the test out. In the test reports below, it looks like test 70b never ran, but that the test suite failed. In the Maloo report, 93/93 tests pass, but clearly not all the replay-single tests were run and looking at the suite_stdout, we see the error message above.
3. No logs are collected to analyze this failure.
This test has failed in this way five times this month:
2015-07-10 08:11:47 - https://testing.hpdd.intel.com/test_sets/a941c760-2725-11e5-bc86-5254006e85c2
2015-07-11 21:00:38 - https://testing.hpdd.intel.com/test_sets/ee8a4f52-2858-11e5-ba19-5254006e85c2
2015-07-20 14:23:34 - https://testing.hpdd.intel.com/test_sets/cff07ed6-2f33-11e5-92dd-5254006e85c2
2015-07-21 16:12:58 - https://testing.hpdd.intel.com/test_sets/3fadf97a-300f-11e5-97d6-5254006e85c2
2015-07-31 06:19:18 - https://testing.hpdd.intel.com/test_sets/9635fe7a-3797-11e5-9d53-5254006e85c2