Description
This issue was created by maloo for Li Wei <liwei@whamcloud.com>
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/31e6229a-a18c-11e2-8fc0-52540035b04c.
The sub-test test_24v failed with the following error:
test failed to respond and timed out
Info required for matching: sanity 24v
From the test log:
== sanity test 24v: list directory with large files (handle hash collision, bug: 17560) == 05:48:16 (1365511696)
- created 10000 (time 1365511718.82 total 21.86 last 21.86)
- created 20000 (time 1365511741.85 total 44.90 last 23.03)
- created 30000 (time 1365511763.98 total 67.02 last 22.13)
- created 40000 (time 1365511803.12 total 106.17 last 39.14)
- created 50000 (time 1365511905.09 total 208.14 last 101.97)
- created 60000 (time 1365511983.21 total 286.26 last 78.12)
- created 70000 (time 1365512040.82 total 343.87 last 57.61)
- created 80000 (time 1365512097.71 total 400.75 last 56.88)
- created 90000 (time 1365512214.20 total 517.25 last 116.50)
total: 100000 creates in 1401.14 seconds: 71.37 creates/second
mdc.lustre-MDT0000-mdc-ffff88007abcd000.stats=clear
ls: reading directory /mnt/lustre/d0.sanity/d24: Input/output error
sanity test_24v: @@@@@@ FAIL: error in listing large dir
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:4024:error_noexit()
= /usr/lib64/lustre/tests/test-framework.sh:4047:error()
= /usr/lib64/lustre/tests/sanity.sh:1018:test_24v()
= /usr/lib64/lustre/tests/test-framework.sh:4301:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:4334:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:4189:run_test()
= /usr/lib64/lustre/tests/sanity.sh:1036:main()
Dumping lctl log to /logdir/test_logs/2013-04-09/lustre-reviews-el6-x86_64-review-1_1_1_14707_-70245651123900-051101/sanity.test_24v.*.1365513300.log
CMD: wtm-27vm3,wtm-27vm4,wtm-27vm5,wtm-27vm6.rosso.whamcloud.com /usr/sbin/lctl dk > /logdir/test_logs/2013-04-09/lustre-reviews-el6-x86_64-review-1_1_1_14707_-70245651123900-051101/sanity.test_24v.debug_log.\$(hostname -s).1365513300.log;
dmesg > /logdir/test_logs/2013-04-09/lustre-reviews-el6-x86_64-review-1_1_1_14707_-70245651123900-051101/sanity.test_24v.dmesg.\$(hostname -s).1365513300.log
From the client console log:
06:15:00:Lustre: DEBUG MARKER: == sanity test 24v: list directory with large files (handle hash collision, bug: 17560) == 05:48:16 (1365511696)
06:15:00:Lustre: DEBUG MARKER: cancel_lru_locks mdc start
06:15:01:Lustre: DEBUG MARKER: cancel_lru_locks mdc stop
06:15:01:Lustre: 18546:0:(dir.c:463:ll_get_dir_page()) Page-wide hash collision: 6491135612813312
06:15:01:LustreError: 18546:0:(dir.c:594:ll_dir_read()) error reading dir [0x200000400:0xa3:0x0] at 6491135612813312: rc -5
06:15:01:Lustre: 18546:0:(dir.c:463:ll_get_dir_page()) Page-wide hash collision: 6491135612813312
06:15:01:LustreError: 18546:0:(dir.c:594:ll_dir_read()) error reading dir [0x200000400:0xa3:0x0] at 6491135612813312: rc -5
06:15:01:Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity test_24v: @@@@@@ FAIL: error in listing large dir
06:15:01:Lustre: DEBUG MARKER: sanity test_24v: @@@@@@ FAIL: error in listing large dir
Note that:
- The failure happened on ZFS; recent ldiskfs sessions are clean.
- The build being tested includes the
LU-2990fix. - This test, if run by Autotest, were skipped on ZFS due to "not enough free inodes 15833 required 100000", but is enabled by the patch (http://review.whamcloud.com/5806) being tested.