Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.7.0
-
None
-
Versions of 2.6.54 on clients & servers.
Cray SLES11SP3 clients, CentOS servers (2.6.32-431.5.1.el6.x86_64).
Most recent commit on clients:
Ie7a2a98be8cc97db9af7a64476c06fc7321544eb
http://review.whamcloud.com/12142
Most recent commit on servers:
If24443955290b091fd22905dfb74b0d6a6d1b4e8
http://review.whamcloud.com/12490Versions of 2.6.54 on clients & servers. Cray SLES11SP3 clients, CentOS servers (2.6.32-431.5.1.el6.x86_64). Most recent commit on clients: Ie7a2a98be8cc97db9af7a64476c06fc7321544eb http://review.whamcloud.com/12142 Most recent commit on servers: If24443955290b091fd22905dfb74b0d6a6d1b4e8 http://review.whamcloud.com/12490
-
3
-
16451
Description
While doing a general purpose test of master with DNE II (2 MDSes with 3 MDTs each, 6 total MDTs), we did an ls to check the status of something, and our client LBUGged. We turned up debug on a different client, did the ls again, and it crashed as well:
2014-11-06T20:08:43.974554-06:00 c1-0c0s0n3 LustreError: 3797:0:(statahead.c:262:sa_kill()) ASSERTION( !list_empty(&entry->se_list) ) failed:
2014-11-06T20:08:43.974591-06:00 c1-0c0s0n3 LustreError: 3797:0:(statahead.c:262:sa_kill()) LBUG
2014-11-06T20:08:43.974597-06:00 c1-0c0s0n3 Pid: 3797, comm: ls
2014-11-06T20:08:43.974604-06:00 c1-0c0s0n3 Call Trace:
2014-11-06T20:08:43.974617-06:00 c1-0c0s0n3 [<ffffffff81006591>] try_stack_unwind+0x161/0x1a0
2014-11-06T20:08:43.974622-06:00 c1-0c0s0n3 [<ffffffff81004de9>] dump_trace+0x89/0x440
2014-11-06T20:08:43.974632-06:00 c1-0c0s0n3 [<ffffffffa0176897>] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
2014-11-06T20:08:43.974639-06:00 c1-0c0s0n3 [<ffffffffa0176de7>] lbug_with_loc+0x47/0xc0 [libcfs]
2014-11-06T20:08:44.015207-06:00 c1-0c0s0n3 [<ffffffffa0750512>] sa_put+0x332/0x370 [lustre]
2014-11-06T20:08:44.015232-06:00 c1-0c0s0n3 [<ffffffffa0752b6c>] do_statahead_enter+0xfdc/0x1cb0 [lustre]
2014-11-06T20:08:44.015238-06:00 c1-0c0s0n3 [<ffffffffa073a695>] ll_lookup_it+0x7b5/0x1b70 [lustre]
2014-11-06T20:08:44.015245-06:00 c1-0c0s0n3 [<ffffffffa073bad5>] ll_lookup_nd+0x85/0x570 [lustre]
2014-11-06T20:08:44.015250-06:00 c1-0c0s0n3 [<ffffffff81148edc>] d_alloc_and_lookup+0x4c/0x80
2014-11-06T20:08:44.015261-06:00 c1-0c0s0n3 [<ffffffff8114a57e>] do_lookup+0x29e/0x3a0
2014-11-06T20:08:44.015273-06:00 c1-0c0s0n3 [<ffffffff8114cdb3>] path_lookupat+0xc3/0x5e0
2014-11-06T20:08:44.015281-06:00 c1-0c0s0n3 [<ffffffff8114d305>] do_path_lookup+0x35/0xd0
2014-11-06T20:08:44.015287-06:00 c1-0c0s0n3 [<ffffffff8114e043>] user_path_at_empty+0x83/0xb0
2014-11-06T20:08:44.015303-06:00 c1-0c0s0n3 [<ffffffff8114e081>] user_path_at+0x11/0x20
2014-11-06T20:08:44.015311-06:00 c1-0c0s0n3 [<ffffffff81142d85>] vfs_fstatat+0x55/0x90
2014-11-06T20:08:44.054257-06:00 c1-0c0s0n3 [<ffffffff81142e2e>] vfs_lstat+0x1e/0x20
2014-11-06T20:08:44.054275-06:00 c1-0c0s0n3 [<ffffffff81142e54>] sys_newlstat+0x24/0x50
2014-11-06T20:08:44.054281-06:00 c1-0c0s0n3 [<ffffffff8133776b>] system_call_fastpath+0x16/0x1b
2014-11-06T20:08:44.054293-06:00 c1-0c0s0n3 [<00007ffff78b9515>] 0x7ffff78b9515
I'll make the client dump available in a few minutes.