Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.6.0
-
3
-
13533
Description
This occurs on remount after crash (or reset) while running racer on 2.5.57-72-g69ddb2e or on checkout of http://review.whamcloud.com/#/c/9699/. Disabling migration in racer has no effect here.
# export MDSCOUNT=4 # export MOUNT_2=y # llmount.sh ... # sh lustre/tests/racer.sh ... Wait for panic or reset while running. ... Restart the node. # export MDSCOUNT=4 # export PTLDEBUG=+trace # export NOFORMAT=1 # llmount.sh
Call Trace: [<ffffffffa02a9895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa02a9e97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0ced4db>] lod_declare_object_destroy+0x55b/0x780 [lod] [<ffffffffa0bb3869>] __mdd_orphan_cleanup+0x7d9/0xca0 [mdd] [<ffffffffa0bc7cbd>] mdd_recovery_complete+0xed/0x170 [mdd] [<ffffffffa0bfa9c5>] mdt_postrecov+0x35/0xd0 [mdt] [<ffffffffa0bfbf08>] mdt_obd_postrecov+0x78/0x90 [mdt] [<ffffffffa0632cf4>] ? ldlm_reprocess_all_ns+0xa4/0x110 [ptlrpc] [<ffffffffa0648505>] target_recovery_thread+0xd25/0x19c0 [ptlrpc] [<ffffffffa06477e0>] ? target_recovery_thread+0x0/0x19c0 [ptlrpc] [<ffffffff81096a36>] kthread+0x96/0xa0 [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffff810969a0>] ? kthread+0x0/0xa0 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Kernel panic - not syncing: LBUG
00010000:00080000:0.0:1397318109.970223:0:3046:0:(ldlm_lib.c:1991:target_recovery_thread()) lustre-MDT0002: started recovery thread pid 3046 00010000:02000400:3.0:1397318182.861911:0:3046:0:(ldlm_lib.c:1803:target_recovery_overseer()) lustre-MDT0002: recovery is timed out, evict stale exports ... 00000100:00100000:2.0:1397318182.977710:0:3046:0:(client.c:1849:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc tgt_recov:lustre-MDT0002-mdtlov_UUID:3046:1465194230319564:0@lo:1000 00000004:00080000:2.0:1397318182.977722:0:3046:0:(mdd_orphans.c:395:orph_key_test_and_del()) Found orphan [0x380000bd0:0x170f:0x0], delete it 00000004:00040000:2.0:1397318182.977731:0:3046:0:(dt_object.h:1483:dt_declare_delete()) ASSERTION( dt->do_index_ops ) failed: 00000004:00040000:2.0:1397318182.977735:0:3046:0:(dt_object.h:1483:dt_declare_delete()) LBUG
Attachments
Issue Links
- is related to
-
LU-3531 DNE2: striped directory
- Resolved