[LU-4897] dt_declare_delete()) ASSERTION( dt->do_index_ops ) failed (in orphan cleanup) Created: 12/Apr/14 Updated: 08/May/14 Resolved: 08/May/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | John Hammond | Assignee: | Di Wang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | dne2, lod, mdd | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 13533 | ||||||||
| Description |
|
This occurs on remount after crash (or reset) while running racer on 2.5.57-72-g69ddb2e or on checkout of http://review.whamcloud.com/#/c/9699/. Disabling migration in racer has no effect here. # export MDSCOUNT=4 # export MOUNT_2=y # llmount.sh ... # sh lustre/tests/racer.sh ... Wait for panic or reset while running. ... Restart the node. # export MDSCOUNT=4 # export PTLDEBUG=+trace # export NOFORMAT=1 # llmount.sh Call Trace: [<ffffffffa02a9895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa02a9e97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0ced4db>] lod_declare_object_destroy+0x55b/0x780 [lod] [<ffffffffa0bb3869>] __mdd_orphan_cleanup+0x7d9/0xca0 [mdd] [<ffffffffa0bc7cbd>] mdd_recovery_complete+0xed/0x170 [mdd] [<ffffffffa0bfa9c5>] mdt_postrecov+0x35/0xd0 [mdt] [<ffffffffa0bfbf08>] mdt_obd_postrecov+0x78/0x90 [mdt] [<ffffffffa0632cf4>] ? ldlm_reprocess_all_ns+0xa4/0x110 [ptlrpc] [<ffffffffa0648505>] target_recovery_thread+0xd25/0x19c0 [ptlrpc] [<ffffffffa06477e0>] ? target_recovery_thread+0x0/0x19c0 [ptlrpc] [<ffffffff81096a36>] kthread+0x96/0xa0 [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffff810969a0>] ? kthread+0x0/0xa0 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Kernel panic - not syncing: LBUG 00010000:00080000:0.0:1397318109.970223:0:3046:0:(ldlm_lib.c:1991:target_recovery_thread()) lustre-MDT0002: started recovery thread pid 3046 00010000:02000400:3.0:1397318182.861911:0:3046:0:(ldlm_lib.c:1803:target_recovery_overseer()) lustre-MDT0002: recovery is timed out, evict stale exports ... 00000100:00100000:2.0:1397318182.977710:0:3046:0:(client.c:1849:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc tgt_recov:lustre-MDT0002-mdtlov_UUID:3046:1465194230319564:0@lo:1000 00000004:00080000:2.0:1397318182.977722:0:3046:0:(mdd_orphans.c:395:orph_key_test_and_del()) Found orphan [0x380000bd0:0x170f:0x0], delete it 00000004:00040000:2.0:1397318182.977731:0:3046:0:(dt_object.h:1483:dt_declare_delete()) ASSERTION( dt->do_index_ops ) failed: 00000004:00040000:2.0:1397318182.977735:0:3046:0:(dt_object.h:1483:dt_declare_delete()) LBUG |
| Comments |
| Comment by Jodi Levi (Inactive) [ 15/Apr/14 ] |
|
Di, |
| Comment by Di Wang [ 18/Apr/14 ] |
|
hmm, it seems index_try is missing before delete orphans, I will cook a patch. |
| Comment by Di Wang [ 19/Apr/14 ] |
| Comment by Di Wang [ 08/May/14 ] |
|
the patch has been merged to 9511 and landed to master |