Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.7.0, Lustre 2.8.0
-
None
-
3
-
17660
Description
When migrate metadata, the OST-object's PFID still references the old MDT-object that has been removed. The layout LFSCK will try to handle it as unmatched MDT-object (new) and OST-object (old) pairs.
Unfortunately, the layout LFSCK assistant thread hung at ldlm lock when try to
lfsck_layout S 0000000000000000 0 10745 2 0x00000080 ffff8800580158c0 0000000000000046 0000000000000000 ffff880058015890 ffff880058015820 ffff88006bf360f8 00000aaa4cb7f0b6 0000000000000000 ffff880058015840 0000000100ae5af3 ffff880037d7dab8 ffff880058015fd8 Call Trace: [<ffffffffa07de290>] ? ldlm_expired_completion_wait+0x0/0x370 [ptlrpc] [<ffffffffa07e2e7d>] ldlm_completion_ast+0x66d/0x9b0 [ptlrpc] [<ffffffff81064b90>] ? default_wake_function+0x0/0x20 [<ffffffffa07dcf06>] ldlm_cli_enqueue_fini+0x936/0xe30 [ptlrpc] [<ffffffffa07fbd7b>] ? ptlrpc_set_destroy+0x26b/0x450 [ptlrpc] [<ffffffffa07dd7c1>] ldlm_cli_enqueue+0x3c1/0x870 [ptlrpc] [<ffffffffa07e2810>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc] [<ffffffffa07e0f70>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] [<ffffffffa10980e8>] osp_md_object_lock+0x188/0x210 [osp] [<ffffffffa0dfa2a4>] lfsck_ibits_lock+0x1e4/0x2e0 [lfsck] [<ffffffffa0e378d8>] lfsck_layout_check_parent+0x698/0xa40 [lfsck] [<ffffffffa0e33b97>] ? dt_xattr_get+0x97/0x130 [lfsck] [<ffffffffa0e49fc3>] lfsck_layout_assistant_handler_p1+0x683/0x19f0 [lfsck] [<ffffffff8152a27e>] ? thread_return+0x4e/0x7d0 [<ffffffff81064ba2>] ? default_wake_function+0x12/0x20 [<ffffffffa0e106e6>] lfsck_assistant_engine+0x496/0x1de0 [lfsck] [<ffffffff8105e0d0>] ? __dequeue_entity+0x30/0x50 [<ffffffff81064b90>] ? default_wake_function+0x0/0x20 [<ffffffffa0e10250>] ? lfsck_assistant_engine+0x0/0x1de0 [lfsck] [<ffffffff8109e66e>] kthread+0x9e/0xc0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 [<ffffffff8100c200>] ? child_rip+0x0/0x20
This issue was created by maloo for nasf <fan.yong@intel.com>
Please provide additional information about the failure here.
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/e809aea6-be91-11e4-ac06-5254006e85c2.