Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.7.0
-
None
-
3
-
17440
Description
For open case, the client side open handling thread may hit error after the MDT grant the open. Under the such case, the client should send close RPC to the MDT as cleanup; otherwise, the open handle on the MDT will be leaked there until the client umount or evicted.
If the LFSCK marks LU_OBJECT_HEARD_BANSHEE on the MDT-object that is opened by others for repairing some inconsistency, such as repairing multiple-referenced OST-object, because the leaked open handle still references the MDT-object, then it will block the subsequent threads that want to locate such object via FID.
23:07:57:INFO: task mdt00_000:6380 blocked for more than 120 seconds. 23:07:57: Not tainted 2.6.32-504.8.1.el6_lustre.g0ef66b1.x86_64 #1 23:07:57:mdt00_000 D 0000000000000001 0 6380 2 0x00000080 23:07:57:Call Trace: 23:07:57: [<ffffffffa05f62af>] ? lu_object_find_try+0x9f/0x260 [obdclass] 23:07:57: [<ffffffffa05f64ad>] lu_object_find_at+0x3d/0xe0 [obdclass] 23:07:57: [<ffffffffa05f6566>] lu_object_find+0x16/0x20 [obdclass] 23:07:57: [<ffffffffa0ebe056>] mdt_object_find+0x56/0x170 [mdt] 23:07:57: [<ffffffffa0ef5407>] mdt_reint_open+0x1527/0x2c70 [mdt] 23:07:57: [<ffffffffa0edd0cd>] mdt_reint_rec+0x5d/0x200 [mdt] 23:07:57: [<ffffffffa0ec123b>] mdt_reint_internal+0x4cb/0x7a0 [mdt] 23:07:57: [<ffffffffa0ec1706>] mdt_intent_reint+0x1f6/0x430 [mdt] 23:07:57: [<ffffffffa0ebfcf4>] mdt_intent_policy+0x494/0xce0 [mdt] 23:07:57: [<ffffffffa07c24f9>] ldlm_lock_enqueue+0x129/0x9d0 [ptlrpc] 23:07:57: [<ffffffffa07ee48b>] ldlm_handle_enqueue0+0x51b/0x13f0 [ptlrpc] 23:07:57: [<ffffffffa086e951>] tgt_enqueue+0x61/0x230 [ptlrpc] 23:07:57: [<ffffffffa086f59e>] tgt_request_handle+0x8be/0x1000 [ptlrpc] 23:07:57: [<ffffffffa081f5c1>] ptlrpc_main+0xe41/0x1960 [ptlrpc] 23:07:57: [<ffffffff8109e66e>] kthread+0x9e/0xc0 23:07:57: [<ffffffff8100c20a>] child_rip+0xa/0x20