[LU-7535] mdt_intent_layout does not care about lock handles Created: 10/Dec/15  Updated: 01/Jun/16  Resolved: 06/Jan/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Major
Reporter: Vitaly Fertman Assignee: John Hammond
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-7599 recovery-small test_130b: mds panic: ... Closed
Related
is related to LU-7173 ldlm_lock_destroy_internal() LBUG enc... Resolved
is related to LU-8043 MDS running lustre 2.5.5+ OOM when ru... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

lock handles are to be zeroed by the return from the intent

05:20:43:LustreError: 31441:0:(mdt_handler.c:2712:mdt_lock_handle_fini()) ASSERTION( !lustre_handle_is_used(&lh->mlh_reg_lh) ) failed: 
 05:20:43:LustreError: 31441:0:(mdt_handler.c:2712:mdt_lock_handle_fini()) LBUG
 05:20:43:Pid: 31441, comm: mdt00_002
 05:20:43:
 05:20:43:Call Trace:
 05:20:43: [<ffffffffa05d2875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 05:20:43: [<ffffffffa05d2e77>] lbug_with_loc+0x47/0xb0 [libcfs]
 05:20:43: [<ffffffffa0e2f47b>] mdt_lock_handle_fini+0x4b/0x80 [mdt]
 05:20:43: [<ffffffffa0e37390>] mdt_thread_info_fini+0xe0/0x190 [mdt]
 05:20:43: [<ffffffffa0e3bee2>] mdt_intent_policy+0xe2/0xc70 [mdt]
 05:20:43: [<ffffffffa06f18c7>] ldlm_lock_enqueue+0x127/0x970 [ptlrpc]
 05:20:43: [<ffffffffa071f3d7>] ldlm_handle_enqueue0+0x807/0x15a0 [ptlrpc]
 05:20:43: [<ffffffffa05de6c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
 05:20:43: [<ffffffffa07aa321>] tgt_enqueue+0x61/0x230 [ptlrpc]
 05:20:43: [<ffffffffa07aadac>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc]
 05:20:43: [<ffffffffa0752691>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
 05:20:43: [<ffffffffa0751850>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
 05:20:43: [<ffffffff810a0fce>] kthread+0x9e/0xc0
 05:20:43: [<ffffffff8100c28a>] child_rip+0xa/0x20
 05:20:43: [<ffffffff810a0f30>] ? kthread+0x0/0xc0
 05:20:43: [<ffffffff8100c280>] ? child_rip+0x0/0x20
 05:20:43:
 05:20:43:Kernel panic - not syncing: LBUG


 Comments   
Comment by Andreas Dilger [ 10/Dec/15 ]

Vitaly, could you provide some information about how you hit this problem.

Comment by Jian Yu [ 10/Dec/15 ]

The failure occurred while running recovery-small test 130b to verify patch set 1 of http://review.whamcloud.com/17501 for LU-7173 under DNE configuration:
https://testing.hpdd.intel.com/test_sessions/9246ab4e-9dc1-11e5-91b0-5254006e85c2

Comment by Gerrit Updater [ 29/Dec/15 ]

John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/17735
Subject: LU-7535 test: avoid layout intent in recovery-small 130[ab]
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3f27751fa6851fdbbde31291bde4b284d7292194

Comment by James Nunez (Inactive) [ 05/Jan/16 ]

Another failure on master:
2016-01-04 10:59:15 - https://testing.hpdd.intel.com/test_sets/000486a0-b2ea-11e5-9134-5254006e85c2

Comment by Gerrit Updater [ 06/Jan/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17735/
Subject: LU-7535 mdt: clear the lock handle in mdt_intent_layout()
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 86c0e7d98d52ebb5b6e7c4b50a40f60fe0769a03

Comment by Joseph Gmitter (Inactive) [ 06/Jan/16 ]

Landed for 2.8.0

Generated at Sat Feb 10 02:09:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.