[LU-5332] hsm: mdt_lock_handle_fini()) ASSERTION( !lustre_handle_is_used(&lh->mlh_reg_lh) ) failed Created: 11/Jul/14  Updated: 18/Feb/15  Resolved: 14/Jul/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.1
Fix Version/s: Lustre 2.6.0, Lustre 2.7.0, Lustre 2.5.3

Type: Bug Priority: Blocker
Reporter: Frank Zago (Inactive) Assignee: Jinshan Xiong (Inactive)
Resolution: Fixed Votes: 0
Labels: HSM
Environment:

centos 6.5, Lustre head of tree and 2.5.1


Severity: 3
Rank (Obsolete): 14878

 Description   

Running the following commands will crash an MDS:

echo hello > test
lfs hsm_set --dirty test1

Tested on different systems, Lustre 2.5.1 and head of tree.
Can crash a new FS created with llmount.sh

<4>Lustre: Mounted lustre-client
<4>Lustre: DEBUG MARKER: Using TIMEOUT=20
<6>Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt
<0>LustreError: 20508:0:(mdt_handler.c:2781:mdt_lock_handle_fini()) ASSERTION( !lustre_handle_is_used(&lh->mlh_reg_lh) ) failed: 
<0>LustreError: 20508:0:(mdt_handler.c:2781:mdt_lock_handle_fini()) LBUG
<4>Pid: 20508, comm: mdt00_001
<4>
<4>Call Trace:
<4> [<ffffffffa039d895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa039de97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa0ca255b>] mdt_lock_handle_fini+0x4b/0x80 [mdt]
<4> [<ffffffffa0ca8180>] mdt_thread_info_fini+0xe0/0x190 [mdt]
<4> [<ffffffffa0ce893e>] mdt_hsm_state_set+0x18e/0x6b0 [mdt]
<4> [<ffffffffa0785b6c>] tgt_request_handle+0x23c/0xac0 [ptlrpc]
<4> [<ffffffffa073526a>] ptlrpc_main+0xd1a/0x1980 [ptlrpc]
<4> [<ffffffffa0734550>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
<4> [<ffffffff8109aee6>] kthread+0x96/0xa0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109ae50>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
<4>
<0>Kernel panic - not syncing: LBUG
<4>Pid: 20508, comm: mdt00_001 Not tainted 2.6.32.431.5.1.el6_lustre #3
<4>Call Trace:
<4> [<ffffffff81527983>] ? panic+0xa7/0x16f
<4> [<ffffffffa039deeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
<4> [<ffffffffa0ca255b>] ? mdt_lock_handle_fini+0x4b/0x80 [mdt]
<4> [<ffffffffa0ca8180>] ? mdt_thread_info_fini+0xe0/0x190 [mdt]
<4> [<ffffffffa0ce893e>] ? mdt_hsm_state_set+0x18e/0x6b0 [mdt]
<4> [<ffffffffa0785b6c>] ? tgt_request_handle+0x23c/0xac0 [ptlrpc]
<4> [<ffffffffa073526a>] ? ptlrpc_main+0xd1a/0x1980 [ptlrpc]
<4> [<ffffffffa0734550>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
<4> [<ffffffff8109aee6>] ? kthread+0x96/0xa0
<4> [<ffffffff8100c20a>] ? child_rip+0xa/0x20
<4> [<ffffffff8109ae50>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20



 Comments   
Comment by Jodi Levi (Inactive) [ 11/Jul/14 ]

Jinshan
Can you please take a look at this as it is a blocker for 2.6?
Thank you!

Comment by Jinshan Xiong (Inactive) [ 12/Jul/14 ]

the patch is located at: http://review.whamcloud.com/11083

Comment by Jodi Levi (Inactive) [ 14/Jul/14 ]

Patch landed to Master and backported to b2_6.

Comment by James Nunez (Inactive) [ 24/Jul/14 ]

Patch for b2_5 at http://review.whamcloud.com/#/c/11215/

Generated at Sat Feb 10 01:50:35 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.