[LU-5685] (cl_lock.c:1128:cl_use_try()) ASSERTION( result != -38 ) Created: 30/Sep/14  Updated: 08/Feb/18  Resolved: 08/Feb/18

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.3
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Bruno Travouillon (Inactive) Assignee: Jinshan Xiong (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: p4b
Environment:

RHEL6 w/ patched kernel 2.6.32-431.11.2.el6
Lustre 2.4.3 + bullpatches


Issue Links:
Related
is related to LU-5062 LBUG: osc_req_attr_set Resolved
Severity: 3
Rank (Obsolete): 15924

 Description   

We hit the following LBUG twice on two Lustre clients:

LustreError: 116840:0:(lcommon_cl.c:1201:cl_file_inode_init()) Failure to initialize cl object [0x22b09fa54:0x1550:0x0]: -16
LustreError: 116850:0:(cl_lock.c:1128:cl_use_try()) ASSERTION( result != -38 ) failed:
LustreError: 116850:0:(cl_lock.c:1128:cl_use_try()) LBUG
Pid: 116850, comm: XXXXXXXXXXXXXXX

Call Trace:
 [<ffffffffa042c895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa042ce97>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa05a7fa6>] cl_use_try+0x2a6/0x2e0 [obdclass]
 [<ffffffffa05a813d>] cl_enqueue_try+0x15d/0x300 [obdclass]
 [<ffffffffa05a8fff>] cl_enqueue_locked+0x6f/0x1f0 [obdclass]
 [<ffffffffa05a9c6e>] cl_lock_request+0x7e/0x270 [obdclass]
 [<ffffffffa0b61f00>] cl_glimpse_lock+0x180/0x490 [lustre]
 [<ffffffffa0b62775>] cl_glimpse_size0+0x1a5/0x1d0 [lustre]
 [<ffffffffa0b15528>] ll_inode_revalidate_it+0x198/0x1c0 [lustre]
 [<ffffffff81197036>] ? final_putname+0x26/0x50
 [<ffffffffa0b15599>] ll_getattr_it+0x49/0x170 [lustre]
 [<ffffffffa0b156f7>] ll_getattr+0x37/0x40 [lustre]
 [<ffffffff81227b23>] ? security_inode_getattr+0x23/0x30
 [<ffffffff8118f001>] vfs_getattr+0x51/0x80
 [<ffffffff8118f094>] vfs_fstatat+0x64/0xa0
 [<ffffffff811bd788>] ? user_statfs+0x38/0xb0
 [<ffffffff8118f13e>] vfs_lstat+0x1e/0x20
 [<ffffffff8118f164>] sys_newlstat+0x24/0x50
 [<ffffffff810686d5>] ? sys_sched_yield+0x55/0x60
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Both time, it was the same user with two version of a code. We were not able to reproduce since.



 Comments   
Comment by Peter Jones [ 30/Sep/14 ]

Jinshan is looking into this

Comment by Jinshan Xiong (Inactive) [ 30/Sep/14 ]

This should be a reproduction of LU-5062. Now that this occurred at b2_4 so I will leave this ticket open and back port the patch.

Comment by Jinshan Xiong (Inactive) [ 30/Sep/14 ]

patch is here: http://review.whamcloud.com/12137

Comment by Bruno Travouillon (Inactive) [ 30/Sep/14 ]

Thanks Jinshan.

Are you aware of a backport to b2_5? We should upgrade soon to this maintenance release.

Comment by Peter Jones [ 02/Oct/14 ]

b2_5 port http://review.whamcloud.com/#/c/12139/

Comment by Bruno Travouillon (Inactive) [ 24/Oct/14 ]

We hit this bug in 2.5.3 as well.

Comment by Peter Jones [ 24/Oct/14 ]

Bruno

Were you carrying the LU-5062 patch against 2.5.3 when you hit this issue?

Peter

Comment by Bruno Travouillon (Inactive) [ 24/Oct/14 ]

If you mean b2_5 port http://review.whamcloud.com/#/c/12139/ , no, we don't have this patch in our build. Seems it need some more reviewers.

Comment by Bruno Travouillon (Inactive) [ 20/Nov/14 ]

Peter,

Can we safely use the b2_5 port on top of 2.5.3?

Comment by Peter Jones [ 20/Nov/14 ]

Yes that is fine

Comment by Jinshan Xiong (Inactive) [ 08/Feb/18 ]

This is about old cl_lock that doesn't exist any more

Generated at Sat Feb 10 01:53:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.