[LU-4874] (osc_lock.c:240:osc_lock_fini()) ASSERTION( ols->ols_lock == ((void *)0) ) Created: 09/Apr/14  Updated: 05/May/14  Resolved: 28/Apr/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.1
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Daire Byrne (Inactive) Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-4558 Crash in cl_lock_put on racer Resolved
Severity: 3
Rank (Obsolete): 13469

 Description   

One of our Lustre clients died with this LBUG. The client exports over NFS and was also doing some large file copies and deletes at the time. I have not seen this LBUG before and couldn't find anything similar in JIRA. kdump vmcore available on request.

<0>LustreError: 2778:0:(osc_lock.c:240:osc_lock_fini()) ASSERTION( ols->ols_lock == ((void *)0) ) failed:
<0>LustreError: 2778:0:(osc_lock.c:240:osc_lock_fini()) LBUG
<4>Pid: 2778, comm: ptlrpcd_8
<4>
<4>Call Trace:
<4> [<ffffffffa03ab895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa03abe97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa090cfce>] osc_lock_fini+0x18e/0x190 [osc]
<4> [<ffffffffa0532553>] cl_lock_put+0x223/0x430 [obdclass]
<4> [<ffffffffa090fefc>] osc_lock_upcall+0x19c/0x5e0 [osc]
<4> [<ffffffffa090fd60>] ? osc_lock_upcall+0x0/0x5e0 [osc]
<4> [<ffffffffa08f0876>] osc_enqueue_fini+0x106/0x240 [osc]
<4> [<ffffffffa08f52d2>] osc_enqueue_interpret+0xe2/0x1e0 [osc]
<4> [<ffffffffa069cedc>] ptlrpc_check_set+0x2ac/0x1b20 [ptlrpc]
<4> [<ffffffffa06ca69b>] ptlrpcd_check+0x53b/0x560 [ptlrpc]
<4> [<ffffffffa06cabc3>] ptlrpcd+0x233/0x390 [ptlrpc]
<4> [<ffffffff81063410>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa06ca990>] ? ptlrpcd+0x0/0x390 [ptlrpc]
<4> [<ffffffff8100c0ca>] child_rip+0xa/0x20
<4> [<ffffffffa06ca990>] ? ptlrpcd+0x0/0x390 [ptlrpc]
<4> [<ffffffffa06ca990>] ? ptlrpcd+0x0/0x390 [ptlrpc]
<4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
<4>
<0>Kernel panic - not syncing: LBUG
<4>Pid: 2778, comm: ptlrpcd_8 Not tainted 2.6.32-358.18.1.el6_lustre.x86_64 #1
<4>Call Trace:
<4> [<ffffffff8150de58>] ? panic+0xa7/0x16f
<4> [<ffffffffa03abeeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
<4> [<ffffffffa090cfce>] ? osc_lock_fini+0x18e/0x190 [osc]
<4> [<ffffffffa0532553>] ? cl_lock_put+0x223/0x430 [obdclass]
<4> [<ffffffffa090fefc>] ? osc_lock_upcall+0x19c/0x5e0 [osc]
<4> [<ffffffffa090fd60>] ? osc_lock_upcall+0x0/0x5e0 [osc]
<4> [<ffffffffa08f0876>] ? osc_enqueue_fini+0x106/0x240 [osc]
<4> [<ffffffffa08f52d2>] ? osc_enqueue_interpret+0xe2/0x1e0 [osc]
<4> [<ffffffffa069cedc>] ? ptlrpc_check_set+0x2ac/0x1b20 [ptlrpc]
<4> [<ffffffffa06ca69b>] ? ptlrpcd_check+0x53b/0x560 [ptlrpc]
<4> [<ffffffffa06cabc3>] ? ptlrpcd+0x233/0x390 [ptlrpc]
<4> [<ffffffff81063410>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa06ca990>] ? ptlrpcd+0x0/0x390 [ptlrpc]
<4> [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
<4> [<ffffffffa06ca990>] ? ptlrpcd+0x0/0x390 [ptlrpc]
<4> [<ffffffffa06ca990>] ? ptlrpcd+0x0/0x390 [ptlrpc]
<4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20


 Comments   
Comment by Peter Jones [ 09/Apr/14 ]

Bobijam

Could you please look into this issue?

Thanks

Peter

Comment by Zhenyu Xu [ 09/Apr/14 ]

I think http://review.whamcloud.com/#/c/9876/ can fix this issue.

Comment by Peter Jones [ 28/Apr/14 ]

Daire

Are you ok to close out this issue and reopen it if the same issue occurs with the fix for LU-4558 in place?

Peter

Comment by Daire Byrne (Inactive) [ 29/Apr/14 ]

Yes it's okay to close. We have only ever seen this once (in many months) and I was mainly reporting it to make sure you guys knew about it. But it is clear now that you did, and have fixed it! We will wait until we do a point release update.

Comment by Peter Jones [ 29/Apr/14 ]

Thanks Daire!

Generated at Sat Feb 10 01:46:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.