[LU-5632] ldlm_lock_addref()) ASSERTION( lock != ((void *)0) ) Created: 16/Sep/14  Updated: 19/Sep/14  Resolved: 19/Sep/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.2
Fix Version/s: Lustre 2.5.0

Type: Bug Priority: Major
Reporter: Christopher Morrone Assignee: Niu Yawei (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Lustre 2.4.2-14chaos (see github.com/chaos/lustre)
ZFS OSD


Issue Links:
Related
is related to LU-4584 Lock revocation process fails consist... Resolved
is related to LU-5525 ASSERTION( new_lock->l_readers + new_... Resolved
Severity: 3
Rank (Obsolete): 15756

 Description   

We had an MDS crash with the following assertion:

ldlm_lock.c:770:ldlm_lock_addref()) ASSERTION( lock != ((void *)0) )

The previous few minutes contained many lock timeouts and some node reconnections.

The backtrace for the mdt_02_049 that hit the assertion was:

ldlm_lock_addref
mdt_reint_open
mdt_reconstruct_open
mdt_reconstruct
mdt_reint_internal
mdt_intent_reint
mdt_intent_policy
ldlm_lock_enqueue
ldlm_handle_enqueue0
mdt_enqueue
mdt_handle_common
mds_regular_handle
ptlrpc_server_handle_request
ptlrpc_main

We were running lustre version 2.4.2-14chaos (see github.com/chaos/lustre).

We cannot provide logs or crash dumps from this system.



 Comments   
Comment by Peter Jones [ 16/Sep/14 ]

Niu

Could you please comment?

Thanks

Peter

Comment by Niu Yawei (Inactive) [ 17/Sep/14 ]

This is caused by the flaw patch of LU-4584, Oleg's correct fix from LU-5525 ( http://review.whamcloud.com/#/c/9488/) should fix the problem.

Comment by Andreas Dilger [ 19/Sep/14 ]

Should be fixed for 2.4 with Oleg's patch from LU-5525.

Generated at Sat Feb 10 01:53:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.