[LU-14522] missing ldlm lock processing causes timeouts in racer Created: 13/Mar/21  Updated: 11/May/23  Resolved: 06/Apr/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Minor
Reporter: Alex Zhuravlev Assignee: Alex Zhuravlev
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-14551 MDT hangs on ldlm_expired_completion_... Closed
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

post-mortem analysis for locks may look like the following:

R: 240000402:35e1:0 in mdt-lustre-MDT0001_UUID
  W: 480963a72f1c4f5e in #09443 --/EX IB:22 40210400000020 in mdt-lustre-MDT0001_UUID
  W: 480963a72f1c4f9d in #08398 --/PR IB:13 40210000000000 in mdt-lustre-MDT0001_UUID
  W: 480963a72f1c54a5 in #11284 --/PR IB:13 40210000000000 in mdt-lustre-MDT0001_UUID
    #11284 hold 240000402:1:0 PR/PR IB:2 1/0 in mdt-lustre-MDT0001_UUID
    #11284 hold 240000402:1:0 CR/CR IB:2 1/0 in mdt-lustre-MDT0001_UUID
  W: 480963a72f1c89f7 in #11331 --/PR IB:13 40210000000000 in mdt-lustre-MDT0001_UUID
    #11331 hold 240000402:1:0 PR/PR IB:2 1/0 in mdt-lustre-MDT0001_UUID
    #11331 hold 240000402:1:0 CR/CR IB:2 1/0 in mdt-lustre-MDT0001_UUID

i.e. there are number of the locks in the waiting queue, but none is granted.



 Comments   
Comment by Gerrit Updater [ 14/Mar/21 ]

Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/42031
Subject: LU-14522 ldlm: reprocess locks if enqueue failed
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4815632a0f5eb9c9bcaffae944f658e90f951dad

Comment by Andreas Dilger [ 25/Mar/21 ]

It seems that LU-14551 is also fixing the same problem.

Comment by Gerrit Updater [ 06/Apr/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/42031/
Subject: LU-14522 ldlm: reprocess locks if enqueue failed
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 9cc7128b9b2bf444657dac6765decf9fb56aee8d

Comment by Peter Jones [ 06/Apr/21 ]

Landed for 2.15

Comment by Gerrit Updater [ 08/Jul/22 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/47914
Subject: LU-14522 ldlm: reprocess locks if enqueue failed
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: c1771961c623f17c8866302cb10905cbfa3e0039

Generated at Sat Feb 10 03:10:30 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.