[LU-11926] Lost lease lock on migrate error Created: 05/Feb/19  Updated: 21/Mar/19  Resolved: 21/Mar/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0

Type: Bug Priority: Minor
Reporter: Andriy Skulysh Assignee: Andriy Skulysh
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

   All the file operations have the following locking order - parent,
   child. If a lock for a child is returned to the client, the following
   operations on this file are done by the child fid.
    
   However, the migrate is an exception - it takes the lease lock first and
   takes the PW parent lock next during the MDS_REINT.
    
   At the same time, if there is a parallel racing operation (open) which
   has taken a lock on parent (conflicting with the next MDS_REINT) and
   is trying to take a lock on child - it is blocked until
   the lease cancel comes.
    
   The lease cancel is piggy-backed on the MDS_REINT RPC and is handled
   at the end of the operation, trying to take the conflicting parent lock
   first - thus a deadlock occurs.



 Comments   
Comment by Gerrit Updater [ 05/Feb/19 ]

Andriy Skulysh (c17819@cray.com) uploaded a new patch: https://review.whamcloud.com/34182
Subject: LU-11926 ldlm: Lost lease lock on migrate error
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1eb1a8f060b54d1a1542f8a11c95d916975fa34a

Comment by Gerrit Updater [ 21/Mar/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34182/
Subject: LU-11926 ldlm: Lost lease lock on migrate error
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ae7ca90713b444647e682599398b28c8c16b68f7

Comment by Peter Jones [ 21/Mar/19 ]

Landed for 2.13

Generated at Sat Feb 10 02:48:08 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.