[LU-17427] reduce hold time for BFL rename lock Created: 16/Jan/24  Updated: 19/Jan/24

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-17426 parallel cross-directory rename of re... Open
is related to LU-17434 DNE3: add exclude list for remote sub... Open
is related to LU-17441 move rename RPC handling to MDS_IO_PO... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

During a non-parallel rename, the BFL resource is locked by the MDS thread first, then (up to) 4 child FIDs are locked. This means the BFL can be held for a long time if any client holding one of those locks is non-responsive for some reason. There may potentially be hundreds of clients holding PR locks on the source or target directory and/or child being renamed.

To reduce the hold time and contention on the BFL resource lock, the MDS could get the 4 child locks first (to cancel the majority of lock holders), drop those locks, then get the BFL resource lock and re-lock the children.

In many cases, this would allow many or all contended DLM locks held by children to be cancelled without holding the BFL lock, which avoids holding the BFL when talking to slow clients, and also reduces the overall time that the BFL lock is held (allowing more renames to be done).

A further optimization might be to acquire the child locks first, then "trylock" the BFL lock afterward. If the BFL locking succeeds (i.e. it is uncontendeed), then verify the parent and child objects have not been modified since they were locked, maybe also the path connectivity. That would help avoid lock ping-pong in situations where the parent/child locks continue to be contended, and the MDS would only get them once if it works.


Generated at Sat Feb 10 03:35:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.