[LU-1157] improve flock deadlock detection: hash of waiting flocks instead of list Created: 01/Mar/12  Updated: 21/Nov/12  Resolved: 01/Jul/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.3.0

Type: Improvement Priority: Minor
Reporter: Vitaly Fertman Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 4579

 Comments   
Comment by Andreas Dilger [ 01/Mar/12 ]

Vitaly, please also provide some useful information about this change.

Comment by Vitaly Fertman [ 01/Mar/12 ]

http://review.whamcloud.com/2240

Comment by Vitaly Fertman [ 01/Mar/12 ]

the test:
3000 threads takes by 1 non-conflicnting locks, after that these 3000 threads takes the lock its right neighbour thread has taken, so 3000-th lock gets deadlocked on the 1st one. to check it is dead locked, we need to search the list 3000 times on the last enqueue. this enqueue is done 2000 times. 2000 enqueues (only the deadlocked one is accounted) + 2000*3000 deadlock detections takes for following, in seconds:
===vanilla===
exports: 1 deadlocks: 2000 locks: 3000 time: 148
exports: 2 deadlocks: 2000 locks: 3000 time: 148
exports: 4 deadlocks: 2000 locks: 3000 time: 148
exports: 8 deadlocks: 2000 locks: 3000 time: 148
===per-export locking & hash===
exports: 1 deadlocks: 2000 locks: 3000 time: 19.5
exports: 2 deadlocks: 2000 locks: 3000 time: 12
exports: 4 deadlocks: 2000 locks: 3000 time: 11.5
exports: 8 deadlocks: 2000 locks: 3000 time: 10

Comment by Vitaly Fertman [ 01/Mar/12 ]

this one is done on top of LU-1156

Comment by Peter Jones [ 01/Jul/12 ]

Landed for 2.3

Comment by Nathan Rutman [ 21/Nov/12 ]

Xyratex-bug-id: MRP-385

Generated at Sat Feb 10 01:14:03 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.