[LU-4801] spin lock contention in lock_res_and_lock - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: None
Affects Version/s: Lustre 2.4.1
Labels:
- llnl
Environment:
lustre-2.4.0-26chaos

Severity:
3
Rank (Obsolete):
13209

Description

Our MDS experienced severe lock contention in lock_res_and_lock(). This had a large impact on client responsiveness because service threads were starved for CPU time. We have not yet identified the client workload that caused this problem. All active tasks had stack traces like this, but would eventually get scheduled out.

 ...
__spin_lock
lock_res_and_lock
ldlm_handle_enqueue0
mdt_handle_common
mds_regular_handle
ptlrpc_server_handle_request
...

This raises the question of why the ldlm resource lock needs to be a spinlock. Couldn't we avoid this issue by converting it to a mutex? This question was raised in ~~LU-3504~~.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

parallel_flock_v2.c
3 kB
23/Jul/14 4:09 AM
parallel_flock.c
3 kB
22/Jul/14 3:48 AM

Issue Links

is related to

LU-12542 LDLM improvements form linux lustre client work

Resolved

is related to

LU-3504 MDS: All cores spinning on ldlm lock in lock_res_and_lock

Resolved

Activity

[LU-4801] spin lock contention in lock_res_and_lock

Gerrit Updater added a comment - 15/Apr/20 9:40 PM

Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38238
Subject: Revert "~~LU-4801~~ ldlm: discard l_lock from struct ldlm_lock."
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 32095b5717954fa7260c9d6e369a208395bc39da

Gerrit Updater added a comment - 15/Apr/20 9:40 PM Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38238 Subject: Revert " LU-4801 ldlm: discard l_lock from struct ldlm_lock." Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 32095b5717954fa7260c9d6e369a208395bc39da

Gerrit Updater added a comment - 14/Apr/20 8:11 AM

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35483/
Subject: ~~LU-4801~~ ldlm: discard l_lock from struct ldlm_lock.
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0584eb73dbb5b4c710a8c7eb1553ed5dad0c18d8

Gerrit Updater added a comment - 14/Apr/20 8:11 AM Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35483/ Subject: LU-4801 ldlm: discard l_lock from struct ldlm_lock. Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0584eb73dbb5b4c710a8c7eb1553ed5dad0c18d8

Gerrit Updater added a comment - 12/Jul/19 3:11 PM

James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/35484
Subject: ~~LU-4801~~ ldlm: don't access l_resource when not locked.
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5ea026bb3eb9020233569659189850519bc99a17

Gerrit Updater added a comment - 12/Jul/19 3:11 PM James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/35484 Subject: LU-4801 ldlm: don't access l_resource when not locked. Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 5ea026bb3eb9020233569659189850519bc99a17

Gerrit Updater added a comment - 12/Jul/19 3:08 PM

James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/35483
Subject: ~~LU-4801~~ ldlm: discard l_lock from struct ldlm_lock.
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c7e16e62306abdb62f1039957573400c9114ea3f

Gerrit Updater added a comment - 12/Jul/19 3:08 PM James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/35483 Subject: LU-4801 ldlm: discard l_lock from struct ldlm_lock. Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: c7e16e62306abdb62f1039957573400c9114ea3f

James A Simmons added a comment - 12/Jul/19 3:05 PM

Neil pushed some patches to address. I will push

James A Simmons added a comment - 12/Jul/19 3:05 PM Neil pushed some patches to address. I will push

Ned Bass (Inactive) added a comment - 24/Jul/14 5:25 AM

Oleg, it may not be right away, but we'll get this scheduled for testing.

Ned Bass (Inactive) added a comment - 24/Jul/14 5:25 AM Oleg, it may not be right away, but we'll get this scheduled for testing.

Oleg Drokin added a comment - 23/Jul/14 5:07 PM

Ned, I see.
Well, I wonder if you can try my patch on your testbed to see if it hits similar overloaded issues right away by any chance?
Cliff tried it with the patch and MDS was basically under no load during the test run, but he was then preempted by other important testing, so there was no test without the patch to actually ensure that the reproducer reproduces anything.
So if you have time for that, it might be interesting exercise.

Oleg Drokin added a comment - 23/Jul/14 5:07 PM Ned, I see. Well, I wonder if you can try my patch on your testbed to see if it hits similar overloaded issues right away by any chance? Cliff tried it with the patch and MDS was basically under no load during the test run, but he was then preempted by other important testing, so there was no test without the patch to actually ensure that the reproducer reproduces anything. So if you have time for that, it might be interesting exercise.

Ned Bass (Inactive) added a comment - 23/Jul/14 4:46 AM

Oleg, to clarify my comment regarding threads blocking in kmalloc(), I don't mean that they are doing so while holding a spinlock. My theory is that they spent so much time under the spin lock that they become eligible to reschedule. When they later call kmalloc() outside the spinlock it internally calls might_sleep() and reschedules the thread. Because there are other threads actively contending in lock_res_and_lock(), those blocked threads get starved for CPU time.

Ned Bass (Inactive) added a comment - 23/Jul/14 4:46 AM Oleg, to clarify my comment regarding threads blocking in kmalloc(), I don't mean that they are doing so while holding a spinlock. My theory is that they spent so much time under the spin lock that they become eligible to reschedule. When they later call kmalloc() outside the spinlock it internally calls might_sleep() and reschedules the thread. Because there are other threads actively contending in lock_res_and_lock(), those blocked threads get starved for CPU time.

Oleg Drokin added a comment - 23/Jul/14 4:09 AM

Attaching parallel_flock_v2.c - this is the same as before, only this version actually works as expected.

Oleg Drokin added a comment - 23/Jul/14 4:09 AM Attaching parallel_flock_v2.c - this is the same as before, only this version actually works as expected.

Oleg Drokin added a comment - 22/Jul/14 3:48 AM

attached parallel_flock.c is my idea for a good reproducer of the flock issue.

It takes three arguments: -f filename - filename on lustre fs to work on
-n number of iterations
-s how long for the first lock to be kept.

To reproduce - run on a machine with a lot of clients using all available cores too.
default sleep time is just 6 seconds so possbly you want more than that.
Several iterations (default - 10).

While running, I imagine MDS should grind to a total halt so that even userspace barely responds if at all.

This is untested code, I just made sure it compiles.

Oleg Drokin added a comment - 22/Jul/14 3:48 AM attached parallel_flock.c is my idea for a good reproducer of the flock issue. It takes three arguments: -f filename - filename on lustre fs to work on -n number of iterations -s how long for the first lock to be kept. To reproduce - run on a machine with a lot of clients using all available cores too. default sleep time is just 6 seconds so possbly you want more than that. Several iterations (default - 10). While running, I imagine MDS should grind to a total halt so that even userspace barely responds if at all. This is untested code, I just made sure it compiles.

Oleg Drokin added a comment - 22/Jul/14 2:47 AM

I guess an idea for a reproducer if it's just a long flock reprocessing issue is to have an multithreaded app run from a whole bunch of nodes where every thread would try to lock the same range (in blocking mode) in the same file to accumulate a multi-thousand list of blocked locks.

Make whoever has the lock first to wait a minute or whatever amount of time is needed to ensure that all threads on all nodes has sent the flock requests and they all blocked.
Then release the original lock, and after that every thread that has received a lock should release it immediately as well.

Oleg Drokin added a comment - 22/Jul/14 2:47 AM I guess an idea for a reproducer if it's just a long flock reprocessing issue is to have an multithreaded app run from a whole bunch of nodes where every thread would try to lock the same range (in blocking mode) in the same file to accumulate a multi-thousand list of blocked locks. Make whoever has the lock first to wait a minute or whatever amount of time is needed to ensure that all threads on all nodes has sent the flock requests and they all blocked. Then release the original lock, and after that every thread that has received a lock should release it immediately as well.

People

Assignee:: Oleg Drokin

Reporter:: Ned Bass (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 21/Mar/14 7:51 PM

Updated:: 22/Apr/21 4:17 PM

Resolved:: 22/Apr/21 4:17 PM