Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: None
Affects Version/s: Lustre 2.4.1
Labels:
- llnl
Environment:
lustre-2.4.0-26chaos

Severity:
3
Rank (Obsolete):
13209

Description

Our MDS experienced severe lock contention in lock_res_and_lock(). This had a large impact on client responsiveness because service threads were starved for CPU time. We have not yet identified the client workload that caused this problem. All active tasks had stack traces like this, but would eventually get scheduled out.

 ...
__spin_lock
lock_res_and_lock
ldlm_handle_enqueue0
mdt_handle_common
mds_regular_handle
ptlrpc_server_handle_request
...

This raises the question of why the ldlm resource lock needs to be a spinlock. Couldn't we avoid this issue by converting it to a mutex? This question was raised in ~~LU-3504~~.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

parallel_flock.c
22/Jul/14 3:48 AM
3 kB
Oleg Drokin
parallel_flock_v2.c
23/Jul/14 4:09 AM
3 kB
Oleg Drokin

Issue Links

is related to

LU-12542 LDLM improvements form linux lustre client work

Resolved

is related to

LU-3504 MDS: All cores spinning on ldlm lock in lock_res_and_lock

Resolved

Activity

People

Assignee:: Oleg Drokin

Reporter:: Ned Bass (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 21/Mar/14 7:51 PM

Updated:: 22/Apr/21 4:17 PM

Resolved:: 22/Apr/21 4:17 PM