[LU-9660] reduce ptlrpcd wakeups on idle system - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.11.0, Lustre 2.10.3
Affects Version/s: None
Labels:
- medium

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

When a client is not actively using Lustre, there is still background activity (dirty page flushing, lock cancellation, pings, etc.) that wakes a number of different threads. This background activity can cause small delays in user-space threads, and in tightly-coupled HPC applications running across a large number of clients (typically MPI with barriers) this jitter causes all of the clients to be delayed all of the time.

It would be beneficial to determine which threads are commonly being woken when there is no work to be done (in particular ptlrpcd) and:

avoid waking them on a periodic basis (e.g. 1s) and instead only wake them when there is work to be done (e.g. send ping, cancel a lock, etc)
coordinate wakeups across clients (e.g. on even multiples of 1s, 5s, etc) so that the delays are affecting all clients at one time rather than spread continuously across the application timesteps. This can potentially be problematic, if it causes large load spikes on the servers (essentially DDOS).
pre-emptively perform work that would happen in the near future (e.g. cancel locks that will expire in the next few seconds, send an earlier ping, etc).
completely disconnect from the server(s) if there is no activity (~~LU-7236~~) to avoid pings completely and reduce the number of clients they need to recover when the system is idle.

Attachments

Issue Links

is related to

LU-9441 Use kernel threads in predictable fashion to confine OS noise

Resolved

is related to

LU-7236 OST connect and disconnect on demand

Resolved

Activity

People

Assignee:: Alex Zhuravlev

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 13/Jun/17 10:44 PM

Updated:: 19/Dec/17 9:28 PM

Resolved:: 16/Oct/17 1:42 PM