Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.15.5
-
RHEL 8.9 client running lustre 2.15.5
-
3
-
9223372036854775807
Description
When running a single shared file IOR on a compute node with a large number of cores it's possible to trigger soft locks. Applying LU-17630 helps but doesn't entirely resolve the issue. The stack traces logged by the soft lockup watchdog indicate the cause is heavy contention in delete_from_page_cache() on the page cache spin lock.
RIP: 0010:delete_from_page_cache+0x52/0x70
[ 9375.915829] generic_error_remove_page+0x36/0x60
[ 9375.915837] cl_page_discard+0x47/0x80 [obdclass]
[ 9375.915883] discard_pagevec+0x7d/0x150 [osc]
[ 9375.915900] osc_lru_shrink+0x87f/0x8b0 [osc]
[ 9375.915913] lru_queue_work+0xfd/0x230 [osc]
[ 9375.915925] work_interpreter+0x32/0x110 [ptlrpc]
[ 9375.915992] ptlrpc_check_set+0x5cf/0x1fc0 [ptlrpc]
[ 9375.916052] ptlrpcd+0x6df/0xa70 [ptlrpc]
[ 9375.916176] kthread+0x14c/0x170
It looks like this is possible because:
1. Multiple callers pass 'force=1' to osc_lru_shrink() allowing multiple threads to run concurrently. lru_queue_work() does use 'force=0' which is good.
2. There is no per-filesystem or per-node limit on how many threads can run osc_lru_shrink(). It's only limited per client_obd using the 'cl_lru_shrinkers' atomic.
I'll push a patch for review which adds a per-filesystem limit. Interestingly, it looks portions of this may have been implemented long ago but not completed. The proposed patch still needs to be tested on a system with a large number of OSCs but I wanted to post it for initial feedback.
Attachments
Issue Links
- is related to
-
LU-17630 osc_lru_shrink() should not block scheduling for long
- Resolved