adilger
Was any investigation done here to see why this thread is not "idle"? Was cfs_flush_delayed_fput() just called too often, or was this not correctly detecting the idle state?
WIth cfs_flush_delayed_fput() inside ptlrpc, we dedicate one rpc thread to a flushing. With a previous high load, it could be unavailable for a general req processing for a long time. For example mdtest - high io, than it do sync and wait all threads are done, MDT ptlrpc became idle for a moment, one thread starts to flush, and mdtest starts a second step for a testing. Also all 200+ ptlrpc_main() would call flush_delayed_fput() when idle, but only one do all work. It is better to have ptlrpc_main() as small as possible.
The problem with a perf regression was a bit complicated. The most important thing is context switch during a request processing.
During testing we saw a good mdt_reint_unlink takes only 115 micro seconds, and a bad one ten times more.
mdt_reint_unlink[120822407.0/1048657=115.0]
mdt_reint_unlink[1170551560.0/1048657=1116.0]
the next is a profiling of a bad mdt_reint_unlink, it took 23751 micro seconds. Where the most time was spend during lu_object_free context switch.
In addition to improve this thing we did
- ext4-dquot-commit-speedup.patch
- sysctl -w kernel.sched_cfs_bandwidth_slice_us=50000
Also it would be good to move last object put outside of mdt_handler, and call it after reply. And maybe some yield() after request processing.
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57771
Subject:
LU-18463ptlrpc: removing cfs_flush_fput idleProject: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 449c576282d6aca5dea50e97837b05210483ee68