Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.14.0
-
None
-
master (commit: 56526a90ae)
-
3
-
9223372036854775807
Description
commit 76626d6c52 "LU-13344 all: Separate debugfs and procfs handling" caused write performance regression. Here is a reproducer and tested workload.
Single Client(Ubuntu 18.04, 5.4.0-47-generic), 16MB O_DIRECT, FPP (128 processes)
# mpirun --allow-run-as-root -np 128 --oversubscribe --mca btl_openib_warn_default_gid_prefix 0 --bind-to none ior -u -w -r -k -e -F -t 16384k -b 16384k -s 1000 -u -o /mnt/ai400x/ior.out/file --posix.odirect
"git bisect" indentified an commit where regression started.
Here is test results.
76626d6c52 LU-13344 all: Separate debugfs and procfs handling
access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 21861 1366.33 60.78 16384 16384 0.091573 93.68 40.38 93.68 0 read 38547 2409.18 46.14 16384 16384 0.005706 53.13 8.26 53.13 0
5bc1fe092c LU-13196 llite: Remove mutex on dio read
access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 32678 2042.40 58.96 16384 16384 0.105843 62.67 4.98 62.67 0 read 38588 2411.78 45.89 16384 16384 0.004074 53.07 8.11 53.07 0
master (commit 56526a90ae)
access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 17046 1065.37 119.02 16384 16384 0.084449 120.15 67.76 120.15 0 read 38512 2407.00 45.04 16384 16384 0.006462 53.18 9.07 53.18 0
master still has this regression and when commit 76626d6c52 reverts from master, the performrance is back.
master (commit 56526a90ae)+ revert commit 76626d6c52
access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 32425 2026.59 59.88 16384 16384 0.095842 63.16 4.79 63.16 0 read 39601 2475.09 47.22 16384 16384 0.003637 51.72 5.73 51.72 0
tappro made this comment over on LU-14580, and I wanted to bring it here:
"I don't see problems with patch itself. Increment in osc_consume_write_grant() was removed because it is done by atomic_long_add_return() now outside that call and it is done in both places where it is called. But maybe the patch "
LU-12687osc: consume grants for direct I/O" itself causes slowdown? Now grants are taken for Direct IO as well, so maybe that is related to not enough grants problem or similar. Are there any complains about grants on client during IOR run?"That patch definitely has performance implications. Direct i/o will keep sending even when there are no grants - since it is already synchronous - but it significantly increases the load on the cl_loi_list_lock in some cases. The patches I noted above are aimed at that.
There's still very likely a memory layout issue here, but perhaps these will help...