Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.15.2
-
None
-
Lustre-2.15.2, Rokeylinux 8.6 (4.18.0-372.32.1.el8_6.x86_64), OFED-5.4-3.6.8.1
-
3
-
9223372036854775807
Description
a client performance regression was found in 2.15.2-RC1 (commit:e21498bcaa).
Tested workload is single client and SSF(single shared file) from 16 processes.
# mpirun -np 16 ior -a POSIX -i 1 -d 10 -w -r -b 16g -t 1m -C -Q 17 -e -vv -o //exafs/d0/d1/d2/ost_stripe/file
lustre-2.15.1
access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 2489.25 2489.28 0.006428 16777216 1024.00 0.000936 105.31 0.000238 105.31 0 read 4176 4176 0.003803 16777216 1024.00 0.001695 62.77 3.92 62.77 0 write 2423.58 2423.60 0.006452 16777216 1024.00 0.000586 108.16 2.45 108.16 1 read 4197 4197 0.003652 16777216 1024.00 0.001982 62.46 3.98 62.46 1 write 2502.32 2502.34 0.006375 16777216 1024.00 0.000404 104.76 0.305282 104.76 2 read 4211 4211 0.003683 16777216 1024.00 0.001679 62.25 3.99 62.25 2 Max Write: 2502.32 MiB/sec (2623.88 MB/sec) Max Read: 4211.19 MiB/sec (4415.75 MB/sec)
lustre-2.15.2-RC1
access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 2103.65 2103.68 0.007142 16777216 1024.00 0.001769 124.61 7.60 124.61 0 read 4204 4204 0.003159 16777216 1024.00 0.001461 62.35 10.59 62.35 0 write 2169.58 2169.69 0.006903 16777216 1024.00 0.000912 120.82 7.72 120.83 1 read 4282 4282 0.003722 16777216 1024.00 0.137671 61.22 2.78 61.22 1 write 2133.24 2133.25 0.007500 16777216 1024.00 0.000380 122.88 3.60 122.89 2 read 4088 4088 0.003689 16777216 1024.00 0.001053 64.13 3.68 64.13 2 Max Write: 2169.58 MiB/sec (2274.97 MB/sec) Max Read: 4282.19 MiB/sec (4490.20 MB/sec)
it is ~14% performance regression in 2.15.2-RC1 compared to lustre-2.15.1.
After investigations and 'git bisect' tells us "commit: [6d4559f6b948a93aaf5e94c4eb47cd9ebcf7ba95] LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1]" caused this performance regression.
Here is another test result after revered patch "LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1]" from lustre-2.15.2-RC1 and it confirmed the performance was back to same level of 2.15.1.
lustre-2.15.2-RC1 + reverted commit:6d4559f6b9 (LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1])
access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 2497.41 2497.44 0.006407 16777216 1024.00 0.001115 104.97 0.000791 104.97 0 read 4217 4217 0.003773 16777216 1024.00 0.001680 62.16 3.37 62.16 0 write 2471.13 2471.14 0.006475 16777216 1024.00 0.000375 106.08 0.000292 106.08 1 read 4083 4083 0.003765 16777216 1024.00 0.001659 64.20 3.23 64.20 1 write 2457.91 2457.92 0.006509 16777216 1024.00 0.000412 106.65 0.010367 106.65 2 read 4163 4163 0.003771 16777216 1024.00 0.001909 62.97 6.35 62.97 2 Max Write: 2497.41 MiB/sec (2618.72 MB/sec) Max Read: 4217.39 MiB/sec (4422.25 MB/sec)
Attachments
Issue Links
- is related to
-
LU-15959 support for SLES 15 SP4
-
- Resolved
-
Should this issue be re-opened to investigate/address the performance loss for newer kernels?
I don't think it is only SLES15sp4 that is affected, but any kernel since Linux 5.2 where account_page_dirtied() is not exported, like Ubuntu 22.04, RHEL9.x. The patch landed here defers this problem while kallsyms_lookup_name() can work around that lack, but that is also removed in newer kernels.
There should be some way that we can work with the new page cache more efficiently for large page ranges, since that is what xarray and folios are supposed to be for...