Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.15.0
-
None
-
3
-
9223372036854775807
Description
performance regressions in stripe directory on 2.15.0 (commit;4d93fd7) were found against b2_14(commit:d4b9557).
Here is configuration.
4 x MDS (1 x MDT per MDS) 4 x OSS (2 x OSS per OSS) 40 x client [root@ec01 ~]# mkdir -p /exafs/d0/d1/d2/mdt_stripe/ [root@ec01 ~]# lfs setdirstripe -c 4 -D /exafs/d0/d1/d2/mdt_stripe/ [root@ec01 ~]# salloc -p 40n -N 40 --ntasks-per-node=16 mpirun --allow-run-as-root -oversubscribe -mca btl_openib_if_include mlx5_1:1 -x UCX_NET_DEVICES=mlx5_1:1 /work/tools/bin/mdtest -n 2000 -F -i 3 -p 10 -v -d /exafs/d0/d1/d2/mdt_stripe/
Here is test resutls.
server: version=2.15.0_RC2_22_g4d93fd7 client: version=2.15.0_RC2_22_g4d93fd7 SUMMARY rate: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 103733.203 76276.410 93728.713 15168.101 File stat 693152.731 656461.448 671671.960 19132.425 File read 259081.462 247951.008 253393.168 5569.308 File removal 145137.390 142142.699 143590.068 1499.846 Tree creation 48.035 1.922 17.475 26.467 Tree removal 35.643 15.861 24.045 10.323
server: version=2.14.0_21_gd4b9557 client: version=2.14.0_21_gd4b9557 SUMMARY rate: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 138939.425 81336.388 117014.695 31167.261 File stat 1678888.952 1580356.340 1645190.276 56162.463 File read 569731.788 528830.155 546121.363 21170.387 File removal 191837.291 186597.900 188595.661 2832.527 Tree creation 120.108 0.986 51.078 61.778 Tree removal 40.863 33.203 37.987 4.171
As far as I observed this, it seems to be server side regression since because performance with lustre-2.15 clients + lustre-2.14 was ok below.
server: version=2.14.0_21_gd4b9557 client: version=2.15.0_RC2_22_g4d93fd7 SUMMARY rate: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 132009.360 74074.615 106514.108 29585.056 File stat 1570754.679 1457120.401 1532703.082 65457.038 File read 563710.286 540228.432 553871.772 12194.544 File removal 189557.092 186065.253 187536.946 1809.374 Tree creation 54.678 1.883 19.576 30.399 Tree removal 42.065 41.677 41.875 0.194
it seems that the following patch where regressions started.
LU-14459 lmv: change default hash type to crush Change the default hash type to CRUSH to minimize the number of directory entries that need to be migrated.
server: version=2.14.51_197_gf269497 client: version=2.15.0_RC2_22_g4d93fd7 SUMMARY rate: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 148072.690 87600.145 127000.919 34149.618 File stat 1523849.471 1388808.972 1441253.182 72393.681 File read 562840.721 505515.837 538333.864 29552.364 File removal 197259.873 191117.823 194934.244 3331.372 Tree creation 111.869 1.707 39.426 62.755 Tree removal 44.113 30.518 36.562 6.922
server: version=2.14.2.14.51_198_gbb60caa client: version=2.15.0_RC2_22_g4d93fd7 SUMMARY rate: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 86531.781 63506.794 72790.003 12142.761 File stat 808075.643 746570.771 784071.104 32898.551 File read 260064.500 249212.881 256291.924 6135.058 File removal 159592.539 155603.788 157752.556 2012.224 Tree creation 120.060 1.138 41.069 68.410 Tree removal 37.780 37.263 37.450 0.287 V-1: Entering PrintTimestamp...
I just found MDT load balancing seems to be not working well after patch. It's unbalanced file distribution across MDTs at create. For instance, here is just file creation test in a stripe directory.
Before patch (commit:f269497)
mpirunp -np 640 mdtest -n 2000 -F -C -i 1 -p 10 -v -d /exafs/d0/d1/d2/mdt_stripe/ [root@ec01 ~]# lfs df -i | grep MDT exafs-MDT0000_UUID 83050496 320298 82730198 1% /exafs[MDT:0] exafs-MDT0001_UUID 83050496 320283 82730213 1% /exafs[MDT:1] exafs-MDT0002_UUID 83050496 320334 82730162 1% /exafs[MDT:2] exafs-MDT0003_UUID 83050496 320293 82730203 1% /exafs[MDT:3]
After patch (commit:bb60caa)
[root@ec01 ~]# lfs df -i | grep MDT exafs-MDT0000_UUID 83050496 192404 82858092 1% /exafs[MDT:0] exafs-MDT0001_UUID 83050496 190698 82859798 1% /exafs[MDT:1] exafs-MDT0002_UUID 83050496 177266 82873230 1% /exafs[MDT:2] exafs-MDT0003_UUID 83050496 720852 82329644 1% /exafs[MDT:3]
That's why mdtest's numbers was slower since one of MDS/MDT (MDT3 in this case) is more working longer than others. Eventually, mdtest's elapsed time is longer than balanced case.
Attachments
Issue Links
- Clones
-
LU-15692 performance regressions for files in stripe directory
- Resolved
- is related to
-
LU-16198 sanity test_33hh: MDT index match 49/250 times
- Resolved
- is related to
-
LU-15479 sanity: test_316 failed lfs mv: /mnt/lustre/d316.sanity/d/file migrate failed: No such file or directory (2)
- Open
-
LU-15546 Shared Directory File Creates regression seen in 2.15 when comparing to 2.12.6
- Resolved
-
LU-13481 sanity test_33h: MDT index mismatch 5 times
- Resolved
-
LU-11025 DNE3: directory restripe
- Resolved
-
LU-14459 DNE3: directory auto split during create
- Open