Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16433

single client performance regression in SSF workload

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0, Lustre 2.15.2
    • Lustre 2.15.2
    • None
    • Lustre-2.15.2, Rokeylinux 8.6 (4.18.0-372.32.1.el8_6.x86_64), OFED-5.4-3.6.8.1
    • 3
    • 9223372036854775807

    Description

      a client performance regression was found in 2.15.2-RC1 (commit:e21498bcaa).
      Tested workload is single client and SSF(single shared file) from 16 processes.

      # mpirun -np 16 ior -a POSIX -i 1 -d 10 -w -r -b 16g -t 1m -C -Q 17 -e -vv -o //exafs/d0/d1/d2/ost_stripe/file 
      

      lustre-2.15.1

      access    bw(MiB/s)  IOPS       Latency(s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
      ------    ---------  ----       ----------  ---------- ---------  --------   --------   --------   --------   ----
      write     2489.25    2489.28    0.006428    16777216   1024.00    0.000936   105.31     0.000238   105.31     0   
      read      4176       4176       0.003803    16777216   1024.00    0.001695   62.77      3.92       62.77      0   
      write     2423.58    2423.60    0.006452    16777216   1024.00    0.000586   108.16     2.45       108.16     1   
      read      4197       4197       0.003652    16777216   1024.00    0.001982   62.46      3.98       62.46      1   
      write     2502.32    2502.34    0.006375    16777216   1024.00    0.000404   104.76     0.305282   104.76     2   
      read      4211       4211       0.003683    16777216   1024.00    0.001679   62.25      3.99       62.25      2   
      
      Max Write: 2502.32 MiB/sec (2623.88 MB/sec)
      Max Read:  4211.19 MiB/sec (4415.75 MB/sec)
      

      lustre-2.15.2-RC1

      access    bw(MiB/s)  IOPS       Latency(s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
      ------    ---------  ----       ----------  ---------- ---------  --------   --------   --------   --------   ----
      write     2103.65    2103.68    0.007142    16777216   1024.00    0.001769   124.61     7.60       124.61     0   
      read      4204       4204       0.003159    16777216   1024.00    0.001461   62.35      10.59      62.35      0   
      write     2169.58    2169.69    0.006903    16777216   1024.00    0.000912   120.82     7.72       120.83     1   
      read      4282       4282       0.003722    16777216   1024.00    0.137671   61.22      2.78       61.22      1   
      write     2133.24    2133.25    0.007500    16777216   1024.00    0.000380   122.88     3.60       122.89     2   
      read      4088       4088       0.003689    16777216   1024.00    0.001053   64.13      3.68       64.13      2  
      
      Max Write: 2169.58 MiB/sec (2274.97 MB/sec)
      Max Read:  4282.19 MiB/sec (4490.20 MB/sec)
      

      it is ~14% performance regression in 2.15.2-RC1 compared to lustre-2.15.1.

      After investigations and 'git bisect' tells us "commit: [6d4559f6b948a93aaf5e94c4eb47cd9ebcf7ba95] LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1]" caused this performance regression.

      Here is another test result after revered patch "LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1]" from lustre-2.15.2-RC1 and it confirmed the performance was back to same level of 2.15.1.

      lustre-2.15.2-RC1 + reverted commit:6d4559f6b9 (LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1])

      access    bw(MiB/s)  IOPS       Latency(s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
      ------    ---------  ----       ----------  ---------- ---------  --------   --------   --------   --------   ----
      write     2497.41    2497.44    0.006407    16777216   1024.00    0.001115   104.97     0.000791   104.97     0   
      read      4217       4217       0.003773    16777216   1024.00    0.001680   62.16      3.37       62.16      0   
      write     2471.13    2471.14    0.006475    16777216   1024.00    0.000375   106.08     0.000292   106.08     1   
      read      4083       4083       0.003765    16777216   1024.00    0.001659   64.20      3.23       64.20      1   
      write     2457.91    2457.92    0.006509    16777216   1024.00    0.000412   106.65     0.010367   106.65     2   
      read      4163       4163       0.003771    16777216   1024.00    0.001909   62.97      6.35       62.97      2   
      
      Max Write: 2497.41 MiB/sec (2618.72 MB/sec)
      Max Read:  4217.39 MiB/sec (4422.25 MB/sec)
      

      Attachments

        Issue Links

          Activity

            [LU-16433] single client performance regression in SSF workload
            stancheff Shaun Tancheff added a comment - - edited

            Sorry, I didn't read through the collapsed comments.

            Patrick is correct. Post removal of kallsyms* we do not have a way to acquire the account_page_dirtied / folio_account_dirtied directly.

            On the plus side it looks like we might be able to 'vectorize' folio_account_dirtied and provide a local vvp_account_dirtied_folios() for those kernels.

            There now a vvp_set_folio_dirty_batched() under LU-16577 that may be useful.

            stancheff Shaun Tancheff added a comment - - edited Sorry, I didn't read through the collapsed comments. Patrick is correct. Post removal of kallsyms* we do not have a way to acquire the account_page_dirtied / folio_account_dirtied directly. On the plus side it looks like we might be able to 'vectorize' folio_account_dirtied and provide a local vvp_account_dirtied_folios() for those kernels. There now a vvp_set_folio_dirty_batched() under LU-16577 that may be useful.

            Shaun, I see patch https://review.whamcloud.com/49520 "LU-16433 llite: check vvp_account_page_dirtied" on b2_15, which is the cherry-picked version of Jian's 49512 patch that fixed the problem on master. It looks like it was included in 2.15.2, so you just need to update your tree.

            adilger Andreas Dilger added a comment - Shaun, I see patch https://review.whamcloud.com/49520 " LU-16433 llite: check vvp_account_page_dirtied " on b2_15, which is the cherry-picked version of Jian's 49512 patch that fixed the problem on master. It looks like it was included in 2.15.2, so you just need to update your tree.

            Shaun,

            I don't totally understand your question - The performance regression is about whether or not we have access to the necessary symbols to do things in batch.  This patch fixes it for some 'intermediate' kernels, where we can still use kallsyms_lookup_name() to find non-exported symbols, but that's gone in newer kernels.  So we know exactly why the regression is occurring and where it's occurring.

            If HPE is interested in avoiding the regression on intermediate kernels for 2.15, you could push the patch to b2_15 and I think we'd be happy to land it.  But we have no solution for the latest kernels.

            paf0186 Patrick Farrell added a comment - Shaun, I don't totally understand your question - The performance regression is about whether or not we have access to the necessary symbols to do things in batch.  This patch fixes it for some 'intermediate' kernels, where we can still use kallsyms_lookup_name() to find non-exported symbols, but that's gone in newer kernels.  So we know exactly why the regression is occurring and where it's occurring. If HPE is interested in avoiding the regression on intermediate kernels for 2.15, you could push the patch to b2_15 and I think we'd be happy to land it.  But we have no solution for the latest kernels.

            I would note that 2.15.2-RC1 does not have LU-16433. It is possible that you could check if this fixes the performance regression?

            stancheff Shaun Tancheff added a comment - I would note that 2.15.2-RC1 does not have LU-16433 . It is possible that you could check if this fixes the performance regression?

            sihara , whether we re-open this or not, be aware this problem exists in Linux 5.2 and newer (and there is no obvious way to fix it).  So, as Andreas said, Ubuntu 22.04 + and RHEL9.

            paf0186 Patrick Farrell added a comment - sihara , whether we re-open this or not, be aware this problem exists in Linux 5.2 and newer (and there is no obvious way to fix it).  So, as Andreas said, Ubuntu 22.04 + and RHEL9.

            We could re-open it, but as it stands, xarray is just a re-API of the radix tree, and non-single-page-folios aren't supported in the page cache yet.  Setting folios aside, last I checked, the operations we'd need to do much in batch aren't exported.

            At the very least, my focus is on the DIO stuff - I'm more interested in pushing buffered I/O through the DIO path once unaligned support is fully working.  That would offer much larger gains.  (Not that it's not worth working on the buffered path, but ...)

            So re-opening is probably a decent idea, but I wouldn't prioritize it.

            paf0186 Patrick Farrell added a comment - We could re-open it, but as it stands, xarray is just a re-API of the radix tree, and non-single-page-folios aren't supported in the page cache yet.  Setting folios aside, last I checked, the operations we'd need to do much in batch aren't exported. At the very least, my focus is on the DIO stuff - I'm more interested in pushing buffered I/O through the DIO path once unaligned support is fully working.  That would offer much larger gains.  (Not that it's not worth working on the buffered path, but ...) So re-opening is probably a decent idea, but I wouldn't prioritize it.

            Should this issue be re-opened to investigate/address the performance loss for newer kernels?

            I don't think it is only SLES15sp4 that is affected, but any kernel since Linux 5.2 where account_page_dirtied() is not exported, like Ubuntu 22.04, RHEL9.x. The patch landed here defers this problem while kallsyms_lookup_name() can work around that lack, but that is also removed in newer kernels.

            There should be some way that we can work with the new page cache more efficiently for large page ranges, since that is what xarray and folios are supposed to be for...

            adilger Andreas Dilger added a comment - Should this issue be re-opened to investigate/address the performance loss for newer kernels? I don't think it is only SLES15sp4 that is affected, but any kernel since Linux 5.2 where account_page_dirtied() is not exported, like Ubuntu 22.04, RHEL9.x. The patch landed here defers this problem while kallsyms_lookup_name() can work around that lack, but that is also removed in newer kernels. There should be some way that we can work with the new page cache more efficiently for large page ranges, since that is what xarray and folios are supposed to be for...
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49520/
            Subject: LU-16433 llite: check vvp_account_page_dirtied
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 1c6e03a53cb374c10cf2d9e5a22fdb304f81e8bf

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49520/ Subject: LU-16433 llite: check vvp_account_page_dirtied Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 1c6e03a53cb374c10cf2d9e5a22fdb304f81e8bf

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49512/
            Subject: LU-16433 llite: check vvp_account_page_dirtied
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 61c4c2b5e5d7d919149921d913322586ba5ebcab

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49512/ Subject: LU-16433 llite: check vvp_account_page_dirtied Project: fs/lustre-release Branch: master Current Patch Set: Commit: 61c4c2b5e5d7d919149921d913322586ba5ebcab

            People

              yujian Jian Yu
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: