Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14424

write performance regression in Lustre-2.14.0-RC1

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.14.0
    • Lustre 2.14.0
    • None
    • 2
    • 9223372036854775807

    Description

      While I was runnng the performance regression tests, a performance regression in Lustre-2.14.0-RC1 was found on the write with 4K transfer size.
      Here is a reproducer and test results.

      # mpirun -np $NP -mca btl_openib_if_include mlx5_1:1 -x UCX_NET_DEVICES=mlx5_1:1 --allow-run-as-root /work/tools/bin/ior -w -r -t 4k -b $((256/NP)) -e -v -F -o /ai400x/ior.out/file
      
      Client Version NP Write(MB/s) Read(MB/s)
      Lustre-2.13.0 1  803  3293
      Lustre-2.14.0-RC1 1  529  3092
      Lustre-2.13.0 16  6962  12021
      Lustre-2.14.0-RC1 16  5127  11951

      Attachments

        Issue Links

          Activity

            [LU-14424] write performance regression in Lustre-2.14.0-RC1
            neilb Neil Brown added a comment -

            I believe the problem was that cur->oe_state wasn't being initialised properly.

            I've uploaded a revised version of the reverted patch at https://review.whamcloud.com/41691

             

            neilb Neil Brown added a comment - I believe the problem was that cur->oe_state wasn't being initialised properly. I've uploaded a revised version of the reverted patch at https://review.whamcloud.com/41691  
            pjones Peter Jones added a comment -

            Landed for 2.14 RC3

            pjones Peter Jones added a comment - Landed for 2.14 RC3

            While a revert works around this issue as Andreas pointed out this is hot code so more improvements could be done here.

            simmonsja James A Simmons added a comment - While a revert works around this issue as Andreas pointed out this is hot code so more improvements could be done here.

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41498/
            Subject: LU-14424 Revert "LU-9679 osc: simplify osc_extent_find()"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: b592f75446ccd8fea790de4156478fd057c82019

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41498/ Subject: LU-14424 Revert " LU-9679 osc: simplify osc_extent_find()" Project: fs/lustre-release Branch: master Current Patch Set: Commit: b592f75446ccd8fea790de4156478fd057c82019

            Some added notes here from my investigation:

            • the test workload is IOR file per process with 4KB buffered writes, and it looks like a single client with 1 and 16 threads
            • at 800MiB/s that is 200k x 4KB writes/s, so osc_extent_merge() function is called between 200k/s and 400k/s (depending on whether it can do both front and back merges, so any extra overhead in this path will affect the bottom line
            • it isn't yet totally clear if the problem is just more CPU overhead in osc_extent_merge() or if there is a subtle bug that is preventing extents to be merged and sending more RPCs over the wire? Plain CPU overhead would be hard to explain a 250MB/s slowdown, but this is a hot path, and CPU overhead is a major contributor to the IOPS number. Bad RPC formation would be an easier explanation for this performance delta.
            • going from in-line coding of the checks to calling osc_extent_merge() added a number of extra calls, which makes me wonder if we can improve performance by reducing more overhead in this call path

            In terms of debugging the source of the slowdown there are a couple things to check:

            • look at osd-ldiskfs.*.brw_stats and/or osc.*.rpc_stats to see if it is badly formed RPCs
            • look at the flame graph before/after to see where the extra CPU overhead is coming from
            adilger Andreas Dilger added a comment - Some added notes here from my investigation: the test workload is IOR file per process with 4KB buffered writes, and it looks like a single client with 1 and 16 threads at 800MiB/s that is 200k x 4KB writes/s, so osc_extent_merge() function is called between 200k/s and 400k/s (depending on whether it can do both front and back merges, so any extra overhead in this path will affect the bottom line it isn't yet totally clear if the problem is just more CPU overhead in osc_extent_merge() or if there is a subtle bug that is preventing extents to be merged and sending more RPCs over the wire? Plain CPU overhead would be hard to explain a 250MB/s slowdown, but this is a hot path, and CPU overhead is a major contributor to the IOPS number. Bad RPC formation would be an easier explanation for this performance delta. going from in-line coding of the checks to calling osc_extent_merge() added a number of extra calls, which makes me wonder if we can improve performance by reducing more overhead in this call path In terms of debugging the source of the slowdown there are a couple things to check: look at osd-ldiskfs.*.brw_stats and/or osc.*.rpc_stats to see if it is badly formed RPCs look at the flame graph before/after to see where the extra CPU overhead is coming from

            Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41498
            Subject: LU-14424 Revert "LU-9679 osc: simplify osc_extent_find()"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 94649b2121449fb80bf9d9b971db9be539db76b1

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41498 Subject: LU-14424 Revert " LU-9679 osc: simplify osc_extent_find()" Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 94649b2121449fb80bf9d9b971db9be539db76b1

            Here is patch we started to see this perf regression in master branch.

            commit 80e21cce3dd6748fd760786cafe9c26d502fd74f
            Author: NeilBrown <neilb@suse.com>
            Date:   Thu Dec 13 11:32:56 2018 +1100
            
                LU-9679 osc: simplify osc_extent_find()
                
                osc_extent_find() contains some code with the same functionality as
                osc_extent_merge().  So replace that code with a call to
                osc_extent_merge().
                
                This requires that we set cur->oe_grants earlier, as
                osc_extent_merge() needs that.
                
                Also:
                
                 - fix a pre-existing bug - osc_extent_merge() should never try to
                   merge two extends with different ->oe_mppr as later alignment
                   checks can get confused.
                 - Remove a redundant list_del_init() which is already included in
                   __osc_extent_remove().
                
                Linux-Commit: 85ebb57ddc5b ("lustre: osc: simplify osc_extent_find()")
            

            Here is test results before/after commit 80e21cce3d

            80e21cce3d LU-9679 osc: simplify osc_extent_find()
            Max Write: 540.67 MiB/sec (566.93 MB/sec)
            
            9d914f9cc7 LU-13711 build: fix typo on SSL dependency for Ubuntu
            Max Write: 797.24 MiB/sec (835.96 MB/sec)
            
            sihara Shuichi Ihara added a comment - Here is patch we started to see this perf regression in master branch. commit 80e21cce3dd6748fd760786cafe9c26d502fd74f Author: NeilBrown <neilb@suse.com> Date: Thu Dec 13 11:32:56 2018 +1100 LU-9679 osc: simplify osc_extent_find() osc_extent_find() contains some code with the same functionality as osc_extent_merge(). So replace that code with a call to osc_extent_merge(). This requires that we set cur->oe_grants earlier, as osc_extent_merge() needs that. Also: - fix a pre-existing bug - osc_extent_merge() should never try to merge two extends with different ->oe_mppr as later alignment checks can get confused. - Remove a redundant list_del_init() which is already included in __osc_extent_remove(). Linux-Commit: 85ebb57ddc5b ("lustre: osc: simplify osc_extent_find()") Here is test results before/after commit 80e21cce3d 80e21cce3d LU-9679 osc: simplify osc_extent_find() Max Write: 540.67 MiB/sec (566.93 MB/sec) 9d914f9cc7 LU-13711 build: fix typo on SSL dependency for Ubuntu Max Write: 797.24 MiB/sec (835.96 MB/sec)

            I will narrow down and find what patch causes problem.

            sihara Shuichi Ihara added a comment - I will narrow down and find what patch causes problem.

            People

              sihara Shuichi Ihara
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: