Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4367

unlink performance regression on lustre-2.5.52 client

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0
    • Lustre 2.5.0
    • 2
    • 11951

    Description

      lustre-2.5.52 client (and maybe more old client as well) causes metadata performance (unlink files in the single shared directory) regression.
      Here is test results on lustre-2.5.52 clients and lustre-2.4.1 clients. lustre-2.5.52 is running on all servers.

      1 x MDS, 4 x OSS (32 x OST) and 16 clients(64 processs, 20000 files per process)

      lustre-2.4.1 client
      
      4.1-take2.log
      -- started at 12/09/2013 07:31:29 --
      
      mdtest-1.9.1 was launched with 64 total task(s) on 16 node(s)
      Command line used: /work/tools/bin/mdtest -d /lustre/dir.0 -n 20000 -F -i 3
      Path: /lustre
      FS: 1141.8 TiB   Used FS: 0.0%   Inodes: 50.0 Mi   Used Inodes: 0.0%
      
      64 tasks, 1280000 files
      
      SUMMARY: (of 3 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         File creation     :      58200.265      56783.559      57589.448        594.589
         File stat         :     123351.857     109571.584     114223.612       6455.043
         File read         :     109917.788      83891.903      99965.718      11472.968
         File removal      :      60825.889      59066.121      59782.774        754.599
         Tree creation     :       4048.556       1971.934       3082.293        853.878
         Tree removal      :         21.269         15.069         18.204          2.532
      
      -- finished at 12/09/2013 07:34:53 --
      
      lustre-2.5.5.2 client
      
      -- started at 12/09/2013 07:13:42 --
      
      mdtest-1.9.1 was launched with 64 total task(s) on 16 node(s)
      Command line used: /work/tools/bin/mdtest -d /lustre/dir.0 -n 20000 -F -i 3
      Path: /lustre
      FS: 1141.8 TiB   Used FS: 0.0%   Inodes: 50.0 Mi   Used Inodes: 0.0%
      
      64 tasks, 1280000 files
      
      SUMMARY: (of 3 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         File creation     :      58286.631      56689.423      57298.286        705.112
         File stat         :     127671.818     116429.261     121610.854       4631.684
         File read         :     173527.817     158205.242     166676.568       6359.445
         File removal      :      46818.194      45638.851      46118.111        506.151
         Tree creation     :       3844.458       2576.354       3393.050        578.560
         Tree removal      :         21.383         18.329         19.844          1.247
      
      -- finished at 12/09/2013 07:17:07 --
      

      46K ops/sec (lusre-2.5.52) vs 60K ops/sec (lustre-2.4.1). 25% performance drops on Lustre-2.5.52 compared to Lustre-2.4.1.

      Attachments

        1. debugfile
          512 kB
        2. LU-4367.xlsx
          99 kB
        3. unlinkmany-result.zip
          4 kB

        Issue Links

          Activity

            [LU-4367] unlink performance regression on lustre-2.5.52 client

            Patches landed to Master. Please reopen ticket if more work is needed.

            jlevi Jodi Levi (Inactive) added a comment - Patches landed to Master. Please reopen ticket if more work is needed.

            I ran the patch on Hyperion, 1,32,64,100 clients. Mdtest dir-per-process and single-shared-dir.
            Spreadsheet with graphs attached

            cliffw Cliff White (Inactive) added a comment - I ran the patch on Hyperion, 1,32,64,100 clients. Mdtest dir-per-process and single-shared-dir. Spreadsheet with graphs attached

            All of this is in this patch: http://review.whamcloud.com/11062
            Ihara-san, please give it a try to see if it helps for your workload?

            sure, will test that patches as soon as I can run benchmark. maybe early next week, thanks!

            ihara Shuichi Ihara (Inactive) added a comment - All of this is in this patch: http://review.whamcloud.com/11062 Ihara-san, please give it a try to see if it helps for your workload? sure, will test that patches as soon as I can run benchmark. maybe early next week, thanks!
            green Oleg Drokin added a comment -

            So, it looks like we still can infer if the open originated from vfs or not.

            When we come from do_filp_open (this is for real open path), we go through filename_lookup with LOOKUP_OPEN set, when we go through dentry_open, LOOKUP_OPEN is not set.

            As such the most brute-force way I see to address this is in ll_revalidate_dentry to always return 0 if LOOKUP_OPEN is set and LOOKUP_CONTINUE is NOT set (i.e. we are looking up last component).
            We already do a similar trick for LOOKUP_OPEN|LOOKUP_CONTINUE

            BTW, while looking at the ll_revalidate_dentry logic, I think we can improve it quite a bit too in the area of intermediate path component lookup.

            All of this is in this patch: http://review.whamcloud.com/11062
            Ihara-san, please give it a try to see if it helps for your workload?
            This patch passes medium level of my testing (does not include any performance testing).

            green Oleg Drokin added a comment - So, it looks like we still can infer if the open originated from vfs or not. When we come from do_filp_open (this is for real open path), we go through filename_lookup with LOOKUP_OPEN set, when we go through dentry_open, LOOKUP_OPEN is not set. As such the most brute-force way I see to address this is in ll_revalidate_dentry to always return 0 if LOOKUP_OPEN is set and LOOKUP_CONTINUE is NOT set (i.e. we are looking up last component). We already do a similar trick for LOOKUP_OPEN|LOOKUP_CONTINUE BTW, while looking at the ll_revalidate_dentry logic, I think we can improve it quite a bit too in the area of intermediate path component lookup. All of this is in this patch: http://review.whamcloud.com/11062 Ihara-san, please give it a try to see if it helps for your workload? This patch passes medium level of my testing (does not include any performance testing).
            laisiyao Lai Siyao added a comment -

            Oleg, the cause is simplified revalidate (see 7475), originally revalidate will execute IT_OPEN, but this code is replicate of lookup, and this opened handle can be lost if other client canceled this lock. So 7475 simplified revalidate, which just return 1 if dentry is valid, and let .open to really open file, but this can't be differentiate from NFS export open, so both open after revalidate and NFS export open take open lock.

            laisiyao Lai Siyao added a comment - Oleg, the cause is simplified revalidate (see 7475), originally revalidate will execute IT_OPEN, but this code is replicate of lookup, and this opened handle can be lost if other client canceled this lock. So 7475 simplified revalidate, which just return 1 if dentry is valid, and let .open to really open file, but this can't be differentiate from NFS export open, so both open after revalidate and NFS export open take open lock.

            Sorry about my earlier confusion with 10426 - I thought that was a different patch, but I see now that it is required for 10398 to work.

            It looks like the 10398 patch does improve the unlink performance, but at the expense of almost every other operation. Since unlink is already faster than create, it doesn't make sense to speed it up and slow down create. It looks like there is also some other change(s) that slowed down the create and stat operations on master compared to 2.5.2.

            It doesn't seem reasonable to land 10398 for 2.6.0 at this point.

            adilger Andreas Dilger added a comment - Sorry about my earlier confusion with 10426 - I thought that was a different patch, but I see now that it is required for 10398 to work. It looks like the 10398 patch does improve the unlink performance, but at the expense of almost every other operation. Since unlink is already faster than create, it doesn't make sense to speed it up and slow down create. It looks like there is also some other change(s) that slowed down the create and stat operations on master compared to 2.5.2. It doesn't seem reasonable to land 10398 for 2.6.0 at this point.

            People

              laisiyao Lai Siyao
              ihara Shuichi Ihara (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: