Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6800

Significant performance regression with patch LU-5264

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • None
    • master
    • 2
    • 9223372036854775807

    Description

      Durding our performance testing, we found siginicant metadata performance regression with LU-5264 on master.

      # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -n 1000 -p 10 -i 5 -d /scratch1/mdtest.out
      

      master

      SUMMARY: (of 5 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         Directory creation:      39552.671      33039.129      37024.828       2875.617
         Directory stat    :      33462.417      29340.691      31662.586       1384.330
         Directory removal :      40938.777      40238.677      40571.960        283.701
         File creation     :      17696.663      17209.531      17542.185        171.470
         File stat         :      33892.041      33429.312      33680.603        170.577
         File read         :      11284.121      11012.694      11220.417        104.978
         File removal      :      39718.200      39449.348      39556.254         90.590
         Tree creation     :       4583.939        700.335       3652.356       1487.449
         Tree removal      :        170.563        156.738        162.935          5.172
      

      keep client version, but revert patch 42fdf8355791cb682c6120f7950bb2ecd50f97aa (LU-5264 obdclass: fix race during key quiescency) on servers.

      SUMMARY: (of 5 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         Directory creation:      44937.511      42117.095      43780.402       1335.927
         Directory stat    :     135310.427     129560.951     133625.293       2077.128
         Directory removal :      51525.499      46852.534      49965.297       1611.759
         File creation     :      42978.506      41435.145      42413.409        586.294
         File stat         :     135882.699     133344.886     134466.144        977.577
         File read         :     121788.787     111332.613     116374.190       3351.730
         File removal      :      84827.815      78120.995      80378.741       2522.662
         Tree creation     :       4650.004       3788.893       4268.099        336.241
         Tree removal      :        198.059        129.234        179.980         25.563
      

      Attachments

        Issue Links

          Activity

            [LU-6800] Significant performance regression with patch LU-5264

            With the potential move to rhashtable which have lockless lookups we might be able to resolve these performance issues.

            simmonsja James A Simmons added a comment - With the potential move to rhashtable which have lockless lookups we might be able to resolve these performance issues.
            pjones Peter Jones added a comment -

            ok then let's close this ticket for now and if we need to make future improvements to read operations track that separately

            pjones Peter Jones added a comment - ok then let's close this ticket for now and if we need to make future improvements to read operations track that separately

            We only hit this performance regression on mdtest and all test file size are zero byte.
            And, we agreed patch http://review.whamcloud.com/15558 helped and the performance was back even with patch LU-5264, but except "file read" operation.
            We still don't know why read operation doesn't come back with patch 15558.

            ihara Shuichi Ihara (Inactive) added a comment - We only hit this performance regression on mdtest and all test file size are zero byte. And, we agreed patch http://review.whamcloud.com/15558 helped and the performance was back even with patch LU-5264 , but except "file read" operation. We still don't know why read operation doesn't come back with patch 15558.

            Ihara, looking at your test results it seems that the mean performance of the original results (before LU-5264) and the results after the LU-6800 patch are very close, within the standard deviation for the tests:
            BEFORE

            master + revert 15558 + revert 13103
            # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out
            
               Operation                    Mean        Std Dev
               ---------                      ----        -------
               Directory creation:       40159.293       3669.695
               Directory stat    :      131383.164       1118.004
               Directory removal :       52790.576       1285.107
               File creation     :       40070.221       2219.840
               File stat         :      130765.529       1013.515
               File read         :       80344.389       8825.741
               File removal      :       82668.050       1604.248
               Tree creation     :        4164.502        143.755
               Tree removal      :         200.008          3.799
            

            AFTER

            master (commit-id: fe60e0135ee2334440247cde167b707b223cf11d, includes LU-5264 and patch 15558)
            # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out
            
               Operation                      Mean        Std Dev
               ---------                      ----        -------
               Directory creation:       43188.774       4041.792
               Directory stat    :      130085.996       3245.214
               Directory removal :       50171.405       4125.732
               File creation     :       40834.020       2776.628
               File stat         :      132934.894       2009.179
               File read         :       91483.603      10776.519
               File removal      :       85021.870       1700.945
               Tree creation     :        4167.598        451.208
               Tree removal      :         197.894          3.043
            

            The mean Directory removal and Directory stat operations are somewhat slower, but this is within the standard deviation of the three test runs. Conversely, the Directory create, File create, and File removal operations are faster, but are also within the standard deviation of the three test runs.

            For the File read it appears that the results are highly variable (stddev more than 10% of the mean). Is this performance loss seen with IO benchmarks like IOR or only the mdtest? What size of files is mdtest using?

            adilger Andreas Dilger added a comment - Ihara, looking at your test results it seems that the mean performance of the original results (before LU-5264 ) and the results after the LU-6800 patch are very close, within the standard deviation for the tests: BEFORE master + revert 15558 + revert 13103 # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out Operation Mean Std Dev --------- ---- ------- Directory creation: 40159.293 3669.695 Directory stat : 131383.164 1118.004 Directory removal : 52790.576 1285.107 File creation : 40070.221 2219.840 File stat : 130765.529 1013.515 File read : 80344.389 8825.741 File removal : 82668.050 1604.248 Tree creation : 4164.502 143.755 Tree removal : 200.008 3.799 AFTER master (commit-id: fe60e0135ee2334440247cde167b707b223cf11d, includes LU-5264 and patch 15558) # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out Operation Mean Std Dev --------- ---- ------- Directory creation: 43188.774 4041.792 Directory stat : 130085.996 3245.214 Directory removal : 50171.405 4125.732 File creation : 40834.020 2776.628 File stat : 132934.894 2009.179 File read : 91483.603 10776.519 File removal : 85021.870 1700.945 Tree creation : 4167.598 451.208 Tree removal : 197.894 3.043 The mean Directory removal and Directory stat operations are somewhat slower, but this is within the standard deviation of the three test runs. Conversely, the Directory create , File create , and File removal operations are faster, but are also within the standard deviation of the three test runs. For the File read it appears that the results are highly variable (stddev more than 10% of the mean). Is this performance loss seen with IO benchmarks like IOR or only the mdtest? What size of files is mdtest using?

            mdtest have run in restricted 2 and restricted 3 state, respectively without patch and with all patches....

            In actual state (revert patch LU-5264 and LU-6049)

            $ mpirun -n 128 xx/mdtest -n 1000 -p 10 -i 5 -d xx/run_MDTest_repro3
            -- started at 08/13/2015 15:26:03 --
            
            mdtest-1.9.3 was launched with 128 total task(s) on 8 node(s)
            Command line used: ./mdtest -n 1000 -p 10 -i 5 -d ./run_MDTest_repro3
            Path: xxxxxxxxxxx
            FS: 155.1 TiB   Used FS: 7.9%   Inodes: 154.1 Mi   Used Inodes: 0.3%
            
            128 tasks, 128000 files/directories
            
            SUMMARY: (of 5 iterations)
               Operation                      Max            Min Mean        Std Dev
               ---------                      ---            --- ----        -------
               Directory creation:      12044.248       3622.254 7693.558       2787.514
               Directory stat    :      29808.509      28605.578 29428.433        434.277
               Directory removal :      16316.172      15596.360 16069.509        271.041
               File creation     :       8304.285       2475.372 5950.888       2378.499
               File stat         :      28493.314      28090.363 28265.330        130.886
               File read         :      15694.955      15170.723 15435.999        181.395
               File removal      :      15253.714      14426.384 14981.075        305.384
               Tree creation     :       3077.259       1170.939 1855.926        653.607
               Tree removal      :         95.066         61.637 77.186         11.245
            
            -- finished at 08/13/2015 15:34:26 --
            

            With LU-5264, LU-6049 and LU-6800:

            $ mpirun -n 128 xx/mdtest -n 1000 -p 10 -i 5 -d xx/run_MDTest_repro2
             -- started at 08/13/2015 15:04:09 --
            
            mdtest-1.9.3 was launched with 128 total task(s) on 8 node(s)
            Command line used: ./mdtest -n 1000 -p 10 -i 5 -d ./run_MDTest_repro2
            Path: xxxxxxxxxxxxx
            FS: 155.1 TiB   Used FS: 7.9%   Inodes: 154.0 Mi   Used Inodes: 0.3%
            
            128 tasks, 128000 files/directories
            
            SUMMARY: (of 5 iterations)
                Operation                      Max            Min Mean Std Dev
                ---------                      ---            --- ---- -------
                Directory creation:      11815.599       6041.768 8297.495       2021.031
                Directory stat    :      29708.108      29290.724 29475.438        147.864
                Directory removal :      16459.019      16182.934 16283.041         93.778
                File creation     :       8561.213       8407.310 8496.989         57.227
                File stat         :      28579.728      28018.041 28328.611        184.786
                File read         :      15066.452      14786.594 14943.476         98.652
                File removal      :      14821.486      14289.802 14645.054        190.972
                Tree creation     :       2746.761       1234.708 1675.881        558.742
                Tree removal      :         63.032         51.565 58.417          3.900
            
            -- finished at 08/13/2015 15:11:16 --
            

            We do not observe significant difference but the tests were launched with 8 nodes only.
            We expect a test with 32 nodes and more by the end of the month.

            bruno.travouillon Bruno Travouillon (Inactive) added a comment - mdtest have run in restricted 2 and restricted 3 state, respectively without patch and with all patches.... In actual state (revert patch LU-5264 and LU-6049 ) $ mpirun -n 128 xx/mdtest -n 1000 -p 10 -i 5 -d xx/run_MDTest_repro3 -- started at 08/13/2015 15:26:03 -- mdtest-1.9.3 was launched with 128 total task(s) on 8 node(s) Command line used: ./mdtest -n 1000 -p 10 -i 5 -d ./run_MDTest_repro3 Path: xxxxxxxxxxx FS: 155.1 TiB Used FS: 7.9% Inodes: 154.1 Mi Used Inodes: 0.3% 128 tasks, 128000 files/directories SUMMARY: (of 5 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 12044.248 3622.254 7693.558 2787.514 Directory stat : 29808.509 28605.578 29428.433 434.277 Directory removal : 16316.172 15596.360 16069.509 271.041 File creation : 8304.285 2475.372 5950.888 2378.499 File stat : 28493.314 28090.363 28265.330 130.886 File read : 15694.955 15170.723 15435.999 181.395 File removal : 15253.714 14426.384 14981.075 305.384 Tree creation : 3077.259 1170.939 1855.926 653.607 Tree removal : 95.066 61.637 77.186 11.245 -- finished at 08/13/2015 15:34:26 -- With LU-5264 , LU-6049 and LU-6800 : $ mpirun -n 128 xx/mdtest -n 1000 -p 10 -i 5 -d xx/run_MDTest_repro2 -- started at 08/13/2015 15:04:09 -- mdtest-1.9.3 was launched with 128 total task(s) on 8 node(s) Command line used: ./mdtest -n 1000 -p 10 -i 5 -d ./run_MDTest_repro2 Path: xxxxxxxxxxxxx FS: 155.1 TiB Used FS: 7.9% Inodes: 154.0 Mi Used Inodes: 0.3% 128 tasks, 128000 files/directories SUMMARY: (of 5 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 11815.599 6041.768 8297.495 2021.031 Directory stat : 29708.108 29290.724 29475.438 147.864 Directory removal : 16459.019 16182.934 16283.041 93.778 File creation : 8561.213 8407.310 8496.989 57.227 File stat : 28579.728 28018.041 28328.611 184.786 File read : 15066.452 14786.594 14943.476 98.652 File removal : 14821.486 14289.802 14645.054 190.972 Tree creation : 2746.761 1234.708 1675.881 558.742 Tree removal : 63.032 51.565 58.417 3.900 -- finished at 08/13/2015 15:11:16 -- We do not observe significant difference but the tests were launched with 8 nodes only. We expect a test with 32 nodes and more by the end of the month.

            First tests running with patch #15558, at TGCC site, does not show the same read perfs regression.
            Site will soon provide their numbers for this ticket.
            More instrumentations will be done.

            bfaccini Bruno Faccini (Inactive) added a comment - First tests running with patch #15558, at TGCC site, does not show the same read perfs regression. Site will soon provide their numbers for this ticket. More instrumentations will be done.

            We have removed patch for LU-5264 from all our file systems. We will discuss the ability to give a try with the current fix for LU-6800 by the end of the month on a test file system.

            I will keep you in touch.

            bruno.travouillon Bruno Travouillon (Inactive) added a comment - We have removed patch for LU-5264 from all our file systems. We will discuss the ability to give a try with the current fix for LU-6800 by the end of the month on a test file system. I will keep you in touch.

            Since I am the creator of patch for LU-5264 and thus the unfortunate guilty of this situation, and based on the fact that DDN team has already produced a very good but partial fix, I would like to work more actively and fix this last read performance regression.

            Aurelien, Bruno, since the multi-client competition seems to be the main cause to trigger the issue, could it be possible for me to directly work with you on a site where you heavily hit this problem ?

            bfaccini Bruno Faccini (Inactive) added a comment - Since I am the creator of patch for LU-5264 and thus the unfortunate guilty of this situation, and based on the fact that DDN team has already produced a very good but partial fix, I would like to work more actively and fix this last read performance regression. Aurelien, Bruno, since the multi-client competition seems to be the main cause to trigger the issue, could it be possible for me to directly work with you on a site where you heavily hit this problem ?
            ihara Shuichi Ihara (Inactive) added a comment - - edited

            Please re-open LU-6800, we understood http://review.whamcloud.com/15558 helps a lot, but still not all performance back. Here is test resutls. 32 clients, 128 mdtest process.

            test1 : master (commit-id: fe60e0135ee2334440247cde167b707b223cf11d) branch (includes LU-5264 and patch 15558 )

            # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out
            
               Operation                      Max            Min           Mean        Std Dev
               ---------                      ---            ---           ----        -------
               Directory creation:      45237.210      36692.398      40159.293       3669.695
               Directory stat    :     132371.575     129820.230     131383.164       1118.004
               Directory removal :      53873.775      50985.149      52790.576       1285.107
               File creation     :      42732.503      37298.342      40070.221       2219.840
               File stat         :     131527.304     129333.170     130765.529       1013.515
               File read         :      87588.987      67919.964      80344.389       8825.741
               File removal      :      84046.477      80418.268      82668.050       1604.248
               Tree creation     :       4364.520       4032.985       4164.502        143.755
               Tree removal      :        203.587        194.749        200.008          3.799
            

            test2 : master + revert 15558

            # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out
            
               Operation                      Max            Min           Mean        Std Dev
               ---------                      ---            ---           ----        -------
               Directory creation:      40422.683      20650.668      30457.842       8072.661
               Directory stat    :      33032.600      27110.270      30459.575       2479.308
               Directory removal :      41611.362      39640.289      40887.059        885.442
               File creation     :      17622.819      17537.572      17581.070         34.824
               File stat         :      33991.557      33935.386      33959.396         23.645
               File read         :      11241.112      10994.112      11104.383        102.558
               File removal      :      40024.327      39973.169      39998.669         20.886
               Tree creation     :       4185.932       3705.216       4007.822        215.092
               Tree removal      :        170.327        164.689        167.062          2.386
            

            test3 : master + revert 15558 + revert 13103

            # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out
            
               Operation                      Max            Min           Mean        Std Dev
               ---------                      ---            ---           ----        -------
               Directory creation:      46423.406      37490.161      43188.774       4041.792
               Directory stat    :     134178.816     126241.328     130085.996       3245.214
               Directory removal :      53737.981      44389.098      50171.405       4125.732
               File creation     :      44199.169      37398.927      40834.020       2776.628
               File stat         :     135524.181     130626.893     132934.894       2009.179
               File read         :     100767.654      76374.732      91483.603      10776.519
               File removal      :      86318.162      82618.862      85021.870       1700.945
               Tree creation     :       4634.590       3557.510       4167.598        451.208
               Tree removal      :        201.814        194.397        197.894          3.043
            

            If we compare test3 and test2 resutls, test2 results are significant bad which means patch 13103 caused this performance regression.
            GuZhang at DDN pushed patch 15558 and as far as we can see test1 results, perforamnce was back expect "file read' operation.
            So, patch 15558 helps a lot, but even that, we still see perforamnce regression on "file read" operation. We need more investigate on this to back everything performance back.

            ihara Shuichi Ihara (Inactive) added a comment - - edited Please re-open LU-6800 , we understood http://review.whamcloud.com/15558 helps a lot, but still not all performance back. Here is test resutls. 32 clients, 128 mdtest process. test1 : master (commit-id: fe60e0135ee2334440247cde167b707b223cf11d) branch (includes LU-5264 and patch 15558 ) # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 45237.210 36692.398 40159.293 3669.695 Directory stat : 132371.575 129820.230 131383.164 1118.004 Directory removal : 53873.775 50985.149 52790.576 1285.107 File creation : 42732.503 37298.342 40070.221 2219.840 File stat : 131527.304 129333.170 130765.529 1013.515 File read : 87588.987 67919.964 80344.389 8825.741 File removal : 84046.477 80418.268 82668.050 1604.248 Tree creation : 4364.520 4032.985 4164.502 143.755 Tree removal : 203.587 194.749 200.008 3.799 test2 : master + revert 15558 # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 40422.683 20650.668 30457.842 8072.661 Directory stat : 33032.600 27110.270 30459.575 2479.308 Directory removal : 41611.362 39640.289 40887.059 885.442 File creation : 17622.819 17537.572 17581.070 34.824 File stat : 33991.557 33935.386 33959.396 23.645 File read : 11241.112 10994.112 11104.383 102.558 File removal : 40024.327 39973.169 39998.669 20.886 Tree creation : 4185.932 3705.216 4007.822 215.092 Tree removal : 170.327 164.689 167.062 2.386 test3 : master + revert 15558 + revert 13103 # mpirun -np 128 -ppn 4 -hostfile ./hostfile /work/tools/bin/mdtest -i 3 -n 1000 -d /scratch1/mdtest.out Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 46423.406 37490.161 43188.774 4041.792 Directory stat : 134178.816 126241.328 130085.996 3245.214 Directory removal : 53737.981 44389.098 50171.405 4125.732 File creation : 44199.169 37398.927 40834.020 2776.628 File stat : 135524.181 130626.893 132934.894 2009.179 File read : 100767.654 76374.732 91483.603 10776.519 File removal : 86318.162 82618.862 85021.870 1700.945 Tree creation : 4634.590 3557.510 4167.598 451.208 Tree removal : 201.814 194.397 197.894 3.043 If we compare test3 and test2 resutls, test2 results are significant bad which means patch 13103 caused this performance regression. GuZhang at DDN pushed patch 15558 and as far as we can see test1 results, perforamnce was back expect "file read' operation. So, patch 15558 helps a lot, but even that, we still see perforamnce regression on "file read" operation. We need more investigate on this to back everything performance back.

            Aurélien,

            The issue in the build for bullx has already been reported in duplicate LU-6823. Bull is currently looking at LU-6800 carrefully.

            bruno.travouillon Bruno Travouillon (Inactive) added a comment - Aurélien, The issue in the build for bullx has already been reported in duplicate LU-6823 . Bull is currently looking at LU-6800 carrefully.

            People

              bfaccini Bruno Faccini (Inactive)
              ihara Shuichi Ihara (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: