Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5212

poor stat performance after upgrade from zfs-0.6.2-1/lustre-2.4.0-1 to zfs-0.6.3-1

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • Lustre 2.4.2
    • centos 6.5

    Description

      After upgrading a system from
      lustre 2.4.0-1 / zfs-0.6.2-1 to
      lustre 2.4.2-1 / zfs-0.6.3-1

      mdtest shows a signficantly lower stat performance - about 8000 iops vs 14400

      File reads and file removals are a bit worse, but not as severe. See the attached graph.

      We do see other marked improvements with the upgrade, for example with system processes waiting on the MDS.

      I wonder if this is some kind of expected perfromance tradeoff for the new version? I'm guessing the absolute numbers for stat are still acceptable for our workload, but it is quite a large relative difference.

      Scott

      Attachments

        1. arcstat-1.png
          arcstat-1.png
          11 kB
        2. arcstat-2.png
          arcstat-2.png
          12 kB
        3. arcstat-2-95G-arc_meta_limit.png
          arcstat-2-95G-arc_meta_limit.png
          13 kB
        4. arcstat-3.png
          arcstat-3.png
          14 kB
        5. arcstat-95G-arc_meta_limit.png
          arcstat-95G-arc_meta_limit.png
          10 kB
        6. arcstat-MB.png
          arcstat-MB.png
          11 kB
        7. mdtest-zfs063.png
          mdtest-zfs063.png
          15 kB
        8. mdtest-zfs063-95G-arc_meta_limit.png
          mdtest-zfs063-95G-arc_meta_limit.png
          17 kB

        Issue Links

          Activity

            [LU-5212] poor stat performance after upgrade from zfs-0.6.2-1/lustre-2.4.0-1 to zfs-0.6.3-1
            pjones Peter Jones added a comment -

            I imagine the performance is quite different on more current versions of Lustre and ZFS

            pjones Peter Jones added a comment - I imagine the performance is quite different on more current versions of Lustre and ZFS

            Prakash, thanks for letting me know. We won't bother running it then. Ours is a production cluster, we can typically run these tests though as it's not heavily used all the time, but it's not easy.

            Scott

            sknolin Scott Nolin (Inactive) added a comment - Prakash, thanks for letting me know. We won't bother running it then. Ours is a production cluster, we can typically run these tests though as it's not heavily used all the time, but it's not easy. Scott

            Scott, I was able to squeeze in a test run with lu_cache_nr=1048576 on the MDS and all OSS nodes in the filesystem. I didn't see any significant difference:

            hype355@root:srun -- mdtest -v -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-5                                                                          
            -- started at 07/16/2014 09:23:46 --
            
            mdtest-1.8.3 was launched with 64 total task(s) on 64 nodes
            Command line used: /opt/mdtest-1.8.3/bin/mdtest -v -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-5
            Path: /p/lcratery/surya1/LU-5212
            FS: 1019.6 TiB   Used FS: 50.3%   Inodes: 834.8 Mi   Used Inodes: 62.5%
            
            64 tasks, 256000 files
            
            SUMMARY: (of 8 iterations)
               Operation                  Max        Min       Mean    Std Dev
               ---------                  ---        ---       ----    -------
               File creation     :   3060.802   2525.669   2719.003    161.410
               File stat         :  72310.501  32382.555  57016.755  11440.553
               File removal      :   4344.489   4043.991   4224.141     97.727
               Tree creation     :    377.644     32.784    147.864    126.958
               Tree removal      :     11.800      9.356     10.626      0.884
            
            -- finished at 07/16/2014 09:45:06 --
            
            prakash Prakash Surya (Inactive) added a comment - Scott, I was able to squeeze in a test run with lu_cache_nr=1048576 on the MDS and all OSS nodes in the filesystem. I didn't see any significant difference: hype355@root:srun -- mdtest -v -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-5 -- started at 07/16/2014 09:23:46 -- mdtest-1.8.3 was launched with 64 total task(s) on 64 nodes Command line used: /opt/mdtest-1.8.3/bin/mdtest -v -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-5 Path: /p/lcratery/surya1/LU-5212 FS: 1019.6 TiB Used FS: 50.3% Inodes: 834.8 Mi Used Inodes: 62.5% 64 tasks, 256000 files SUMMARY: (of 8 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 3060.802 2525.669 2719.003 161.410 File stat : 72310.501 32382.555 57016.755 11440.553 File removal : 4344.489 4043.991 4224.141 97.727 Tree creation : 377.644 32.784 147.864 126.958 Tree removal : 11.800 9.356 10.626 0.884 -- finished at 07/16/2014 09:45:06 --

            Prakash, we will give this a try soon.

            Scott

            sknolin Scott Nolin (Inactive) added a comment - Prakash, we will give this a try soon. Scott

            Scott, can you try increasing the `lu_cache_nr` module option and re-running the test?

            # zwicky-lcy-mds1 /root > cat /sys/module/obdclass/parameters/lu_cache_nr
            256
            

            Try increasing it to something much larger, maybe 1M. I'd try that myself, but our testing resource is busy with other work at the moment.

            prakash Prakash Surya (Inactive) added a comment - Scott, can you try increasing the `lu_cache_nr` module option and re-running the test? # zwicky-lcy-mds1 /root > cat /sys/module/obdclass/parameters/lu_cache_nr 256 Try increasing it to something much larger, maybe 1M. I'd try that myself, but our testing resource is busy with other work at the moment.
            prakash Prakash Surya (Inactive) added a comment - - edited

            Interesting.. I think I see similar reduced performance with stats as well.. Hm..

            So here's the mdtest output with releases based on lustre 2.4.2 and zfs 0.6.3 on the servers:

            hype355@root:srun -- mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1
            -- started at 07/09/2014 13:09:02 --
            
            mdtest-1.8.3 was launched with 64 total task(s) on 64 nodes
            Command line used: /opt/mdtest-1.8.3/bin/mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1
            Path: /p/lcratery/surya1/LU-5212
            FS: 1019.6 TiB   Used FS: 50.3%   Inodes: 866.3 Mi   Used Inodes: 60.2%
            
            64 tasks, 256000 files
            
            SUMMARY: (of 8 iterations)
               Operation                  Max        Min       Mean    Std Dev
               ---------                  ---        ---       ----    -------
               File creation     :   2046.438    838.703   1534.565    375.574
               File stat         :  65205.403  23577.494  57837.499  13089.055
               File removal      :   4780.471   4647.670   4719.076     45.088
               Tree creation     :    505.051     34.332    221.404    196.950
               Tree removal      :     12.423     10.049     11.123      0.763
            
            -- finished at 07/09/2014 13:40:52 --
            

            And here's the mdtest output with releases based on lustre 2.4.0 and zfs 0.6.2 on the servers:

            hype355@root:srun -- mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1
            -- started at 07/09/2014 14:43:06 --
            
            mdtest-1.8.3 was launched with 64 total task(s) on 64 nodes
            Command line used: /opt/mdtest-1.8.3/bin/mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1
            Path: /p/lcratery/surya1/LU-5212
            FS: 1019.6 TiB   Used FS: 50.3%   Inodes: 861.8 Mi   Used Inodes: 60.5%
            
            64 tasks, 256000 files
            
            SUMMARY: (of 8 iterations)
               Operation                  Max        Min       Mean    Std Dev
               ---------                  ---        ---       ----    -------
               File creation     :   1627.029    810.017   1320.848    239.655
               File stat         :  99560.417  69839.184  88798.194   9632.641
               File removal      :   4352.713   3279.728   4029.607    413.213
               Tree creation     :    348.675     33.174    194.944    141.913
               Tree removal      :     15.176     10.103     12.088      1.386
            
            -- finished at 07/09/2014 15:19:02 --
            

            Which shows about a 34% decrease in the mean "File stat" performance with the lustre 2.4.2 and zfs 0.6.3 release (I'm assuming the number reported is operations per second). That's no good.

            prakash Prakash Surya (Inactive) added a comment - - edited Interesting.. I think I see similar reduced performance with stats as well.. Hm.. So here's the mdtest output with releases based on lustre 2.4.2 and zfs 0.6.3 on the servers: hype355@root:srun -- mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1 -- started at 07/09/2014 13:09:02 -- mdtest-1.8.3 was launched with 64 total task(s) on 64 nodes Command line used: /opt/mdtest-1.8.3/bin/mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1 Path: /p/lcratery/surya1/LU-5212 FS: 1019.6 TiB Used FS: 50.3% Inodes: 866.3 Mi Used Inodes: 60.2% 64 tasks, 256000 files SUMMARY: (of 8 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 2046.438 838.703 1534.565 375.574 File stat : 65205.403 23577.494 57837.499 13089.055 File removal : 4780.471 4647.670 4719.076 45.088 Tree creation : 505.051 34.332 221.404 196.950 Tree removal : 12.423 10.049 11.123 0.763 -- finished at 07/09/2014 13:40:52 -- And here's the mdtest output with releases based on lustre 2.4.0 and zfs 0.6.2 on the servers: hype355@root:srun -- mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1 -- started at 07/09/2014 14:43:06 -- mdtest-1.8.3 was launched with 64 total task(s) on 64 nodes Command line used: /opt/mdtest-1.8.3/bin/mdtest -i 8 -F -n 4000 -d /p/lcratery/surya1/LU-5212/mdtest-1 Path: /p/lcratery/surya1/LU-5212 FS: 1019.6 TiB Used FS: 50.3% Inodes: 861.8 Mi Used Inodes: 60.5% 64 tasks, 256000 files SUMMARY: (of 8 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 1627.029 810.017 1320.848 239.655 File stat : 99560.417 69839.184 88798.194 9632.641 File removal : 4352.713 3279.728 4029.607 413.213 Tree creation : 348.675 33.174 194.944 141.913 Tree removal : 15.176 10.103 12.088 1.386 -- finished at 07/09/2014 15:19:02 -- Which shows about a 34% decrease in the mean "File stat" performance with the lustre 2.4.2 and zfs 0.6.3 release (I'm assuming the number reported is operations per second). That's no good.

            I would also add, that while this absolute number from mdtest is worse, in use so far the upgrade has been an improvement. Performance doesn't seem to degrade so quickly with file creates, and things like interactive 'ls -l' are much better.

            Glad to hear it!

            I'm still a bit puzzled regarding the stat's though. I'm going to try and reproduce this using our test cluster; stay tuned.

            prakash Prakash Surya (Inactive) added a comment - I would also add, that while this absolute number from mdtest is worse, in use so far the upgrade has been an improvement. Performance doesn't seem to degrade so quickly with file creates, and things like interactive 'ls -l' are much better. Glad to hear it! I'm still a bit puzzled regarding the stat's though. I'm going to try and reproduce this using our test cluster; stay tuned.

            I would also add, that while this absolute number from mdtest is worse, in use so far the upgrade has been an improvement. Performance doesn't seem to degrade so quickly with file creates, and things like interactive 'ls -l' are much better.

            Scott

            sknolin Scott Nolin (Inactive) added a comment - I would also add, that while this absolute number from mdtest is worse, in use so far the upgrade has been an improvement. Performance doesn't seem to degrade so quickly with file creates, and things like interactive 'ls -l' are much better. Scott

            Y-axis is IOPs.

            The command info:

            mdtest-1.9.1 was launched with 64 total task(s) on 4 node(s)
            Command line used: /home/scottn/benchmarks/mdtest -i 2 -F -n 4000 -d /arcdata/scottn/mdtest

            64 tasks, 256000 files

            Scott

            sknolin Scott Nolin (Inactive) added a comment - Y-axis is IOPs. The command info: mdtest-1.9.1 was launched with 64 total task(s) on 4 node(s) Command line used: /home/scottn/benchmarks/mdtest -i 2 -F -n 4000 -d /arcdata/scottn/mdtest 64 tasks, 256000 files Scott

            Also, what's the Y axis label in the graph you linked to? I saw that earlier, but I can't make sense of it without labels. My initial interpretation was the Y axis is seconds, but that would mean lower is better, which doesn't agree with the claim of a performance decrease.

            Actually, I think I got it now. The Y axis must be the rate of operations per second, which lines up with your claim of 14400 stat/s prior and 8000 stat/s now.

            When you get a chance, please update us with the command used to generate the workload.

            prakash Prakash Surya (Inactive) added a comment - Also, what's the Y axis label in the graph you linked to? I saw that earlier, but I can't make sense of it without labels. My initial interpretation was the Y axis is seconds, but that would mean lower is better, which doesn't agree with the claim of a performance decrease. Actually, I think I got it now. The Y axis must be the rate of operations per second, which lines up with your claim of 14400 stat/s prior and 8000 stat/s now. When you get a chance, please update us with the command used to generate the workload.

            People

              wc-triage WC Triage
              sknolin Scott Nolin (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: