Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3321

2.x single thread/process throughput degraded from 1.8

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0
    • Lustre 2.4.0
    • Tested on 2.3.64 and 1.8.9 clients with 4 OSS x 3 - 32 GB OST ramdisks
    • 3
    • 8259

    Description

      Single thread/process throughput on tag 2.3.64 is degraded from 1.8.9 and significantly degraded when the client hits its caching limit (llite.*.max_cached_mb). Attached graph shows lnet stats sampled every second for a single dd writing 2 - 64 GB files followed by a dropping cache and reading the same two files. The tests were not done simultaenously but the graph has them starting from the same point. It also takes a significant amount of time to drop the cache on 2.3.64.

      Lustre 2.3.64
      Write (dd if=/dev/zero of=testfile bs=1M)
      68719476736 bytes (69 GB) copied, 110.459 s, 622 MB/s
      68719476736 bytes (69 GB) copied, 147.935 s, 465 MB/s

      Drop caches (echo 1 > /proc/sys/vm/drop_caches)
      real 0m43.075s

      Read (dd if=testfile of=/dev/null bs=1M)
      68719476736 bytes (69 GB) copied, 99.2963 s, 692 MB/s
      68719476736 bytes (69 GB) copied, 142.611 s, 482 MB/s

      Lustre 1.8.9
      Write (dd if=/dev/zero of=testfile bs=1M)
      68719476736 bytes (69 GB) copied, 63.3077 s, 1.1 GB/s
      68719476736 bytes (69 GB) copied, 67.4487 s, 1.0 GB/s

      Drop caches (echo 1 > /proc/sys/vm/drop_caches)
      real 0m9.189s

      Read (dd if=testfile of=/dev/null bs=1M)
      68719476736 bytes (69 GB) copied, 46.4591 s, 1.5 GB/s
      68719476736 bytes (69 GB) copied, 52.3635 s, 1.3 GB/s

      Attachments

        1. cpustat.scr
          0.5 kB
        2. dd_throughput_comparison_with_change_5446.png
          dd_throughput_comparison_with_change_5446.png
          7 kB
        3. dd_throughput_comparison.png
          dd_throughput_comparison.png
          6 kB
        4. lu-3321-singlethreadperf.tgz
          391 kB
        5. lu-3321-singlethreadperf2.tgz
          564 kB
        6. mcm8_wcd.png
          mcm8_wcd.png
          9 kB
        7. perf3.png
          perf3.png
          103 kB

        Issue Links

          Activity

            [LU-3321] 2.x single thread/process throughput degraded from 1.8

            the maloo test failed due to an unrelated bug. I will retrigger the test.

            jay Jinshan Xiong (Inactive) added a comment - the maloo test failed due to an unrelated bug. I will retrigger the test.

            John - There are a few components to the review/landing process.

            The patch is built automatically by Jenkins, which contributes a +1 if it builds correctly (that's already true for http://review.whamcloud.com/#/c/8523/).

            Maloo then runs several sets of tests, currently I think that's three. Once all the sets of tests have passed, Maloo contributes a +1. It looks like two of the three test sets have completed for the patch in question.

            Separately, human reviewers contribute code reviews. A positive review gives a +1. The general standard is two +1s before landing a patch.

            Then, finally, someone in the gatekeeper role - I believe that's currently Oleg Drokin and Andreas Dilger - approves the patch, which appears as +2. Then the patch is cherry-picked on to master (also by a gatekeeper).

            Once the patch has been cherry-picked, it has landed.

            This one is almost ready to land. The tests need to complete, then it should be approved and cherry-picked quickly. (Since this ticket is a blocker for 2.6, it'll definitely be in before that release.)

            paf Patrick Farrell (Inactive) added a comment - - edited John - There are a few components to the review/landing process. The patch is built automatically by Jenkins, which contributes a +1 if it builds correctly (that's already true for http://review.whamcloud.com/#/c/8523/ ). Maloo then runs several sets of tests, currently I think that's three. Once all the sets of tests have passed, Maloo contributes a +1. It looks like two of the three test sets have completed for the patch in question. Separately, human reviewers contribute code reviews. A positive review gives a +1. The general standard is two +1s before landing a patch. Then, finally, someone in the gatekeeper role - I believe that's currently Oleg Drokin and Andreas Dilger - approves the patch, which appears as +2. Then the patch is cherry-picked on to master (also by a gatekeeper). Once the patch has been cherry-picked, it has landed. This one is almost ready to land. The tests need to complete, then it should be approved and cherry-picked quickly. (Since this ticket is a blocker for 2.6, it'll definitely be in before that release.)

            Can anyone tell me about the status of the final patch for this issue? Looks like some recent testing has been successful, but I don't know the other tools well enough to know if the patch is ready to land, or even already landed. Thanks.

            jfc John Fuchs-Chesney (Inactive) added a comment - Can anyone tell me about the status of the final patch for this issue? Looks like some recent testing has been successful, but I don't know the other tools well enough to know if the patch is ready to land, or even already landed. Thanks.

            Once http://review.whamcloud.com/#/c/8523/ lands this ticket can be closed

            jlevi Jodi Levi (Inactive) added a comment - Once http://review.whamcloud.com/#/c/8523/ lands this ticket can be closed

            No worry, we've already had a patch for percpu cl_env cache. It turns out that the overhead of allocating cl_env is really high, so caching cl_env is necessary, but we just need a smart way to cache them.

            jay Jinshan Xiong (Inactive) added a comment - No worry, we've already had a patch for percpu cl_env cache. It turns out that the overhead of allocating cl_env is really high, so caching cl_env is necessary, but we just need a smart way to cache them.

            We are rather disappointed by the revert of commit 93fe562. While I can understand not liking the performance impact, basic functionality trumps performance. On nodes with high processor counts, the lustre client thrashes to the point of being unusable without that patch.

            I think this ticket is now a blocker for 2.6, because we can't operate with the tree in its current state.

            morrone Christopher Morrone (Inactive) added a comment - We are rather disappointed by the revert of commit 93fe562. While I can understand not liking the performance impact, basic functionality trumps performance. On nodes with high processor counts, the lustre client thrashes to the point of being unusable without that patch. I think this ticket is now a blocker for 2.6, because we can't operate with the tree in its current state.

            Hi Erich,

            I didn't use multiple striped files, and each thread wrote their own files in multiple threads testing.

            Did you use my performance patch to do the test?

            I assume you're doing benchmark on multiple striped file with single thread. In my experience, if your OSTw are fast, it won't help to get better performance by striping files to multiple OSTs, because the bottleneck is on the client CPU. You can take a look at the rpc_stats of OSC: lctl get_param osc.*.rpc_stats, if the value of rpcs_in_flight is low, which implies client can't generate data fast enough to saturate OSTs.

            jay Jinshan Xiong (Inactive) added a comment - Hi Erich, I didn't use multiple striped files, and each thread wrote their own files in multiple threads testing. Did you use my performance patch to do the test? I assume you're doing benchmark on multiple striped file with single thread. In my experience, if your OSTw are fast, it won't help to get better performance by striping files to multiple OSTs, because the bottleneck is on the client CPU. You can take a look at the rpc_stats of OSC: lctl get_param osc.*.rpc_stats, if the value of rpcs_in_flight is low, which implies client can't generate data fast enough to saturate OSTs.
            efocht Erich Focht added a comment -

            Jinshan, a quick question: are the results you've uploaded on Oct. 11 measured on a striped file? How many stripes? Were these multiple threads on one striped file or each thread with it's own file?

            I've seen very poor performance with striped files and no increase in performance when increasing the number of stripes with Lustre 2.x, just want to make sure whether your measurements are "my" use case or not.

            Thanks,
            Erich

            efocht Erich Focht added a comment - Jinshan, a quick question: are the results you've uploaded on Oct. 11 measured on a striped file? How many stripes? Were these multiple threads on one striped file or each thread with it's own file? I've seen very poor performance with striped files and no increase in performance when increasing the number of stripes with Lustre 2.x, just want to make sure whether your measurements are "my" use case or not. Thanks, Erich

            And the test I did in our lab

            jay Jinshan Xiong (Inactive) added a comment - And the test I did in our lab

            Let me share a test result from Jeremy which showed performance improvement.

            jay Jinshan Xiong (Inactive) added a comment - Let me share a test result from Jeremy which showed performance improvement.
            jay Jinshan Xiong (Inactive) added a comment - Please check patch for master at: http://review.whamcloud.com/7888 http://review.whamcloud.com/7889 http://review.whamcloud.com/7890 http://review.whamcloud.com/7891 http://review.whamcloud.com/7892 http://review.whamcloud.com/7893 http://review.whamcloud.com/7894 http://review.whamcloud.com/7895 and patch for b2_4 at: http://review.whamcloud.com/7896 http://review.whamcloud.com/7897 http://review.whamcloud.com/7898 http://review.whamcloud.com/7899 http://review.whamcloud.com/7900 http://review.whamcloud.com/7901 http://review.whamcloud.com/7902 http://review.whamcloud.com/7903

            People

              jay Jinshan Xiong (Inactive)
              jfilizetti Jeremy Filizetti
              Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: