Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-744

Single client's performance degradation on 2.1

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • Lustre 2.2.0, Lustre 2.3.0
    • None
    • 3
    • 4018

    Description

      During the performance testing on lustre-2.1, I saw the single client's performance degradation on it.
      Here is IOR results on the single cleints with 2.1 and also lustre-1.8.6.80 for comparing.
      I ran IOR (IOR -t 1m -b 32g -w -r -vv -F -o /lustre/ior.out/file) on the single client with 1, 2, 4 and 8 processes.

      Write(MiB/sec)
      v1.8.6.80 v2.1
      446.25 411.43
      808.53 761.30
      1484.18 1151.41
      1967.42 1172.06

      Read(MiB/sec)
      v1.8.6.80 v2.1
      823.90 595.71
      1449.49 1071.76
      2502.49 1517.79
      3133.43 1746.30

      Tested on same infrastracture(hardware and network). The client just turned off the checksum on both testing.

      Attachments

        1. 2.4 Single Client 3May2013.xlsx
          34 kB
        2. 574.1.pdf
          169 kB
        3. ior-256gb.tar.gz
          32 kB
        4. ior-32gb.tar.gz
          24 kB
        5. lu744-20120909.tar.gz
          883 kB
        6. lu744-20120915.tar.gz
          874 kB
        7. lu744-20120915-02.tar.gz
          1.02 MB
        8. lu744-20121111.tar.gz
          849 kB
        9. lu744-20121113.tar.gz
          846 kB
        10. lu744-20121117.tar.gz
          2.45 MB
        11. lu744-20130104.tar.gz
          915 kB
        12. lu744-20130104-02.tar.gz
          26 kB
        13. lu744-dls-20121113.tar.gz
          10 kB
        14. orig-collectl.out
          81 kB
        15. orig-ior.out
          2 kB
        16. orig-opreport-l.out
          146 kB
        17. patched-collectl.out
          34 kB
        18. patched-ior.out
          2 kB
        19. patched-opreport-l.out
          137 kB
        20. single-client-performance.xlsx
          42 kB
        21. stats-1.8.zip
          14 kB
        22. stats-2.1.zip
          64 kB
        23. test2-various-version.zip
          264 kB
        24. test-patchset-2.zip
          147 kB

        Issue Links

          Activity

            [LU-744] Single client's performance degradation on 2.1

            So far all these tests have been done with 2.3.0 on the servers. I've not tried 2.3.54 on any of my test servers yet. I'll try to find some time over the next few days.

            ferner Frederik Ferner (Inactive) added a comment - So far all these tests have been done with 2.3.0 on the servers. I've not tried 2.3.54 on any of my test servers yet. I'll try to find some time over the next few days.

            Frederik, I'm assuming for your test results that you are running the same version on both the client and aerver? Would it also be possible for you to test 2.3.0 clients with 2.3.54 servers and vice versa? That would allow us to isolate if the slowdown seen with 2.3.54 is due to changes in the client or server.

            adilger Andreas Dilger added a comment - Frederik, I'm assuming for your test results that you are running the same version on both the client and aerver? Would it also be possible for you to test 2.3.0 clients with 2.3.54 servers and vice versa? That would allow us to isolate if the slowdown seen with 2.3.54 is due to changes in the client or server.

            Jinshan,

            apologies for not providing the information from the start, I've also now realised that this might be better suited in a new ticket. So let me know if you prefer me to open a new ticket.

            My current test setup is a small file system with all servers on Lustre 2.3, 2 OSSes, 6 OSTs in total (3 per OSS). All servers and test clients are attached via 10GigE. Network throughput has been tested and the test client can send with 1100MB/s to each server in turn using netperf. LNET selftest throughput also reaches 1100MB/s sending from one client to both servers at the same time.

            I've now repeated a small test with ior and different version on the clients. The test client only has 4GB RAM, in my tests on 2.3.54 (master up to commit 8229702 with patches 4245,4374,4375,4471,4472) I can write small files relatively fast but 4GB files are slow. I've not tested reading as this is not my main concern at the moment. (I'm hoping to achieve 900MB/s sustained write speed over 10GigE from a single process to accomodate a new detector we will commission early next year, my hope was on 2.X clients to provide higher single thread performance than 1.8.)

            Ior command used:
            ior -o /mnt/play01/tmp/stripe-all/ior_dat -w -k -t1m -b 4g -i 1 -e

            client details write speed [MiB/s]
            1.8.8, checksums on 487.61
            1.8.8, checksums off 592.90
            2.3.0, checksums on 440.36
            2.3.0, checksums off 441.63
            2.3.54+patches, checksums on 30.21
            2.3.54+patches, checksums off 34.12
            2.3.54+patches, checksums on, 1GB file 313.47

            opreport and collectl output for all the tests with 4GB files are attached in lu744-dls-20121113.tar.gz

            Let me know if you need anything else or if I need to run oprofile differently as I wasn't familiar with oprofile before.

            ferner Frederik Ferner (Inactive) added a comment - Jinshan, apologies for not providing the information from the start, I've also now realised that this might be better suited in a new ticket. So let me know if you prefer me to open a new ticket. My current test setup is a small file system with all servers on Lustre 2.3, 2 OSSes, 6 OSTs in total (3 per OSS). All servers and test clients are attached via 10GigE. Network throughput has been tested and the test client can send with 1100MB/s to each server in turn using netperf. LNET selftest throughput also reaches 1100MB/s sending from one client to both servers at the same time. I've now repeated a small test with ior and different version on the clients. The test client only has 4GB RAM, in my tests on 2.3.54 (master up to commit 8229702 with patches 4245,4374,4375,4471,4472) I can write small files relatively fast but 4GB files are slow. I've not tested reading as this is not my main concern at the moment. (I'm hoping to achieve 900MB/s sustained write speed over 10GigE from a single process to accomodate a new detector we will commission early next year, my hope was on 2.X clients to provide higher single thread performance than 1.8.) Ior command used: ior -o /mnt/play01/tmp/stripe-all/ior_dat -w -k -t1m -b 4g -i 1 -e client details write speed [MiB/s] 1.8.8, checksums on 487.61 1.8.8, checksums off 592.90 2.3.0, checksums on 440.36 2.3.0, checksums off 441.63 2.3.54+patches, checksums on 30.21 2.3.54+patches, checksums off 34.12 2.3.54+patches, checksums on, 1GB file 313.47 opreport and collectl output for all the tests with 4GB files are attached in lu744-dls-20121113.tar.gz Let me know if you need anything else or if I need to run oprofile differently as I wasn't familiar with oprofile before.

            Hi Ihara, I still saw high contention in cl_page_put and stats. Can you please try this patch 4519 where I disabled stats completely. For the cl_page_put() part, I will think about a way to solve it.

            jay Jinshan Xiong (Inactive) added a comment - Hi Ihara, I still saw high contention in cl_page_put and stats. Can you please try this patch 4519 where I disabled stats completely. For the cl_page_put() part, I will think about a way to solve it.

            Hi Jinshan, just ran same testing after applied two patches (4471 and 4472) to the master. Please check all results and statistics.

            ihara Shuichi Ihara (Inactive) added a comment - Hi Jinshan, just ran same testing after applied two patches (4471 and 4472) to the master. Please check all results and statistics.

            Yes, only those two on master.

            jay Jinshan Xiong (Inactive) added a comment - Yes, only those two on master.

            Jinshan, so, just two patches (4471 and 4472) to master is fine? then, collect stats during the IOR. no need to apply any patches to the master for this debuging, right?

            ihara Shuichi Ihara (Inactive) added a comment - Jinshan, so, just two patches (4471 and 4472) to master is fine? then, collect stats during the IOR. no need to apply any patches to the master for this debuging, right?

            Can you please describe the test env in detail and tell me the specific performance number before and after applying patch? Also, please collect performance data with oprofile and collectl as what Ihara did.

            There are two new patches(4471 and 4472) I submitted yesterday, can you please also give it a try?

            jay Jinshan Xiong (Inactive) added a comment - Can you please describe the test env in detail and tell me the specific performance number before and after applying patch? Also, please collect performance data with oprofile and collectl as what Ihara did. There are two new patches(4471 and 4472) I submitted yesterday, can you please also give it a try?

            Using those patches, I managed to compile a client from the git master branch and run my ior benchmark. It didn't improve performance but my client didn't suffer OOM either. I've not added any other patches on top of master (as off Monday evening: commit 82297027514416985a5557cfe154e174014804ba), as none of them seemed to apply cleanly. Were you expecting me to see higher performance? Are there any other patches I should test?

            Frederik

            ferner Frederik Ferner (Inactive) added a comment - Using those patches, I managed to compile a client from the git master branch and run my ior benchmark. It didn't improve performance but my client didn't suffer OOM either. I've not added any other patches on top of master (as off Monday evening: commit 82297027514416985a5557cfe154e174014804ba), as none of them seemed to apply cleanly. Were you expecting me to see higher performance? Are there any other patches I should test? Frederik

            Hi Ihara, I pushed two patches to address the stats problem:

            http://review.whamcloud.com/

            {4471,4472}

            Can you please give them a try? Please collect stats when you're running the patch, thanks.

            Hi Frederik, can you please try patches http://review.whamcloud.com/

            {4245,4374,4375}

            , it may solve your problem if you're hitting the same one by LLNL.

            jay Jinshan Xiong (Inactive) added a comment - Hi Ihara, I pushed two patches to address the stats problem: http://review.whamcloud.com/ {4471,4472} Can you please give them a try? Please collect stats when you're running the patch, thanks. Hi Frederik, can you please try patches http://review.whamcloud.com/ {4245,4374,4375} , it may solve your problem if you're hitting the same one by LLNL.

            I'm quite interest in these patches, as I'm currently trying to implement a file system where all traffic is via Ethernet, OSS attached with (dual bonded) 10GigE. A small number of clients connected via 10GigE should each be able to write with 900MB/s from a single stream. Currently with a 1.8.8 client writing to 2.3.0 OSSes and network checksums turned off, I get about 700MB/s. Upgrading the client to 2.3.0 I don't seem to get above 450MB/s, checksums don't make much difference here. (IOR 1M block size)

            I've so far not had much luck trying the patches attached to this ticket without OOM on my client.

            ferner Frederik Ferner (Inactive) added a comment - - edited I'm quite interest in these patches, as I'm currently trying to implement a file system where all traffic is via Ethernet, OSS attached with (dual bonded) 10GigE. A small number of clients connected via 10GigE should each be able to write with 900MB/s from a single stream. Currently with a 1.8.8 client writing to 2.3.0 OSSes and network checksums turned off, I get about 700MB/s. Upgrading the client to 2.3.0 I don't seem to get above 450MB/s, checksums don't make much difference here. (IOR 1M block size) I've so far not had much luck trying the patches attached to this ticket without OOM on my client.

            People

              jay Jinshan Xiong (Inactive)
              ihara Shuichi Ihara (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              35 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: