Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10373

LNet OPA Performance Drop

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0
    • None
    • None
    • CentOS 7.3
    • 3
    • 9223372036854775807

    Description

      A drop in OPA LNet bandwidth has occurred since Lustre 2.10.0.

      # lctl --version
      lctl 2.10.0
      
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 32 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 23426.6428571429
      Client Write RPC/s: 11714.1428571429
      Client Read MiB/s: 11713.6164285714
      Client Write MiB/s: 1.78714285714286
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 64 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 23577.5714285714
      Client Write RPC/s: 11790.2857142857
      Client Read MiB/s: 11789.2135714286
      Client Write MiB/s: 1.79928571428571
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 128 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 23595.5714285714
      Client Write RPC/s: 11798.2857142857
      Client Read MiB/s: 11799.1114285714
      Client Write MiB/s: 1.8
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 32 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 21268.3571428571
      Client Write RPC/s: 10635.2142857143
      Client Read MiB/s: 1.62357142857143
      Client Write MiB/s: 10634.2071428571
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 64 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 22236.9285714286
      Client Write RPC/s: 11118.9285714286
      Client Read MiB/s: 1.69714285714286
      Client Write MiB/s: 11118.7914285714
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 128 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 22178.6428571429
      Client Write RPC/s: 11087.2142857143
      Client Read MiB/s: 1.69142857142857
      Client Write MiB/s: 11089.0557142857
      
      
      
      # lctl --version
      lctl 2.10.55_127_g063a83a
      
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 32 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 16879.5
      Client Write RPC/s: 8441.14285714286
      Client Read MiB/s: 8439.57857142857
      Client Write MiB/s: 1.28785714285714
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 64 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 21844
      Client Write RPC/s: 10923.2857142857
      Client Read MiB/s: 10922.4635714286
      Client Write MiB/s: 1.66714285714286
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 128 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 21928.4285714286
      Client Write RPC/s: 10964.7857142857
      Client Read MiB/s: 10965.17
      Client Write MiB/s: 1.67357142857143
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 32 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 17288.2142857143
      Client Write RPC/s: 8645.07142857143
      Client Read MiB/s: 1.32
      Client Write MiB/s: 8643.84928571428
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 64 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 18382.8571428571
      Client Write RPC/s: 9192.92857142857
      Client Read MiB/s: 1.40214285714286
      Client Write MiB/s: 9191.25285714285
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 128 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 14966.3571428571
      Client Write RPC/s: 7486.07142857143
      Client Read MiB/s: 1.14285714285714
      Client Write MiB/s: 7482.79071428571
      
      
      

      LNet configuration is:

      # cat /etc/lnet.conf
      net:
          - net type: o2ib1
            local NI(s):
              - nid: 10.2.0.40@o2ib1
                interfaces:
                    0: ib0
                tunables:
                    peer_timeout: 180
                    peer_credits: 128
                    peer_buffer_credits: 0
                    credits: 1024
                lnd tunables:
                    peercredits_hiw: 64
                    map_on_demand: 256
                    concurrent_sends: 256
                    fmr_pool_size: 2048
                    fmr_flush_trigger: 512
                    fmr_cache: 1
                    ntx: 2048
                    conns_per_peer: 2
                CPT: "[0,1]"
      
      

      OPA driver configuration is:

      # cat /etc/modprobe.d/hfi1.conf
      options hfi1 piothreshold=0 sge_copy_mode=2 wss_threshold=70
      
      

      Attachments

        Issue Links

          Activity

            People

              ashehata Amir Shehata (Inactive)
              iziemba Ian Ziemba (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: