Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10394

IB_MR_TYPE_SG_GAPS mlx5 LNet performance drop

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.11.0
    • Lustre 2.11.0
    • None
    • CentOS 7.4
      2.10.56_1_g11aae87-1.el7.centos.x86_64
    • 3
    • 9223372036854775807

    Description

      mlx5 performance is down 2+ GB/s when using IB_MR_TYPE_SG_GAPS as compared to IB_MR_TYPE_MEM_REG.

       

      mlx5 with SG GAPS

      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 32 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 17426.3333333333
      Client Write RPC/s: 8713.77777777778
      Client Read MiB/s: 8713.46111111111
      Client Write MiB/s: 1.33
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 64 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 17408.3333333333
      Client Write RPC/s: 8705.22222222222
      Client Read MiB/s: 8704.06666666667
      Client Write MiB/s: 1.33
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 128 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 17388.4444444444
      Client Write RPC/s: 8697
      Client Read MiB/s: 8695.54666666667
      Client Write MiB/s: 1.32777777777778
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 32 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 17712.1111111111
      Client Write RPC/s: 8856.55555555555
      Client Read MiB/s: 1.35
      Client Write MiB/s: 8855.53111111111
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 64 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 17705.7777777778
      Client Write RPC/s: 8853.66666666667
      Client Read MiB/s: 1.35
      Client Write MiB/s: 8853.18555555556
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 128 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 17697.3333333333
      Client Write RPC/s: 8854.44444444445
      Client Read MiB/s: 1.34888888888889
      Client Write MiB/s: 8850.95777777778
      
      
      

      mlx5 without SG GAPS

      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 32 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 22449.5555555556
      Client Write RPC/s: 11227
      Client Read MiB/s: 11224.5033333333
      Client Write MiB/s: 1.71222222222222
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 64 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 22308.6666666667
      Client Write RPC/s: 11154.3333333333
      Client Read MiB/s: 11155.7288888889
      Client Write MiB/s: 1.7
      ----------------------------------------------------------
      Running test: lst add_test --batch rperf --concurrency 128 --distribute 1:1 --from clients --to servers brw read size=1M
      Client Read RPC/s: 21549.1
      Client Write RPC/s: 10737.4
      Client Read MiB/s: 11135.278
      Client Write MiB/s: 1.638
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 32 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 22178.3333333333
      Client Write RPC/s: 11090.8888888889
      Client Read MiB/s: 1.69
      Client Write MiB/s: 11088.7822222222
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 64 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 22198.6666666667
      Client Write RPC/s: 11099.8888888889
      Client Read MiB/s: 1.69111111111111
      Client Write MiB/s: 11100.1666666667
      ----------------------------------------------------------
      Running test: lst add_test --batch wperf --concurrency 128 --distribute 1:1 --from clients --to servers brw write size=1M
      Client Read RPC/s: 22162.6666666667
      Client Write RPC/s: 11085.7777777778
      Client Read MiB/s: 1.68777777777778
      Client Write MiB/s: 11083.5477777778
      
      
      

      o2iblnd parameters:

      options ko2iblnd timeout=10
      options ko2iblnd peer_timeout=0
      options ko2iblnd keepalive=30
      options ko2iblnd credits=2048
      options ko2iblnd ntx=2048
      options ko2iblnd peer_credits=16
      options ko2iblnd concurrent_sends=16
      
      

      Attachments

        Issue Links

          Activity

            People

              ashehata Amir Shehata (Inactive)
              iziemba Ian Ziemba (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: