Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9998

Default partition setup is not optimal for best metadata performance

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0, Lustre 2.10.4
    • None
    • None
    • b2_10
    • 3
    • 9223372036854775807

    Description

      Here is MDS's CPU configuration.

      [root@mds11 ~]# lscpu 
      Architecture:          x86_64
      CPU op-mode(s):        32-bit, 64-bit
      Byte Order:            Little Endian
      CPU(s):                48
      On-line CPU(s) list:   0-47
      Thread(s) per core:    2
      Core(s) per socket:    24
      Socket(s):             1
      NUMA node(s):          1
      Vendor ID:             GenuineIntel
      CPU family:            6
      Model:                 85
      Model name:            Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
      Stepping:              4
      CPU MHz:               2101.000
      BogoMIPS:              4200.00
      Virtualization:        VT-x
      L1d cache:             32K
      L1i cache:             32K
      L2 cache:              1024K
      L3 cache:              33792K
      NUMA node0 CPU(s):     0-47
      
      [root@mds11 ~]# numactl -H
      available: 1 nodes (0)
      node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
      node 0 size: 96940 MB
      node 0 free: 90229 MB
      node distances:
      node   0 
        0:  10 
      

      only single partition created by default for single CPU configuration.

      [root@mds11 ~]# cat /proc/sys/lnet/cpu_partition_table 
      0	: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
      

      This default partition configuration is not optimal and affects huge metadata performance impact. especially stats and read operations.
      Please see below test results with default and manual setting with 6 partitions.

      Default partition (npartition=1)

      mpirun -np 128 /work/tools/bin/mdtest -n 5000 -v -d /scratch0/dir0 -F -i 3 -p 10 -w 0 -u
      
      SUMMARY: (of 3 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         File creation     :      90269.484      73210.911      83067.818       7212.787
         File stat         :     192519.466     191217.586     191843.135        532.702
         File read         :      84278.190      74407.351      78726.036       4123.061
         File removal      :     152552.089     141405.693     148541.612       5058.776
         Tree creation     :        576.227        129.569        332.039        184.718
         Tree removal      :         28.016         12.466         18.019          7.083
      V-1: Entering timestamp...
      

      npartition=6

      [root@mds11 ~]# cat /proc/sys/lnet/cpu_partition_table 
      0	: 0 1 2 3 24 25 26 27
      1	: 4 5 6 7 28 29 30 31
      2	: 8 9 10 11 32 33 34 35
      3	: 12 13 14 15 36 37 38 39
      4	: 16 17 18 19 40 41 42 43
      5	: 20 21 22 23 44 45 46 47
      
      mpirun -np 128 /work/tools/bin/mdtest -n 5000 -v -d /scratch0/dir0 -F -i 3 -p 10 -w 0 -u
      SUMMARY: (of 3 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         File creation     :     130215.199     112298.894     123903.497       8216.228
         File stat         :     447219.644     422373.391     436421.078      10400.374
         File read         :     224856.656     216383.752     219513.555       3796.625
         File removal      :     142603.040     138102.147     139843.976       1973.252
         Tree creation     :        561.879        170.631        379.767        160.865
         Tree removal      :         41.908         41.042         41.509          0.357
      V-1: Entering timestamp...
      

      Attachments

        Activity

          People

            dmiter Dmitry Eremin (Inactive)
            ihara Shuichi Ihara (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: