Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9840

LU-3529 causes 25% metadata performance regressions even without DNE

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0, Lustre 2.10.2
    • None
    • None
    • master
    • 3
    • 9223372036854775807

    Description

      Finally, we found a commit and root cause of 25% metadata performance regression.
      (File creation into single shared directory) This regression introduced on middle of lustre-2.5 and lustre-2.6 and this regression are still exist.
      After our investigation with "git bisect" the follwoing patch caseus perforamnce regression.

      5f3e926ac9ff8ad134ad920d0e8545e16395ef3b is the first bad commit
      commit 5f3e926ac9ff8ad134ad920d0e8545e16395ef3b
      Author: wang di <di.wang@intel.com>
      Date:   Wed Jul 31 00:00:40 2013 -0700
      
          LU-3529 lod: create striped directory
          
          1. Add "lfs setdirstripe -i -c" to create striped
          directory.
          
          2. client send create request to the master MDT, which
          will allocate FIDs and create slaves. for all of slaves.
          
          3. Client needs to revalidate slaves during intent getattr
          and open request.
          
          4. lmv_stripe_md will include attributes(size, nlink etc)
          from all of stripe, which will be protected by UPDATE lock.
          client needs to merge these attributes when update inode.
          
          5. send create request to the MDT where the file is located,
          which can help creating master stripe of striped directory.
          
          Signed-off-by: wang di <di.wang@intel.com>
          Change-Id: I7ac560e39dcb415e310dc5e6ade531d76227ffae
          Reviewed-on: http://review.whamcloud.com/7196
          Tested-by: Jenkins
          Tested-by: Maloo <hpdd-maloo@intel.com>
          Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
          Reviewed-by: John L. Hammond <john.hammond@intel.com>
      
      

      Here is test configuration
      1 x MDS (2 x E5-2690 v3, 128GB memory)
      32 x Client(2 x CPU E5-2650, 128GB memory)
      4 x OSS and 40 OST
      RHEL6.5

      1. mpirun -np 128 -ppn 4 -hostfile ./hostfile.32 mdtest -n 5000 -v -d /scratch/mdtest.out -p 30 -i 3 -F
      [e19b51372ad94818a7a79b1fbae5b55c665ba59f] LU-4196 build: Reenable OFED-3.5 support on SLES11
      SUMMARY: (of 3 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         File creation     :      88017.734      82493.520      84509.613       2489.825
         File stat         :     151997.740     142649.444     148656.020       4256.270
         File read         :     162847.716     154605.697     158734.759       3364.809
         File removal      :      84993.971      78063.127      80600.677       3118.973
         Tree creation     :       3692.169       2931.030       3262.108        318.519
         Tree removal      :         51.245         47.758         49.678          1.446
      V-1: Entering timestamp...
      
      
      [5f3e926ac9ff8ad134ad920d0e8545e16395ef3b] LU-3529 lod: create striped directory
      SUMMARY: (of 3 iterations)
         Operation                      Max            Min           Mean        Std Dev
         ---------                      ---            ---           ----        -------
         File creation     :      66695.074      66242.158      66524.596        201.138
         File stat         :     152026.405     143681.866     148948.529       3741.740
         File read         :     165470.364     163291.085     164307.211        895.740
         File removal      :      86953.641      82117.776      84285.373       2005.726
         Tree creation     :       4165.148       2841.669       3603.150        558.417
         Tree removal      :         59.119         52.690         55.581          2.664
      V-1: Entering timestamp...
      
      

      Even no DNE, we are losing 25% performance regression.

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            ihara Shuichi Ihara (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: