Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11999

DNE performance improvement

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Critical
    • Lustre 2.13.0, Lustre 2.12.1
    • None
    • None
    • 9223372036854775807

    Description

      There exists a case that DNE would create remote files slow. When a remote directory is created, and if the process has to walk path to create files under that remote directory, excessive UPDATE lock request and revocation is seen. To reproduce this problem:

       # lfs mkdir -i 0 /mnt/lustre/dir1
       # lfs mkdir -i 1 /mnt/lustre/dir1/dir2
       # touch /mnt/lustre/dir1/dir2/f1
       # touch /mnt/lustre/dir1/dir2/f2
       

      Each time when a file is created under /mnt/lustre/dir1/dir2, client will walk the path, and a UPDATE lock is requested in function lmv_intent_remote() since this is a remote directory; as the code shows:

              /*
               * Unfortunately, we have to lie to MDC/MDS to retrieve
               * attributes llite needs and provideproper locking.
               */
              if (it->it_op & IT_LOOKUP)
                      it->it_op = IT_GETATTR;
      

      The above code will cause UPDATE + PERM lock to be returned.

      Then when the create intent RPC is sent to the MDS, the UPDATE lock will be revoked by the MDT. This pattern will go on over and over again as this operation continues.

      This piece of code was written before PERM lock was introduced, and it's causing drastic performance degradation. Now we have PERM lock in place, the code is no longer required.

      Thanks Di for providing the insight of DNE code.

      Patch and test are coming soon.

      Attachments

        Issue Links

          Activity

            People

              Jinshan Jinshan Xiong
              Jinshan Jinshan Xiong
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: