Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10262

Lock contention when doing creates for the same name

Details

    • Improvement
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0, Lustre 2.12.7
    • None
    • None
    • 9223372036854775807

    Description

      It was frequently documented that using e.g. fortran programs that write into same file is kind of slow. The reason is every time you open a file for write in fortran, it adds O_CREAT flag to the open causing the locking to be much less cooperative and when the name is the same, basically every thread gets bottlenecked on that same lock as the opens are processed because we currently decide the locking mode based on open flags only.

      Similar problem exists for other kind of creates like mkdirs.

      For the open-create case it's in mdt_reint_open():

      again:
              lh = &info->mti_lh[MDT_LH_PARENT];
              mdt_lock_pdo_init(lh,
                                (create_flags & MDS_OPEN_CREAT) ? LCK_PW : LCK_PR,
                                &rr->rr_name);
      
              parent = mdt_object_find(info->mti_env, mdt, rr->rr_fid1);
              if (IS_ERR(parent))
                      GOTO(out, result = PTR_ERR(parent));
      
              result = mdt_object_lock(info, parent, lh, MDS_INODELOCK_UPDATE);
      

      Similar code for mkdir/mknod/... in mdt_create

              lh = &info->mti_lh[MDT_LH_PARENT];
              mdt_lock_pdo_init(lh, LCK_PW, &rr->rr_name);
              rc = mdt_object_lock(info, parent, lh, MDS_INODELOCK_UPDATE);
      

      It looks like we should be able to do a lockless lookup on the parent and then in the parent (internal fs locking should ensure that the parent does not disappear from under us) then relock with desired read/write mode and relookup. If the file has disappeared by then and we are in PR lock mode for the parent - we need to drop the PR lock and reobtain the PW one and try again.

      The risk is there that if many threads do it for non-existing file there's still going to be some number of them stuck on the same ldlm lock, but hopefully fewer than before.

      Attachments

        Issue Links

          Activity

            [LU-10262] Lock contention when doing creates for the same name
            adilger Andreas Dilger made changes -
            Link New: This issue is related to DDN-3129 [ DDN-3129 ]
            adilger Andreas Dilger made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]

            This bug was fixed in 2.14.0 and patch backported for 2.12.7.

            adilger Andreas Dilger added a comment - This bug was fixed in 2.14.0 and patch backported for 2.12.7.
            adilger Andreas Dilger made changes -
            Fix Version/s New: Lustre 2.12.7 [ 14793 ]
            adilger Andreas Dilger made changes -
            Fix Version/s New: Lustre 2.14.0 [ 14490 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-15546 [ LU-15546 ]

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41172/
            Subject: LU-10262 mdt: mdt_reint_open: check EEXIST without lock
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 515255019d4589354eb3bf393dabc689fc37407c

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41172/ Subject: LU-10262 mdt: mdt_reint_open: check EEXIST without lock Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 515255019d4589354eb3bf393dabc689fc37407c

            Etienne AUJAMES (eaujames@ddn.com) uploaded a new patch: https://review.whamcloud.com/41172
            Subject: LU-10262 mdt: mdt_reint_open: check EEXIST without lock
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: dca678c450d517a3fce631eac076bc1ea4c502d7

            gerrit Gerrit Updater added a comment - Etienne AUJAMES (eaujames@ddn.com) uploaded a new patch: https://review.whamcloud.com/41172 Subject: LU-10262 mdt: mdt_reint_open: check EEXIST without lock Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: dca678c450d517a3fce631eac076bc1ea4c502d7

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33098/
            Subject: LU-10262 mdt: mdt_reint_open: check EEXIST without lock
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 33dc40d58ef6eb8b384fce1da9f8d21cad4ef6d8

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33098/ Subject: LU-10262 mdt: mdt_reint_open: check EEXIST without lock Project: fs/lustre-release Branch: master Current Patch Set: Commit: 33dc40d58ef6eb8b384fce1da9f8d21cad4ef6d8

            Thanks for the update.

            Could you please put a condensed version of these results into the commit message of the patch, showing a simple table of without/with results absolute times in the last table, and the last column %improvement.

            adilger Andreas Dilger added a comment - Thanks for the update. Could you please put a condensed version of these results into the commit message of the patch, showing a simple table of without/with results absolute times in the last table, and the last column %improvement.

            People

              eaujames Etienne Aujames
              green Oleg Drokin
              Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: