Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      In mdt_path_current() we find an object while holding a reference to it causing a potential deadlock in lu_object_find_at():

      static int mdt_path_current(struct mdt_thread_info *info,
                                  struct mdt_object *obj,
                                  struct getinfo_fid2path *fp)
      {
              struct mdt_device       *mdt = info->mti_mdt;
              struct mdt_object       *mdt_obj;
              struct link_ea_header   *leh;
              struct link_ea_entry    *lee;
              struct lu_name          *tmpname = &info->mti_name;
              struct lu_fid           *tmpfid = &info->mti_tmp_fid1;
              struct lu_buf           *buf = &info->mti_big_buf;
              char                    *ptr;
              int                     reclen;
              struct linkea_data      ldata = { NULL };
              int                     rc = 0;
              bool                    first = true;
              ENTRY;
      
              /* temp buffer for path element, the buffer will be finally freed
               * in mdt_thread_info_fini */
              buf = lu_buf_check_and_alloc(buf, PATH_MAX);
              if (buf->lb_buf == NULL)
                      RETURN(-ENOMEM);
      
              ldata.ld_buf = buf;
              ptr = fp->gf_path + fp->gf_pathlen - 1;
              *ptr = 0;
              --ptr;
              *tmpfid = fp->gf_fid = *mdt_object_fid(obj);
      
              /* root FID only exists on MDT0, and fid2path should also ends at MDT0,
               * so checking root_fid can only happen on MDT0. */
              while (!lu_fid_eq(&mdt->mdt_md_root_fid, &fp->gf_fid)) {
                      struct lu_buf           lmv_buf;
      
                      mdt_obj = mdt_object_find(info->mti_env, mdt, tmpfid);
                      ...
      

      One way to see a hang from this is to enable HSM and do:

      # cd /mnt/lustre
      # while true; do
          echo XXX > f0
          lfs hsm_archive f0
          sys_unlink f0
      done
      

      Note that in the archive path the CT uses the fid2path ioctl for debug messages. In restore it uses the fid2path ioctl to get the parent directory of the file to be restored when creating the volatile file.

      Attachments

        Issue Links

          Activity

            [LU-8821] double find in mdt_path_current()
            pjones Peter Jones added a comment -

            Landed for 2.10

            pjones Peter Jones added a comment - Landed for 2.10

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23701/
            Subject: LU-8821 mdt: avoid double find in mdt_path_current()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c6383473e74262eaf8f822dcb6b28b22b130f364

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23701/ Subject: LU-8821 mdt: avoid double find in mdt_path_current() Project: fs/lustre-release Branch: master Current Patch Set: Commit: c6383473e74262eaf8f822dcb6b28b22b130f364
            rread Robert Read added a comment -

            FWIW, Lemur doesn't use fid2path in archive path (we just print FIDs in debug messages), and it is liblustreapi_hsm.c that is using fid2path in restore path, so out of our control currently.

            I wonder if we can avoid the fid2path in restore by using the parent fid from the lsm xattr and then openat() to create the recovery file.

            rread Robert Read added a comment - FWIW, Lemur doesn't use fid2path in archive path (we just print FIDs in debug messages), and it is liblustreapi_hsm.c that is using fid2path in restore path, so out of our control currently. I wonder if we can avoid the fid2path in restore by using the parent fid from the lsm xattr and then openat() to create the recovery file.

            John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/23701
            Subject: LU-8821 mdt: avoid double find in mdt_path_current()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 302192da25bc9e53735a5f81895af862eba78ddf

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/23701 Subject: LU-8821 mdt: avoid double find in mdt_path_current() Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 302192da25bc9e53735a5f81895af862eba78ddf

            People

              jhammond John Hammond
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: