Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4429

clients leaking open handles/bad lock matching in ll_md_blocking_ast

    XMLWordPrintable

Details

    • 3
    • 12170

    Description

      In ll_md_blocking_ast() we try to avoid calling ll_md_real_close() by looking for a same mode OPEN lock on the file.

      case LDLM_CB_CANCELING: {
          struct inode *inode = ll_inode_from_resource_lock(lock);
          __u64 bits = lock->l_policy_data.l_inodebits.bits;
          ...
      
          if (bits & MDS_INODELOCK_XATTR)
                              ll_xattr_cache_destroy(inode);
      
          /* For OPEN locks we differentiate between lock modes             
           * LCK_CR, LCK_CW, LCK_PR - bug 22891 */
          if (bits & (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE |
                      MDS_INODELOCK_LAYOUT | MDS_INODELOCK_PERM))
              ll_have_md_lock(inode, &bits, LCK_MINMODE);
      
          if (bits & MDS_INODELOCK_OPEN)
              ll_have_md_lock(inode, &bits, mode);
      
          ...
          if (bits & MDS_INODELOCK_OPEN) {
              int flags = 0;
              switch (lock->l_req_mode) {
              case LCK_CW:
                  flags = FMODE_WRITE;
                  break;
              case LCK_PR:
                  flags = FMODE_EXEC;
                  break;
              case LCK_CR:
                  flags = FMODE_READ;
                  break;
              ...
              ll_md_real_close(inode, flags);
      }
      

      However the ll_have_md_lock(inode, &bits, LCK_MINMODE) call may match a lock which happens to include MDS_INODELOCK_OPEN but has an inappropriate mode. This will prevent ll_md_real_close() from being called when it should be and leave a stale obd_client_handle in the lli.

      That handles are really being leaked is easy to see by using the patch http://review.whamcloud.com/#/c/6386/ from LU-946. Then do

      # llmount.sh
      ...
      # DURATION=10 sh ./lustre/tests/racer.sh
      ...
      # lsof /mnt/lustre
      # lctl set_param ldlm.namespaces.*mdc*.lru_size=clear
      # lctl get_param ldlm.namespaces.*mdc*.lru_size
      # lctl dk > 1.dk
      # cat /proc/fs/lustre/mdt/lustre-MDT0000/exports/0\@lo/open_files
      [0x200000400:0x9d1:0x0] 04240000001 0xbd7c9c99fdb17cfc
      [0x200000401:0x893:0x0] 04240000001 0xbd7c9c99fdad82ad
      

      Note that I have modified the patch to also print the flags and cookie of the MFD.

      Attachments

        Issue Links

          Activity

            People

              jhammond John Hammond
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: