Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2487

2.2 Client deadlock between ll_md_blocking_ast, sys_close, and sys_open

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.2.0
    • 2
    • 5837

    Description

      Spinlock Usage

      Walking through the code and digging through pointers in the stack frames of the 3 threads leads to the following 3 suspect structures and their related spinlocks:

      inode = 0xffff880581a26638 (i_lock)
      dentry = 0xffff88051f7242c0 (d_lock)
      ldlm_resource = 0xffff880308f916c0 (lr_lock)

      Looks like there is a 3 way deadlock between ll_md_blocking_ast, sys_open, and sys_close using the above spinlocks.

      CPU 10: ll_md_blocking_ast()
      + ll_inode_from_resource() gets lock-l_resource->lr_lock

      • igrab() wants lock->l_resource->lr_lvb_inode->i_lock

      CPU 7: sys_close()
      + dput() gets dentry->d_lock

      • ldlm_resource_foreach() wants res->lr_lock

      CPU 15: sys_open
      + ll_splice_alias()/ll_find_alias() gets inode->i_lock

      • ll_splice_alias()/ll_find_alias() wants dentry->d_lock

      The lr_lock, d_lock, and i_lock are the same in all cases. So ll_md_blocking_ast() waits for sys_open(), sys_open waits for sys_close(), and sys_close() waits for ll_md_blocking_ast. A 3-way deadlock.

      This deadlock is possible because of two pathces:
      1) LU-903(not landed yet) (bz24555) - get inode lock. If we get ldlm_lock lock thin no deadlock possible.
      resource dentry inode
      ldlm_lock resource dentry
      2) dcache_lock removing (http://review.whamcloud.com/#change,1865). Before this patch (with bz24555 enabled)also no deadlock possible.
      resource dentry inode
      inode resource dcache_lock

      Attachments

        Issue Links

          Activity

            [LU-2487] 2.2 Client deadlock between ll_md_blocking_ast, sys_close, and sys_open

            I think the comment you added in the LU-903 patch should be enough for now. Thanks for the update.

            I'm going to close this as Cannot Reproduce, since the problem never existed in any of the public Lustre releases.

            adilger Andreas Dilger added a comment - I think the comment you added in the LU-903 patch should be enough for now. Thanks for the update. I'm going to close this as Cannot Reproduce, since the problem never existed in any of the public Lustre releases.

            Can we close this issue with "wan't fix"?

            artem_blagodarenko Artem Blagodarenko (Inactive) added a comment - Can we close this issue with "wan't fix"?

            Yes, we do not need this patch, because one of deadlock's branches is disabled in current master. But I have added comment "LU-2487 need to be resolved before branch can be enabled again" to LU-903 patch to prevent possible deadlock if somebody enable that branch.

            artem_blagodarenko Artem Blagodarenko (Inactive) added a comment - Yes, we do not need this patch, because one of deadlock's branches is disabled in current master. But I have added comment " LU-2487 need to be resolved before branch can be enabled again" to LU-903 patch to prevent possible deadlock if somebody enable that branch.

            Which client kernel is this? The new dcache_lock removal is only in use for kernels > 2.6.37, so I guess this is SLES11 SP2 (3.0)? I'm trying to see where there is a deadlock in the code, because your original comment is not showing the callpath to the function getting the second lock. I'm guessing something like:

            CPU 10: ll_md_blocking_ast()
               + ll_inode_from_resource()/ll_inode_from_lock() gets lock->l_lock and lock->l_resource->lr_lock
               - igrab() wants lock->l_resource->lr_lvb_inode->i_lock
            
            CPU 7: sys_close()
               + dput() gets dentry->d_lock
               - ll_ddelete->find_cbdata->ldlm_resource_foreach() wants res->lr_lock
            
            CPU 15: sys_open
               + ll_splice_alias()/ll_find_alias() gets inode->i_lock 
               - ll_splice_alias()/ll_find_alias() wants dentry->d_lock
            

            The ll_ddelete->find_cbdata() path has been disabled in 9f3469f1:

                    /* Disable this piece of code temproarily because this is called
                     * inside dcache_lock so it's not appropriate to do lots of work
                     * here. */
            #if 0
                    /* if not ldlm lock for this inode, set i_nlink to 0 so that
                     * this inode can be recycled later b=20433 */
                    if (de->d_inode && !find_cbdata(de->d_inode))
                            clear_nlink(de->d_inode);
            #endif
            

            So it seems there is no need for this complex patch?

            adilger Andreas Dilger added a comment - Which client kernel is this? The new dcache_lock removal is only in use for kernels > 2.6.37, so I guess this is SLES11 SP2 (3.0)? I'm trying to see where there is a deadlock in the code, because your original comment is not showing the callpath to the function getting the second lock. I'm guessing something like: CPU 10: ll_md_blocking_ast() + ll_inode_from_resource()/ll_inode_from_lock() gets lock->l_lock and lock->l_resource->lr_lock - igrab() wants lock->l_resource->lr_lvb_inode->i_lock CPU 7: sys_close() + dput() gets dentry->d_lock - ll_ddelete->find_cbdata->ldlm_resource_foreach() wants res->lr_lock CPU 15: sys_open + ll_splice_alias()/ll_find_alias() gets inode->i_lock - ll_splice_alias()/ll_find_alias() wants dentry->d_lock The ll_ddelete->find_cbdata() path has been disabled in 9f3469f1: /* Disable this piece of code temproarily because this is called * inside dcache_lock so it's not appropriate to do lots of work * here. */ # if 0 /* if not ldlm lock for this inode, set i_nlink to 0 so that * this inode can be recycled later b=20433 */ if (de->d_inode && !find_cbdata(de->d_inode)) clear_nlink(de->d_inode); #endif So it seems there is no need for this complex patch?

            Xyratex MRP-675

            artem_blagodarenko Artem Blagodarenko (Inactive) added a comment - Xyratex MRP-675

            I set minor priority because race is possible only with patch from LU-903. I already have patch, but need to rebase it ot master.

            artem_blagodarenko Artem Blagodarenko (Inactive) added a comment - I set minor priority because race is possible only with patch from LU-903 . I already have patch, but need to rebase it ot master.

            People

              adilger Andreas Dilger
              artem_blagodarenko Artem Blagodarenko (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: