Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4725

wrong lock ordering in rename leads to deadlocks

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0, Lustre 2.5.4
    • Lustre 2.1.6, Lustre 2.6.0, Lustre 2.5.2, Lustre 2.4.3
    • 3
    • 12984

    Description

      the current rename code locks objects in the order: src parent, dst parent, src child, dst child. it may happen that dst is a parent of src, what may lead to deadlock.

      example from a core dump:
      res1 - dst parent
      res2 - dst parent, PDO
      res3 - src parent

      Thread 1 (T1), rename:
      Has RES3 (CW,0x2)
      Wants RES1 (CW,0x2)

      Thread 2 (T2), getattr:
      Has RES1(CR,0x2)
      Has RES2(PR,0x2)
      Wants RES3(PR,0x2) - blocked by T1

      Thread 3 (T3), create or open|create
      Has RES1(CW,0x2)
      Wants RES2(PW,0x2) - blocked by T2

      Thread4 (T4), getattr or similar
      Wants RES1(PR,0x2) - blocked by T3

      T1 has no conflicts, but is sitting in the waiting queue behind T4, thus not granted.

      Attachments

        Issue Links

          Activity

            [LU-4725] wrong lock ordering in rename leads to deadlocks
            yujian Jian Yu added a comment -

            Just combined the above two patches into one: http://review.whamcloud.com/11615

            yujian Jian Yu added a comment - Just combined the above two patches into one: http://review.whamcloud.com/11615

            The result for first patch shows two tests, this because first patch is not complete.
            All 19 tests pass for second patch. (I was not sure where to mention it in gerrit, so updating here).

            http://review.whamcloud.com/#/c/10916/
            http://review.whamcloud.com/#/c/10917/

            rdeshmukh_xyratex Rahul Deshmukh (Inactive) added a comment - The result for first patch shows two tests, this because first patch is not complete. All 19 tests pass for second patch. (I was not sure where to mention it in gerrit, so updating here). http://review.whamcloud.com/#/c/10916/ http://review.whamcloud.com/#/c/10917/

            I have back ported LU-4725 patches to b2_5 and here are the links:

            http://review.whamcloud.com/#/c/10916/
            http://review.whamcloud.com/#/c/10917/

            Not sure if I need to create new bug for this? Please help.

            rdeshmukh_xyratex Rahul Deshmukh (Inactive) added a comment - I have back ported LU-4725 patches to b2_5 and here are the links: http://review.whamcloud.com/#/c/10916/ http://review.whamcloud.com/#/c/10917/ Not sure if I need to create new bug for this? Please help.

            Patches landed to Master. Closing this ticket and LU-5144 as the patch moved to that ticket was also landed.

            jlevi Jodi Levi (Inactive) added a comment - Patches landed to Master. Closing this ticket and LU-5144 as the patch moved to that ticket was also landed.

            the last patch is moved to a separate ticket: LU-5144

            vitaly_fertman Vitaly Fertman added a comment - the last patch is moved to a separate ticket: LU-5144

            independently of the previous fixes, but still related to rename, another deadlock is found:
            http://review.whamcloud.com/10570

            vitaly_fertman Vitaly Fertman added a comment - independently of the previous fixes, but still related to rename, another deadlock is found: http://review.whamcloud.com/10570
            vitaly_fertman Vitaly Fertman added a comment - alternatives for the last problem: http://review.whamcloud.com/10342 http://review.whamcloud.com/10484
            green Oleg Drokin added a comment -

            Sothe current deadlocks in rename I am observinghave a trace like this:

            [24234.786010] Call Trace:
            [24234.786802]  [<ffffffffa052a304>] ? htable_lookup+0x1c4/0x1e0 [obdclass]
            [24234.787351]  [<ffffffffa052a91b>] lu_object_find_at+0xab/0x350 [obdclass]
            [24234.787869]  [<ffffffff8105de00>] ? default_wake_function+0x0/0x20
            [24234.788398]  [<ffffffffa052abd6>] lu_object_find+0x16/0x20 [obdclass]
            [24234.788918]  [<ffffffffa0ce4e06>] mdt_object_find+0x56/0x170 [mdt]
            [24234.789480]  [<ffffffffa0d03e32>] mdt_rename_sanity+0xa2/0x4a0 [mdt]
            [24234.790006]  [<ffffffffa0d08ad8>] mdt_reint_rename_internal+0xe08/0x1a80 [mdt]
            [24234.790890]  [<ffffffffa06be630>] ? ldlm_lock_enqueue+0x1d0/0x8b0 [ptlrpc]
            [24234.791424]  [<ffffffffa0d09994>] mdt_reint_rename_or_migrate+0x244/0x660 [mdt]
            [24234.792291]  [<ffffffffa06dc6c0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc]
            [24234.792832]  [<ffffffffa06ddd30>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc]
            [24234.793414]  [<ffffffffa0d09de0>] mdt_reint_rename+0x10/0x20 [mdt]
            [24234.793959]  [<ffffffffa0d024b1>] mdt_reint_rec+0x41/0xe0 [mdt]
            [24234.795042]  [<ffffffffa0ce7e03>] mdt_reint_internal+0x4c3/0x7c0 [mdt]
            [24234.795374]  [<ffffffffa0ce868b>] mdt_reint+0x6b/0x120 [mdt]
            [24234.795915]  [<ffffffffa0767b7c>] tgt_request_handle+0x23c/0xac0 [ptlrpc]
            [24234.796475]  [<ffffffffa0718fb8>] ptlrpc_main+0xcc8/0x1950 [ptlrpc]
            [24234.797078]  [<ffffffffa07182f0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
            [24234.797579]  [<ffffffff81098c06>] kthread+0x96/0xa0
            [24234.798041]  [<ffffffff8100c24a>] child_rip+0xa/0x20
            [24234.798495]  [<ffffffff81098b70>] ? kthread+0x0/0xa0
            [24234.798981]  [<ffffffff8100c240>] ? child_rip+0x0/0x20
            [24234.799449] 
            
            (gdb) l *mdt_rename_sanity+0xa2
            0x23e82 is in mdt_rename_sanity (/home/green/git/lustre-release-double/lustre/mdt/mdt_reint.c:1263).
            1258			return 0;
            1259	
            1260		do {
            1261			LASSERT(fid_is_sane(&dst_fid));
            1262			dst = mdt_object_find(info->mti_env, info->mti_mdt, &dst_fid);
            1263			if (!IS_ERR(dst)) {
            1264				rc = mdo_is_subdir(info->mti_env,
            1265						   mdt_object_child(dst), fid,
            1266						   &dst_fid);
            1267				mdt_object_put(info->mti_env, dst);
            (gdb) l *(mdt_reint_rename_internal+0xe08)
            0x28b28 is in mdt_reint_rename_internal (/home/green/git/lustre-release-double/lustre/mdt/mdt_reint.c:1783).
            1778			GOTO(out_unlock_parents, rc = PTR_ERR(mold));
            1779	
            1780		/* Check if @mtgtdir is subdir of @mold, before locking child
            1781		 * to avoid reverse locking. */
            1782		rc = mdt_rename_sanity(info, rr->rr_fid2, old_fid);
            1783		if (rc)
            1784			GOTO(out_put_old, rc);
            1785	
            1786		tgt_vbr_obj_set(info->mti_env, mdt_obj2dt(mold));
            1787		/* save version after locking */
            

            The apparent problem is such that while we are holding object reference on target (or it's parent) after we looked it up it got unlinked and attempted to be destroyed, but we are trying to find it again ewhile lholding one reference and get deadlocked in find_object_at waiting for all referenced to go away due to HEARD_BANSHEE being set.

            green Oleg Drokin added a comment - Sothe current deadlocks in rename I am observinghave a trace like this: [24234.786010] Call Trace: [24234.786802] [<ffffffffa052a304>] ? htable_lookup+0x1c4/0x1e0 [obdclass] [24234.787351] [<ffffffffa052a91b>] lu_object_find_at+0xab/0x350 [obdclass] [24234.787869] [<ffffffff8105de00>] ? default_wake_function+0x0/0x20 [24234.788398] [<ffffffffa052abd6>] lu_object_find+0x16/0x20 [obdclass] [24234.788918] [<ffffffffa0ce4e06>] mdt_object_find+0x56/0x170 [mdt] [24234.789480] [<ffffffffa0d03e32>] mdt_rename_sanity+0xa2/0x4a0 [mdt] [24234.790006] [<ffffffffa0d08ad8>] mdt_reint_rename_internal+0xe08/0x1a80 [mdt] [24234.790890] [<ffffffffa06be630>] ? ldlm_lock_enqueue+0x1d0/0x8b0 [ptlrpc] [24234.791424] [<ffffffffa0d09994>] mdt_reint_rename_or_migrate+0x244/0x660 [mdt] [24234.792291] [<ffffffffa06dc6c0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] [24234.792832] [<ffffffffa06ddd30>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc] [24234.793414] [<ffffffffa0d09de0>] mdt_reint_rename+0x10/0x20 [mdt] [24234.793959] [<ffffffffa0d024b1>] mdt_reint_rec+0x41/0xe0 [mdt] [24234.795042] [<ffffffffa0ce7e03>] mdt_reint_internal+0x4c3/0x7c0 [mdt] [24234.795374] [<ffffffffa0ce868b>] mdt_reint+0x6b/0x120 [mdt] [24234.795915] [<ffffffffa0767b7c>] tgt_request_handle+0x23c/0xac0 [ptlrpc] [24234.796475] [<ffffffffa0718fb8>] ptlrpc_main+0xcc8/0x1950 [ptlrpc] [24234.797078] [<ffffffffa07182f0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc] [24234.797579] [<ffffffff81098c06>] kthread+0x96/0xa0 [24234.798041] [<ffffffff8100c24a>] child_rip+0xa/0x20 [24234.798495] [<ffffffff81098b70>] ? kthread+0x0/0xa0 [24234.798981] [<ffffffff8100c240>] ? child_rip+0x0/0x20 [24234.799449] (gdb) l *mdt_rename_sanity+0xa2 0x23e82 is in mdt_rename_sanity (/home/green/git/lustre-release-double/lustre/mdt/mdt_reint.c:1263). 1258 return 0; 1259 1260 do { 1261 LASSERT(fid_is_sane(&dst_fid)); 1262 dst = mdt_object_find(info->mti_env, info->mti_mdt, &dst_fid); 1263 if (!IS_ERR(dst)) { 1264 rc = mdo_is_subdir(info->mti_env, 1265 mdt_object_child(dst), fid, 1266 &dst_fid); 1267 mdt_object_put(info->mti_env, dst); (gdb) l *(mdt_reint_rename_internal+0xe08) 0x28b28 is in mdt_reint_rename_internal (/home/green/git/lustre-release-double/lustre/mdt/mdt_reint.c:1783). 1778 GOTO(out_unlock_parents, rc = PTR_ERR(mold)); 1779 1780 /* Check if @mtgtdir is subdir of @mold, before locking child 1781 * to avoid reverse locking. */ 1782 rc = mdt_rename_sanity(info, rr->rr_fid2, old_fid); 1783 if (rc) 1784 GOTO(out_put_old, rc); 1785 1786 tgt_vbr_obj_set(info->mti_env, mdt_obj2dt(mold)); 1787 /* save version after locking */ The apparent problem is such that while we are holding object reference on target (or it's parent) after we looked it up it got unlinked and attempted to be destroyed, but we are trying to find it again ewhile lholding one reference and get deadlocked in find_object_at waiting for all referenced to go away due to HEARD_BANSHEE being set.

            Reopened because of a new problem found. http://review.whamcloud.com/10342

            adilger Andreas Dilger added a comment - Reopened because of a new problem found. http://review.whamcloud.com/10342
            pjones Peter Jones added a comment -

            Landed for 2.6

            pjones Peter Jones added a comment - Landed for 2.6

            People

              hongchao.zhang Hongchao Zhang
              vitaly_fertman Vitaly Fertman
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: