[LU-15285] same dir rename deadlock Created: 29/Nov/21  Updated: 11/Oct/23  Resolved: 31/Jan/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.0
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Critical
Reporter: Oleg Drokin Assignee: Oleg Drokin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
is related to LU-12125 Allow parallel rename of regular files Resolved
is related to LU-12834 MDT hung during failover Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

removal of big rename lock uncovered a deadlock situation.

When two threads are racing to perform the opposite rename:

mv a b &
mv b a 

we obtain PDO locks in the source-target order and it opens up the deadlock (easily seen in master)

 Deadlock!
ungranted lock
   -- Lock: 0xffff8800b3d24b40/0x45f9a916d27c17ce (pid: 14611)
       Resource: 8589935618:1:0:13873
       Req mode: PW, grant mode: --, read: 0, write: 1
       Bits: 0x2
waits for granted lock
   -- Lock: 0xffff8800b4795200/0x45f9a916d27c17c0 (pid: 12773)
       Resource: 8589935618:1:0:13873
       Req mode: PW, grant mode: PW, read: 0, write: 1
       Bits: 0x2
that is blocked waiting on another lock
   -- Lock: 0xffff8800b4797600/0x45f9a916d27c17d5 (pid: 12773)
       Resource: 8589935618:1:0:13361
       Req mode: PW, grant mode: --, read: 0, write: 1
       Bits: 0x2
that is held by the first thread wanting the first lock in the chain
   -- Lock: 0xffff8800b4513840/0x45f9a916d27c17c7 (pid: 14611)
       Resource: 8589935618:1:0:13361
       Req mode: PW, grant mode: PW, read: 0, write: 1
       Bits: 0x2
Deadlock!
ungranted lock
   -- Lock: 0xffff8800b4797600/0x45f9a916d27c17d5 (pid: 12773)
       Resource: 8589935618:1:0:13361
       Req mode: PW, grant mode: --, read: 0, write: 1
       Bits: 0x2
waits for granted lock
   -- Lock: 0xffff8800b4513840/0x45f9a916d27c17c7 (pid: 14611)
       Resource: 8589935618:1:0:13361
       Req mode: PW, grant mode: PW, read: 0, write: 1
       Bits: 0x2
that is blocked waiting on another lock
   -- Lock: 0xffff8800b3d24b40/0x45f9a916d27c17ce (pid: 14611)
       Resource: 8589935618:1:0:13873
       Req mode: PW, grant mode: --, read: 0, write: 1
       Bits: 0x2
that is held by the first thread wanting the first lock in the chain
   -- Lock: 0xffff8800b4795200/0x45f9a916d27c17c0 (pid: 12773)
       Resource: 8589935618:1:0:13873
       Req mode: PW, grant mode: PW, read: 0, write: 1
       Bits: 0x2
    rr_opcode = REINT_RENAME,
    rr_open_handle = 0x0,
    rr_lease_handle = 0x0,
    rr_fid1 = 0xffff8800a9d81b40,
    rr_fid2 = 0xffff8800a9d81b50,
    rr_name = {
      ln_name = 0xffff8800a9d81ba0 "14",
      ln_namelen = 2
    },
    rr_tgt_name = {
      ln_name = 0xffff8800a9d81ba8 "16",
      ln_namelen = 2
    },
---
    rr_opcode = REINT_RENAME,
    rr_open_handle = 0x0,
    rr_lease_handle = 0x0,
    rr_fid1 = 0xffff8800a9d84b58,
    rr_fid2 = 0xffff8800a9d84b68,
    rr_name = {
      ln_name = 0xffff8800a9d84bb8 "16",
      ln_namelen = 2
    },
    rr_tgt_name = {
      ln_name = 0xffff8800a9d84bc0 "14",
      ln_namelen = 2
    }, 
crash> bt 12773
PID: 12773  TASK: ffff8800b827c440  CPU: 3   COMMAND: "mdt00_005"
 #0 [ffff8800b7b37920] __schedule at ffffffff817e3e22
 #1 [ffff8800b7b37988] schedule at ffffffff817e4339
 #2 [ffff8800b7b37998] ldlm_completion_ast at ffffffffa05ec3dd [ptlrpc]
 #3 [ffff8800b7b37a38] ldlm_cli_enqueue_local at ffffffffa05ea219 [ptlrpc]
 #4 [ffff8800b7b37ad8] mdt_reint_rename at ffffffffa0d53948 [mdt]
 #5 [ffff8800b7b37bf0] mdt_reint_rec at ffffffffa0d5dfb7 [mdt]
 #6 [ffff8800b7b37c18] mdt_reint_internal at ffffffffa0d32acc [mdt]
 #7 [ffff8800b7b37c58] mdt_reint at ffffffffa0d3d647 [mdt]
 #8 [ffff8800b7b37c88] tgt_request_handle at ffffffffa06852be [ptlrpc]
 #9 [ffff8800b7b37d18] ptlrpc_server_handle_request at ffffffffa06309c0 [ptlrpc]
#10 [ffff8800b7b37dd0] ptlrpc_main at ffffffffa0632559 [ptlrpc]
#11 [ffff8800b7b37ea8] kthread at ffffffff810ba114
#12 [ffff8800b7b37f50] ret_from_fork_nospec_begin at ffffffff817f1e5d
crash> bt 14611
PID: 14611  TASK: ffff8800c930b330  CPU: 1   COMMAND: "mdt00_013"
 #0 [ffff8800b7efb920] __schedule at ffffffff817e3e22
 #1 [ffff8800b7efb988] schedule at ffffffff817e4339
 #2 [ffff8800b7efb998] ldlm_completion_ast at ffffffffa05ec3dd [ptlrpc]
 #3 [ffff8800b7efba38] ldlm_cli_enqueue_local at ffffffffa05ea219 [ptlrpc]
 #4 [ffff8800b7efbad8] mdt_reint_rename at ffffffffa0d53948 [mdt]
 #5 [ffff8800b7efbbf0] mdt_reint_rec at ffffffffa0d5dfb7 [mdt]
 #6 [ffff8800b7efbc18] mdt_reint_internal at ffffffffa0d32acc [mdt]
 #7 [ffff8800b7efbc58] mdt_reint at ffffffffa0d3d647 [mdt]
 #8 [ffff8800b7efbc88] tgt_request_handle at ffffffffa06852be [ptlrpc]
 #9 [ffff8800b7efbd18] ptlrpc_server_handle_request at ffffffffa06309c0 [ptlrpc]
#10 [ffff8800b7efbdd0] ptlrpc_main at ffffffffa0632559 [ptlrpc]
#11 [ffff8800b7efbea8] kthread at ffffffff810ba114
#12 [ffff8800b7efbf50] ret_from_fork_nospec_begin at ffffffff817f1e5d 

Additionally while looking at this code it's not yet clear to me why do we need mdt_pdir_hash_lock at all since everythning it does is also done by the mdt_object_local_lock (that we get into from mdt_object_lock_save call)



 Comments   
Comment by Gerrit Updater [ 29/Nov/21 ]

"Oleg Drokin <green@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45676
Subject: LU-15285 mdt: fix same-dir racing rename deadlock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4fdbe3209492fd567ed6cee3f6c876776c41f122

Comment by Andreas Dilger [ 30/Nov/21 ]

Oleg, I was thinking about your mdt_pdir_hash_lock vs. mdt_object_local_lock comment. If they were taking exactly the same lock, wouldn't the thread deadlock on itself in that case?

Comment by Gerrit Updater [ 31/Jan/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45676/
Subject: LU-15285 mdt: fix same-dir racing rename deadlock
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 82ec537d8b4cc9261828f4efe6b03d8d33f38432

Comment by Peter Jones [ 31/Jan/22 ]

Landed for 2.15

Generated at Sat Feb 10 03:17:01 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.