[LU-15535] deadlock on lli->lli_lsm_sem Created: 08/Feb/22 Updated: 21/Aug/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Andriy Skulysh | Assignee: | Lai Siyao |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
<struct rw_semaphore 0xffff91737c998370> counter: 102 owner: ffff9174359adf00 PID: 233888 TASK: ffff9174748a5f00 CPU: 23 COMMAND: "mv" #1 [ffffbba8a4b0f588] schedule at ffffffff9594a448 #2 [ffffbba8a4b0f598] rwsem_down_write_slowpath at ffffffff9513c57a #3 [ffffbba8a4b0f640] ll_update_default_lsm_md at ffffffffc1339fe3 [lustre] #4 [ffffbba8a4b0f678] ll_update_lsm_md at ffffffffc1340439 [lustre] #5 [ffffbba8a4b0f6e0] ll_update_inode at ffffffffc1344766 [lustre] #6 [ffffbba8a4b0f718] ll_iget at ffffffffc1357a47 [lustre] #7 [ffffbba8a4b0f738] ll_prep_inode at ffffffffc1346d5a [lustre] #8 [ffffbba8a4b0f800] ll_lookup_it_finish.constprop.28 at ffffffffc1358611 [lustre] #9 [ffffbba8a4b0f8e8] ll_lookup_it at ffffffffc1359cb9 [lustre] #10 [ffffbba8a4b0fb58] ll_lookup_nd at ffffffffc135ca8c [lustre] #11 [ffffbba8a4b0fbe0] __lookup_slow at ffffffff95325667 PID: 242859 TASK: ffff9174359adf00 CPU: 23 COMMAND: "rm"
#2 [ffffbba88ba1f7f8] schedule_timeout at ffffffff9594dad3
#3 [ffffbba88ba1f890] ldlm_completion_ast at ffffffffc10409bc [ptlrpc]
#4 [ffffbba88ba1f920] ldlm_cli_enqueue_fini at ffffffffc103fa9d [ptlrpc]
#5 [ffffbba88ba1f990] ldlm_cli_enqueue at ffffffffc10433ed [ptlrpc]
#6 [ffffbba88ba1fa38] mdc_enqueue_base at ffffffffc119d24d [mdc]
#7 [ffffbba88ba1fb30] mdc_intent_lock at ffffffffc119f1b9 [mdc]
#8 [ffffbba88ba1fbd8] mdc_read_page at ffffffffc118c42c [mdc]
#9 [ffffbba88ba1fcb8] lmv_read_page at ffffffffc12e22d0 [lmv]
#10 [ffffbba88ba1fd00] ll_get_dir_page at ffffffffc1313c5f [lustre]
#11 [ffffbba88ba1fd48] ll_dir_read at ffffffffc1313f38 [lustre]
ll_prep_md_op_data()
#12 [ffffbba88ba1fe00] ll_iterate at ffffffffc13144dc [lustre]
|
| Comments |
| Comment by Oleg Drokin [ 08/Feb/22 ] |
|
from racer or how did it come to be? |
| Comment by Patrick Farrell [ 08/Feb/22 ] |
|
What is the actual deadlock? What's the cycle here? Basically, how is 242859 waiting for 233888? Or is it something simpler, like 242859 should not be holding the semaphore at this time? |
| Comment by Andriy Skulysh [ 09/Feb/22 ] |
|
242859 takes lli_lsm_sem and sends a lock_enqueue 233888 processes lock reply and tries to acquire lli_lsm_sem to update directory striping |
| Comment by Lai Siyao [ 10/Feb/22 ] |
|
IMO mdc_read_page() doesn't need to enqueue lock when reading directory page, because it doesn't guarantee anything, when it finishes reading and unlocks, the real directory content may be changed on server side at any time. |
| Comment by Lai Siyao [ 12/Feb/22 ] |
|
mdc_read_page() could revalidate lock first, if there is a lock already (quite likely, because readdir will getattr first), and then continue reading dir page, and after reading page, revalidate lock again, if lock is still valid, and lock handle unchanged, the page read can be kept in page cache, otherwise discard the page read. |
| Comment by Etienne Aujames [ 16/Feb/22 ] |
|
Hi, What are the symptoms/consequences of this ? Is this resulting to client threads endlessly hang or is this resulting to a client eviction? |
| Comment by Gerrit Updater [ 18/Feb/22 ] |
|
"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46551 |
| Comment by Andreas Dilger [ 12/Mar/22 ] |
|
please see also LU-3308 #comment-58611 and the following one, and patch https://review.whamcloud.com/7909 " There is no requirement under POSIX that multiple readdir() calls be completely coherent with files being created/unlinked in the directory, so long as this is consistent from the time the directory open() call, or if rewinddir() is called, so it would also be possible to read multiple pages into a temporary cache for the file descriptor, and discard those pages when the fd is closed or on rewinddir(). |
| Comment by Andriy Skulysh [ 14/Mar/22 ] |
|
The same deadlock happens with getattr: PID: 13996 TASK: ffff98f0791c0000 CPU: 1 COMMAND: "setfattr" #0 [ffffa7850d8fb5a8] __schedule at ffffffff91349fac #1 [ffffa7850d8fb638] schedule at ffffffff9134a448 #2 [ffffa7850d8fb648] rwsem_down_write_slowpath at ffffffff90b3c57a #3 [ffffa7850d8fb6f0] ll_update_default_lsm_md at ffffffffc14100a3 [lustre] #4 [ffffa7850d8fb728] ll_update_lsm_md at ffffffffc1416369 [lustre] #5 [ffffa7850d8fb790] ll_update_inode at ffffffffc141a696 [lustre] #6 [ffffa7850d8fb7c8] ll_iget at ffffffffc142dc57 [lustre] #7 [ffffa7850d8fb7e8] ll_prep_inode at ffffffffc141d01a [lustre] #8 [ffffa7850d8fb8b0] ll_lookup_it_finish.constprop.28 at ffffffffc142e821 [lustre] #9 [ffffa7850d8fb998] ll_lookup_it at ffffffffc142fec9 [lustre] #10 [ffffa7850d8fbc08] ll_lookup_nd at ffffffffc1432c9c [lustre] #11 [ffffa7850d8fbc90] __lookup_slow at ffffffff90d25667 PID: 13876 TASK: ffff98f09f570000 CPU: 7 COMMAND: "mkdir" #0 [ffffa7850d94ba70] __schedule at ffffffff91349fac #1 [ffffa7850d94bb00] schedule at ffffffff9134a448 #2 [ffffa7850d94bb10] schedule_timeout at ffffffff9134dad3 #3 [ffffa7850d94bba8] ptlrpc_set_wait at ffffffffc1131a80 [ptlrpc] #4 [ffffa7850d94bc28] ptlrpc_queue_wait at ffffffffc1131c71 [ptlrpc] #5 [ffffa7850d94bc40] mdc_getattr_common at ffffffffc12619b0 [mdc] #6 [ffffa7850d94bc70] mdc_getattr at ffffffffc12621ce [mdc] #7 [ffffa7850d94bcc0] lmv_getattr at ffffffffc13b54fe [lmv] #8 [ffffa7850d94bcf8] ll_dir_get_default_layout at ffffffffc13e7d32 [lustre] #9 [ffffa7850d94bd68] ll_dir_getstripe at ffffffffc13eb181 [lustre] #10 [ffffa7850d94bdb8] ll_new_node at ffffffffc1434257 [lustre] #11 [ffffa7850d94be90] ll_mkdir at ffffffffc1434faf [lustre] #12 [ffffa7850d94beb8] vfs_mkdir at ffffffff90d27aa2 |
| Comment by Lai Siyao [ 14/Mar/22 ] |
|
Why mdc_getattr() get stuck? I don't see ldlm lock involved. |
| Comment by Andriy Skulysh [ 14/Mar/22 ] |
|
For example pid 13996 sends, lock request, than 8 conflicting lock requests are send, pid 13876 takes the mutex and sends getattr request but it can't be sent because max_rpc_in_flight limit. |
| Comment by Andriy Skulysh [ 15/Mar/22 ] |
|
another failure: PID: 25082 TASK: ffff948ed3e79680 CPU: 0 COMMAND: "ls" #0 [ffffa6fa81d5f9f0] __schedule at ffffffff8eb47d74 #1 [ffffa6fa81d5fa88] schedule at ffffffff8eb481e8 #2 [ffffa6fa81d5fa98] rwsem_down_write_slowpath at ffffffff8e33c02a #3 [ffffa6fa81d5fb40] ll_update_default_lsm_md at ffffffffc0f9a0a3 [lustre] #4 [ffffa6fa81d5fb78] ll_update_lsm_md at ffffffffc0fa0369 [lustre] #5 [ffffa6fa81d5fbe0] ll_update_inode at ffffffffc0fa4696 [lustre] #6 [ffffa6fa81d5fc18] ll_prep_inode at ffffffffc0fa6dfd [lustre] #7 [ffffa6fa81d5fce0] ll_revalidate_it_finish at ffffffffc0f701c6 [lustre] #8 [ffffa6fa81d5fd30] ll_inode_revalidate at ffffffffc0f82382 [lustre] #9 [ffffa6fa81d5fdb0] ll_getattr_dentry at ffffffffc0f8f0be [lustre] #10 [ffffa6fa81d5fe60] vfs_statx_fd at ffffffff8e51e2c4 PID: 25074 TASK: ffff948ed1824380 CPU: 0 COMMAND: "ls" #0 [ffffa6fa8183b850] __schedule at ffffffff8eb47d74 #1 [ffffa6fa8183b8e8] schedule at ffffffff8eb481e8 #2 [ffffa6fa8183b8f8] schedule_timeout at ffffffff8eb4b873 #3 [ffffa6fa8183b990] ptlrpc_set_wait at ffffffffc0cd2a80 [ptlrpc] #4 [ffffa6fa8183ba10] ptlrpc_queue_wait at ffffffffc0cd2c71 [ptlrpc] #5 [ffffa6fa8183ba28] ldlm_cli_enqueue at ffffffffc0cba3d7 [ptlrpc] #6 [ffffa6fa8183bab0] mdc_enqueue_base at ffffffffc0f0c3fd [mdc] #7 [ffffa6fa8183bba8] mdc_intent_lock at ffffffffc0f0e369 [mdc] #8 [ffffa6fa8183bc50] lmv_intent_lookup at ffffffffc0e17702 [lmv] #9 [ffffa6fa8183bcb8] lmv_intent_lock at ffffffffc0e1814c [lmv] #10 [ffffa6fa8183bd30] ll_inode_revalidate at ffffffffc0f8235d [lustre] #11 [ffffa6fa8183bdb0] ll_getattr_dentry at ffffffffc0f8f0be [lustre] #12 [ffffa6fa8183be60] vfs_statx_fd at ffffffff8e51e2c4 Pid 25074 send lock enqueue for not conflicting lock with pid 25082 but MDS may have conflicting lock enqueue from another client |
| Comment by Gerrit Updater [ 31/Mar/23 ] |
|
"Vitaly Fertman <vitaly.fertman@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50488 |
| Comment by Gerrit Updater [ 31/Mar/23 ] |
|
"Vitaly Fertman <vitaly.fertman@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50489 |
| Comment by Gerrit Updater [ 19/Jul/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50488/ |
| Comment by Gerrit Updater [ 19/Jul/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50489/ |