[LU-16958] migrate vs regular ops deadlock Created: 12/Jul/23 Updated: 20/Nov/23 Resolved: 18/Nov/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alex Zhuravlev | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
PID: 350193 TASK: ffff9bd65af446c0 CPU: 0 COMMAND: "getfattr"
#0 [ffff9bd63ffb7950] __schedule at ffffffffba5a232d
/tmp/kernel/kernel/sched/core.c: 3109
#1 [ffff9bd63ffb79d8] schedule at ffffffffba5a2748
/tmp/kernel/./arch/x86/include/asm/preempt.h: 84
#2 [ffff9bd63ffb79e8] rwsem_down_write_slowpath at ffffffffba0f41a7
/tmp/kernel/./arch/x86/include/asm/current.h: 15
#3 [ffff9bd63ffb7a88] down_write at ffffffffba5a691a
/tmp/kernel/./include/linux/err.h: 36
#4 [ffff9bd63ffb7ac0] vvp_inode_ops at ffffffffc116d57f [lustre]
/home/lustre/linux-4.18.0-305.25.1.el8_4/./arch/x86/include/asm/current.h: 15
#5 [ffff9bd63ffb7ae0] cl_object_inode_ops at ffffffffc0454a50 [obdclass]
/home/lustre/master-mine/lustre/obdclass/cl_object.c: 442
#6 [ffff9bd63ffb7b18] lov_conf_set at ffffffffc0aa36c4 [lov]
/home/lustre/master-mine/lustre/lov/lov_object.c: 1465
#7 [ffff9bd63ffb7b88] cl_conf_set at ffffffffc04542d8 [obdclass]
/home/lustre/master-mine/lustre/obdclass/cl_object.c: 299
#8 [ffff9bd63ffb7bb8] ll_layout_conf at ffffffffc111d110 [lustre]
/home/lustre/master-mine/lustre/llite/file.c: 5995
#9 [ffff9bd63ffb7c28] ll_layout_refresh at ffffffffc111dad3 [lustre]
/home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155
#10 [ffff9bd63ffb7cf0] vvp_io_init at ffffffffc116d019 [lustre]
/home/lustre/master-mine/lustre/llite/vvp_io.c: 1870
#11 [ffff9bd63ffb7d20] __cl_io_init at ffffffffc045e66f [obdclass]
/home/lustre/master-mine/lustre/obdclass/cl_io.c: 134
#12 [ffff9bd63ffb7d58] cl_glimpse_size0 at ffffffffc11642ca [lustre]
/home/lustre/master-mine/lustre/llite/glimpse.c: 204
#13 [ffff9bd63ffb7da0] ll_getattr_dentry at ffffffffc111c65d [lustre]
/home/lustre/master-mine/lustre/llite/llite_internal.h: 1677
#14 [ffff9bd63ffb7e50] vfs_statx at ffffffffba1d4be9
/tmp/kernel/fs/stat.c: 204
checking the stack on the process above inode was found at 0xffff9bd60367d350: crash> p *(struct ll_inode_info *)(0xffff9bd60367d350-0x150) lli_inode_magic = 287116773, ... lli_inode_lock_owner = 0xffff9bd68f51d380 now check task 0xffff9bd68f51d380:
crash> p *(struct task_struct *)0xffff9bd68f51d380|more
...
pid = 348428,
...
PID: 348428 TASK: ffff9bd68f51d380 CPU: 1 COMMAND: "lfs"
#0 [ffff9bd613c37968] __schedule at ffffffffba5a232d
/tmp/kernel/kernel/sched/core.c: 3109
#1 [ffff9bd613c379f0] schedule at ffffffffba5a2748
/tmp/kernel/./arch/x86/include/asm/preempt.h: 84
#2 [ffff9bd613c37a00] schedule_preempt_disabled at ffffffffba5a2a6c
/tmp/kernel/./arch/x86/include/asm/preempt.h: 79
#3 [ffff9bd613c37a08] __mutex_lock at ffffffffba5a3a40
/tmp/kernel/kernel/locking/mutex.c: 1038
#4 [ffff9bd613c37ac8] ll_layout_refresh at ffffffffc111d577 [lustre]
/home/lustre/master-mine/lustre/llite/llite_internal.h: 1536
#5 [ffff9bd613c37b88] vvp_io_init at ffffffffc116d019 [lustre]
/home/lustre/master-mine/lustre/llite/vvp_io.c: 1870
#6 [ffff9bd613c37bb8] __cl_io_init at ffffffffc045e66f [obdclass]
/home/lustre/master-mine/lustre/obdclass/cl_io.c: 134
#7 [ffff9bd613c37bf0] ll_ioc_data_version at ffffffffc110c665 [lustre]
/home/lustre/master-mine/lustre/llite/file.c: 3193
#8 [ffff9bd613c37c28] ll_migrate at ffffffffc111b244 [lustre]
/home/lustre/master-mine/lustre/llite/file.c: 3227
#9 [ffff9bd613c37ca8] ll_dir_ioctl at ffffffffc1105563 [lustre]
/home/lustre/master-mine/lustre/llite/dir.c: 2277
#10 [ffff9bd613c37e88] do_vfs_ioctl at ffffffffba1e3199
/tmp/kernel/fs/ioctl.c: 48
it seems this is an locking order issue: |
| Comments |
| Comment by Gerrit Updater [ 12/Jul/23 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51641 |
| Comment by Gerrit Updater [ 01/Aug/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51641/ |
| Comment by Peter Jones [ 01/Aug/23 ] |
|
Landed for 2.16 |
| Comment by Zhenyu Xu [ 21/Sep/23 ] |
|
another deadlock found T1:
vvp_io_init()
->ll_layout_refresh() <= take lli_layout_mutex
->ll_layout_intent()
->ll_take_md_lock() <= take the CR layout lock ref
->ll_layout_conf()
->vvp_prune()
->vvp_inode_ops() <= release lli_layout_mtex
->vvp_inode_ops() <= try to acquire lli_layout_mutex
-> racer wait here
T2:
->ll_file_write_iter()
->vvp_io_init()
->ll_layout_refresh() <= take lli_layout_mutex
->ll_layout_intent() <= Request layout from MDT
-> racer wait ...
T3: occure in PCC-RO attach, It can happen in normal case without PCC-RO.
->pcc_readonly_attach()
->ll_layout_intent_write()
->ll_intent_lock()
-> on MDT, it will try to obtain EX layout lock to change layout.
but the client T1 hold CR layout lock, and T2's lock request is in lock waiting list to wai for T3 finished, thus cause dealock...
|
| Comment by Zhenyu Xu [ 28/Sep/23 ] |
|
I thought deadlock due to this patch , but I reverted the essential part of this patch at https://review.whamcloud.com/52388, and the racer still hang at the server, looks more like LU-15491 |
| Comment by Qian Yingjin [ 11/Oct/23 ] |
|
Found another deadlock for parallel DIO: T1: writer Obtain DLM extent lock: L1=PW[0, EOF] T2: DIO reader: 50M data, iosize=64M, max_pages_per_rpc=1024 (4M) max_rpcs_in_flight=8 ll_direct_IO_impl() use all available RPC slots: number of read RPC in flight is 9 on the server side: ->tgt_brw_read() ->tgt_brw_lock() # server side locking -> Try to cancel the conflict locks on client: L1=PW[0, EOF] T3: reader take DLM lock ref on L1=PW[0, EOF] Read-ahead pages (prepare pages); wait for RPC slots to send the read RPCs to OST deadlock: T2->T3: T2 is waiting for T3 to release DLM extent lock L1; T3->T2: T3 is waiting for T2 finished to free RPC slots... The possible solution is that when found all RPC slots are used by srvlock DIO, and there are urgent I/O, force to send the I/O RPC to OST? |
| Comment by Andreas Dilger [ 27/Oct/23 ] |
|
Another patch was pushed under this ticket. |
| Comment by Andreas Dilger [ 27/Oct/23 ] |
There could definitely be multiple different issues affecting racer testing, so that doesn't mean the above patch is not fixing a problem. |
| Comment by Gerrit Updater [ 18/Nov/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52388/ |
| Comment by Peter Jones [ 18/Nov/23 ] |
|
Landed for 2.16 |