[LU-16315] sync I/O stuck on refreshing layout for PFL Created: 15/Nov/22  Updated: 16/Nov/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Li Xi Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File reproduce_sync_stuck.sh    
Rank (Obsolete): 9223372036854775807

 Description   

When testing PFL, got the client stuck.

 

[  360.010400] INFO: task dd:1542 blocked for more than 120 seconds.
[  360.014017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.018530] dd              D ffff8c3578f10000     0  1542      1 0x00000080
[  360.022370] Call Trace:
[  360.023463]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.026263]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.029179]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.031242]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.033751]  [<ffffffff8faad8b0>] ? requeue_timers+0x170/0x170
[  360.036043]  [<ffffffff8fac6b41>] ? remove_wait_queue+0x31/0x40
[  360.038309]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.040660]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.043188]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.045369]  [<ffffffffc0b8ca78>] cl_sync_file_range+0x1b8/0x380 [lustre]
[  360.047729]  [<ffffffffc0b8cf01>] ll_fsync+0x2c1/0x4e0 [lustre]
[  360.049821]  [<ffffffff8fc8408f>] generic_write_sync+0x4f/0x70
[  360.052050]  [<ffffffffc0bd77b0>] vvp_io_write_start+0x660/0xd40 [lustre]
[  360.054437]  [<ffffffffc0a3a265>] ? lov_lock_enqueue+0x95/0x150 [lov]
[  360.056721]  [<ffffffff8fac7050>] ? wake_bit_function_rh+0x40/0x40
[  360.058785]  [<ffffffffc0576d20>] cl_io_start+0x70/0x140 [obdclass]
[  360.060868]  [<ffffffffc057922f>] cl_io_loop+0x9f/0x200 [obdclass]
[  360.062917]  [<ffffffffc0b864cc>] ll_file_io_generic+0x90c/0xe50 [lustre]
[  360.065092]  [<ffffffffc0b86ca5>] ll_file_aio_write+0x295/0x6d0 [lustre]
[  360.067185]  [<ffffffffc0b871e0>] ll_file_write+0x100/0x1c0 [lustre]
[  360.069146]  [<ffffffff8fc4e590>] vfs_write+0xc0/0x1f0
[  360.070728]  [<ffffffff8fc4f36f>] SyS_write+0x7f/0xf0
[  360.072323]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.074179] INFO: task dd:1544 blocked for more than 120 seconds.
[  360.076000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.078951] dd              D ffff8c3575eb12c0     0  1544      1 0x00000080
[  360.082129] Call Trace:
[  360.083834]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.087082]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.090202]  [<ffffffff8fadae02>] ? default_wake_function+0x12/0x20
[  360.093104]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.095580]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.098712]  [<ffffffffc084b370>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[  360.101765]  [<ffffffffc0873293>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[  360.104431]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.106929]  [<ffffffffc0b0db93>] ? mdc_reint+0xd3/0x160 [mdc]
[  360.109495]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.112127]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.114365]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.116748]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.119143]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.121326]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.123258]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.125079]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.127025]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.129204]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.131046]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.132917] INFO: task dd:1546 blocked for more than 120 seconds.
[  360.134923] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.137185] dd              D ffff8c35359d0000     0  1546      1 0x00000080
[  360.139334] Call Trace:
[  360.140439]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.142365]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.144239]  [<ffffffff8fadae02>] ? default_wake_function+0x12/0x20
[  360.146119]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.147743]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.149742]  [<ffffffffc0545a59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[  360.152566]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.155100]  [<ffffffffc0566803>] ? lu_context_init+0xd3/0x1f0 [obdclass]
[  360.157804]  [<ffffffffc056693a>] ? lu_env_init+0x1a/0x30 [obdclass]
[  360.160377]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.163160]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.165699]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.168375]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.171046]  [<ffffffffc0bb0978>] ? ll_stats_ops_tally+0x98/0x100 [lustre]
[  360.173759]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.176170]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.178468]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.180644]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.183035]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.185726]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.187880]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.190291] INFO: task dd:1548 blocked for more than 120 seconds.
[  360.192658] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.195512] dd              D ffff8c3575eb2580     0  1548      1 0x00000080
[  360.198271] Call Trace:
[  360.199497]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.201642]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.203630]  [<ffffffff8fadae02>] ? default_wake_function+0x12/0x20
[  360.205629]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.207363]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.209491]  [<ffffffffc0545a59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[  360.211639]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.213527]  [<ffffffffc0566803>] ? lu_context_init+0xd3/0x1f0 [obdclass]
[  360.215616]  [<ffffffffc056693a>] ? lu_env_init+0x1a/0x30 [obdclass]
[  360.218169]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.221050]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.223613]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.226325]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.229000]  [<ffffffffc0bb0978>] ? ll_stats_ops_tally+0x98/0x100 [lustre]
[  360.231775]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.234252]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.236582]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.238765]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.241177]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.243907]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.246121]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.248472] INFO: task dd:1550 blocked for more than 120 seconds.
[  360.250771] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.253713] dd              D ffff8c3575eb4b00     0  1550      1 0x00000080
[  360.256549] Call Trace:
[  360.257818]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.260412]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.262933]  [<ffffffff8fadae02>] ? default_wake_function+0x12/0x20
[  360.265412]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.267590]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.270278]  [<ffffffffc084b370>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[  360.272960]  [<ffffffffc0545a59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[  360.275668]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.278056]  [<ffffffffc0566803>] ? lu_context_init+0xd3/0x1f0 [obdclass]
[  360.280641]  [<ffffffffc056693a>] ? lu_env_init+0x1a/0x30 [obdclass]
[  360.282985]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.285188]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.287171]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.289272]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.291359]  [<ffffffffc0bb0978>] ? ll_stats_ops_tally+0x98/0x100 [lustre]
[  360.293485]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.295310]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.297073]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.298735]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.300574]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.302787]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.305020]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.307466] INFO: task dd:1552 blocked for more than 120 seconds.
[  360.309924] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.312957] dd              D ffff8c3577ff2580     0  1552      1 0x00000080
[  360.315859] Call Trace:
[  360.317176]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.319743]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.322237]  [<ffffffff8fadae02>] ? default_wake_function+0x12/0x20
[  360.324739]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.326887]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.329540]  [<ffffffffc084b370>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[  360.332210]  [<ffffffffc0545a59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[  360.334944]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.337388]  [<ffffffffc0566803>] ? lu_context_init+0xd3/0x1f0 [obdclass]
[  360.339590]  [<ffffffffc056693a>] ? lu_env_init+0x1a/0x30 [obdclass]
[  360.341641]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.343843]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.345835]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.348485]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.350812]  [<ffffffffc0bb0978>] ? ll_stats_ops_tally+0x98/0x100 [lustre]
[  360.353149]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.355632]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.358007]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.360257]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.362753]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.365512]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.367642]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.370023] INFO: task dd:1553 blocked for more than 120 seconds.
[  360.372519] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.375566] dd              D ffff8c3577ea2100     0  1553      1 0x00000080
[  360.378447] Call Trace:
[  360.379713]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.382254]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.384734]  [<ffffffff8fadae02>] ? default_wake_function+0x12/0x20
[  360.387242]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.389380]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.391983]  [<ffffffffc084b370>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[  360.394676]  [<ffffffffc0873293>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[  360.397328]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.399580]  [<ffffffffc0b0db93>] ? mdc_reint+0xd3/0x160 [mdc]
[  360.401481]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.404077]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.406518]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.409073]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.411643]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.413942]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.416027]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.418064]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.420292]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.422788]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.424538]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.426441] INFO: task dd:1554 blocked for more than 120 seconds.
[  360.428364] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.430641] dd              D ffff8c3577ff3200     0  1554      1 0x00000080
[  360.432802] Call Trace:
[  360.433824]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.436037]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.437969]  [<ffffffffc05667c6>] ? lu_context_init+0x96/0x1f0 [obdclass]
[  360.440009]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.441665]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.443668]  [<ffffffffc05667c6>] ? lu_context_init+0x96/0x1f0 [obdclass]
[  360.446100]  [<ffffffffc0545a59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[  360.448858]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.451385]  [<ffffffffc0566803>] ? lu_context_init+0xd3/0x1f0 [obdclass]
[  360.454093]  [<ffffffffc056693a>] ? lu_env_init+0x1a/0x30 [obdclass]
[  360.456657]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.459390]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.461852]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.464397]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.466973]  [<ffffffffc0bb0978>] ? ll_stats_ops_tally+0x98/0x100 [lustre]
[  360.469597]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.471907]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.473816]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.475572]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.477526]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.479686]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.481434]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.483323] INFO: task dd:1555 blocked for more than 120 seconds.
[  360.485141] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.487359] dd              D ffff8c3577ea0000     0  1555      1 0x00000080
[  360.489460] Call Trace:
[  360.490507]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.492432]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.494284]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.495913]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.497887]  [<ffffffffc0545a59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[  360.499951]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.501812]  [<ffffffffc0566803>] ? lu_context_init+0xd3/0x1f0 [obdclass]
[  360.503857]  [<ffffffffc056693a>] ? lu_env_init+0x1a/0x30 [obdclass]
[  360.505747]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.507767]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.509599]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.511507]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.513432]  [<ffffffffc0bb0978>] ? ll_stats_ops_tally+0x98/0x100 [lustre]
[  360.515433]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.517218]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.518923]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.520542]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.522326]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.524853]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.527127]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a
[  360.529619] INFO: task dd:1558 blocked for more than 120 seconds.
[  360.532135] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.535914] dd              D ffff8c3578473180     0  1558      1 0x00000080
[  360.539991] Call Trace:
[  360.541984]  [<ffffffff9018d309>] schedule_preempt_disabled+0x29/0x70
[  360.545418]  [<ffffffff9018b457>] __mutex_lock_slowpath+0xc7/0x1d0
[  360.548747]  [<ffffffff9018a82f>] mutex_lock+0x1f/0x2f
[  360.551666]  [<ffffffffc0b8ff9e>] ll_layout_refresh+0x1ee/0x910 [lustre]
[  360.555106]  [<ffffffffc0545a59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[  360.558524]  [<ffffffffc0bd81df>] vvp_io_init+0x34f/0x490 [lustre]
[  360.561638]  [<ffffffffc0566803>] ? lu_context_init+0xd3/0x1f0 [obdclass]
[  360.564847]  [<ffffffffc056693a>] ? lu_env_init+0x1a/0x30 [obdclass]
[  360.567832]  [<ffffffffc057725b>] cl_io_init0.isra.14+0x8b/0x160 [obdclass]
[  360.571007]  [<ffffffffc05773f3>] cl_io_init+0x43/0x80 [obdclass]
[  360.574113]  [<ffffffffc0bcda2e>] cl_setattr_ost+0x14e/0x3e0 [lustre]
[  360.576943]  [<ffffffffc0ba5978>] ll_setattr_raw+0xd58/0x10d0 [lustre]
[  360.579788]  [<ffffffffc0bb0978>] ? ll_stats_ops_tally+0x98/0x100 [lustre]
[  360.582826]  [<ffffffffc0ba5d53>] ll_setattr+0x63/0xc0 [lustre]
[  360.585399]  [<ffffffff8fc6e1ac>] notify_change+0x30c/0x4d0
[  360.587813]  [<ffffffff8fc4c2c5>] do_truncate+0x75/0xc0
[  360.590117]  [<ffffffff8fc514e8>] ? __sb_start_write+0x58/0x120
[  360.592621]  [<ffffffff8fc4c6e9>] do_sys_ftruncate.constprop.14+0x139/0x1a0
[  360.595400]  [<ffffffff8fc4c78e>] SyS_ftruncate+0xe/0x10
[  360.597612]  [<ffffffff90199f92>] system_call_fastpath+0x25/0x2a



 Comments   
Comment by Li Xi [ 15/Nov/22 ]

With the attached script reproduce_sync_stuck.sh, the problem is very easy to reproduce.

 

Comment by Li Xi [ 15/Nov/22 ]

This looks like, but is not the same with LU-14877.

Comment by Li Xi [ 15/Nov/22 ]

The reproduce script is able to reproduce LU-14877 too.

Comment by Gerrit Updater [ 15/Nov/22 ]

"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49164
Subject: LU-16315 llite: invoke sync after write IO
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 02fe7dfad2de0352bd34801b18d1060f6c567eac

Generated at Sat Feb 10 03:25:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.