Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
The replay-dual/26 timeout with the following stack trace:
[Tue Sep 6 05:21:32 2022] task:ptlrpcd_00_01 state:I stack: 0 pid: 8026 ppid: 2 flags:0x80004080 [Tue Sep 6 05:21:32 2022] Call Trace: [Tue Sep 6 05:21:32 2022] __schedule+0x2bd/0x760 [Tue Sep 6 05:21:32 2022] schedule+0x37/0xa0 [Tue Sep 6 05:21:32 2022] osc_extent_wait+0x44d/0x560 [osc] [Tue Sep 6 05:21:32 2022] ? finish_wait+0x80/0x80 [Tue Sep 6 05:21:32 2022] osc_cache_wait_range+0x2b8/0x930 [osc] [Tue Sep 6 05:21:32 2022] osc_io_fsync_end+0x67/0x80 [osc] [Tue Sep 6 05:21:32 2022] cl_io_end+0x58/0x130 [obdclass] [Tue Sep 6 05:21:32 2022] lov_io_end_wrapper+0xcf/0xe0 [lov] [Tue Sep 6 05:21:32 2022] lov_io_fsync_end+0x6f/0x1c0 [lov] [Tue Sep 6 05:21:32 2022] cl_io_end+0x58/0x130 [obdclass] [Tue Sep 6 05:21:32 2022] cl_io_loop+0xa7/0x200 [obdclass] [Tue Sep 6 05:21:32 2022] cl_sync_file_range+0x2c9/0x340 [lustre] [Tue Sep 6 05:21:32 2022] vvp_prune+0x5d/0x1e0 [lustre] [Tue Sep 6 05:21:32 2022] cl_object_prune+0x58/0x130 [obdclass] [Tue Sep 6 05:21:32 2022] lov_layout_change.isra.47+0x1ba/0x640 [lov] [Tue Sep 6 05:21:32 2022] lov_conf_set+0x38d/0x4e0 [lov] [Tue Sep 6 05:21:32 2022] cl_conf_set+0x60/0x140 [obdclass] [Tue Sep 6 05:21:32 2022] cl_file_inode_init+0xc8/0x380 [lustre] [Tue Sep 6 05:21:32 2022] ll_update_inode+0x432/0x6e0 [lustre] [Tue Sep 6 05:21:32 2022] ll_iget+0x227/0x320 [lustre] [Tue Sep 6 05:21:32 2022] ll_prep_inode+0x344/0xb60 [lustre] [Tue Sep 6 05:21:32 2022] ll_statahead_interpret_common.isra.26+0x69/0x830 [lustre] [Tue Sep 6 05:21:32 2022] ll_statahead_interpret+0x2c8/0x5b0 [lustre] [Tue Sep 6 05:21:32 2022] mdc_intent_getattr_async_interpret+0x14a/0x3e0 [mdc] [Tue Sep 6 05:21:32 2022] ptlrpc_check_set+0x5b8/0x1fe0 [ptlrpc] [Tue Sep 6 05:21:32 2022] ptlrpcd+0x6c6/0xa50 [ptlrpc] [Tue Sep 6 05:21:32 2022] ? do_wait_intr_irq+0xb0/0xb0 [Tue Sep 6 05:21:32 2022] ? ptlrpcd_add_req+0x2f0/0x2f0 [ptlrpc] [Tue Sep 6 05:21:32 2022] kthread+0x116/0x130 [Tue Sep 6 05:21:32 2022] ? kthread_flush_work_fn+0x10/0x10 [Tue Sep 6 05:21:32 2022] ret_from_fork+0x35/0x40
The reason is that we wait for file range sync during the layout change for the regular file, it is dangerous to block the ptlrpcd interpret callback context for a long time.
The solution is use work queue to do the @ll_prep_inode call in a separate thread.
Attachments
Issue Links
- is related to
-
LU-14139 batched statahead processing
- Resolved