Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.17.0
-
3
-
9223372036854775807
Description
This issue was created by maloo for jianyu <yujian@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/999f320c-4a63-4a47-a9fd-050795558019
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/110246 - 5.14.0-427.42.1.el9_4.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/110246 - 5.14.0-427.42.1_lustre.el9.x86_64
<<Please provide additional information about the failure here>>
obd_memory max: 129361747, obd_memory current: 124305179 obd_memory max: 129361747, obd_memory current: 124305179 obd_memory max: 129361747, obd_memory current: 124305179 obd_memory max: 129361747, obd_memory current: 124305179 Autotest: Test running for 200 minutes (lustre-reviews_full-dne-zfs-part-1_110246.1003) INFO: task txg_sync:265572 blocked for more than 122 seconds. Tainted: P OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:txg_sync state:D stack:0 pid:265572 ppid:2 flags:0x00004000 Call Trace: <TASK> __schedule+0x21b/0x550 ? _raw_spin_unlock_irqrestore+0xa/0x30 schedule+0x2d/0x70 schedule_timeout+0x88/0x160 ? __pfx_process_timeout+0x10/0x10 ? prepare_to_wait_exclusive+0x4f/0xb0 io_schedule_timeout+0x4c/0x80 __cv_timedwait_common+0x12d/0x170 [spl] ? __pfx_autoremove_wake_function+0x10/0x10 __cv_timedwait_io+0x15/0x20 [spl] zio_wait+0x130/0x290 [zfs] dsl_pool_sync+0xfe/0x530 [zfs] spa_sync_iterate_to_convergence+0xf0/0x300 [zfs] ? _raw_spin_unlock_irqrestore+0xa/0x30 spa_sync+0x48d/0x980 [zfs] txg_sync_thread+0x204/0x3a0 [zfs] ? __pfx_txg_sync_thread+0x10/0x10 [zfs] ? __pfx_thread_generic_wrapper+0x10/0x10 [spl] thread_generic_wrapper+0x59/0x70 [spl] kthread+0xe0/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> INFO: task lctl:292876 blocked for more than 122 seconds. Tainted: P OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:lctl state:D stack:0 pid:292876 ppid:292869 flags:0x00004002 Call Trace: <TASK> __schedule+0x21b/0x550 schedule+0x2d/0x70 io_schedule+0x42/0x70 cv_wait_common+0xaa/0x130 [spl] ? __pfx_autoremove_wake_function+0x10/0x10 txg_wait_synced_impl+0xcb/0x110 [zfs] txg_wait_synced+0xc/0x40 [zfs] dmu_tx_wait+0x20d/0x410 [zfs] ? dmu_tx_try_assign+0x29b/0x360 [zfs] dmu_tx_assign+0x14e/0x190 [zfs] osd_trans_start+0xbf/0x460 [osd_zfs] ofd_commitrw_write+0x482/0x1220 [ofd] ofd_commitrw+0x6bd/0xc50 [ofd] ? obd_commitrw.constprop.0+0x1c8/0x3a0 [obdecho] obd_commitrw.constprop.0+0x1c8/0x3a0 [obdecho] echo_client_prep_commit.constprop.0+0x3ea/0x9c0 [obdecho] ? echo_client_brw_ioctl+0x222/0x2d0 [obdecho] echo_client_brw_ioctl+0x222/0x2d0 [obdecho] echo_client_iocontrol+0x534/0x10f0 [obdecho] obd_iocontrol.constprop.0+0x190/0x360 [obdclass] class_handle_ioctl+0xc9a/0x1640 [obdclass] ? switch_fpu_return+0x4c/0xd0 ? exit_to_user_mode_prepare+0xec/0x100 ? __pfx_bpf_lsm_capable+0x10/0x10 ? security_capable+0x36/0x60 obd_class_ioctl+0x155/0x1a0 [obdclass] __x64_sys_ioctl+0x8a/0xc0 do_syscall_64+0x5c/0x90 ? do_syscall_64+0x69/0x90 ? exc_page_fault+0x62/0x150 entry_SYSCALL_64_after_hwframe+0x77/0xe1 RIP: 0033:0x7f6e63d0357b RSP: 002b:00007ffd669e6518 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f6e63f03310 RCX: 00007f6e63d0357b RDX: 00007ffd669e6840 RSI: 00000000c008667e RDI: 0000000000000003 RBP: 00000000c008667e R08: 0000000000000000 R09: 0000000000000000 R10: 00007f6e63c11d78 R11: 0000000000000246 R12: 00007ffd669e6840 R13: 0000559edcc0763f R14: 00000000c008667e R15: 0000000000000000 </TASK>