Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.17.0
-
3
-
9223372036854775807
Description
This issue was created by maloo for jianyu <yujian@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/999f320c-4a63-4a47-a9fd-050795558019
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/110246 - 5.14.0-427.42.1.el9_4.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/110246 - 5.14.0-427.42.1_lustre.el9.x86_64
<<Please provide additional information about the failure here>>
obd_memory max: 129361747, obd_memory current: 124305179
obd_memory max: 129361747, obd_memory current: 124305179
obd_memory max: 129361747, obd_memory current: 124305179
obd_memory max: 129361747, obd_memory current: 124305179
Autotest: Test running for 200 minutes (lustre-reviews_full-dne-zfs-part-1_110246.1003)
INFO: task txg_sync:265572 blocked for more than 122 seconds.
Tainted: P OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:txg_sync state:D stack:0 pid:265572 ppid:2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x21b/0x550
? _raw_spin_unlock_irqrestore+0xa/0x30
schedule+0x2d/0x70
schedule_timeout+0x88/0x160
? __pfx_process_timeout+0x10/0x10
? prepare_to_wait_exclusive+0x4f/0xb0
io_schedule_timeout+0x4c/0x80
__cv_timedwait_common+0x12d/0x170 [spl]
? __pfx_autoremove_wake_function+0x10/0x10
__cv_timedwait_io+0x15/0x20 [spl]
zio_wait+0x130/0x290 [zfs]
dsl_pool_sync+0xfe/0x530 [zfs]
spa_sync_iterate_to_convergence+0xf0/0x300 [zfs]
? _raw_spin_unlock_irqrestore+0xa/0x30
spa_sync+0x48d/0x980 [zfs]
txg_sync_thread+0x204/0x3a0 [zfs]
? __pfx_txg_sync_thread+0x10/0x10 [zfs]
? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
thread_generic_wrapper+0x59/0x70 [spl]
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
INFO: task lctl:292876 blocked for more than 122 seconds.
Tainted: P OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:lctl state:D stack:0 pid:292876 ppid:292869 flags:0x00004002
Call Trace:
<TASK>
__schedule+0x21b/0x550
schedule+0x2d/0x70
io_schedule+0x42/0x70
cv_wait_common+0xaa/0x130 [spl]
? __pfx_autoremove_wake_function+0x10/0x10
txg_wait_synced_impl+0xcb/0x110 [zfs]
txg_wait_synced+0xc/0x40 [zfs]
dmu_tx_wait+0x20d/0x410 [zfs]
? dmu_tx_try_assign+0x29b/0x360 [zfs]
dmu_tx_assign+0x14e/0x190 [zfs]
osd_trans_start+0xbf/0x460 [osd_zfs]
ofd_commitrw_write+0x482/0x1220 [ofd]
ofd_commitrw+0x6bd/0xc50 [ofd]
? obd_commitrw.constprop.0+0x1c8/0x3a0 [obdecho]
obd_commitrw.constprop.0+0x1c8/0x3a0 [obdecho]
echo_client_prep_commit.constprop.0+0x3ea/0x9c0 [obdecho]
? echo_client_brw_ioctl+0x222/0x2d0 [obdecho]
echo_client_brw_ioctl+0x222/0x2d0 [obdecho]
echo_client_iocontrol+0x534/0x10f0 [obdecho]
obd_iocontrol.constprop.0+0x190/0x360 [obdclass]
class_handle_ioctl+0xc9a/0x1640 [obdclass]
? switch_fpu_return+0x4c/0xd0
? exit_to_user_mode_prepare+0xec/0x100
? __pfx_bpf_lsm_capable+0x10/0x10
? security_capable+0x36/0x60
obd_class_ioctl+0x155/0x1a0 [obdclass]
__x64_sys_ioctl+0x8a/0xc0
do_syscall_64+0x5c/0x90
? do_syscall_64+0x69/0x90
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f6e63d0357b
RSP: 002b:00007ffd669e6518 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f6e63f03310 RCX: 00007f6e63d0357b
RDX: 00007ffd669e6840 RSI: 00000000c008667e RDI: 0000000000000003
RBP: 00000000c008667e R08: 0000000000000000 R09: 0000000000000000
R10: 00007f6e63c11d78 R11: 0000000000000246 R12: 00007ffd669e6840
R13: 0000559edcc0763f R14: 00000000c008667e R15: 0000000000000000
</TASK>