Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
client may get stuck in cl_object_put_last():
[465342.626191] INFO: task nwchem:108679 blocked for more than 120 seconds. [465342.632934] Tainted: G OE --------- -t - 4.18.0-193.el8.x86_64 #1 [465342.640783] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [465342.648720] nwchem D 0 108679 108659 0x00004082 [465342.655115] Call Trace: [465342.658245] ? __schedule+0x24f/0x650 [465342.662607] schedule+0x2f/0xa0 [465342.666446] cl_inode_fini+0x137/0x1e0 [lustre] [465342.671705] ? wake_up_q+0x70/0x70 [465342.675813] ll_clear_inode+0x1b3/0x570 [lustre] [465342.681197] ll_delete_inode+0x58/0x220 [lustre] [465342.686571] evict+0xd2/0x1a0 [465342.690291] do_unlinkat+0x250/0x2e0 [465342.694604] do_syscall_64+0x5b/0x1a0 [465342.698910] entry_SYSCALL_64_after_hwframe+0x65/0xca [465342.704672] RIP: 0033:0x7fd72cf373cb [465342.708989] Code: Bad RIP value. [465342.712929] RSP: 002b:00007ffe0b19b5d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000057 [465342.721236] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd72cf373cb [465342.729119] RDX: 0000000000000010 RSI: 0000000000000000 RDI: 00007ffe0b19ba10 [465342.736982] RBP: 00007ffe0b19ba10 R08: 0000000000000000 R09: 0000000000000000 [465342.744820] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000000000 [465342.752638] R13: 00007ffe0b19dd70 R14: 00007ffe0b19beb0 R15: 00000000010a9e9d
It should be woken up by lu_object_free():
if (waitqueue_active(wq))
wake_up(wq);
But according to description of waitqueue_active(), a smp_mb()/spinlock is needed to wake up reliably.