Description
It would be useful to have a few second pause between when LBUG() is called and panic() is triggered, in order for the stack trace to be written to the serial console, and ideally also to give a chance for it to be written to /var/log/messages if no serial console is available.
The code currently calls panic() immediately after dumping the stack:
lbug_with_loc(struct libcfs_debug_msg_data *msgdata)
{
libcfs_catastrophe = 1;
libcfs_debug_msg(msgdata, "LBUG\n");
if (in_interrupt()) {
panic("LBUG in interrupt.\n");
/* not reached */
}
libcfs_debug_dumpstack(NULL);
if (libcfs_panic_on_lbug)
panic("LBUG");
else
libcfs_debug_dumplog();
set_current_state(TASK_UNINTERRUPTIBLE);
while (1)
schedule();
}
It would be reasonable to allow libcfs_panic_on_lbug() to store the number of seconds (or milliseconds?) to delay before calling panic(), probably using msleep() to busy-wait instead of being scheduled. In the meantime, a task could be dispatched to a work queue to try sync-and-flush for whatever can be written during this delay (if the system is not locked up), equivalent to "sysrq-w" and "sysrq-s".
Attachments
Issue Links
- is related to
-
LU-16297 ptl_send_rpc() ASSERTION ( (at_max == 0) || imp->imp_state != LUSTRE_IMP_FULL || (imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) || !(imp->imp_connect_data.ocd_connect_flags & 0x1000000ULL) )
-
- Resolved
-