Description
It would be useful to have a few second pause between when LBUG() is called and panic() is triggered, in order for the stack trace to be written to the serial console, and ideally also to give a chance for it to be written to /var/log/messages if no serial console is available.
The code currently calls panic() immediately after dumping the stack:
lbug_with_loc(struct libcfs_debug_msg_data *msgdata) { libcfs_catastrophe = 1; libcfs_debug_msg(msgdata, "LBUG\n"); if (in_interrupt()) { panic("LBUG in interrupt.\n"); /* not reached */ } libcfs_debug_dumpstack(NULL); if (libcfs_panic_on_lbug) panic("LBUG"); else libcfs_debug_dumplog(); set_current_state(TASK_UNINTERRUPTIBLE); while (1) schedule(); }
It would be reasonable to allow libcfs_panic_on_lbug() to store the number of seconds (or milliseconds?) to delay before calling panic(), probably using msleep() to busy-wait instead of being scheduled. In the meantime, a task could be dispatched to a work queue to try sync-and-flush for whatever can be written during this delay (if the system is not locked up), equivalent to "sysrq-w" and "sysrq-s".
Attachments
Issue Links
- is related to
-
LU-16297 ptl_send_rpc() ASSERTION ( (at_max == 0) || imp->imp_state != LUSTRE_IMP_FULL || (imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) || !(imp->imp_connect_data.ocd_connect_flags & 0x1000000ULL) )
- Resolved