Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17540

sync and delay before LBUG() calls panic()

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      It would be useful to have a few second pause between when LBUG() is called and panic() is triggered, in order for the stack trace to be written to the serial console, and ideally also to give a chance for it to be written to /var/log/messages if no serial console is available.

      The code currently calls panic() immediately after dumping the stack:

      lbug_with_loc(struct libcfs_debug_msg_data *msgdata)
      {
              libcfs_catastrophe = 1;
              libcfs_debug_msg(msgdata, "LBUG\n");
      
              if (in_interrupt()) {
                      panic("LBUG in interrupt.\n");
                      /* not reached */
              }
      
              libcfs_debug_dumpstack(NULL);
              if (libcfs_panic_on_lbug)
                      panic("LBUG");
              else
                      libcfs_debug_dumplog();
              set_current_state(TASK_UNINTERRUPTIBLE);
              while (1)
                      schedule();
      }
      

      It would be reasonable to allow libcfs_panic_on_lbug() to store the number of seconds (or milliseconds?) to delay before calling panic(), probably using msleep() to busy-wait instead of being scheduled. In the meantime, a task could be dispatched to a work queue to try sync-and-flush for whatever can be written during this delay (if the system is not locked up), equivalent to "sysrq-w" and "sysrq-s".

      Attachments

        Issue Links

          Activity

            People

              yujian Jian Yu
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: