Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9492

MDT reports passing recovery deadline prematurely

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      During MDT recovery multiple console messages appear containing the phrase "Recovery already passed deadline MM:SS". The MM:SS displays the minutes and seconds _remaining _ until the recovery deadline expires. This is confusing to system administrators. There are two issues to address here.

      1. The wording of the message seems to be incorrect.
      2. Even if the wording was correct, It is unclear why this message is emitted.

      The clarity of log messages pertaining to recovery is critically important, as that is a time when system administrators tend to watch the logs closely and they need to understand what is happening.

      May 10 09:17:07 zinc1 kernel: Lustre: lsh-MDT0000: Will be in recovery for at least 5:00, or until 2827 clients reconnect
      May 10 09:18:37 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:30. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:37 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:29. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:38 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:28. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:40 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:26. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:45 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:22. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:53 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:14. If you do not want to wait more, please abort the recovery by force.
      May 10 09:19:09 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 2:58. If you do not want to wait more, please abort the recovery by force.
      May 10 09:19:41 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 2:26. If you do not want to wait more, please abort the recovery by force.
      May 10 09:20:45 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 1:22. If you do not want to wait more, please abort the recovery by force.
      May 10 09:22:07 zinc1 kernel: Lustre: lsh-MDT0000: Recovery over after 5:01, of 2827 clients 2651 recovered and 0 were evicted.
      
      

      Attachments

        Activity

          People

            emoly.liu Emoly Liu
            nedbass Ned Bass (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: