Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9492

MDT reports passing recovery deadline prematurely

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.8.0
    • Fix Version/s: None
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      During MDT recovery multiple console messages appear containing the phrase "Recovery already passed deadline MM:SS". The MM:SS displays the minutes and seconds _remaining _ until the recovery deadline expires. This is confusing to system administrators. There are two issues to address here.

      1. The wording of the message seems to be incorrect.
      2. Even if the wording was correct, It is unclear why this message is emitted.

      The clarity of log messages pertaining to recovery is critically important, as that is a time when system administrators tend to watch the logs closely and they need to understand what is happening.

      May 10 09:17:07 zinc1 kernel: Lustre: lsh-MDT0000: Will be in recovery for at least 5:00, or until 2827 clients reconnect
      May 10 09:18:37 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:30. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:37 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:29. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:38 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:28. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:40 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:26. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:45 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:22. If you do not want to wait more, please abort the recovery by force.
      May 10 09:18:53 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:14. If you do not want to wait more, please abort the recovery by force.
      May 10 09:19:09 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 2:58. If you do not want to wait more, please abort the recovery by force.
      May 10 09:19:41 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 2:26. If you do not want to wait more, please abort the recovery by force.
      May 10 09:20:45 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 1:22. If you do not want to wait more, please abort the recovery by force.
      May 10 09:22:07 zinc1 kernel: Lustre: lsh-MDT0000: Recovery over after 5:01, of 2827 clients 2651 recovered and 0 were evicted.
      
      

        Attachments

          Activity

            People

            • Assignee:
              emoly.liu Emoly Liu
              Reporter:
              nedbass Ned Bass
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: