Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5724

IR recovery doesn't behave properly with Lustre 2.5

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Critical Critical
    • None
    • Lustre 2.5.3
    • MDS server running RHEL6.5 running ORNL 2.5.3 branch with about 12 patches.
    • 2
    • 16076

      Today we experienced a hardware failure with our MDS. The MDS rebooted and then came back. We restarted the MDS but IR behaved strangely. Four clients got evicted but when the timer to completion got down to zero IR restarted all over again. Then once it got to the 700 second range the timer starting to go up. It did this a few times before letting the timer running out. Once the timer did finally get to zero the recovery state was reported as still being in recovery. It removed this way for several more minutes before finally being in a recovered state. In all it toke 54 minutes to recover.

            hongchao.zhang Hongchao Zhang
            simmonsja James A Simmons
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: