Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5042

Recovery Lock Replay

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.7.0
    • Lustre 2.4.3
    • 3
    • 13935

    Description

      While performing load testing on one of our filesystems this week we power cycled the OSSs to test recovery. To my surprise it ended up taking the OSS several hours to complete recovery and the vast majority of that time was spent in the lock replay stage.

      What I know for certain is that the OST has roughly 500,000 locks outstanding before it was power cycled. When it came up all the clients did properly reconnect to it and seems to have decided to replay all their locks, used and unused. I thought we fixed this years ago, so I verified that the tunables were set such that we shouldn't replay unused locks. They appeared to be set properly but those 500,000 locks were resent to the OST.

      After the recovery timed dropped to zero and I didn't quickly see recovery complete message I dumped some stacks from the OST. They showed that the tgt_recov thread was in stage two sequentially replaying all of those 500,000 locks. Because this was being done sequentially from a single thread the disk was hardly working and the system looked idle.

      This exact behavior has been reported on our production machines and I can easily understand why an administrator might think the system was hung/deadlocked and give up on it. Basically the recovery timer drops to zero and then recovery doesn't actually complete for several hours.

      You should be able to fairly easily reproduce this on any test system. Just ensure your server has a large number of locks enqueued and then power cycle it.

      Attachments

        Activity

          People

            bfaccini Bruno Faccini (Inactive)
            behlendorf Brian Behlendorf
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: