Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15514

Do not wait for clients to start recovery if there are no clients.

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.15.0
    • None
    • 9223372036854775807

    Description

      With idle-disconnect code a situation can happen where entire cluster is idle for some time and as all the servers restart, the recovery on OSTs does not start as there are no client connections. The MDTs connections to OSTs are rejected because those are considered to be new connections.

      We need to either accept new MDTs in similar to how we do when MDT and OST are colocated on the same node or we need to start the recovry time on first such connection and then proceed with the eviction as the timeout expires to allow them to rejoin as the new clients they are.

      Failing to do this would cause entire cluster delay as the idle-disconnected clients become active again and would need to wait for the recovery to finish first even if the servers restart happened long ago

      Attachments

        Activity

          People

            wc-triage WC Triage
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: