Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16214

Minimize dropping kfilnd messages at target due to stale peer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      When LNET is restarted at a target the kfilnd peer is marked as stale. An incoming message from the initiator peer is silently dropped and a peer handshake exchange is started.

      While this works as designed, since kfilnd is connectionless, it is not optimal because kfilnd clients must rely on their error handing to figure out a message was dropped and then retry.

      The purpose of this story is to determine how to minimize the occurrence of dropped kfilnd messages.

      One thought is to have kfilnd proactively do a hello handshake on a send if a message hasn't been received from a peer for some period of time. Similar to what it does when it knows the peer is stale.

      Attachments

        Activity

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: