Details

    • Technical task
    • Resolution: Fixed
    • Minor
    • Lustre 2.13.0, Lustre 2.12.3
    • None
    • None
    • 9223372036854775807

    Description

      In the following scenario

      Lustre->LNetPrimaryNID with 0@lo
      Discover is initiated on 0@lo
      The peer is created with 0@lo and <addr>@<net>
      The interface health of the peer's <addr>@<net> is decremented
      LNetPut() to self
      selection algorithm selects 0@lo to send to

      This exposes an issue where we try and go through the peer credit management algorithm, but because there are no credits associated with 0@lo we end up indefinitely queuing the message. ptlrpc will then get stuck waiting for send completion on the message.

      This was exposed via conf-sanity 32

       

      Attachments

        Activity

          [LU-12339] LNet Health: selecting loopback interface for sending
          pjones Peter Jones made changes -
          Fix Version/s New: Lustre 2.12.3 [ 14418 ]
          jgmitter Joseph Gmitter (Inactive) made changes -
          Fix Version/s New: Lustre 2.13.0 [ 14290 ]
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]
          ashehata Amir Shehata (Inactive) created issue -

          People

            ashehata Amir Shehata (Inactive)
            ashehata Amir Shehata (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: