Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12291

Wrong NI selection on asymmetric Multi-rail environment

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      If the sending node is MultiRail and the receiving node is non-MultiRail,
      the sending node use always the same NI (even if the sending NI is blocken, the blocken NI is used).
      This may be the specification of MultiRail, but blocken device should be used.

          REMOTE  IB0              (non-MultiRail node)
                  ↑
                  x
                  |
           LOCAL  IB0      IB1     <- always use IB0(not in round-robin fashion)
                  failure    
      

      If the receiving node is non-MultiRail, we check whether its device is normal or out of service and reset the device in case of failure.

      Attachments

        Issue Links

          Activity

            [LU-12291] Wrong NI selection on asymmetric Multi-rail environment

            The reason we always stick with the same device is because doing otherwise will confuse the non-MR peer. If the non-MR peer initiated the connection on a specific NID, it always expects communication from that same NID. If the MR node uses another NID, then it will consider it communication from a different node.

            The reset of the device on failure sounds interesting. How do you do that?

            ashehata Amir Shehata (Inactive) added a comment - The reason we always stick with the same device is because doing otherwise will confuse the non-MR peer. If the non-MR peer initiated the connection on a specific NID, it always expects communication from that same NID. If the MR node uses another NID, then it will consider it communication from a different node. The reset of the device on failure sounds interesting. How do you do that?

            People

              takamura Tatsushi Takamura
              takamura Tatsushi Takamura
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: