Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13461

LNet routing: wrong gw ni may be selected to reach undiscovered peer

Details

    • 3
    • 9223372036854775807

    Description

      Test scenario is as follows:

      PeerA: two networks, one interface per network (A1@tcp, A2@tcp1)

      PeerB: one network, two interfaces (B1@tcp2, B2@tcp2)

      GW1: two interfaces on peerA's first net, two facing peerB (R1@tcp, R2@tcp, R3@tcp2, R4@tcp2)

      GW2: two interfaces on peerA's second net, two facing peerB (R1@tcp1, R2@tcp1, R3@tcp2, R4@tcp2)

      Routes on peer A: reach tcp2 via GW1, reach tcp2 via GW2

      Routes on peer B: reach tcp via GW1, reach tcp1 via GW2

      Do not run discovery on peer A from peer B or vice versa.

       Once everything is configured, the following ping may fail from peer A to peer B:

      lnetctl ping B1@tcp2
      

      It looks like wrong gateway NI may be selected by peer A:

      (lib-move.c:1921:lnet_handle_send()) TRACE: 192.168.122.103@tcp(192.168.122.103@tcp:<?>) -> 192.168.122.110@tcp2(192.168.122.110@tcp2:192.168.122.150@tcp1) <?> : GET try# 0
      

       

      Attachments

        Activity

          [LU-13461] LNet routing: wrong gw ni may be selected to reach undiscovered peer
          ssmirnov Serguei Smirnov made changes -
          Labels Original: lnet New: lnet lnet-router
          ssmirnov Serguei Smirnov made changes -
          Labels New: lnet
          pjones Peter Jones made changes -
          Fix Version/s New: Lustre 2.14.0 [ 14490 ]
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]
          ashehata Amir Shehata (Inactive) made changes -
          Assignee Original: WC Triage [ wc-triage ] New: Amir Shehata [ ashehata ]
          ssmirnov Serguei Smirnov made changes -
          Description Original: Test scenario is as follows:

          PeerA: two networks, one interface per network (A1@tcp, A2@tcp1)

          PeerB: one network, two interfaces (B1@tcp2, B2@tcp2)

          GW1: two interfaces on peerA's first net, two facing peerB (R1@tcp, R2@tcp, R3@tcp2, R4@tcp2)

          GW2: two interfaces on peerA's second net, two facing peerB (R1@tcp1, R2@tcp1, R3@tcp2, R4@tcp2)

          Routes on peer A: reach tcp2 via GW1, reach tcp2 via GW2

          Routes on peer B: reach tcp via GW1, reach tcp1 via GW2

           Once everything is configured, the following ping may fail from peer A to peer B:
          {noformat}
          lnetctl ping B1@tcp2
          {noformat}
          It looks like wrong gateway NI may be selected by peer A:
          {noformat}
          (lib-move.c:1921:lnet_handle_send()) TRACE: 192.168.122.103@tcp(192.168.122.103@tcp:<?>) -> 192.168.122.110@tcp2(192.168.122.110@tcp2:192.168.122.150@tcp1) <?> : GET try# 0
          {noformat}
           
          New: Test scenario is as follows:

          PeerA: two networks, one interface per network (A1@tcp, A2@tcp1)

          PeerB: one network, two interfaces (B1@tcp2, B2@tcp2)

          GW1: two interfaces on peerA's first net, two facing peerB (R1@tcp, R2@tcp, R3@tcp2, R4@tcp2)

          GW2: two interfaces on peerA's second net, two facing peerB (R1@tcp1, R2@tcp1, R3@tcp2, R4@tcp2)

          Routes on peer A: reach tcp2 via GW1, reach tcp2 via GW2

          Routes on peer B: reach tcp via GW1, reach tcp1 via GW2

          Do not run discovery on peer A from peer B or vice versa.

           Once everything is configured, the following ping may fail from peer A to peer B:
          {noformat}
          lnetctl ping B1@tcp2
          {noformat}
          It looks like wrong gateway NI may be selected by peer A:
          {noformat}
          (lib-move.c:1921:lnet_handle_send()) TRACE: 192.168.122.103@tcp(192.168.122.103@tcp:<?>) -> 192.168.122.110@tcp2(192.168.122.110@tcp2:192.168.122.150@tcp1) <?> : GET try# 0
          {noformat}
           
          ssmirnov Serguei Smirnov created issue -

          People

            ashehata Amir Shehata (Inactive)
            ssmirnov Serguei Smirnov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: