[LU-13461] LNet routing: wrong gw ni may be selected to reach undiscovered peer Created: 17/Apr/20  Updated: 02/Feb/21  Resolved: 14/May/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Minor
Reporter: Serguei Smirnov Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: lnet, lnet-router

Epic/Theme: lnet
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Test scenario is as follows:

PeerA: two networks, one interface per network (A1@tcp, A2@tcp1)

PeerB: one network, two interfaces (B1@tcp2, B2@tcp2)

GW1: two interfaces on peerA's first net, two facing peerB (R1@tcp, R2@tcp, R3@tcp2, R4@tcp2)

GW2: two interfaces on peerA's second net, two facing peerB (R1@tcp1, R2@tcp1, R3@tcp2, R4@tcp2)

Routes on peer A: reach tcp2 via GW1, reach tcp2 via GW2

Routes on peer B: reach tcp via GW1, reach tcp1 via GW2

Do not run discovery on peer A from peer B or vice versa.

 Once everything is configured, the following ping may fail from peer A to peer B:

lnetctl ping B1@tcp2

It looks like wrong gateway NI may be selected by peer A:

(lib-move.c:1921:lnet_handle_send()) TRACE: 192.168.122.103@tcp(192.168.122.103@tcp:<?>) -> 192.168.122.110@tcp2(192.168.122.110@tcp2:192.168.122.150@tcp1) <?> : GET try# 0

 



 Comments   
Comment by Gerrit Updater [ 21/Apr/20 ]

Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38298
Subject: LU-13461 lnet: restrict gateway selection
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cd9e4f54c55934c8a9e9d8fd6469c9ebb9e2404e

Comment by Gerrit Updater [ 14/May/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38298/
Subject: LU-13461 lnet: restrict gateway selection
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ceb92c5512bad844b7f741ff6fb37c62c652e66f

Comment by Peter Jones [ 14/May/20 ]

Landed for 2.14

Generated at Sat Feb 10 03:01:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.