Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
This issue is very similar to LU-13575, but it relates to the round robin across multiple nets whereas that ticket was about round robin across interfaces within a single net.
Currently if a peer has multiple network types (either multiple LNDs or multiple nets on their interfaces) there are situations where traffic can be routed to the interfaces on one net (like if a peer is talking to another peer that only has interfaces on one of the nets, or if interfaces go down on the other net for an extended period of time). This causes the peer net/local net sequence numbers to diverge in the same manner documented in LU-13575. This can cause future traffic to funnel to just one of the available nets leading to degraded performance.
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51547
Subject:
LU-15713lnet: Ensure round robin across netsProject: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 4b616f00cacc448f1be6607754feb77dbb347167