Details
-
Improvement
-
Resolution: Duplicate
-
Minor
-
None
-
None
-
9223372036854775807
Description
Currently if peer NI is added late, such that the communication to the peer has been already happening, assuming peer NIs are equal otherwise, the LNet is going to switch to using the newly discovered NI and ignore others until the "sequence count" (i.e. the count of packets sent to the NI) on the new peer NI becomes level with the counts on the peer NIs that were available previously. Same issue can be seen with the peer NI that comes back from an "unhealthy" state.
This creates an imbalance which can last for quite a while.