[LU-15905] LNet: ping router to check its ni status if received a push from the router Created: 31/May/22  Updated: 11/Jan/24

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.8
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Serguei Smirnov Assignee: Serguei Smirnov
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
NodeA <-- o2ib0 --> GWn <-- o2ib1 --> NodeB 

In the diagram above, LNet routers have one nid on o2ib0 and another one on o2ib1.

Currently with b2_12 code, if o2ib0 nid of a router goes down (e.g. link disconnected), then NodeB finds out about it only when it pings the router. In the meantime NodeB is still able to select the router for sending.

The router does send a "push" to NodeB when the router's NI on o2ib0 goes down.  So instead of waiting, ping the router for its updated state as soon as the push is received.


Generated at Sat Feb 10 03:22:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.