Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
9223372036854775807
Description
If we want to service an LNet router (power cycle, HW changes, config changes, etc.) we currently need to delete the associated routes from all LNet peers. This can lead to temporary situations where the route tables are asymmetric. A better solution would be to administratively disable the route(s) so that new sends would not use the router, but any that were in flight would not be suddenly dropped if the route becomes asymmetric.
We should provide capability to:
1. Disable route locally. i.e. on the local host run something like the following to disable the route only on the local host.
lnetctl route set --net <net> --gateway <nid> disable
2. Corresponding enable:
lnetctl route set --net <net> --gateway <nid> enable
It might also be nice if we could set a flag on an LNet router that could be discovered by other LNet peers. We can currently disable routing, on an LNet router, but I believe this will teardown the router buffers and doesn't allow for traffic to gracefully drain from the route(s). It would be nice if we could do something like:
lnetctl set routing enable (equivalent to lnetctl set routing 1) lnetctl set routing disable (equivalent to lnetctl set routing 0) lnetctl set routing admindown (equivalent to lnetctl set routing 3)
Eventually, any peers that discover the router will find a corresponding flag in the router's peer state and set the corresponding route(s) to the administratively disabled state.
"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57490
Subject: LU-15135 lnet: Graceful router removal and addition
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a4a20ab1625415dde2f8056795869bde71e4e833