[LU-16086] add generic LNet network number support Created: 09/Aug/22  Updated: 05/Feb/24

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: medium

Issue Links:
Related
is related to LU-10360 use Imperative Recovery logs for clie... Open
is related to LU-16035 Land kfilnd implementation Resolved
is related to LU-15983 Reserve KFI LND Resolved
Rank (Obsolete): 9223372036854775807

 Description   

In conjunction with the new kfilnd, it would make sense to add in a generic mechanism for connecting to new/unknown LNet network types (via router) by directly specifying the LNet network type and number, something like "1234@lnet17n5", or similar.

That wouldn't help us today in the case of kfilnd, since it is just as easy to backport patch https://review.whamcloud.com/47830 "LU-15983 lnet: Define KFILND network type" to the older branches as it is to backport whatever patch is added for this ticket.

However, the goal of this ticket is to allow arbitrarily old clients to mount the filesystem without patching them by specifying a manual LNet number at mount time (to connect to the MGS and/or LNet routers), together with LU-10360 to have the MGS send the binary server NIDs (including LNet network number) directly to the client rather than the client having to parse the NID strings from the config logs themselves.

For the LNet network number, I propose a strawman @lnetNNNn as the network name, which is short enough to be useful, but long enough to be unique, and does not end with a digit itself, so that the network type can be parsed, and it is followed by the network number.

For the node number, I would propose just trying to parse both a single digit, as well as a dotted-quad, which should work with all networks today except IPv6. If the digit parsing was made long enough (e.g. parsing hex digits into __u32 values and ignoring non-hex characters), it could potentially parse any number (including IPv6 and IB GUIDs, or whatever). That may be unwieldy, but possibly preferable to having to patch/upgrade very old clients.

This would likely only be needed for user tools, the MGS, and LNet router NIDs, if LU-10360 is available to supply all of the server NIDs in binary format.



 Comments   
Comment by Andreas Dilger [ 09/Aug/22 ]

Chris, James,
this seems like it could save us some grief on the next go-round, in case there are e.g. "old" 2.12.x clients that need to connect to e.g. kfi2lnd (or whatever is next).

Generated at Sat Feb 10 03:23:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.