Details
-
Improvement
-
Resolution: Duplicate
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
(HPE LUS-12254)
Motivation
Changes of server NIDs in Lustre have historically been challenging, originally requiring the scary writeconf. There is a slight improvement with lctl replace_nids in that the entire config isn't wiped out, but this still requires a full system shutdown according to the manual, and in any case is another step after futzing with cabling and LNET configuration.
Also, fixed addressing is generally a challenge for more dynamic environments: cloud, virtual, etc.
Proposal
When servers start up, they contact the MGS and self-report their NIDs. The MGS updates its in-memory config with these settings, and notifies all other nodes of the new value via imperative recovery.
But: remove all NID information from persistent Lustre config files. MGS should remember the existence of the servers (uuid), but not know their NIDs when starting up. MGS should make up an address for a server that hasn't registered yet, either something like "nothing@lo0" or perhaps the MGS address if that helps the old client compat case. Clients will fail to contact anything at this address, and will retry forever until they get an update from the MGS with the new addresses.
Heading off objections:
For those people/sites that hate change/are worried about server imposter attacks, we could perhaps keep both methods. Change mkfs.lustre to include a new flag "dynamic_nid" by default. If that is not present when first registering with the MGS, then MGS stores NIDs persistently as today. If it is there, then MGS does not store NID for this server.