Details
-
Improvement
-
Resolution: Duplicate
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
(HPE LUS-12254)
Motivation
Changes of server NIDs in Lustre have historically been challenging, originally requiring the scary writeconf. There is a slight improvement with lctl replace_nids in that the entire config isn't wiped out, but this still requires a full system shutdown according to the manual, and in any case is another step after futzing with cabling and LNET configuration.
Also, fixed addressing is generally a challenge for more dynamic environments: cloud, virtual, etc.
Proposal
When servers start up, they contact the MGS and self-report their NIDs. The MGS updates its in-memory config with these settings, and notifies all other nodes of the new value via imperative recovery.
But: remove all NID information from persistent Lustre config files. MGS should remember the existence of the servers (uuid), but not know their NIDs when starting up. MGS should make up an address for a server that hasn't registered yet, either something like "nothing@lo0" or perhaps the MGS address if that helps the old client compat case. Clients will fail to contact anything at this address, and will retry forever until they get an update from the MGS with the new addresses.
Heading off objections:
For those people/sites that hate change/are worried about server imposter attacks, we could perhaps keep both methods. Change mkfs.lustre to include a new flag "dynamic_nid" by default. If that is not present when first registering with the MGS, then MGS stores NIDs persistently as today. If it is there, then MGS does not store NID for this server.
Now we just need someone to test/fix/confirm that this functionality is working.
It looks like the initial patch https://review.whamcloud.com/39613 "LU-10360 mgc: Use IR for client->MDS/OST connections" was landed in commit v2_13_55-106-g37be05eca3, so it should be available in all recent clients, but I don't think it was really finished before Amir moved on to another project. Hacking the MGS client config log to delete the NID records might be able to quickly show what the state of affairs is for the functionality. Ideally, the client config log wouldn't need to have any target configuration records in it, just know "there is an MDT in the IR log, I know how to set up the MDC for it, proceed". Not only does this avoid the complexity of managing static NIDs, it also reduces issues if the MGS config log becomes corrupt, etc.