Details
-
Improvement
-
Resolution: Not a Bug
-
Minor
-
None
-
None
-
None
-
9223372036854775807
Description
My knowledge about Lustre is limited so please correct me where necessary.
Imagine the following situation: You have a Lustre (2.14) file system running and the clients can access Lustre.
Now you want to resolve some issues with startup order on the clients. And doing so you get the order wrong in which lnet and lustre modules are loaded and configured. In my particular case the lustre module was loaded before Lnet configuration for Infiniband was done so the lustre module configured an Lnet on ethernet, yet there is no connection between client and Lustre server ethernet.
This resulted in having two NIs configured (@tcp and @o2ib) per client where @tcp is the primary NID. The Lustre servers will happily accept these peer configurations but Lustre operation gets slower because the servers will try to reach the clients via @tcp first
(and vice versa).
Having spotted that mistake and corrected the order in which Lnet is configured and the Lustre module is loaded the clients then only get one NI configured (@o2ib) which naturally is the primary NID. But the Lustre servers do not update the Lnet peer entries already discovered and keep a primary NID of @tcp for the clients. And thus the servers will try to connect to the clients using @tcp.
A fools resolution would just remove the peer entry on the Lustre servers and instantly add back a correct entry. But this leads to hiccups that influence the whole file system, possibly leading to reboots of the Lustre servers.
So the solutions to this situation that I can think of are:
- allow lnetctl to remove primary NIDs from peer entries
- dynamically update a peer entry if a peer reconnects with a different configuration
Is one or the other possible or is a primary NID more than just "the first interface for that peer"?
Is there another way to remove wrong entries in a Lustre server's peer configuration (other than rebooting)?
Thanks,
Uwe