Hi Hiroya,
My apologies for the delay, for some reason I wasn't getting any notifications on Jira of comments being added. I will look into that.
I have looked at the case that you have indicated; and you are absolutely right. The im_a_router flag will not be set when initially LNetNIInit() is called, and that will cause the "forwarding" parameter not to be checked, resulting in an lnet router node not behaving as expected. Thanks for pointing this out.
There are a couple of way to approach this:
1. Is to ensure that for a router node, in the luster.conf you enter a routes module param entry that corresponds to the NID of the router. This way when the routes param is being parsed a router node will set the "im_a_router" param properly and thus allocate the pools. The script: "lustre_routes_config" can be used later on to add extra routing entries as desired.
This solution would be temporary until a feature we have in the pipe, DLC, lands. This will allow the modifications of lnet configurations dynamically.
2. Is to modify the patch to ensure that when routes are added, if a route NID is added which corresponds to the local NID, then im_a_route is set, and immediately the pools are created.
Please let me know your thoughts on these approaches.
Thanks again for catching this case.
amir
Since this change added the ability to configure LNet routes from a file, and updated the lnet init script to load that config file at start time, I think the reload and probe actions should be updated so that they will reload the configuration from the file, if necessary.
From the LSB, this is what the reload action should do:
Similarly probe should do the following (according to SUSE, at least):
I am curious about this because I am working on porting at least the lnet init script to SLES. Amir, am I right to think that the lnet init script should call lustre_routes_config during the reload operation?