[LU-2950] Port LNET routes_config change done by Fujitsu in latest lustre Created: 12/Mar/13 Updated: 07/Aug/14 Due: 20/Mar/13 Resolved: 01/May/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Improvement | Priority: | Major |
| Reporter: | Amir Shehata (Inactive) | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | Fujitsu, lnet | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 7082 | ||||||||
| Description |
|
Port a change made by Fujitsu where they added a new lnet module param: routes_config, which can specify a file that contains a set of routes to be used by lnet |
| Comments |
| Comment by Keith Mannthey (Inactive) [ 12/Mar/13 ] |
|
Please be aware of: I really don't like the fact of reading a file into the kernel space but I guess it is seen as a necessary evil at this point. Please safeguard all file actions as much as you can. |
| Comment by Amir Shehata (Inactive) [ 15/Mar/13 ] |
|
We took Issac's suggestion and we created a shell script that reads a routes config file then uses lctl to configure lnet with the routes. This avoids opening files in kernel space. |
| Comment by Amir Shehata (Inactive) [ 25/Mar/13 ] |
|
test files |
| Comment by Amir Shehata (Inactive) [ 25/Mar/13 ] |
|
We created two scripts. One to convert from the legacy syntax to the new syntax, and the second to take a file of the new syntax and configure lustre using lctl. The legacy syntax (which is currently being used in the lustre.conf file): The new syntax (which is used in the new scripts): I would think the intention is to move the syntax to the new form in the future, so there isn't three different syntax for configuring routes. Below is sample test runs. I have also attached two files to this Bug that were used for testing the scripts. [ashehata@localhost scripts]$ ~/lustre-master/lustre/scripts/lustre_routes_conversion legacy_routes_test new_routes_test |
| Comment by Keith Mannthey (Inactive) [ 26/Mar/13 ] |
|
Can you submit your scripts into gerrit? |
| Comment by Amir Shehata (Inactive) [ 26/Mar/13 ] |
|
The code inspection is @ http://review.whamcloud.com/#change,5757 |
| Comment by Hiroya Nozaki [ 04/Apr/13 ] |
|
Hi, Amir. Nice to meet you. I heartily agree your implementation because I also think reading a file into kernel space can be a kind of dangerous. Thank you |
| Comment by Amir Shehata (Inactive) [ 12/Apr/13 ] |
|
Hi Hiroya, My apologies for the delay, for some reason I wasn't getting any notifications on Jira of comments being added. I will look into that. I have looked at the case that you have indicated; and you are absolutely right. The im_a_router flag will not be set when initially LNetNIInit() is called, and that will cause the "forwarding" parameter not to be checked, resulting in an lnet router node not behaving as expected. Thanks for pointing this out. There are a couple of way to approach this: 1. Is to ensure that for a router node, in the luster.conf you enter a routes module param entry that corresponds to the NID of the router. This way when the routes param is being parsed a router node will set the "im_a_router" param properly and thus allocate the pools. The script: "lustre_routes_config" can be used later on to add extra routing entries as desired. 2. Is to modify the patch to ensure that when routes are added, if a route NID is added which corresponds to the local NID, then im_a_route is set, and immediately the pools are created. Please let me know your thoughts on these approaches. Thanks again for catching this case. amir |
| Comment by Amir Shehata (Inactive) [ 12/Apr/13 ] |
|
Hi Hiroya, A note to highlight: The DLC (Dynamic Lnet Configuration) will be backwards compatible. (IE, existing configurations will not need to be changed, when DLC is introduced) thanks |
| Comment by Amir Shehata (Inactive) [ 24/Apr/13 ] |
|
I have updated review: |
| Comment by Ryan Haasken [ 07/Aug/14 ] |
|
Since this change added the ability to configure LNet routes from a file, and updated the lnet init script to load that config file at start time, I think the reload and probe actions should be updated so that they will reload the configuration from the file, if necessary. From the LSB, this is what the reload action should do:
Similarly probe should do the following (according to SUSE, at least):
I am curious about this because I am working on porting at least the lnet init script to SLES. Amir, am I right to think that the lnet init script should call lustre_routes_config during the reload operation? |