[LU-10003] lnetctl error "cannot add network: invalid argument" Created: 19/Sep/17 Updated: 26/Jan/24 |
|
| Status: | Reopened |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | James A Simmons |
| Resolution: | Unresolved | Votes: | 2 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||||||||||
| Description |
|
trying to add second interface getting an error nbpt-serv1 ~ # ip -4 addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc mq state UP qlen 1024 inet 192.168.41.10/24 brd 192.168.41.255 scope global ib0 valid_lft forever preferred_lft forever 7: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc mq state UP qlen 1024 inet 10.151.20.103/18 brd 10.151.63.255 scope global ib1 valid_lft forever preferred_lft forever 8: ib2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc mq state UP qlen 1024 inet 192.168.44.10/24 brd 192.168.44.255 scope global ib2 valid_lft forever preferred_lft forever 9: ib3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc mq state UP qlen 1024 inet 10.151.63.233/18 brd 10.151.63.255 scope global ib3 valid_lft forever preferred_lft forever 10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000 inet 172.17.0.156/16 brd 172.17.255.255 scope global bond0 valid_lft forever preferred_lft forever nbpt-serv1 ~ # lnetctl net show net: - net type: lo local NI(s): - nid: 0@lo status: up - net type: o2ib local NI(s): - nid: 10.151.20.103@o2ib status: up interfaces: 0: ib1 nbpt-serv1 ~ # lnetctl net add --net o2ib --if ib3 add: - net: errno: -22 descr: "cannot add network: Invalid argument" nbpt-serv1 ~ # lnetctl net del --net o2ib --if ib1 del: - net: errno: -22 descr: "cannot del network: Invalid argument"
|
| Comments |
| Comment by Mahmoud Hanafi [ 19/Sep/17 ] |
|
looks like it is fails here rc = l_ioctl(LNET_DEV_ID, IOC_LIBCFS_ADD_LOCAL_NI, data);
(gdb) s
l_ioctl (dev_id=dev_id@entry=0, opc=opc@entry=3233310047, buf=buf@entry=0x618190) at util/l_ioctl.c:106
$1 = {lic_cfg_hdr = {ioc_len = 2728, ioc_version = 65547}, lic_nid = 1407375061237737, lic_ni_intf = {
"ib3", '\000' <repeats 124 times>, '\000' <repeats 127 times> <repeats 15 times>},
lic_legacy_ip2nets = '\000' <repeats 127 times>, lic_cpts = {0 <repeats 128 times>}, lic_ncpts = 0, lic_status = 0,
lic_tcp_bonding = 0, lic_idx = 0, lic_dev_cpt = 0, pad = "\000\000\000", lic_bulk = 0x618b98 "q\004\002"}
|
| Comment by Peter Jones [ 19/Sep/17 ] |
|
Amir Could you please advise? Thanks Peter |
| Comment by Amir Shehata (Inactive) [ 20/Sep/17 ] |
|
if you try and add the same interface twice, it'll fail the second time. But it should be able to add a different interface. Can you please enable debug logs and attach them here: lctl set_param debug=+net
lctl set_param debug=+neterror
lnetctl net add --net o2ib --if ib3
lctl dk > log
Also just to verify can you please paste the output from: lnetctl -h |
| Comment by Mahmoud Hanafi [ 21/Sep/17 ] |
|
here is the output elrtr10 ~ # lctl clear;lctl set_param debug=+net;lctl set_param debug=+neterror;lnetctl net add --net o2ib --if ib0;lctl dk > log debug=+net debug=+neterror add: - net: errno: -22 descr: "cannot add network: Invalid argument" elrtr10 ~ # cat log 00000400:00000080:38.0F:1506037088.299967:0:3941:0:(module.c:121:libcfs_ioctl()) libcfs ioctl cmd 3233310047 Debug log: 1 lines, 1 kept, 0 dropped, 0 bad. elrtr10 ~ # ip -o -4 add show 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 6: ib0 inet 10.151.26.140/18 brd 10.151.63.255 scope global ib0\ valid_lft forever preferred_lft forever 7: ib1 inet 10.151.26.60/18 brd 10.151.63.255 scope global ib1\ valid_lft forever preferred_lft forever 8: ib2 inet 10.149.26.140/18 brd 10.149.63.255 scope global ib2\ valid_lft forever preferred_lft forever 9: ib3 inet 10.149.26.60/18 brd 10.149.63.255 scope global ib3\ valid_lft forever preferred_lft forever 10: bond0 inet 172.17.0.160/16 brd 172.17.255.255 scope global bond0\ valid_lft forever preferred_lft forever elrtr10 ~ # lnetctl -h Try interactive use without arguments or use one of: "lnet" "route" "net" "routing" "set" "import" "export" "stats" "numa" "peer" "help" "exit" "quit" "--list-commands" as argument. |
| Comment by Amir Shehata (Inactive) [ 22/Sep/17 ] |
|
can you try this: lnetctl lnet configure
lnetctl net add --net o2ib --if ib0
If you're bringing up lnet using lctl net up then you'll need to run lnetctl lnet configure before you're able to use the rest of the lnetctl commands. In the future, it'll be a good idea to use lnetctl utility for everything: modprobe lnet
lnetctl lnet configure
lnetctl net add --net o2ib --if ib0,ib1 # or whatever interfaces you'd like to add
# other lnetctl commands
|
| Comment by Mahmoud Hanafi [ 22/Sep/17 ] |
|
We load the lustre module and that starts everything. Here is our /etc/modprobe.d/lustre.conf options ko2iblnd require_privileged_port=0 use_privileged_port=0 options ko2iblnd timeout=150 retry_count=7 map_on_demand=32 peer_credits=63 concurrent_sends=63 options ko2iblnd ntx=32768 credits=32768 fmr_pool_size=8193 #lnet options lnet networks="o2ib(ib0,ib1),o2ib313(ib2,ib3)" forwarding=enabled #options lnet networks="o2ib(ib1),o2ib313(ib3)" forwarding=enabled options lnet avoid_asym_router_failure=1 check_routers_before_use=1 small_router_buffers=65536 large_router_buffers=8192 options ptlrpc at_max=600 at_min=150 |
| Comment by Amir Shehata (Inactive) [ 22/Sep/17 ] |
|
yes, so in that case you'll need to call "lnetctl lnet configure" otherwise you won't be able to add networks or use other lnetctl commands. There is a way to actually do what you're doing using the module parameters but dynamically using lnetctl. The added benefit of doing it dynamically is that you're able to assign tunables per network instead of globally. For example if you want to have map-on-demand different for different networks (this will become useful if you're using OPA and MLX for example, and you want to tune them differently on the router). You can put all the yaml configuration in /etc/lnet.conf and then start lnet as a service which will import that file. |
| Comment by Mahmoud Hanafi [ 24/Sep/17 ] |
|
The issue was not running The documentation should be more clear to indicate this. |
| Comment by Amir Shehata (Inactive) [ 25/Sep/17 ] |
|
i'll push a patch to the manual to clarify that. |
| Comment by Amir Shehata (Inactive) [ 04/Oct/17 ] |
| Comment by James A Simmons [ 13/Oct/17 ] |
|
So in reality lnetctl is not long optional. I has to be installed on a node and run at startup time. In reality lctl net no long works by itself. |
| Comment by Amir Shehata (Inactive) [ 13/Oct/17 ] |
|
If you don't want to use MR and you configure everything from module parameters, then you don't need to use lnetctl. For MR and future LNet features, lnetctl will be the utility to use. no more updates will be made to lctl. I'm looking at modifying the makefiles to always make lnetctl and install it properly. |
| Comment by James A Simmons [ 13/Oct/17 ] |
|
Awesome. I was looking at doing the same thing. Especially since the lustre tools are also looking to use libyaml ( The reason I would also looking at this is that I use the maloo test suite but setup lnet using lnetctl. The test suite doesn't really like that. So I was looking to make the test suite work flawlessly with lnetctl. |
| Comment by Peter Jones [ 18/Dec/17 ] |
|
Is any further work needed on this ticket or can the ticket be closed now that the manual has been updated? |
| Comment by James A Simmons [ 18/Dec/17 ] |
|
Andreas talked about pushing patches that tell the user that lctl net is obsolete. Perhaps this is the perfect ticket to push those patches under? |
| Comment by Gerrit Updater [ 05/Jan/18 ] |
|
Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/30755 |
| Comment by James A Simmons [ 05/Jan/18 ] |
|
Patch pushed to make lctl net functions deprecate now that the work to make lnetctl a hard requirement has landed for lustre 2.11 |
| Comment by Gerrit Updater [ 20/Jan/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30755/ |
| Comment by Peter Jones [ 20/Jan/18 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 22/Jan/18 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30968 |
| Comment by Andreas Dilger [ 24/Jan/18 ] |
|
The deprecation messages are not very clear. For example, now when I run sanity.sh it reports (among other things): This command has been deprecated. Plesae use 'lnetctl net show'. ... but it isn't clear what This command is, so I don't even know what command to start looking for in the test scripts. It should print the command name, like: lctl: 'list_nids' is deprecated. Please use 'lnetctl net show'. Please fix typo Plesae in error message as well. Also, use of the deprecated commands like "lctl list_nids" by the test scripts should be replaced by "lnetctl net show", and equivalent for "lctl network" and "lctl ping". It looks like lnetctl has existed since version v2_6_54_0-13-g0f753ea so this should be safe to land for master (we test 2.11 interop against 2.7, but no longer against 2.5). |
| Comment by Amir Shehata (Inactive) [ 25/Jan/18 ] |
|
Andreas, beside fixing the deprecated message are you suggesting we also change lctl usage to lnetctl usage in the test scripts as part of this ticket? |
| Comment by Gerrit Updater [ 26/Jan/18 ] |
|
Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/31030 |
| Comment by Gerrit Updater [ 26/Jan/18 ] |
|
Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/31031 |
| Comment by Amir Shehata (Inactive) [ 02/Feb/18 ] |
|
chatted with jhammond regarding the deprecation and he thinks that deprecating list_nids is going to cause trouble for the customer. We wanted to get some feedback on whether we should remove list_nids from the set of commands being deprecated. |
| Comment by Gerrit Updater [ 06/Feb/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31030/ |
| Comment by John Hammond [ 27/Feb/18 ] |
|
I think the deprecation warnings should be removed entirely. |
| Comment by James A Simmons [ 27/Feb/18 ] |
|
In order for that to happen lctl would have to work again. The changes to the LNet layer for multirail support broke the lctl net functionality. Also the current lnetctl tools don't work with pre-2.10 version as a side note. |
| Comment by Peter Jones [ 06/Mar/18 ] |
|
John L. Hammond (john.hammond@intel.com) uploaded a new patch: https://review.whamcloud.com/31534 |
| Comment by Gerrit Updater [ 08/Mar/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31534/ |
| Comment by Mahmoud Hanafi [ 29/Aug/18 ] |
|
this can be closed |
| Comment by Peter Jones [ 30/Aug/18 ] |
|
I think that this ticket had been kept open because https://review.whamcloud.com/#/c/31031/ has not landed yet. Is this still needed? If so, let's reopen the ticket and rebase the patch to get it landed, if not let's abandon the patch |
| Comment by Gerrit Updater [ 10/Oct/22 ] |
|
"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/48814 |
| Comment by James A Simmons [ 10/Oct/22 ] |
|
Using this ticket to unify the MR and preMR APIs. We can kill off the old ioctls and keep the lctl LNet functionality since it will never go away |
| Comment by Gerrit Updater [ 08/Nov/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48814/ |
| Comment by James A Simmons [ 08/Nov/22 ] |
|
More patches to go. |
| Comment by Gerrit Updater [ 08/Nov/22 ] |
|
"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49068 |
| Comment by Gerrit Updater [ 10/Nov/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49068/ |
| Comment by Gerrit Updater [ 10/Dec/22 ] |
|
"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49360 |
| Comment by Gerrit Updater [ 19/Jan/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49360/ |
| Comment by James A Simmons [ 19/Jan/23 ] |
|
More work left. |
| Comment by Gerrit Updater [ 27/Mar/23 ] |
|
"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50440 |
| Comment by Gerrit Updater [ 03/Nov/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50440/ |
| Comment by Gerrit Updater [ 27/Dec/23 ] |
|
"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53556 |
| Comment by Gerrit Updater [ 10/Jan/24 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/53556/ |
| Comment by Gerrit Updater [ 26/Jan/24 ] |
|
"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53835 |