[LU-12824] Unable to add single Infiniband interface to multiple o2ib LNets Created: 30/Sep/19 Updated: 05/Feb/20 Resolved: 09/Oct/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Chris Horn | Assignee: | Chris Horn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Configuring a single IB interface on multiple LNets was broken by commit 75ab841d92a7109cf9f4da69a58ae4d21d360a4c Author: James Simmons <jsimmons@infradead.org> Date: Mon Jul 8 10:42:47 2019 -0700 LU-11893 lnet: consoldate secondary IP address handling Prior to this commit, when configuring an ib device for multiple LNets, we would only create a single struct ib_dev object. This object was created via a call to kiblnd_create_dev(). That function initializes the ib_dev object with a call to kiblnd_dev_failover(). kiblnd_dev_failover() creates the struct rdma_cm_id object, and calls rdma_bind_addr(). When the ib_dev object is created successfully, it is added to a global list of devices: list_add_tail(&dev->ibd_list,
&kiblnd_data.kib_devs);
When the interface is added to additional LNets, the kiblnd_startup() routine searches the kiblnd_data.kib_devs list to see if there is an existing ib_dev object for the interface being configured. If it finds one, then that ib_dev object is re-used. Reproducer with kernel module parameter: [root@snx11922n002 ~]# cat /etc/lustre/ip2nets.dat o2ib040(ib0) 10.12.0.*; o2ib041(ib0) 10.12.0.50; [root@snx11922n002 ~]# modprobe lnet l[root@snx11922n002 ~]# lctl net up LNET configure error 100: Network is down [root@snx11922n002 ~]# dmesg | tail [604327.506043] alg: No test for adler32 (adler32-zlib) [604327.512517] alg: No test for crc32 (crc32-table) [604328.280286] LNet: live_router_check_interval and dead_router_check_interval have been deprecated. Use alive_router_check_interval instead. Ignoring these deprecated parameters. [604330.561491] LNet: 3809:0:(config.c:1641:lnet_inet_enumerate()) lnet: Ignoring interface eth2: it's down [604330.591143] LNet: Using FastReg for registration [604330.614353] LNet: Added LNI 10.12.0.50@o2ib40 [16/2048/0/0] [604330.621410] LNetError: 3809:0:(o2iblnd.c:2776:kiblnd_dev_failover()) Failed to bind ib0:10.12.0.50 to device(ffff881f96ff8000): -98 [604330.636010] LNetError: 3809:0:(o2iblnd.c:3266:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -98 [604330.647163] LNetError: 105-4: Error -100 starting up LNI o2ib [604331.659240] LNet: Removed LNI 10.12.0.50@o2ib40 Reproducer with lnetctl: [root@snx11922n002 ~]# modprobe lnet
[root@snx11922n002 ~]# lctl mark mark
[root@snx11922n002 ~]# lnetctl lnet configure
[root@snx11922n002 ~]# lnetctl net add --net o2ib040 --if ib0
[root@snx11922n002 ~]# lnetctl net add --net o2ib041 --if ib0
add:
- net:
errno: -100
descr: "cannot add network: Network is down"
[root@snx11922n002 ~]# dmesg | tail
[604760.221364] alg: No test for crc32 (crc32-table)
[604760.983433] LNet: live_router_check_interval and dead_router_check_interval have been deprecated. Use alive_router_check_interval instead. Ignoring these deprecated parameters.
[604763.557036] Lustre: DEBUG MARKER: mark
[604777.372005] LNet: 7487:0:(config.c:1641:lnet_inet_enumerate()) lnet: Ignoring interface eth2: it's down
[604777.382924] LNet: Using FastReg for registration
[604777.402400] LNet: Added LNI 10.12.0.50@o2ib40 [16/2048/0/0]
[604781.025699] LNet: 7528:0:(config.c:1641:lnet_inet_enumerate()) lnet: Ignoring interface eth2: it's down
[604781.036209] LNetError: 7528:0:(o2iblnd.c:2776:kiblnd_dev_failover()) Failed to bind ib0:10.12.0.50 to device(ffff881f96ff8000): -98
[604781.050103] LNetError: 7528:0:(o2iblnd.c:3266:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -98
[604781.060933] LNetError: 105-4: Error -100 starting up LNI o2ib
[root@snx11922n002 ~]#
|
| Comments |
| Comment by Gerrit Updater [ 30/Sep/19 ] |
|
Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/36324 |
| Comment by Gerrit Updater [ 30/Sep/19 ] |
|
Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/36325 |
| Comment by Gerrit Updater [ 30/Sep/19 ] |
|
Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/36326 |
| Comment by James A Simmons [ 01/Oct/19 ] |
|
If this is really supported I don't think this has been explored for ksocklnd. Does it work there as well? |
| Comment by Chris Horn [ 01/Oct/19 ] |
|
Yes, it works with ksocklnd. sles15build01:~ # lnetctl net add --net tcp --if eth0 sles15build01:~ # lnetctl net add --net tcp1 --if eth0 sles15build01:~ # lctl list_nids 192.168.2.20@tcp 192.168.2.20@tcp1 sles15build01:~ # |
| Comment by James A Simmons [ 01/Oct/19 ] |
|
If such a configuration is allowed this opens up issues about failover pairs and how health behaves in this kind of setup. I do expect their are corner cases hidden in such a setup. If this is allowed I guess IP alias support is not really needed |
| Comment by Chris Horn [ 01/Oct/19 ] |
|
I'm really surprised by your reaction to this change. Cray published the LNet fine grained routing paper at CUG 2013. We've been using this kind of config in production for 6 years. |
| Comment by James A Simmons [ 01/Oct/19 ] |
|
At ORNL we implemented this differently. It just comes as a surprise that such a setup was possible. Talking to Olaf he had the same reaction. I'm not against supporting such a setup but with LNet health and fail over pairing I wonder what corner cases could exist. We should really exercise this in your LNet test suite I have had this happen in the past on other projects. The API is not clearly defined in some area and some company implements something no one expected. Then the change is brought before the standards board to sort it. In this case its Amir. |
| Comment by Amir Shehata (Inactive) [ 01/Oct/19 ] |
|
This config has historically been supported. LNet is designed to act as a virtual network over the physical network. One use case for this configuration is to segregate LNet traffic going over the same interface. As to regards interaction with health, since at the LNet level these are two different NIDs, their health values will be managed independently. When there is a failure to send over one of these NIDs, then their health value will be decremented and added on the recovery queue. As far as I can see it should work at the LNet level. The draw back I see with this type of configuration is performance and security. Performance since you're sharing the same link. Security because traffic is using the same link and you can just sniff traffic on both NIDs. |
| Comment by Gerrit Updater [ 04/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36324/ |
| Comment by Gerrit Updater [ 09/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36325/ |
| Comment by Gerrit Updater [ 09/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36326/ |
| Comment by Peter Jones [ 09/Oct/19 ] |
|
Landed for 2.13 |
| Comment by Alex Parga [ 21/Oct/19 ] |
|
Is this fix expected to land for 2.12? |
| Comment by Peter Jones [ 22/Oct/19 ] |
|
I have marked it as a candidate for a future 2.12.x release. |
| Comment by Gerrit Updater [ 22/Oct/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36545 |
| Comment by Gerrit Updater [ 22/Oct/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36546 |
| Comment by Gerrit Updater [ 22/Oct/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36547 |
| Comment by Gerrit Updater [ 21/Nov/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36545/ |
| Comment by Gerrit Updater [ 21/Nov/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36546/ |
| Comment by Gerrit Updater [ 21/Nov/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36547/ |