Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12824

Unable to add single Infiniband interface to multiple o2ib LNets

Details

    • 3
    • 9223372036854775807

    Description

      Configuring a single IB interface on multiple LNets was broken by

      commit 75ab841d92a7109cf9f4da69a58ae4d21d360a4c
      Author: James Simmons <jsimmons@infradead.org>
      Date:   Mon Jul 8 10:42:47 2019 -0700
      
         LU-11893 lnet: consoldate secondary IP address handling
      

      Prior to this commit, when configuring an ib device for multiple LNets, we would only create a single struct ib_dev object. This object was created via a call to kiblnd_create_dev(). That function initializes the ib_dev object with a call to kiblnd_dev_failover(). kiblnd_dev_failover() creates the struct rdma_cm_id object, and calls rdma_bind_addr(). When the ib_dev object is created successfully, it is added to a global list of devices:

              list_add_tail(&dev->ibd_list,
                                &kiblnd_data.kib_devs);
      

      When the interface is added to additional LNets, the kiblnd_startup() routine searches the kiblnd_data.kib_devs list to see if there is an existing ib_dev object for the interface being configured. If it finds one, then that ib_dev object is re-used.
      The LU-11893 patch I noted above removed the logic for searching this list for an existing ib_dev object. It always creates a new ib_dev object, which I believe results in the EADDRINUSE.
      It should be pretty straight forward to re-introduce the logic for searching the kib_devs list.

      Reproducer with kernel module parameter:

      [root@snx11922n002 ~]# cat /etc/lustre/ip2nets.dat
      o2ib040(ib0) 10.12.0.*;
      o2ib041(ib0) 10.12.0.50;
      [root@snx11922n002 ~]# modprobe lnet
      l[root@snx11922n002 ~]# lctl net up
      LNET configure error 100: Network is down
      [root@snx11922n002 ~]# dmesg | tail
      [604327.506043] alg: No test for adler32 (adler32-zlib)
      [604327.512517] alg: No test for crc32 (crc32-table)
      [604328.280286] LNet: live_router_check_interval and dead_router_check_interval have been deprecated. Use alive_router_check_interval instead. Ignoring these deprecated parameters.
      [604330.561491] LNet: 3809:0:(config.c:1641:lnet_inet_enumerate()) lnet: Ignoring interface eth2: it's down
      [604330.591143] LNet: Using FastReg for registration
      [604330.614353] LNet: Added LNI 10.12.0.50@o2ib40 [16/2048/0/0]
      [604330.621410] LNetError: 3809:0:(o2iblnd.c:2776:kiblnd_dev_failover()) Failed to bind ib0:10.12.0.50 to device(ffff881f96ff8000): -98
      [604330.636010] LNetError: 3809:0:(o2iblnd.c:3266:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -98
      [604330.647163] LNetError: 105-4: Error -100 starting up LNI o2ib
      [604331.659240] LNet: Removed LNI 10.12.0.50@o2ib40
      

      Reproducer with lnetctl:

      [root@snx11922n002 ~]# modprobe lnet
      [root@snx11922n002 ~]# lctl mark mark
      [root@snx11922n002 ~]# lnetctl lnet configure
      [root@snx11922n002 ~]# lnetctl net add --net o2ib040 --if ib0
      [root@snx11922n002 ~]# lnetctl net add --net o2ib041 --if ib0
      add:
          - net:
                errno: -100
                descr: "cannot add network: Network is down"
      [root@snx11922n002 ~]# dmesg | tail
      [604760.221364] alg: No test for crc32 (crc32-table)
      [604760.983433] LNet: live_router_check_interval and dead_router_check_interval have been deprecated. Use alive_router_check_interval instead. Ignoring these deprecated parameters.
      [604763.557036] Lustre: DEBUG MARKER: mark
      [604777.372005] LNet: 7487:0:(config.c:1641:lnet_inet_enumerate()) lnet: Ignoring interface eth2: it's down
      [604777.382924] LNet: Using FastReg for registration
      [604777.402400] LNet: Added LNI 10.12.0.50@o2ib40 [16/2048/0/0]
      [604781.025699] LNet: 7528:0:(config.c:1641:lnet_inet_enumerate()) lnet: Ignoring interface eth2: it's down
      [604781.036209] LNetError: 7528:0:(o2iblnd.c:2776:kiblnd_dev_failover()) Failed to bind ib0:10.12.0.50 to device(ffff881f96ff8000): -98
      [604781.050103] LNetError: 7528:0:(o2iblnd.c:3266:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -98
      [604781.060933] LNetError: 105-4: Error -100 starting up LNI o2ib
      [root@snx11922n002 ~]#
      

      Attachments

        Activity

          [LU-12824] Unable to add single Infiniband interface to multiple o2ib LNets

          Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36547
          Subject: LU-12824 o2ib: Record rc in debug log on startup failure
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set: 1
          Commit: 9d021ae9f819f8a15812c90af33a0604452b4bf9

          gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36547 Subject: LU-12824 o2ib: Record rc in debug log on startup failure Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 9d021ae9f819f8a15812c90af33a0604452b4bf9

          Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36546
          Subject: LU-12824 o2ib: Fix whitespace in kiblnd_startup
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set: 1
          Commit: 1f04b73ce39a9d181d0ba689bbaf993f348ea250

          gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36546 Subject: LU-12824 o2ib: Fix whitespace in kiblnd_startup Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 1f04b73ce39a9d181d0ba689bbaf993f348ea250

          Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36545
          Subject: LU-12824 o2ib: Reintroduce kiblnd_dev_search
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set: 1
          Commit: fe6666b21f421d0fd948489ce8d30c007f5d94f1

          gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36545 Subject: LU-12824 o2ib: Reintroduce kiblnd_dev_search Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: fe6666b21f421d0fd948489ce8d30c007f5d94f1
          pjones Peter Jones added a comment -

          I have marked it as a candidate for a future 2.12.x release.

          pjones Peter Jones added a comment - I have marked it as a candidate for a future 2.12.x release.
          apargal Alex Parga added a comment -

          Is this fix expected to land for 2.12?

          apargal Alex Parga added a comment - Is this fix expected to land for 2.12?
          pjones Peter Jones added a comment -

          Landed for 2.13

          pjones Peter Jones added a comment - Landed for 2.13

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36326/
          Subject: LU-12824 o2ib: Reintroduce kiblnd_dev_search
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: e25e45c612a061031e8b4b5233137fbb57b50cc4

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36326/ Subject: LU-12824 o2ib: Reintroduce kiblnd_dev_search Project: fs/lustre-release Branch: master Current Patch Set: Commit: e25e45c612a061031e8b4b5233137fbb57b50cc4

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36325/
          Subject: LU-12824 o2ib: Record rc in debug log on startup failure
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 99f85541a685df82265f18167e91c161c523ce50

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36325/ Subject: LU-12824 o2ib: Record rc in debug log on startup failure Project: fs/lustre-release Branch: master Current Patch Set: Commit: 99f85541a685df82265f18167e91c161c523ce50

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36324/
          Subject: LU-12824 o2ib: Fix whitespace in kiblnd_startup
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 50300e83e4cab3157149107eb735825cc4c3aff1

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36324/ Subject: LU-12824 o2ib: Fix whitespace in kiblnd_startup Project: fs/lustre-release Branch: master Current Patch Set: Commit: 50300e83e4cab3157149107eb735825cc4c3aff1

          This config has historically been supported. LNet is designed to act as a virtual network over the physical network. One use case for this configuration is to segregate LNet traffic going over the same interface.

          As to regards interaction with health, since at the LNet level these are two different NIDs, their health values will be managed independently. When there is a failure to send over one of these NIDs, then their health value will be decremented and added on the recovery queue. As far as I can see it should work at the LNet level.

          The draw back I see with this type of configuration is performance and security. Performance since you're sharing the same link. Security because traffic is using the same link and you can just sniff traffic on both NIDs.

          ashehata Amir Shehata (Inactive) added a comment - This config has historically been supported. LNet is designed to act as a virtual network over the physical network. One use case for this configuration is to segregate LNet traffic going over the same interface. As to regards interaction with health, since at the LNet level these are two different NIDs, their health values will be managed independently. When there is a failure to send over one of these NIDs, then their health value will be decremented and added on the recovery queue. As far as I can see it should work at the LNet level. The draw back I see with this type of configuration is performance and security. Performance since you're sharing the same link. Security because traffic is using the same link and you can just sniff traffic on both NIDs.

          At ORNL we implemented this differently. It just comes as a surprise that such a setup was possible. Talking to Olaf he had the same reaction. I'm not against supporting such a setup but with LNet health and fail over pairing I wonder what corner cases could exist. We should really exercise this in your LNet test suite  

          I have had this happen in the past on other projects. The API is not clearly defined in some area and some company implements something no one expected. Then the change is brought before the standards board to sort it. In this case its Amir.

          simmonsja James A Simmons added a comment - At ORNL we implemented this differently. It just comes as a surprise that such a setup was possible. Talking to Olaf he had the same reaction. I'm not against supporting such a setup but with LNet health and fail over pairing I wonder what corner cases could exist. We should really exercise this in your LNet test suite   I have had this happen in the past on other projects. The API is not clearly defined in some area and some company implements something no one expected. Then the change is brought before the standards board to sort it. In this case its Amir.

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: