Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11893

doesn't handle logical network interface properly.

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0
    • None
    • None
    • 2.12
    • 3
    • 9223372036854775807

    Description

       # ifconfig | grep ib
       Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
       Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
       Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
       Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
       ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
       infiniband 20:00:10:86:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
       ib0:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
       infiniband 20:00:10:86:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
       ib1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
       infiniband 20:00:18:86:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
       ib1:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
       infiniband 20:00:18:86:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
      

      Lustre-2.10.5 works well

       # cat /etc/modprobe.d/lustre.conf 
       options lnet networks="o2ib0(ib0), o2ib1(ib0:0), o2ib2(ib1), o2ib3(ib1:0)"
       # modprobe lustre
      
       Jan 28 12:52:17 ai200-7f94-vm00 kernel: LNet: HW NUMA nodes: 1, HW CPU cores: 16, npartitions: 4
       Jan 28 12:52:17 ai200-7f94-vm00 kernel: alg: No test for adler32 (adler32-zlib)
       Jan 28 12:52:18 ai200-7f94-vm00 kernel: Lustre: Lustre: Build Version: 2.10.5_ddn7_2_g7fd8383
       Jan 28 12:52:18 ai200-7f94-vm00 kernel: LNet: Using FastReg for registration
       Jan 28 12:52:18 ai200-7f94-vm00 kernel: LNet: Added LNI 172.16.251.20@o2ib [8/256/0/180]
       Jan 28 12:52:18 ai200-7f94-vm00 kernel: LNet: Added LNI 172.16.252.20@o2ib1 [8/256/0/180]
       Jan 28 12:52:18 ai200-7f94-vm00 kernel: LNet: Added LNI 172.16.253.20@o2ib2 [8/256/0/180]
       Jan 28 12:52:18 ai200-7f94-vm00 kernel: LNet: Added LNI 172.16.254.20@o2ib3 [8/256/0/180]
      

      lustre-2.12 doesn't handle logical interface properly

      # modprobe lustre
      modprobe: ERROR: could not insert 'lustre': Network is down
      
      Jan 28 13:00:56 ai200-7f94-vm00 kernel: LNet: HW NUMA nodes: 1, HW CPU cores: 16, npartitions: 4
      Jan 28 13:00:56 ai200-7f94-vm00 kernel: alg: No test for adler32 (adler32-zlib)
      Jan 28 13:00:57 ai200-7f94-vm00 kernel: Lustre: Lustre: Build Version: 2.12.0
      Jan 28 13:00:57 ai200-7f94-vm00 kernel: LNet: Using FastReg for registration
      Jan 28 13:00:57 ai200-7f94-vm00 kernel: LNet: Added LNI 172.16.251.20@o2ib [8/256/0/180]
      Jan 28 13:00:57 ai200-7f94-vm00 kernel: LNetError: 6305:0:(lib-socket.c:105:lnet_ipif_query()) Can't get flags for interface ib0:0
      Jan 28 13:00:57 ai200-7f94-vm00 kernel: LNetError: 6305:0:(o2iblnd.c:2879:kiblnd_create_dev()) Can't query IPoIB interface ib0:0: -19
      Jan 28 13:00:57 ai200-7f94-vm00 kernel: LNetError: 105-4: Error -100 starting up LNI o2ib
      Jan 28 13:00:58 ai200-7f94-vm00 kernel: LNet: Removed LNI 172.16.251.20@o2ib
      Jan 28 13:00:58 ai200-7f94-vm00 kernel: LustreError: 6305:0:(events.c:625:ptlrpc_init_portals()) network initialisation failed
      

      Attachments

        Issue Links

          Activity

            [LU-11893] doesn't handle logical network interface properly.

            Jian Yu (yujian@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35159
            Subject: LU-11893 ksocklnd: add secondary IP address handling
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 9f98099692cf4bfc18000226ac09bee3be2d6e74

            gerrit Gerrit Updater added a comment - Jian Yu (yujian@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35159 Subject: LU-11893 ksocklnd: add secondary IP address handling Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 9f98099692cf4bfc18000226ac09bee3be2d6e74
            pjones Peter Jones added a comment -

            The countdown continues- one to go

            pjones Peter Jones added a comment - The countdown continues- one to go

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34476/
            Subject: LU-11893 o2iblnd: add secondary IP address handling
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c4b39bf56bbcacd49d7f888a0745cd4b5580b36b

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34476/ Subject: LU-11893 o2iblnd: add secondary IP address handling Project: fs/lustre-release Branch: master Current Patch Set: Commit: c4b39bf56bbcacd49d7f888a0745cd4b5580b36b

            Two patches left.

            simmonsja James A Simmons added a comment - Two patches left.

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34993
            Subject: LU-11893 lnet: consoldate secondary IP address handling
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0e53ce7c9c31d37fc6514608843a6049e9167ddd

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34993 Subject: LU-11893 lnet: consoldate secondary IP address handling Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0e53ce7c9c31d37fc6514608843a6049e9167ddd

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34392/
            Subject: LU-11893 ksocklnd: add secondary IP address handling
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 9a2013af0668737dc56424c5c6eaac01621f6c17

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34392/ Subject: LU-11893 ksocklnd: add secondary IP address handling Project: fs/lustre-release Branch: master Current Patch Set: Commit: 9a2013af0668737dc56424c5c6eaac01621f6c17

            Please review https://review.whamcloud.com/#/c/34392. Its blocking RHEL8 support.

            simmonsja James A Simmons added a comment - Please review  https://review.whamcloud.com/#/c/34392.  Its blocking RHEL8 support.

            Amir asked me to break up the patch. So two patches exist to address this issue. Later patches will be done to unify what is being done which also was requested.

            simmonsja James A Simmons added a comment - Amir asked me to break up the patch. So two patches exist to address this issue. Later patches will be done to unify what is being done which also was requested.

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34476
            Subject: LU-11893 o2iblnd: add secondary IP address handling
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8fce282ce0de93e54d373d64e275d161025063f4

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34476 Subject: LU-11893 o2iblnd: add secondary IP address handling Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8fce282ce0de93e54d373d64e275d161025063f4

            I updated patch https://review.whamcloud.com/#/c/34392/ so everything should work now, including ko2iblnd. Please try it out. I have been testing on my end.

            simmonsja James A Simmons added a comment - I updated patch  https://review.whamcloud.com/#/c/34392/  so everything should work now, including ko2iblnd. Please try it out. I have been testing on my end.

            Hi James,
            Thanks!. I also confirmed patch works for socklnd, but o2nld still didn't work. I think it needs similar idea of https://review.whamcloud.com/#/c/33968 for o2lnd.

            sihara Shuichi Ihara added a comment - Hi James, Thanks!. I also confirmed patch works for socklnd, but o2nld still didn't work. I think it needs similar idea of https://review.whamcloud.com/#/c/33968 for o2lnd.

            People

              simmonsja James A Simmons
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: