Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17889

Autotest nodes should not use 'fe80:' link local (loopback) IPv6 address

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • Lustre 2.16.0
    • None
    • Maloo test bed nodes
    • 3
    • 9223372036854775807

    Description

      Pushing patches to run sanity-lnet test with IPv6 which are working locally for me to maloo revealed that the IPv6 address setup is only local link (fe80:.....). These addresses are only visible to the local node so the more complex LNet test can't run. Also LNet ignores these addresses. Proper IPv6 addresses need to be setup.

      Attachments

        Issue Links

          Activity

            [LU-17889] Autotest nodes should not use 'fe80:' link local (loopback) IPv6 address

            mkvardakov, looking at the following test runs from James' patch:
            https://testing.whamcloud.com/test_sets/3c1e5454-23e2-4162-958c-af443ed4fde3
            https://testing.whamcloud.com/test_sets/9f145a47-09f3-4494-a094-d27ea661de38

            I see the following listed near the start of the sanity-lnet.suite_log.onyx-116vm1.log file, from the output of "do_lnetctl net show" and "ip a":

            /usr/sbin/lnetctl lnet configure -all --large
            /usr/sbin/lnetctl net show
            net:
            -     net type: lo
                  local NI(s):
                  -     nid: 0@lo
                        status: up
            -     net type: tcp
                  local NI(s):
                  -     nid: 10.240.29.128@tcp
                        status: up
                        interfaces:
                              0: eth0
            1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
                link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
                inet 127.0.0.1/8 scope host lo
                   valid_lft forever preferred_lft forever
                inet6 ::1/128 scope host 
                   valid_lft forever preferred_lft forever
            2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
                link/ether 52:54:00:c3:da:fa brd ff:ff:ff:ff:ff:ff
                altname enp0s3f0
                altname ens3f0
                inet 10.240.29.128/20 brd 10.240.31.255 scope global dynamic eth0
                   valid_lft 21113sec preferred_lft 21113sec
                inet6 fe80::5054:ff:fec3:dafa/64 scope link 
                   valid_lft forever preferred_lft forever
            3: test1pl@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
                link/ether 8e:7f:44:90:4e:ec brd ff:ff:ff:ff:ff:ff link-netns test_ns
                inet6 fe80::30aa:ddff:feb8:56b9/64 scope link tentative 
                   valid_lft forever preferred_lft forever
            

            so it looks like the inet6 address assignments are still using the link-local fe80:: range instead of the fd33:3981:3213: ranges that you mention above for onyx and trevis? Is there something that needs to be specified for an IPv6 test session to assign these non-local addresses? Is there a reason that shouldn't be done automatically for all test sessions?

            adilger Andreas Dilger added a comment - mkvardakov , looking at the following test runs from James' patch: https://testing.whamcloud.com/test_sets/3c1e5454-23e2-4162-958c-af443ed4fde3 https://testing.whamcloud.com/test_sets/9f145a47-09f3-4494-a094-d27ea661de38 I see the following listed near the start of the sanity-lnet.suite_log.onyx-116vm1.log file , from the output of " do_lnetctl net show " and " ip a ": /usr/sbin/lnetctl lnet configure -all --large /usr/sbin/lnetctl net show net: - net type: lo local NI(s): - nid: 0@lo status: up - net type: tcp local NI(s): - nid: 10.240.29.128@tcp status: up interfaces: 0: eth0 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:c3:da:fa brd ff:ff:ff:ff:ff:ff altname enp0s3f0 altname ens3f0 inet 10.240.29.128/20 brd 10.240.31.255 scope global dynamic eth0 valid_lft 21113sec preferred_lft 21113sec inet6 fe80::5054:ff:fec3:dafa/64 scope link valid_lft forever preferred_lft forever 3: test1pl@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 8e:7f:44:90:4e:ec brd ff:ff:ff:ff:ff:ff link-netns test_ns inet6 fe80::30aa:ddff:feb8:56b9/64 scope link tentative valid_lft forever preferred_lft forever so it looks like the inet6 address assignments are still using the link-local fe80:: range instead of the fd33:3981:3213: ranges that you mention above for onyx and trevis? Is there something that needs to be specified for an IPv6 test session to assign these non-local addresses? Is there a reason that shouldn't be done automatically for all test sessions?
            mdiep Minh Diep added a comment -

            ssmirnov, could you please advise to what we do next?

            mdiep Minh Diep added a comment - ssmirnov , could you please advise to what we do next?

            test1pg is used to create a fake IP address for testing. It doesn't match to anything real. So the 2001:db8:0:f101::1@tcp is correct. The above is a very special test case.

            simmonsja James A Simmons added a comment - test1pg is used to create a fake IP address for testing. It doesn't match to anything real. So the 2001:db8:0:f101::1@tcp is correct. The above is a very special test case.

            simmonsja cannot reproduce it, getting different lnectl ping error. If I understand correctly, --source should take as local ipv6 address which starts with fd33:3981:3213:f010 at onyx and fd33:3981:3213:f020 at trevis. Could we review how this parameter is generated?

            mkvardakov Michael Kvardakov added a comment - simmonsja cannot reproduce it, getting different lnectl ping error. If I understand correctly, --source should take as local ipv6 address which starts with fd33:3981:3213:f010 at onyx and fd33:3981:3213:f020 at trevis. Could we review how this parameter is generated?
            /usr/sbin/lnetctl net add --net tcp --if test1pg
            default via 10.240.16.1 dev eth0
            NI 0@lo expect status "up" found "up"
            NI 10.240.25.232@tcp expect status "up" found "up"
            NI 2001:db8:0:f101::1@tcp expect status "up" found "up"
            /usr/sbin/lnetctl ping --source 2001:db8:0:f101::1@tcp 10.240.25.232@tcp
            manage:
            - ping:
              errno: -5
              descr: ! 'failed to ping 10.240.25.232@tcp: Input/output error'
             sanity-lnet test_214: @@@@@@ FAIL: /usr/sbin/lnetctl ping --source 2001:db8:0:f101::1@tcp 10.240.25.232@tcp failed 
            
            mkvardakov Michael Kvardakov added a comment - /usr/sbin/lnetctl net add --net tcp -- if test1pg default via 10.240.16.1 dev eth0 NI 0@lo expect status "up" found "up" NI 10.240.25.232@tcp expect status "up" found "up" NI 2001:db8:0:f101::1@tcp expect status "up" found "up" /usr/sbin/lnetctl ping --source 2001:db8:0:f101::1@tcp 10.240.25.232@tcp manage: - ping: errno: -5 descr: ! 'failed to ping 10.240.25.232@tcp: Input/output error' sanity-lnet test_214: @@@@@@ FAIL: /usr/sbin/lnetctl ping --source 2001:db8:0:f101::1@tcp 10.240.25.232@tcp failed

            we shall see what fails

            simmonsja James A Simmons added a comment - we shall see what fails

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55435
            Subject: LU-17889 lnet: test IPv6
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: bbacbd65ed4cf72d676aedd6f74e0fde65ae361e

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55435 Subject: LU-17889 lnet: test IPv6 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bbacbd65ed4cf72d676aedd6f74e0fde65ae361e

            simmonsja which distro are affected?

            mkvardakov Michael Kvardakov added a comment - simmonsja which distro are affected?

            mkvardakov please take a look. The IPV6 addresses are only available to the local node.

            colmstea Charlie Olmstead added a comment - mkvardakov please take a look. The IPV6 addresses are only available to the local node.
            pjones Peter Jones added a comment -

            Thanks Lee. I assigned this to you last week because Charlie was on PTO. I'll reassign to him to provide further updates as things progress. It looks like LU-16822 should have fixes merging shortly.

            pjones Peter Jones added a comment - Thanks Lee. I assigned this to you last week because Charlie was on PTO. I'll reassign to him to provide further updates as things progress. It looks like LU-16822 should have fixes merging shortly.

            pjones I haven't been involved with this effort so I'm only mildly in tune with the details, Charlie and Minh can provide more insight but as far as I know the AT-side development is blocked by LU-16822

            leonel8a Lee Ochoa (Inactive) added a comment - pjones I haven't been involved with this effort so I'm only mildly in tune with the details, Charlie and Minh can provide more insight but as far as I know the AT-side development is blocked by LU-16822

            People

              mkvardakov Michael Kvardakov
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: