Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5875

DLC: failed adding an existing network interface when there is traffic ongoing

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.7.0
    • Lustre 2.7.0
    • None
    • 3
    • 16427

    Description

      1. setup the system and run sanity
      2. add an existing network interface on the client side, got error messages

      [root@onyx-28 ~]# lnetctl net show -v
      net:
          - nid: 0@lo
            status: up
            tunables:
                peer_timeout: 0
                peer_credits: 0
                peer_buffer_credits: 0
                credits: 0
                CPTs:  0
          - nid: 10.2.4.74@tcp
            status: up
            interfaces:
                0: eth0
            tunables:
                peer_timeout: 180
                peer_credits: 8
                peer_buffer_credits: 0
                credits: 256
                CPTs:  0
      
      [root@onyx-28 ~]# lnetctl net add --net tcp --if eth0
      add:
          - net:
                errno: -22
                descr: "cannot add network: Invalid argument"
      
      Lustre: DEBUG MARKER: == sanity test 24v: list directory with large files (handle hash collision, bug: 17560) == 12:10:03 (1415218203)
      LNetError: 31891:0:(api-ni.c:1488:lnet_startup_lndnis()) Net tcp is not unique
      LNetError: 31897:0:(api-ni.c:1488:lnet_startup_lndnis()) Net tcp is not unique
      LNetError: 31899:0:(api-ni.c:1488:lnet_startup_lndnis()) Net tcp is not unique
      
       - created 10000 (time 141521821Lustre: DEBUG MARKER: cancel_lru_locks mdc start
      3.63 total 10.13 last 10.13)
       - created 20000 (time 1415218224.04 total 20.54 last 10.41)
      

      Attachments

        Issue Links

          Activity

            [LU-5875] DLC: failed adding an existing network interface when there is traffic ongoing
            pjones Peter Jones added a comment -

            Landed for 2.7

            pjones Peter Jones added a comment - Landed for 2.7

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13056/
            Subject: LU-5875 lnet: return -EEXIST if NI is not unique
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7d63c00e24d77d931642ce6cf5e8ff4cc2cad255

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13056/ Subject: LU-5875 lnet: return -EEXIST if NI is not unique Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7d63c00e24d77d931642ce6cf5e8ff4cc2cad255

            Amir Shehata (amir.shehata@intel.com) uploaded a new patch: http://review.whamcloud.com/13056
            Subject: LU-5875 lnet: return -EEXIST if NI is not unique
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: e2606de22e31866de70df0f1b1c8178aef3ff49f

            gerrit Gerrit Updater added a comment - Amir Shehata (amir.shehata@intel.com) uploaded a new patch: http://review.whamcloud.com/13056 Subject: LU-5875 lnet: return -EEXIST if NI is not unique Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e2606de22e31866de70df0f1b1c8178aef3ff49f
            sarah Sarah Liu added a comment -

            Andreas,

            Adding a different network interface needs at least two test nodes configured with 3 interfaces, current all Onyx nodes only have 2(tcp0 and ib0) hooked, I have opened TEI-2972 to request some test nodes configured with 3 interfaces.

            sarah Sarah Liu added a comment - Andreas, Adding a different network interface needs at least two test nodes configured with 3 interfaces, current all Onyx nodes only have 2(tcp0 and ib0) hooked, I have opened TEI-2972 to request some test nodes configured with 3 interfaces.

            Amir, it would be good to fix the error to return -EEXIST in this case, not -EINVAL, so that it prints out a more useful error message for the user.

            Sarah, can you please re-run this test with a different network interface to verify that this is working correctly. This functionality is the whole reason for DLC so it should work.

            adilger Andreas Dilger added a comment - Amir, it would be good to fix the error to return -EEXIST in this case, not -EINVAL, so that it prints out a more useful error message for the user. Sarah, can you please re-run this test with a different network interface to verify that this is working correctly. This functionality is the whole reason for DLC so it should work.
            sarah Sarah Liu added a comment -

            Then could you please update the test plan?
            "Test Case Name dynLNet.system.net_existing"

            sarah Sarah Liu added a comment - Then could you please update the test plan? "Test Case Name dynLNet.system.net_existing"
            ashehata Amir Shehata (Inactive) added a comment - - edited

            You can not re-add add an existing network. As reported in the errors: the failure to add is due to the network not being unique.

            This is not a bug.

            ashehata Amir Shehata (Inactive) added a comment - - edited You can not re-add add an existing network. As reported in the errors: the failure to add is due to the network not being unique. This is not a bug.

            Amir,
            Can you please have a look at this one?
            Thank you!

            jlevi Jodi Levi (Inactive) added a comment - Amir, Can you please have a look at this one? Thank you!

            People

              ashehata Amir Shehata (Inactive)
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: