Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13548

LNet: b2_12 discovery of non-MR peers may yield unreachable peer NIs

    XMLWordPrintable

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • Lustre 2.12.4
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      If non-MR peer (2.10.8) is discovered by a 2.12 MR peer, the following problem may happen: if non-MR peer has LNets that are not defined on the MR peer, it is possible that a NID on the undefined LNet is listed as primary. Later this causes communication problems when mounting. 

      Here's an example of the buggy discovery:

       

      lnetctl discover 192.168.1.123@o2ib4

      discover:

          - primary nid: 192.168.1.123@o2ib

            Multi-Rail: False

            peer ni:

              - nid: 192.168.1.123@o2ib4

              - nid: 192.168.1.123@o2ib

      lnetctl peer show

      peer:

          - primary nid: 192.168.1.123@o2ib

            Multi-Rail: False

            peer ni:

              - nid: 192.168.1.123@o2ib4

                state: NA

              - nid: 192.168.1.123@o2ib

                state: NA

       

      In the example above, the peer that is running the discovery has an only nid on o2ib4, and so designating a peer with a primary nid on o2ib is a problem.

       

      Here's the lnet config on the MR peer (the peer running discovery):

      lnetctl net show

      net:

          - net type: lo

            local NI(s):

              - nid: 0@lo

                status: up

          - net type: o2ib4

            local NI(s):

              - nid: 192.168.1.105@o2ib4

                status: up

                interfaces:

                    0: ib0

       Here's the lnet config on the non-MR peer (the peer being discovered):

      lnetctl net show

      net:

          - net type: lo

            local NI(s):

              - nid: 0@lo

                status: up

          - net type: o2ib

            local NI(s):

              - nid: 192.168.1.123@o2ib

                status: up

                interfaces:

                    0: ib0

          - net type: o2ib4

            local NI(s):

              - nid: 192.168.1.123@o2ib4

                status: up

                interfaces:

                    0: ib0

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              ssmirnov Serguei Smirnov
              Votes:
              1 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated: