Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11840

Multi rail dynamic discovery prevent mounting filesystem when some NIC is unreachable

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.11.0, Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      In recent Lustre releases, some specific filesystem could not be mounted due to a communication error between clients and servers, depending on the LNET configuration.

      If we have a filesystem running on a host with 2 interfaces, let say tcp0 and tcp1 and the devices are setup to reply on both interfaces (formatted with --servicenode IP1@tcp0,IP2@tcp1).

      If a client is connected only to tcp0 and try to mount this filesystem, it fails with an I/O error because it is trying to connect using tcp1 interface.

      Mount failed:

       

      # mount -t lustre x.y.z.a@tcp:/lustre /mnt/lustre
      mount.lustre: mount x.y.z.a@tcp:/lustre at /mnt/client failed: Input/output error
      Is the MGS running?
      

      dmesg shows that communication fails using the wrong IP

      [422880.743179] LNetError: 19787:0:(lib-move.c:1714:lnet_select_pathway()) no route to a.b.c.d@tcp1
      # lnetctl peer show
      peer:
       - primary nid: a.b.c.d@tcp1
       Multi-Rail: False
       peer ni:
       - nid: x.y.z.a@tcp
       state: NA
       - nid: 0@<0:0>
       state:

      Ping is OK though:

      # lctl ping x.y.z.a@tcp
      12345-0@lo
      12345-a.b.c.d@tcp1
      12345-x.y.z.a@tcp

       

      This was tested with 2.10.5 and 2.12 as server versions and 2.10, 2.11 and 2.12 as client.

      Only 2.10 client is able to mount the filesystem properly with this configuration

       

      I git-bisected the regression down to 0f1aaad LU-9480 lnet: implement Peer Discovery

      Looking at debug log, the client:

      • setups the peer with the proper NI
      • the pings the peer
      • updates the local peer info with the wrong NI coming from the ping reply

      data in the reply seems to announce the tcp1 IP as the primary nid.

      The client will then use this NI to contact the server even if it has no direct connection to it (tcp1) and has a correct one for the same peer (tcp0).

      Attachments

        Issue Links

          Activity

            People

              ashehata Amir Shehata (Inactive)
              degremoa Aurelien Degremont (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated: