Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14668

LNet: do discovery in the background

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0, Lustre 2.15.4
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      When the file system is being mounted the llog is traversed and a local peer representation at the pltrpc layer is created. As part of this process ptlrpc_connection_get() -> LNetPrimaryNID() path gets executed. As a result LNet performs the discovery protocol, to update its local representation of the peer. This involves communicating with the NID provided by the ptlrpc_connection_get() call. Prior to the introduction of LNetPrimaryNID() no communication with the remote peer was performed at this point. This led to the situation where when the llog contains references to old NIDs, or NIDs for bad interfaces, the connection to that NID can take up to the LND timeout (in the 50s range) to expire. This could extend the mount time considerably.

      To avoid this issue we can change the concept of Primary NID. Primary NID currently is a global concept derived from the first interface configured on the node. However, there doesn't seem to be a need to make this a global concept. Each node can have a different view of the primary NID of the peers it communicates with, as long as it keeps the Primary NID consistent through out the life of the peer.

      Since Lustre is the one which requests the initial connection to the peer, it already provides LNet with the NID which it prefers to use (likely the one configured). LNet can lock that NID as the primary NID of the node, even if it is not the first interface configured on the node.

      This actually clarifies some confusion encountered on some sites, where the first interface configured on the system is not on the same network as the peer's interface.

      For example a tcp client can mount a server on the TCP network. However the server has the o2ib interface configured first. On the TCP client the peer shows the o2ib as the primary NID. This can be confusion when viewing configuration.

      By locking the primary NID of the peer to the tcp NID, then viewing the peer configuration from the tcp client will make more sense.

      This way the primary NID concept becomes a node local concept. It is the NID by which a Lustre node references a peer. Different lustre nodes can reference the same peer by different NIDs.

      Practically speaking usually the FS is configured with the first NID which is reachable. From a TCP client it would be the first tcp interface configured and the same for other networks. However, the solution doesn't demand that.

      The solution will be spread across the following patches

      1. Introduce a LOCK_PRIMARY state to the peer. This is set when LNetPrimaryNID() is called on a new peer or a peer is explicitly added by Lustre.
      2. When a peer is in LOCK_PRIMARY state, the primary NID provided by lustre will not change. The peer can be populated by other interfaces' NIDs; however, the primary NID will not change
      3. Get Lustre to pre-define the Primary NID and the constituent NIDs, such that a call to LNetPrimaryNID() on a constituent NID returns consistent result and is not dependant on the completion of the discovery protocol.
      4. If a peer was manually discovered, then Lustre explicitly adds it using a different primary NID afterwards, the Lustre configuration path will take precedence. The peer will be deleted and recreated with the primary NID Lustre uses.
      5. When lustre deletes the UUID, the lock the LNet peer should be removed.
      6. TBD: Should we be removing the lock from an LNet Peer when Lustre evicts a node or when Lustre is unmounted?

      This solution should avoid long mount delays. However, it will not help in the case when the Primary NID used by Lustre is not reachable or LNet encounters network delays reaching that NID.

      On mount the Lustre needs to reach the MGS to retrieve the server NID information in the llog. 

      obd_connect()>lmv_connect>lmv_connect_mdc->client_connect_import->ptlrpc_connect_import() to connect

      it then does a sync OBD_STATFS to MDT0000 to test its aliveness (maybe to wait for the MDT0000 connection to complete), then checks some connection features on the MDT to verify it is not too old, then gets the root directory FID from MDT0000 for the mount. after that, it follows a similar process to connect to the OSTs, but it doesn't wait for them to finish

      The purpose of this solution is not to delay mount on servers which might not be reachable during mount time. By pushing discovery in the background, the discovery can complete at its own time. Any messages to the node under discovery will be sent only after discovery is complete. Therefore, NIDs provided by lustre client for servers necessary for mount will by definition need to be reachable for the mount to complete. Other nodes which are not needed at mount time will not block mount.

      Attachments

        Issue Links

          Activity

            [LU-14668] LNet: do discovery in the background
            pjones Peter Jones added a comment -

            AFAICT this is merged for 2.15.4 and 2.16 (there is just one outstanding patch that should be abandoned)

            pjones Peter Jones added a comment - AFAICT this is merged for 2.15.4 and 2.16 (there is just one outstanding patch that should be abandoned)

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51135/
            Subject: LU-14668 tests: verify state of peer added with '--lock_prim'
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 7ee579d25a614946ba22a5a08fdc4373c41ef8f1

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51135/ Subject: LU-14668 tests: verify state of peer added with '--lock_prim' Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 7ee579d25a614946ba22a5a08fdc4373c41ef8f1

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51134/
            Subject: LU-14668 lnet: add 'lock_prim_nid" lnet module parameter
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 6cfc8e55a2e77c9c91b81a8842e2cbd886025298

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51134/ Subject: LU-14668 lnet: add 'lock_prim_nid" lnet module parameter Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 6cfc8e55a2e77c9c91b81a8842e2cbd886025298

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51133/
            Subject: LU-14668 lnet: add 'force' option to lnetctl peer del
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 8c4df87ec21bf5d61dab4b6580fc7f7ecfa91e37

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51133/ Subject: LU-14668 lnet: add 'force' option to lnetctl peer del Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 8c4df87ec21bf5d61dab4b6580fc7f7ecfa91e37

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51132/
            Subject: LU-14668 lnet: don't delete peer created by Lustre
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 26d11f254795a2869ae30a7e5d6ebf2bee59f879

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51132/ Subject: LU-14668 lnet: don't delete peer created by Lustre Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 26d11f254795a2869ae30a7e5d6ebf2bee59f879

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51131/
            Subject: LU-14668 lnet: Peers added via kernel API should be permanent
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: f63e87f0a88a856d5cc38039afef704676ff5521

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51131/ Subject: LU-14668 lnet: Peers added via kernel API should be permanent Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: f63e87f0a88a856d5cc38039afef704676ff5521

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51130/
            Subject: LU-14668 lnet: Lock primary NID logic
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: b341288179d9b3ad594b461586d826d6811db5a1

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51130/ Subject: LU-14668 lnet: Lock primary NID logic Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: b341288179d9b3ad594b461586d826d6811db5a1

            "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51135
            Subject: LU-14668 tests: verify state of peer added with '--lock_prim'
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 7e46cd092120ea8807fb78598df15de042e8bae5

            gerrit Gerrit Updater added a comment - "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51135 Subject: LU-14668 tests: verify state of peer added with '--lock_prim' Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 7e46cd092120ea8807fb78598df15de042e8bae5

            "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51134
            Subject: LU-14668 lnet: add 'lock_prim_nid" lnet module parameter
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: c01c95917ff9dd46a382d9c3a81660820eb89080

            gerrit Gerrit Updater added a comment - "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51134 Subject: LU-14668 lnet: add 'lock_prim_nid" lnet module parameter Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: c01c95917ff9dd46a382d9c3a81660820eb89080

            "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51133
            Subject: LU-14668 lnet: add 'force' option to lnetctl peer del
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: d6bb43d5e8598af1828de31b42c2f8f1cb17b023

            gerrit Gerrit Updater added a comment - "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51133 Subject: LU-14668 lnet: add 'force' option to lnetctl peer del Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: d6bb43d5e8598af1828de31b42c2f8f1cb17b023

            People

              ashehata Amir Shehata (Inactive)
              ashehata Amir Shehata (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: