[LU-14668] LNet: do discovery in the background - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.16.0, Lustre 2.15.4
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

When the file system is being mounted the llog is traversed and a local peer representation at the pltrpc layer is created. As part of this process ptlrpc_connection_get() -> LNetPrimaryNID() path gets executed. As a result LNet performs the discovery protocol, to update its local representation of the peer. This involves communicating with the NID provided by the ptlrpc_connection_get() call. Prior to the introduction of LNetPrimaryNID() no communication with the remote peer was performed at this point. This led to the situation where when the llog contains references to old NIDs, or NIDs for bad interfaces, the connection to that NID can take up to the LND timeout (in the 50s range) to expire. This could extend the mount time considerably.

To avoid this issue we can change the concept of Primary NID. Primary NID currently is a global concept derived from the first interface configured on the node. However, there doesn't seem to be a need to make this a global concept. Each node can have a different view of the primary NID of the peers it communicates with, as long as it keeps the Primary NID consistent through out the life of the peer.

Since Lustre is the one which requests the initial connection to the peer, it already provides LNet with the NID which it prefers to use (likely the one configured). LNet can lock that NID as the primary NID of the node, even if it is not the first interface configured on the node.

This actually clarifies some confusion encountered on some sites, where the first interface configured on the system is not on the same network as the peer's interface.

For example a tcp client can mount a server on the TCP network. However the server has the o2ib interface configured first. On the TCP client the peer shows the o2ib as the primary NID. This can be confusion when viewing configuration.

By locking the primary NID of the peer to the tcp NID, then viewing the peer configuration from the tcp client will make more sense.

This way the primary NID concept becomes a node local concept. It is the NID by which a Lustre node references a peer. Different lustre nodes can reference the same peer by different NIDs.

Practically speaking usually the FS is configured with the first NID which is reachable. From a TCP client it would be the first tcp interface configured and the same for other networks. However, the solution doesn't demand that.

The solution will be spread across the following patches

Introduce a LOCK_PRIMARY state to the peer. This is set when LNetPrimaryNID() is called on a new peer or a peer is explicitly added by Lustre.
When a peer is in LOCK_PRIMARY state, the primary NID provided by lustre will not change. The peer can be populated by other interfaces' NIDs; however, the primary NID will not change
Get Lustre to pre-define the Primary NID and the constituent NIDs, such that a call to LNetPrimaryNID() on a constituent NID returns consistent result and is not dependant on the completion of the discovery protocol.
If a peer was manually discovered, then Lustre explicitly adds it using a different primary NID afterwards, the Lustre configuration path will take precedence. The peer will be deleted and recreated with the primary NID Lustre uses.
When lustre deletes the UUID, the lock the LNet peer should be removed.
TBD: Should we be removing the lock from an LNet Peer when Lustre evicts a node or when Lustre is unmounted?

This solution should avoid long mount delays. However, it will not help in the case when the Primary NID used by Lustre is not reachable or LNet encounters network delays reaching that NID.

On mount the Lustre needs to reach the MGS to retrieve the server NID information in the llog.

obd_connect()~~>lmv_connect~~>lmv_connect_mdc->client_connect_import->ptlrpc_connect_import() to connect

it then does a sync OBD_STATFS to MDT0000 to test its aliveness (maybe to wait for the MDT0000 connection to complete), then checks some connection features on the MDT to verify it is not too old, then gets the root directory FID from MDT0000 for the mount. after that, it follows a similar process to connect to the OSTs, but it doesn't wait for them to finish

The purpose of this solution is not to delay mount on servers which might not be reachable during mount time. By pushing discovery in the background, the discovery can complete at its own time. Any messages to the node under discovery will be sent only after discovery is complete. Therefore, NIDs provided by lustre client for servers necessary for mount will by definition need to be reachable for the mount to complete. Other nodes which are not needed at mount time will not block mount.

Attachments

Issue Links

is related to

LU-17544 with lock_prim_nid=1 it seems to be possible that an unreachable nid gets primary nid

Open

LU-15169 Regression in "024f9303bc LU-14668 lnet: Lock primary NID logic" breaks client mounts

Resolved

LU-15541 Soft lockups in LNetPrimaryNID() and lnet_discover_peer_locked()

Resolved

LU-17664 Regression in 2.15.4 backport of LU-14668 lnet: add 'lock_prim_nid" lnet module parameter

Resolved

LU-18572 Regression in 2.15.4 backport of b341288179 LU-14668 lnet: Lock primary NID logic

Resolved

LU-14566 Skip discovery in LNetPrimaryNID when lnet_peer_discovery_disabled is set

Resolved

is related to

LU-10360 use Imperative Recovery logs for client->MDT/OST connections

Open

(1 is related to, 1 is related to )

Activity

[LU-14668] LNet: do discovery in the background

Peter Jones added a comment - 13/Nov/23 1:42 PM

AFAICT this is merged for 2.15.4 and 2.16 (there is just one outstanding patch that should be abandoned)

Peter Jones added a comment - 13/Nov/23 1:42 PM AFAICT this is merged for 2.15.4 and 2.16 (there is just one outstanding patch that should be abandoned)

Gerrit Updater added a comment - 02/Aug/23 6:20 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51135/
Subject: ~~LU-14668~~ tests: verify state of peer added with '--lock_prim'
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 7ee579d25a614946ba22a5a08fdc4373c41ef8f1

Gerrit Updater added a comment - 02/Aug/23 6:20 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51135/ Subject: LU-14668 tests: verify state of peer added with '--lock_prim' Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 7ee579d25a614946ba22a5a08fdc4373c41ef8f1

Gerrit Updater added a comment - 02/Aug/23 6:20 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51134/
Subject: ~~LU-14668~~ lnet: add 'lock_prim_nid" lnet module parameter
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 6cfc8e55a2e77c9c91b81a8842e2cbd886025298

Gerrit Updater added a comment - 02/Aug/23 6:20 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51134/ Subject: LU-14668 lnet: add 'lock_prim_nid" lnet module parameter Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 6cfc8e55a2e77c9c91b81a8842e2cbd886025298

Gerrit Updater added a comment - 02/Aug/23 6:20 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51133/
Subject: ~~LU-14668~~ lnet: add 'force' option to lnetctl peer del
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 8c4df87ec21bf5d61dab4b6580fc7f7ecfa91e37

Gerrit Updater added a comment - 02/Aug/23 6:20 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51133/ Subject: LU-14668 lnet: add 'force' option to lnetctl peer del Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 8c4df87ec21bf5d61dab4b6580fc7f7ecfa91e37

Gerrit Updater added a comment - 02/Aug/23 6:19 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51132/
Subject: ~~LU-14668~~ lnet: don't delete peer created by Lustre
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 26d11f254795a2869ae30a7e5d6ebf2bee59f879

Gerrit Updater added a comment - 02/Aug/23 6:19 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51132/ Subject: LU-14668 lnet: don't delete peer created by Lustre Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 26d11f254795a2869ae30a7e5d6ebf2bee59f879

Gerrit Updater added a comment - 02/Aug/23 6:19 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51131/
Subject: ~~LU-14668~~ lnet: Peers added via kernel API should be permanent
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: f63e87f0a88a856d5cc38039afef704676ff5521

Gerrit Updater added a comment - 02/Aug/23 6:19 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51131/ Subject: LU-14668 lnet: Peers added via kernel API should be permanent Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: f63e87f0a88a856d5cc38039afef704676ff5521

Gerrit Updater added a comment - 02/Aug/23 6:19 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51130/
Subject: ~~LU-14668~~ lnet: Lock primary NID logic
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: b341288179d9b3ad594b461586d826d6811db5a1

Gerrit Updater added a comment - 02/Aug/23 6:19 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51130/ Subject: LU-14668 lnet: Lock primary NID logic Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: b341288179d9b3ad594b461586d826d6811db5a1

Gerrit Updater added a comment - 25/May/23 12:36 AM

"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51135
Subject: ~~LU-14668~~ tests: verify state of peer added with '--lock_prim'
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 7e46cd092120ea8807fb78598df15de042e8bae5

Gerrit Updater added a comment - 25/May/23 12:36 AM "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51135 Subject: LU-14668 tests: verify state of peer added with '--lock_prim' Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 7e46cd092120ea8807fb78598df15de042e8bae5

Gerrit Updater added a comment - 25/May/23 12:36 AM

"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51134
Subject: ~~LU-14668~~ lnet: add 'lock_prim_nid" lnet module parameter
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: c01c95917ff9dd46a382d9c3a81660820eb89080

Gerrit Updater added a comment - 25/May/23 12:36 AM "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51134 Subject: LU-14668 lnet: add 'lock_prim_nid" lnet module parameter Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: c01c95917ff9dd46a382d9c3a81660820eb89080

Gerrit Updater added a comment - 25/May/23 12:36 AM

"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51133
Subject: ~~LU-14668~~ lnet: add 'force' option to lnetctl peer del
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: d6bb43d5e8598af1828de31b42c2f8f1cb17b023

Gerrit Updater added a comment - 25/May/23 12:36 AM "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51133 Subject: LU-14668 lnet: add 'force' option to lnetctl peer del Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: d6bb43d5e8598af1828de31b42c2f8f1cb17b023

People

Assignee:: Amir Shehata (Inactive)

Reporter:: Amir Shehata (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 04/May/21 11:28 PM

Updated:: 17/Dec/24 5:01 PM

Resolved:: 13/Nov/23 1:42 PM