[LU-14660] Fix destination NID for discovery PUSH Created: 30/Apr/21  Updated: 09/Jun/21  Resolved: 09/Jun/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Minor
Reporter: Chris Horn Assignee: Chris Horn
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

If we're sending a discovery PUSH after receiving a discovery
REPLY then we want to send via the same NID that the reply was
sent to. This introduces a challenge in selecting an appropriate
destination NID for the PUSH because lnet_select_pathway() will not
run the MR selection algorithm for choosing a peer NI if the source
NI has been specified.

It is reasonable to assume that the NID used by the message
originator in sending the REPLY is a suitable destination for the
discovery PUSH. Thus, we record this NID in the same location we
currently record the lp_disc_src_nid, and use it when sending the
PUSH. With this change, the only other user of lnet_peer_select_nid()
is lnet_peer_send_ping(). In the ping case we do not set a source NID,
so lnet_select_pathway() is free to choose any peer NI. So this change
allows us to get rid of lnet_peer_select_nid() altogether.

Alternatively, we would need to reproduce a lot of the path selection
algorithm inside lnet_peer_select_nid() in order to avoid sending to
unhealthy NIDs. It seems undesirable and unnecessary to duplicate that
logic.



 Comments   
Comment by Gerrit Updater [ 30/Apr/21 ]

Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/43507
Subject: LU-14660 lnet: Fix destination NID for discovery PUSH
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 61cd56a6d0abf99b995fd5877fb0146771f14f6f

Comment by Gerrit Updater [ 08/Jun/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43507/
Subject: LU-14660 lnet: Fix destination NID for discovery PUSH
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dce2f7d1987711dfdced903b13e67091cffe9628

Comment by Peter Jones [ 09/Jun/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:11:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.