[LU-15909] Track Peer/Network credits at peer net/net level - use for path selection Created: 02/Jun/22  Updated: 03/Jun/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Chris Horn Assignee: Chris Horn
Resolution: Unresolved Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

If an lnet peer has multiple networks configured we will currently round robin between them regardless of the relative capabilities/capacities of each network. This can lead to a situation where 1/(# nets) traffic is sent via a slow protocol (like tcp) when it would be better to use a faster protocol.

This situation can be remedied in Lustre 2.15 by defining net selection rules. This tells LNet to prioritize using some nets over other ones. But this may still not be ideal.

Suppose we have a peer with two cassini interfaces and two ethernet interfaces:

    - net type: tcp
      local NI(s):
        - nid: 172.18.2.3@tcp
          status: up
          interfaces:
              0: enp65s0
        - nid: 172.18.2.4@tcp
          status: up
          interfaces:
              0: enp65s1
    - net type: gni
      local NI(s):
        - nid: 17@gni
          status: up
        - nid: 18@gni
          status: up

Without udsp, half of all traffic will be sent over the tcp network which is much slower than the gni network.

With udsp, we can add a rule so that all traffic will be sent over the gni network unless there is a problem and the tcp interfaces have higher health value than the gni interfaces.

This may seem ideal, but it could be the case that all available resources on the gni interfaces are consumed. In this case, LNet will queue messages on the gni interfaces until a resource becomes available. Meanwhile, the tcp interfaces may be completely idle.

I propose to add resource tracking at the local net/peer net level. This will allow LNet to choose a network (local or peer) based on the resources available in that network (which are simply the sum of the resources available to the NIs belonging to the network).

This should allow us to get most of the benefit of the UDSP network selection rule but also enable us to fully leverage all network capacity in a more intelligent manner than round-robin across all nets.

A side benefit is that we can get rid of the round robin behavior altogether. Round robin relies on sequence numbers, and there is a potential scenario where the round robin behavior can be broken.

On every send to some peer we increment a sequence number for the source interface and a sequence number for the peer. The sequence numbers are unsigned 32 bit ints, so if we happen to wrap the sequence number in just the right way we can end up in a situation where the sequence of some ni is UINT_MAX but the next send sets some other NI to 0. Then all future sends get funneled to the NI with the lower sequence number.



 Comments   
Comment by Gerrit Updater [ 03/Jun/22 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47525
Subject: LU-15909 lnet: Add peer NI send lists
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3047ef173f50333f6867d591b3b219f5daace548

Comment by Gerrit Updater [ 03/Jun/22 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47526
Subject: LU-15909 lnet: Use lnet_send_data for NI selection
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ef7d4d75ae0938d5097dc5c5ec13c06c5d84ed8f

Comment by Gerrit Updater [ 03/Jun/22 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47527
Subject: LU-15909 lnet: Correct net selection for router ping
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 32e781b1f6652a8520e47bae9917fba153020ab5

Comment by Gerrit Updater [ 03/Jun/22 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47528
Subject: LU-15909 lnet: Add peer net send lists
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a28ed176663bbe7c57a1c7f4b99aeae19a746f2f

Comment by Gerrit Updater [ 03/Jun/22 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47529
Subject: LU-15909 lnet: Use net/peer net credits for net selection
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 464e85adbce415db0187de52da009480d2d2ad70

Comment by Gerrit Updater [ 03/Jun/22 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47530
Subject: LU-15909 lnet: Refactor lnet_find_route_locked
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2c26ca984c47618336df4ca258c62ade31b8c8dd

Generated at Sat Feb 10 03:22:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.