1) In some cases, we're routing packets from a cluster compute node that may be OPA or IB to a cluster's router node to a data center router node, across a campus WAN, back to another data center router which sits on an IB SAN the Lustre cluster sits on. So, ko2iblnd <> ko2iblnd <> ko2iblnd/ksocklnd <> ksocklnd/ko2iblnd <> ko2iblnd. Does the }}{{credits parameter for the ksocklnd modules need to match the credits parameter for ko2iblnd on router nodes with both interfaces?
2) Given the context in 1), do the number of ko2iblnd credits need to match on servers along the entire path or is it appropriate for router nodes to have a larger number of credits set?
The peer_credits parameters determines how many concurrent messages can be inflight to the same peer. Since o2iblnd is generally more performant than the socklnd, it would make sense to have a larger number of peer_credits for the socklnd network. The o2iblnd negotiates the peer credits per connection. So even if the peer_credits on two nodes are different, they'll be negotiated down to the least common denominator value. I would recommend then to keep the peer_credits the same across homogeneous networks. That said we know of a limitation with socklnd where the performance isn't great per interface. One work around we currently have is to create multiple virtual interfaces on the same ethernet interface and then configure those in MR config. This increases the performance. We're tracking this under: https://jira.whamcloud.com/browse/LU-12815. So this might be a way for you to increase the performance on the socklnd side.
I would also differentiate between the credits module parameters and peer_credits module parameter. The former determines the limit on the total concurrent sends to all peers on a particular Network Interface. While the latter limits the number of concurrent sends per peer. So if you increase the number of peer_credits, you'd want to increase the number of global credits as well for the NI.
The credits are calculated per CPT. You can take a look at {{lnet_ni_tq_credits()}}for more details.
3) Should there be a relation between the number of credits defined for a node's LND driver and it's buffer (or other relevant settings)?
For a router I think you need to look at the total number of large/small/tiny buffers you've specified. lnetctl routing show shows you stats on these buffers; the minimum credits for each. if the minimum credits are dipping in the negative that means you have instances where you're queuing due to lack of buffers. In that case you can increase the buffers allocated for that size. This can be done dynamically via: lnetctl set [tiny_buffers|small_buffers|large_buffers] <value>.
My above comments explain the peer_credits/credits relationship.
Let me know if you have other questions.
Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/40143/
Subject:
LUDOC-479lnet: Clarify transmit and routing creditsProject: doc/manual
Branch: master
Current Patch Set:
Commit: d2c7df42886ed80cf2e5a82d9a1521c0003dddf8