Details
-
Improvement
-
Resolution: Fixed
-
Critical
-
None
-
9223372036854775807
Description
OPA driver optimizations are based on the MPI model where it is expected to have multiple endpoints between two given nodes. To enable this optimization for Lustre, we need to make it possible, via an LND-specific tuneable, to create multiple endpoints and to balance the traffic over them.
I have already created an experimental patch to test this theory out. I was able to push OPA performance to 12.4GB/s by just having 2 QPs between the nodes and round robin messages between them.
This Jira ticket is for productizing my patch and testing it out thoroughly for OPA and IB. Test results will be posted to this ticket.
Attachments
Issue Links
- has to be finished together with
-
LUDOC-374 Add notes about conns_per_peer ko2iblnd parameter
-
- Resolved
-
Are you going to add these to the /usr/sbin/ko2iblnd-probe script, or be set by default in some other manner, or will this be up to the user to discover and set? At a very minimum there should be an update to the Lustre User Manual (see https://wiki.hpdd.intel.com/display/PUB/Making+changes+to+the+Lustre+Manual), but providing good performance out of the box is preferred.