This patch has landed to master, but as yet there is no documentation patch for the manual.
Comments from Jeremy in LU-3322:
Summary:
Normally the Lustre ko2iblnd can only operate with identical peer_credits and map_on_demand between systems. The patch affects the active (initiator) connection in IB for clients/routers and the passive (responder) connections for IB with servers/routers. Passive connections will automatically negotiate down if the parameters permit and reject if they are to high for the remote request. Active connections will send their defaults initially and if rejected attempt to use the lower values from the reject message. If the values are higher the active connection won't retry because its not supported.
There are 3 parameters in the ko2iblnd of interest here: peer_credits, map_on_demand, and concurrent_sends. The default settings for ko2iblnd are peer_credits=8, concurrent_sends=8, and map_on_demand=0 (disabled)
peer_credits determines how many messages you can receive from a single connection (queue pair).
map_on_demand determines the number of DMA segments per credit that are sent. Each segment is usually (always?) a page and is sent as a separate work request so these are typically (256 * 4k pages) going across the wire.
concurrent_sends determines how many send messages you can queue at a time to a single connection, concurrent_sends can't be less than half of peer_credits but concurrent_sends needs to be <= 62 to not exceed the maximum number of work requests per queue pair in the standard Mellanox ConnectX[123] HCAs.
The relation between those values is: work_requests_allocated = (map_on_demand + 1) * concurrent_sends
You can see your max work requests per queue pair with:
My recommendation is the following for maximum compatibility.
Lustre MDS/OSS Severs and Routers, use the patch and run with:
Existing Lustre clients:
Need to apply the patch or lower the current values for peer_credits and concurrent_sends to match the OSS/MDS setting.
Do we need to include some or all of Amir's compatibility matrix (https://jira.hpdd.intel.com/secure/attachment/18893/compatibility%20matrix.xlsx) in the manual to tell users what options exist for configuring with mixed-Lustre-version networks?
The manual section related to this should be marked conditional for Lustre 2.8 using the <para condition="l28"> tag.
The attached spreadsheet lists 4th case "OPN = Map-on-demand->on" but likely should read "OPN = Map-on-demand-> off"