[LU-15828] o2iblnd: limit peer credits hiw at half the queue size Created: 05/May/22 Updated: 21/Jan/23 Resolved: 13/Jan/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Serguei Smirnov | Assignee: | Serguei Smirnov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | o2iblnd | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
O2iblnd "High watermark" parameter value is used to decide when it is time to return the accumulated credits to the peer:
/* when eagerly to return credits */
#define IBLND_CREDITS_HIGHWATER(t, conn) ((conn->ibc_version) == IBLND_MSG_VERSION_1 ? \
IBLND_CREDIT_HIGHWATER_V1 : \
min(t->lnd_peercredits_hiw, (__u32)conn->ibc_queue_depth - 1))
If the values for the peer_credits and peercredits_hiw configured via module parameters are higher than on the other node, the code above may end up setting hiw too close to the negotiated queue depth. For example with peer_credits/peercredits_hiw at 32/16 vs 16/8 on the other end, the hiw will be set at 15 - with the queue depth of 16. This means that this node is holding on to credits for as long as possible and has the potential to cause the other node to run out of credits under load. |
| Comments |
| Comment by Gerrit Updater [ 22/Dec/22 ] |
|
"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49497 |
| Comment by Gerrit Updater [ 13/Jan/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49497/ |
| Comment by Peter Jones [ 13/Jan/23 ] |
|
Landed for 2.16 |