[LU-15828] o2iblnd: limit peer credits hiw at half the queue size Created: 05/May/22  Updated: 21/Jan/23  Resolved: 13/Jan/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Serguei Smirnov Assignee: Serguei Smirnov
Resolution: Fixed Votes: 0
Labels: o2iblnd

Issue Links:
Related
is related to LU-12569 IBLND_CREDITS_HIGHWATER does not chec... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

O2iblnd "High watermark" parameter value is used to decide when it is time to return the accumulated credits to the peer:

 /* when eagerly to return credits */
 #define IBLND_CREDITS_HIGHWATER(t, conn) ((conn->ibc_version) == IBLND_MSG_VERSION_1 ? \
                                         IBLND_CREDIT_HIGHWATER_V1 : \
                         min(t->lnd_peercredits_hiw, (__u32)conn->ibc_queue_depth - 1))

If the values for the peer_credits and peercredits_hiw configured via module parameters are higher than on the other node, the code above may end up setting hiw too close to the negotiated queue depth. For example with peer_credits/peercredits_hiw at 32/16 vs 16/8 on the other end, the hiw will be set at 15 - with the queue depth of 16. 

This means that this node is holding on to credits for as long as possible and has the potential to cause the other node to run out of credits under load.



 Comments   
Comment by Gerrit Updater [ 22/Dec/22 ]

"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49497
Subject: LU-15828 o2iblnd: reset hiw proportionally
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 46cddb76a0fdc1532396ba83ed552ecc4fdc395f

Comment by Gerrit Updater [ 13/Jan/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49497/
Subject: LU-15828 o2iblnd: reset hiw proportionally
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e1944c29793d489429730a9445e243b448c3d751

Comment by Peter Jones [ 13/Jan/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:21:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.