[LU-15137] socklnd: decrement typed connection counters on close Created: 20/Oct/21  Updated: 19/Jan/24  Resolved: 06/Jan/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0, Lustre 2.12.10

Type: Bug Priority: Minor
Reporter: Serguei Smirnov Assignee: Serguei Smirnov
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

There appears to be a possibility that if some connections are being closed while some are still being created for the same control block, the race may result in LBUG inĀ ksocknal_connect:

if ((wanted & BIT(SOCKLND_CONN_ANY)) != 0) {
    type = SOCKLND_CONN_ANY;
} else if ((wanted & BIT(SOCKLND_CONN_CONTROL)) != 0) {
    type = SOCKLND_CONN_CONTROL;
} else if ((wanted & BIT(SOCKLND_CONN_BULK_IN)) != 0 &&
           conn_cb->ksnr_blki_conn_count <= conn_cb->ksnr_blko_conn_count) {
    type = SOCKLND_CONN_BULK_IN;
} else {
    LASSERT ((wanted & BIT(SOCKLND_CONN_BULK_OUT)) != 0);
    type = SOCKLND_CONN_BULK_OUT;
}

This may happen if the previously created BULK_IN connection got closed, no more BULK_OUT connections are needed, but ksocknal_connect is trying again to create a BULK_IN connection. BecauseĀ ksnr_blki_conn_count didn't get decremented on close, the assertion will fail.

Fix this by making sure counters are decremented on close.



 Comments   
Comment by Gerrit Updater [ 30/Oct/21 ]

"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45422
Subject: LU-15137 socklnd: decrement connection counters on close
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b513a383226bfa190e398a33ac011e5cfa5c8272

Comment by Gerrit Updater [ 04/Nov/21 ]

"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45461
Subject: LU-15137 socklnd: expect two control connections maximum
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6c0bcf741e425f9732bc60552d3adb74437ba7fc

Comment by Gerrit Updater [ 23/Dec/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45461/
Subject: LU-15137 socklnd: expect two control connections maximum
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ee9a03d8308c5918a17e2e45fd59ee5a4c38acaf

Comment by Gerrit Updater [ 06/Jan/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45422/
Subject: LU-15137 socklnd: decrement connection counters on close
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 7e26413aa85fdc931721cde36bae3bf2bb97e63f

Comment by Peter Jones [ 06/Jan/22 ]

Landed for 2.15

Comment by Gerrit Updater [ 09/May/22 ]

"Cyril Bordage <cbordage@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47253
Subject: LU-15137 socklnd: decrement connection counters on close
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 79527349324cb8d75904c90f7c0d70939b05cab4

Comment by Gerrit Updater [ 09/May/22 ]

"Cyril Bordage <cbordage@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47254
Subject: LU-15137 socklnd: expect two control connections maximum
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 76e38c06682a23bac58da3c5ddfcbf3ecb94fe6e

Comment by Gerrit Updater [ 20/Sep/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47253/
Subject: LU-15137 socklnd: decrement connection counters on close
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 9e3f7e3ed8ad90795638785c6fc2acc22d80fdf2

Comment by Gerrit Updater [ 20/Sep/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47254/
Subject: LU-15137 socklnd: expect two control connections maximum
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: ed368445ef7028d2f62e30e3c4a40a1475908477

Generated at Sat Feb 10 03:15:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.