[LU-15885] o2iblnd: RDMA_CM_EVENT_UNREACHABLE may be received after conn clean-up Created: 24/May/22 Updated: 14/Oct/22 Resolved: 10/Oct/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Serguei Smirnov | Assignee: | Serguei Smirnov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | o2iblnd | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
There's a scenario when IB port going down triggers the following assertion:
case RDMA_CM_EVENT_UNREACHABLE:
conn = cmid->context;
LASSERT(conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT ||
conn->ibc_state == IBLND_CONN_PASSIVE_WAIT);
Because connection is already disconnected due to an earlier "RDMA Timeout". Since it appears to be possible to get RDMA_CM_EVENT_UNREACHABLE after having decided to close the connection, this code should be changed. |
| Comments |
| Comment by Gerrit Updater [ 08/Sep/22 ] |
|
"Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48492 |
| Comment by Gerrit Updater [ 10/Oct/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48492/ |
| Comment by Peter Jones [ 10/Oct/22 ] |
|
Landed for 2.16 |