[LU-16995] LNetError: 1094:0:(kfilnd_tn.c:1340:kfilnd_tn_state_fail()) LBUG Created: 27/Jul/23 Updated: 22/Aug/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Chris Horn | Assignee: | Chris Horn |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
It is possible for the fabric to delay packets such that the retry handler cancels the message but it is still delivered to the target. If the timing is right then the initiator may receive a TAG_RX_OK event after the transaction has transitioned to TN_STATE_FAIL. This currently trips an LBUG, but we can instead modify kfilnd to allow the transaction to complete normally. |
| Comments |
| Comment by Gerrit Updater [ 27/Jul/23 ] |
|
"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51787 |
| Comment by Gerrit Updater [ 22/Aug/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51787/ |