[LU-17270] TN status is lost if TN_EVENT_TAG_RX_OK occurs before TN_EVENT_TX_OK Created: 07/Nov/23  Updated: 07/Nov/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Chris Horn Assignee: Chris Horn
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When kfilnd issues a bulk get, the originator posts a tagged receive buffer and then the BULK_GET TN transitions to TN_STATE_WAIT_COMP where it waits for the TN_EVENT_TX_OK and TN_EVENT_TAG_RX_OK events to complete the transaction.

These events may arrive at originator in any order. If the TX_OK arrives first, the BULK_GET TN is transitioned to TN_STATE_WAIT_TAG_COMP. If TAG_RX_OK arrives with a non-zero status then this is handled correctly.

If the TAG_RX_OK arrives first, then the status of this event is not recorded in the TN, and the BULK_GET TN is transitioned to TN_STATE_WAIT_SEND_COMP . When the TX_OK event arrives the message is finalized as though the TN completed successfully when in fact it failed. This can result in data corruption.

This shows the good/working case:

# TX_OK first with status 0
00000800:40000000:3.0:1699032572.037560:0:61649:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 TN_STATE_TAGGED_RECV_POSTED -> TN_STATE_WAIT_COMP state change
00000800:40000000:5.0:1699032572.037584:0:61651:0:(kfilnd_tn.c:1084:kfilnd_tn_state_wait_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TX_OK event status 0 <<<< TX_OK first

# TAG_RX_OK second with status -61
00000800:40000000:5.0:1699032572.374710:0:61651:0:(kfilnd_tn.c:1241:kfilnd_tn_state_wait_tag_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TAG_RX_OK event status -61
00000800:40000000:5.0:1699032572.374712:0:61651:0:(kfilnd_tn.c:313:kfilnd_tn_status_update()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 0 -> -61 status change
00000800:40000000:5.0:1699032572.374713:0:61651:0:(kfilnd_tn.c:319:kfilnd_tn_status_update()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 0 -> 0 health status change
00000400:00000200:5.0:1699032572.374714:0:61651:0:(lib-msg.c:1046:lnet_is_health_check()) Msg 0000000039f6c737 is in inconsistent state, don't perform health checking (-61, 0)
00000400:00000200:5.0:1699032572.374715:0:61651:0:(lib-msg.c:1051:lnet_is_health_check()) health check = 0, status = -61, hstatus = 0
00000400:40000000:5.0:1699032572.374715:0:61651:0:(lib-msg.c:979:lnet_msg_detach_md()) md 00000000b5e76aa5 msg 0000000039f6c737

# Correct status is passed to upper layer
00000100:40000000:5.0:1699032572.374716:0:61651:0:(events.c:481:server_bulk_callback()) event type 5, status -61, desc 000000002a3c77d3

This shows the broken case:

# TAG_RX_OK first with status -61
00000800:40000000:5.0:1699032576.237177:0:61651:0:(kfilnd_tn.c:1084:kfilnd_tn_state_wait_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TAG_RX_OK event status -61
00000800:40000000:5.0:1699032576.237178:0:61651:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 TN_STATE_WAIT_COMP -> TN_STATE_WAIT_SEND_COMP state change

# TX_OK second with status 0
00000800:40000000:5.0:1699032576.237242:0:61651:0:(kfilnd_tn.c:1178:kfilnd_tn_state_wait_send_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TX_OK event status 0
00000400:00000200:5.0:1699032576.237243:0:61651:0:(lib-msg.c:1051:lnet_is_health_check()) health check = 1, status = 0, hstatus = 0
00000400:00000200:5.0:1699032576.237244:0:61651:0:(lib-msg.c:822:lnet_health_check()) health check: 17@kfi->0@kfi: GET: OK
00000400:40000000:5.0:1699032576.237247:0:61651:0:(lib-msg.c:979:lnet_msg_detach_md()) md 00000000c76d55f1 msg 00000000c54fcba7
00000400:40000000:5.0:1699032576.237248:0:61651:0:(lib-msg.c:979:lnet_msg_detach_md()) md 00000000c76d55f1 msg 000000003842aa4c
00000400:00000200:5.0:1699032576.237249:0:61651:0:(lib-md.c:68:lnet_md_unlink()) Unlinking md 00000000c76d55f1
00000800:40000000:5.0:1699032576.237251:0:61651:0:(kfilnd_tn.c:1473:kfilnd_tn_free()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -> 0@kfi(00000000eb641e7b):0x0 Transaction freed

# Upper layer got status 0, corruption detected
00000001:00020000:4.0:1699032576.261312:0:58034:0:(brw_test.c:230:brw_check_page()) Bad data in page 000000008c1819d6: 0xbeefbeefbeefbeef, 0xeeb0eeb1eeb2eeb3 expected
00000001:00020000:4.0:1699032576.261312:0:58034:0:(brw_test.c:266:brw_check_bulk()) Bulk page 000000008c1819d6 (0/256) is corrupted!
00000001:00020000:4.0:1699032576.261313:0:58034:0:(brw_test.c:432:brw_bulk_ready()) Bulk data rpc 0000000007c0e22a from 12345-0@kfi is corrupted!


 Comments   
Comment by Gerrit Updater [ 07/Nov/23 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53027
Subject: LU-17270 kfilnd: Check status of TAG_RX_OK in WAIT_COMP
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c10b92f65813eec7712fdb62432dd393323be350

Generated at Sat Feb 10 03:34:03 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.