Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
In the logging below, the server sends a bulk GET using peer 000000006aedb49e, but then it frees this peer and creates new one 00000000d640004c. HELLO is sent to client. Client processes this hello before posting the write for the bulk GET.
Server posted the tagged recv (1) using kp_local_session_key 66, but client will respond using kp_remote_session_key 67. This will match the next tagged recv that is posted by the server (2). Thus the client is performing two writes using the same RKEY and we will get data corruption.
Server posts recv (1) using kp_local_session_key 66 and tn_mr_key 107 / deletes peer with kp_local_session_key 66, creates peer with kp_local_session_key 67 00000800:40000000:17.0:1714513430.861113:0:2105922:0:(kfilnd_ep.c:360:kfilnd_ep_post_tagged_recv()) 5@kfi:5 Transaction ID 000000004bf22b7d: Posted tagged recv of 1048576 bytes (256 frags) with tag 0x42006b klsk 66 tmk 107: rc=0 00000800:40000000:17.0:1714513430.861115:0:2105922:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 TN_STATE_IDLE -> TN_STATE_TAGGED_RECV_POSTED state change 00000800:40000000:17.0:1714513430.861117:0:2105922:0:(kfilnd_tn.c:585:kfilnd_tn_state_tagged_recv_posted()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 TN_EVENT_INIT_BULK event status 0 tmk 107 trr 5 00000800:40000000:17.0:1714513430.861120:0:2105922:0:(kfilnd_tn.c:592:kfilnd_tn_state_tagged_recv_posted()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 Using peer 1@kfi(0x100000000000041) 00000800:40000000:17.0:1714513430.861123:0:2105922:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 TN_STATE_TAGGED_RECV_POSTED -> TN_STATE_WAIT_COMP state change 00000800:40000000:17.0:1714513430.861129:0:2105922:0:(kfilnd_peer.c:167:kfilnd_peer_put()) 1@kfi(000000006aedb49e):0x41 removed from peer cache 00000800:40000000:17.0:1714513430.861146:0:2105922:0:(kfilnd_peer.c:281:kfilnd_peer_get()) 1@kfi(00000000d640004c):0x42 peer entry allocated 00000800:40000000:17.0:1714513430.861148:0:2105922:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_HELLO_REQ Transaction ID 000000002a74c314: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 TN_STATE_IDLE -> TN_STATE_IMM_SEND state change Client processes hello / posts write (1) using kp_remote_session_key 67 (???) tn_mr_key 107 00000800:40000000:0.0F:1714513430.861967:0:630:0:(kfilnd_peer.c:363:kfilnd_peer_process_hello()) Peer 5@kfi(00000000e19bde77):0x2 version: 1; local version 1; negotiated version: 1 00000800:40000000:5.0:1714513430.861970:0:24112:0:(kfilnd.c:339:kfilnd_recv()) KFILND_MSG_INVALID Transaction ID 0000000005695d8a: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 KFILND_MSG_BULK_GET_REQ in 1048576 bytes in 256 frags trmk 107 trr 5 00000800:40000000:0.0:1714513430.861970:0:630:0:(kfilnd_peer.c:373:kfilnd_peer_process_hello()) kp 5@kfi(00000000e19bde77):0x2 is up-to-date 00000800:40000000:5.0:1714513430.861992:0:24112:0:(kfilnd_ep.c:498:kfilnd_ep_post_write()) 1@kfi:1 Transaction ID 0000000005695d8a: Posted write of 1048576 bytes in 256 frags with key 0x6b to peer 0x500000000000002: rc=0 00000800:40000000:5.0:1714513430.861996:0:24112:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_INVALID Transaction ID 0000000005695d8a: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 TN_STATE_IMM_RECV -> TN_STATE_WAIT_TAG_RMA_COMP state change Server TX_OK (1) 00000800:40000000:16.0:1714513430.862762:0:2105921:0:(kfilnd_tn.c:1087:kfilnd_tn_state_wait_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 TN_EVENT_TX_OK event status 0 00000800:40000000:16.0:1714513430.862764:0:2105921:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 TN_STATE_WAIT_COMP -> TN_STATE_WAIT_TAG_COMP state change Server timeout (1) 00000800:40000000:16.0:1714513557.837113:0:2105921:0:(kfilnd_tn.c:1267:kfilnd_tn_state_wait_tag_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 TN_EVENT_TIMEOUT event status 0 00000800:40000000:16.0:1714513557.837120:0:2105921:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 TN_STATE_WAIT_TAG_COMP -> TN_STATE_WAIT_TIMEOUT_TAG_COMP state change 00000800:40000000:17.0:1714513557.837409:0:2105922:0:(kfilnd_tn.c:313:kfilnd_tn_status_update()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 0 -> -110 status change 00000800:40000000:17.0:1714513557.837411:0:2105922:0:(kfilnd_tn.c:319:kfilnd_tn_status_update()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 0 -> 10 health status change 00000800:40000000:17.0:1714513557.837418:0:2105922:0:(kfilnd_tn.c:1514:kfilnd_tn_free()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000004bf22b7d: 5@kfi:5 -> 1@kfi(000000006aedb49e):0x0 Transaction freed Server posts recv (2) kp_local_session_key 67 tn_mr_key 107 00000800:40000000:17.0:1714513762.768860:0:1912050:0:(kfilnd_ep.c:360:kfilnd_ep_post_tagged_recv()) 5@kfi:5 Transaction ID 000000002cc27c20: Posted tagged recv of 1048576 bytes (256 frags) with tag 0x43006b klsk 67 tmk 107: rc=0 00000800:40000000:17.0:1714513762.768861:0:1912050:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 TN_STATE_IDLE -> TN_STATE_TAGGED_RECV_POSTED state change 00000800:40000000:17.0:1714513762.768863:0:1912050:0:(kfilnd_tn.c:585:kfilnd_tn_state_tagged_recv_posted()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 TN_EVENT_INIT_BULK event status 0 tmk 107 trr 5 00000800:40000000:17.0:1714513762.768864:0:1912050:0:(kfilnd_tn.c:592:kfilnd_tn_state_tagged_recv_posted()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 Using peer 1@kfi(0x100000000000042) 00000800:40000000:17.0:1714513762.768866:0:1912050:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 TN_STATE_TAGGED_RECV_POSTED -> TN_STATE_WAIT_COMP state change Server TX_OK (2) 00000800:40000000:17.0:1714513762.769486:0:2105922:0:(kfilnd_tn.c:1087:kfilnd_tn_state_wait_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 TN_EVENT_TX_OK event status 0 00000800:40000000:17.0:1714513762.769488:0:2105922:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 TN_STATE_WAIT_COMP -> TN_STATE_WAIT_TAG_COMP state change Client posts write (2) kp_remote_session_key 67 (???) tn_mr_key 107 00000800:40000000:5.0:1714513762.769733:0:24112:0:(kfilnd.c:339:kfilnd_recv()) KFILND_MSG_INVALID Transaction ID 00000000ff4c1a3b: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 KFILND_MSG_BULK_GET_REQ in 1048576 bytes in 256 frags trmk 107 trr 5 00000800:40000000:5.0:1714513762.769743:0:24112:0:(kfilnd_ep.c:498:kfilnd_ep_post_write()) 1@kfi:1 Transaction ID 00000000ff4c1a3b: Posted write of 1048576 bytes in 256 frags with key 0x6b to peer 0x500000000000002: rc=0 00000800:40000000:5.0:1714513762.769745:0:24112:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_INVALID Transaction ID 00000000ff4c1a3b: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 TN_STATE_IMM_RECV -> TN_STATE_WAIT_TAG_RMA_COMP state change Client TAG_TX_OK (1) 00000800:40000000:4.0:1714513762.769980:0:1810:0:(kfilnd_tn.c:1228:kfilnd_tn_state_wait_tag_rma_comp()) KFILND_MSG_INVALID Transaction ID 0000000005695d8a: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 TN_EVENT_TAG_TX_OK event status 0 key 0x6b peer 0x500000000000002 00000800:40000000:4.0:1714513762.769984:0:1810:0:(kfilnd_tn.c:1514:kfilnd_tn_free()) KFILND_MSG_INVALID Transaction ID 0000000005695d8a: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 Transaction freed Server TAG_RX_OK (2) 00000800:40000000:17.0:1714513762.771187:0:2105922:0:(kfilnd_tn.c:1267:kfilnd_tn_state_wait_tag_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 TN_EVENT_TAG_RX_OK event status 0 00000800:40000000:17.0:1714513762.771212:0:2105922:0:(kfilnd_tn.c:1514:kfilnd_tn_free()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000002cc27c20: 5@kfi:5 -> 1@kfi(00000000d640004c):0x0 Transaction freed Client TAG_TX_FAIL (2) 00000800:40000000:5.0:1714513762.780467:0:24112:0:(kfilnd_tn.c:1228:kfilnd_tn_state_wait_tag_rma_comp()) KFILND_MSG_INVALID Transaction ID 00000000ff4c1a3b: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 TN_EVENT_TAG_TX_FAIL event status -5 key 0x6b peer 0x500000000000002 00000800:40000000:5.0:1714513762.780469:0:24112:0:(kfilnd_tn.c:313:kfilnd_tn_status_update()) KFILND_MSG_INVALID Transaction ID 00000000ff4c1a3b: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 0 -> -5 status change 00000800:40000000:5.0:1714513762.780470:0:24112:0:(kfilnd_tn.c:319:kfilnd_tn_status_update()) KFILND_MSG_INVALID Transaction ID 00000000ff4c1a3b: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 0 -> 7 health status change 00000800:40000000:5.0:1714513762.780475:0:24112:0:(kfilnd_tn.c:1514:kfilnd_tn_free()) KFILND_MSG_INVALID Transaction ID 00000000ff4c1a3b: 1@kfi:1 <- 5@kfi(00000000e19bde77):0x0 Transaction freed