Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
When two kfilnd peers want to talk to each other, simultaneously and for the first time, they can race and one side can end up dropping a message from the other.
gaza hello -> cassb
00000800:00000200:17.0:1686768092.809471:0:3151313:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_HELLO_REQ Transaction ID 000000006c812243: 17@kfi:1 -> 0@kfi(0000000057925e46):0x0 TN_STATE_IDLE -> TN_STATE_IMM_SEND state change 00000800:00000200:15.0:1686768092.809484:0:2429020:0:(kfilnd_tn.c:912:kfilnd_tn_state_imm_send()) KFILND_MSG_HELLO_REQ Transaction ID 000000006c812243: 17@kfi:1 -> 0@kfi(0000000057925e46):0x0 TN_EVENT_TX_OK event status 0
cassb hello -> gaza
00000800:00000200:2.0:1686768092.808477:0:21655:0:(kfilnd_tn.c:649:kfilnd_tn_state_idle()) KFILND_MSG_HELLO_REQ Transaction ID 00000000aa9fce91: 0@kfi:1 -> 17@kfi(00000000315faae4):0x0 TN_EVENT_TX_HELLO event status 0 00000800:00000200:21.0:1686768092.808517:0:14623:0:(kfilnd_tn.c:912:kfilnd_tn_state_imm_send()) KFILND_MSG_HELLO_REQ Transaction ID 00000000aa9fce91: 0@kfi:1 -> 17@kfi(00000000315faae4):0x0 TN_EVENT_TX_OK event status 0
At same time cassb receives hello from gaza, marks peer up to date
00000800:00000200:16.0:1686768092.808478:0:17916:0:(kfilnd_tn.c:649:kfilnd_tn_state_idle()) KFILND_MSG_INVALID Transaction ID 00000000318c963d: 0@kfi:0 <- 17@kfi(00000000315faae4):0x0 TN_EVENT_RX_HELLO event status 0 00000800:00000200:16.0:1686768092.808483:0:17916:0:(kfilnd_peer.c:368:kfilnd_peer_process_hello()) kp 17@kfi(00000000315faae4):0x0 is up-to-date
Since peer is up to date, it sends new transactions:
00000800:00000200:2.0:1686768092.808500:0:21655:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_IMMEDIATE Transaction ID 0000000012d400f7: 0@kfi:1 -> 17@kfi(00000000315faae4):0x0 TN_STATE_IDLE -> TN_STATE_IMM_SEND state change 00000800:00000200:16.0:1686768092.808599:0:17916:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_IMMEDIATE Transaction ID 00000000967f4305: 0@kfi:0 -> 17@kfi(00000000315faae4):0x0 TN_STATE_IDLE -> TN_STATE_IMM_SEND state change
gaz has not processed hello response from cassb yet, so drops one of these messages:
00000800:00000200:15.0:1686768092.809536:0:2429020:0:(kfilnd_tn.c:649:kfilnd_tn_state_idle()) KFILND_MSG_INVALID Transaction ID 00000000448b76f9: 17@kfi:1 <- 0@kfi(0000000057925e46):0x0 TN_EVENT_RX_HELLO event status 0 00000800:00000100:16.0:1686768092.809538:0:2428909:0:(kfilnd_tn.c:794:kfilnd_tn_state_idle()) Transaction ID 0000000098e1bbe3: 17@kfi:1 <- 0@kfi(0000000057925e46):0x0 Dropping message from 0@kfi due to stale peer 00000800:00000200:15.0:1686768092.809540:0:2429020:0:(kfilnd_peer.c:368:kfilnd_peer_process_hello()) kp 0@kfi(0000000057925e46):0x1 is up-to-date