Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
Before lnd startup is complete, there's a window of time when o2iblnd can reject connection requests similar to the following:
Nov 16 08:24:18 ai400x2vm-008 kernel: LNetError: 7758:0:(o2iblnd_cb.c:2480:kiblnd_passive_connect()) Can't accept conn from 172.16.16.12@o2ib on NA (ib0:0:172.16.0.192): bad dst nid 172.16.0.192@o2ib
Nov 16 08:24:19 ai400x2vm-008 kernel: LNetError: 7758:0:(o2iblnd_cb.c:2480:kiblnd_passive_connect()) Can't accept conn from 172.16.16.187@o2ib on NA (ib0:0:172.16.0.192): bad dst nid 172.16.0.192@o2ib
Nov 16 08:24:19 ai400x2vm-008 kernel: LNetError: 7758:0:(o2iblnd_cb.c:2480:kiblnd_passive_connect()) Skipped 54 previous similar messages
Nov 16 08:24:19 ai400x2vm-008 kernel: LNet: Added LNI 172.16.0.192@o2ib [32/5120/0/180]
Nov 16 08:24:19 ai400x2vm-008 kernel: LNet: Using FastReg for registration
Nov 16 08:24:20 ai400x2vm-008 kernel: LNetError: 7758:0:(o2iblnd_cb.c:2480:kiblnd_passive_connect()) Can't accept conn from 172.16.0.58@o2ib on NA (ib0:1:172.16.0.192): bad dst nid 172.16.0.192@o2ib
Nov 16 08:24:20 ai400x2vm-008 kernel: LNetError: 7758:0:(o2iblnd_cb.c:2480:kiblnd_passive_connect()) Skipped 180 previous similar messages
Nov 16 08:24:20 ai400x2vm-008 kernel: LNet: Added LNI 172.16.16.192@o2ib [32/5120/0/180]
Look into getting rid of this race condition.
Attachments
Issue Links
- is related to
-
LU-17071 o2iblnd: Oops caused by IBLND_REJECT_EARLY code
- Resolved