Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
Setting debug_peer_on_timeout=1 on a client and then rebooting a lnet-router causes this to happen:
===
[Fri Oct 7 15:50:28 2022] LNetError: 246:0:(o2iblnd_cb.c:3044:kiblnd_rejected()) 172.27.243.18@o2ib240 rejected: o2iblnd fatal error
[Fri Oct 7 15:50:28 2022] Lustre: 3803:0:(client.c:2182:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1665150627/real 1665150627] req@00000000e4b7f0d8 x1746036694498560/t0(0) o400->stor10-MDT0003-mdc-ffff8a3a1e709800@172.27.1.33@o2ib1:12/10 lens 224/224 e 0 to 1 dl 1665151070 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1
[Fri Oct 7 15:50:28 2022] BUG: kernel NULL pointer dereference, address: 000000000000003
===
The client is running DDN lustre 2.12.8-ddn9 but I suspect this will be a problem for upstream too.