Details
-
Bug
-
Resolution: Not a Bug
-
Critical
-
None
-
Lustre 2.10.0
-
None
-
2
-
9223372036854775807
Description
We have the patch from LU-8752 applied.
We are testing lustre2.10.1 pre-release on a mlx5 hca host. lnet_selftest fails and mounting filesystem produced this error.
[ 435.503071] mlx5_warn:mlx5_1:dump_cqe:257:(pid 4031): dump error cqe
[ 435.503072] 00000000 00000000 00000000 00000000
[ 435.503072] 00000000 00000000 00000000 00000000
[ 435.503073] 00000000 00000000 00000000 00000000
[ 435.503075] 00000000 9d005304 08000069 005878d2
[ 435.503078] LNet: 4031:0:(o2iblnd_cb.c:3475:kiblnd_complete()) RDMA (tx: ffffc90063356f28) failed: 4
[ 435.503292] LNet: 4029:0:(o2iblnd_cb.c:967:kiblnd_tx_complete()) Tx -> 10.151.20.103@o2ib cookie 0x67 sending 1 waiting 0: failed 5
[ 435.503295] LNet: 4029:0:(o2iblnd_cb.c:1919:kiblnd_close_conn_locked()) Closing conn to 10.151.20.103@o2ib: error -5(waiting)
[ 435.503304] LNet: 4029:0:(rpc.c:1413:srpc_lnet_ev_handler()) LNet event status -5 type 1, RPC errors 11
[ 435.503306] LNet: 4029:0:(rpc.c:1413:srpc_lnet_ev_handler()) Skipped 1 previous similar message
[ 435.503396] LNet: 4151:0:(rpc.c:1143:srpc_client_rpc_done()) Client RPC done: service 5, peer 12345-10.151.20.103@o2ib, status SWI_STATE_REQUEST_SUBMITTED:1:-4
[ 440.503751] LNet: 4152:0:(lib-move.c:830:lnet_post_send_locked()) Dropping message for 12345-10.151.20.103@o2ib: peer not alive
[ 440.503754] LNet: 4152:0:(lib-move.c:2827:LNetPut()) Error sending PUT to 12345-10.151.20.103@o2ib: -113
[ 440.503757] LNet: 4152:0:(rpc.c:1413:srpc_lnet_ev_handler()) LNet event status -113 type 5, RPC errors 16
[ 440.503758] LNet: 4152:0:(rpc.c:1413:srpc_lnet_ev_handler()) Skipped 4 previous similar messages
[ 440.503765] LNet: 4152:0:(rpc.c:1143:srpc_client_rpc_done()) Client RPC done: service 5, peer 12345-10.151.20.103@o2ib, status SWI_STATE_REQUEST_SUBMITTED:1:-4
[ 506.581347] LNet: 4173:0:(rpc.c:1069:srpc_client_rpc_expired()) Client RPC expired: service 11, peer 12345-10.151.20.103@o2ib, timeout 64.
[ 506.581363] LNet: 4147:0:(rpc.c:1143:srpc_client_rpc_done()) Client RPC done: service 11, peer 12345-10.151.20.103@o2ib, status SWI_STATE_REQUEST_SENT:1:-4
[ 506.581367] LustreError: 4147:0:(brw_test.c:344:brw_client_done_rpc()) BRW RPC to 12345-10.151.20.103@o2ib failed with -110