Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for eaujames <eaujames@ddn.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/a5dba607-7817-42ca-9075-4c7880f9082c
test_220 failed with the following error:
Timeout occurred after 369 minutes, last suite running was sanity-lnet
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/97846 - 4.18.0-425.10.1.el8_7.aarch64
servers: https://build.whamcloud.com/job/lustre-reviews/97846 - 4.18.0-477.15.1.el8_lustre.x86_64
route goes down during lnet_selftest:
[Thu Sep 7 20:51:51 2023] Lustre: DEBUG MARKER: Start LST rw [Thu Sep 7 20:51:51 2023] LNet: 1042010:0:(rpc.c:641:srpc_service_add_buffers()) waiting for adding buffer [Thu Sep 7 20:51:51 2023] LNet: 943043:0:(rpc.c:641:srpc_service_add_buffers()) waiting for adding buffer [Thu Sep 7 20:52:05 2023] LNetError: 1068901:0:(lib-lnet.h:1305:lnet_set_route_aliveness()) route to tcp2 through 10.240.44.207@tcp1 has gone from up to down [Thu Sep 7 20:52:05 2023] LNetError: 1068901:0:(lib-lnet.h:1305:lnet_set_route_aliveness()) Skipped 1 previous similar message [Thu Sep 7 20:52:06 2023] LNetError: 943043:0:(lib-move.c:2341:lnet_handle_find_routed_path()) no route to 10.240.45.24@tcp2 from <?> [Thu Sep 7 20:52:06 2023] Lustre: DEBUG MARKER: lst stop brw_rw [Thu Sep 7 20:52:07 2023] Lustre: DEBUG MARKER: lst stop brw_rw [Thu Sep 7 20:52:07 2023] Lustre: DEBUG MARKER: Stop LST rw [Thu Sep 7 20:52:07 2023] LNetError: 1042010:0:(lib-move.c:2341:lnet_handle_find_routed_path()) no route to 10.240.45.24@tcp2 from 10.240.44.206@tcp1 [Thu Sep 7 20:52:07 2023] LNetError: 1042010:0:(lib-move.c:2341:lnet_handle_find_routed_path()) Skipped 1 previous similar message [Thu Sep 7 20:52:07 2023] LustreError: 1042010:0:(brw_test.c:388:brw_server_rpc_done()) Bulk transfer from 12345-10.240.45.24@tcp2 has failed: -113
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-lnet test_220 - Timeout occurred after 369 minutes, last suite running was sanity-lnet