Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.16.0, Lustre 2.15.1, Lustre 2.15.2, Lustre 2.15.3
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/abd05f73-e3f1-4f4d-8d6b-70923bb58dd5
test_30 failed with the following error:
Timeout occurred after 119 minutes, last suite running was replay-dual
This may be related with LU-15657
MDS dmesg shows:
[Mon Jul 18 05:59:21 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1658123361/real 1658123361] req@000000008b9f0b5d x1738664899455104/t0(0) o6->lustre-OST0002-osc-MDT0000@10.240.28.246@tcp:28/4 lens 544/432 e 20 to 1 dl 1658123962 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'osp-syn-2-0.0' [Mon Jul 18 05:59:21 2022] Lustre: lustre-OST0001-osc-MDT0000: Connection to lustre-OST0001 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 05:59:21 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [Mon Jul 18 05:59:21 2022] Lustre: lustre-OST0002-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 05:59:22 2022] Lustre: lustre-OST0005-osc-MDT0000: Connection to lustre-OST0005 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 05:59:22 2022] Lustre: Skipped 1 previous similar message [Mon Jul 18 05:59:22 2022] Lustre: lustre-OST0003-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 05:59:22 2022] Lustre: Skipped 1 previous similar message [Mon Jul 18 05:59:35 2022] LustreError: 199852:0:(osp_precreate.c:966:osp_precreate_cleanup_orphans()) lustre-OST0005-osc-MDT0000: cannot cleanup orphans: rc = -107 [Mon Jul 18 05:59:35 2022] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 05:59:35 2022] LustreError: 199852:0:(osp_precreate.c:966:osp_precreate_cleanup_orphans()) Skipped 9 previous similar messages [Mon Jul 18 05:59:35 2022] Lustre: Skipped 2 previous similar messages [Mon Jul 18 05:59:35 2022] Lustre: lustre-OST0000-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 05:59:35 2022] Lustre: Skipped 3 previous similar messages [Mon Jul 18 06:09:22 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1658123963/real 1658123963] req@000000008b9f0b5d x1738664899455104/t0(0) o6->lustre-OST0002-osc-MDT0000@10.240.28.246@tcp:28/4 lens 544/432 e 20 to 1 dl 1658124564 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'osp-syn-2-0.0' [Mon Jul 18 06:09:22 2022] Lustre: lustre-OST0001-osc-MDT0000: Connection to lustre-OST0001 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 06:09:22 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) Skipped 47 previous similar messages [Mon Jul 18 06:09:22 2022] Lustre: lustre-OST0001-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 06:12:22 2022] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
replay-dual test_30 - Timeout occurred after 119 minutes, last suite running was replay-dual