[LU-16066] replay-dual test_30: timeout, cannot cleanup orphans: rc = -107 Created: 02/Aug/22 Updated: 21/Jun/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.16.0, Lustre 2.15.1, Lustre 2.15.2, Lustre 2.15.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | failing_tests | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/abd05f73-e3f1-4f4d-8d6b-70923bb58dd5 test_30 failed with the following error: Timeout occurred after 119 minutes, last suite running was replay-dual This may be related with LU-15657 MDS dmesg shows: [Mon Jul 18 05:59:21 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1658123361/real 1658123361] req@000000008b9f0b5d x1738664899455104/t0(0) o6->lustre-OST0002-osc-MDT0000@10.240.28.246@tcp:28/4 lens 544/432 e 20 to 1 dl 1658123962 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'osp-syn-2-0.0' [Mon Jul 18 05:59:21 2022] Lustre: lustre-OST0001-osc-MDT0000: Connection to lustre-OST0001 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 05:59:21 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [Mon Jul 18 05:59:21 2022] Lustre: lustre-OST0002-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 05:59:22 2022] Lustre: lustre-OST0005-osc-MDT0000: Connection to lustre-OST0005 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 05:59:22 2022] Lustre: Skipped 1 previous similar message [Mon Jul 18 05:59:22 2022] Lustre: lustre-OST0003-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 05:59:22 2022] Lustre: Skipped 1 previous similar message [Mon Jul 18 05:59:35 2022] LustreError: 199852:0:(osp_precreate.c:966:osp_precreate_cleanup_orphans()) lustre-OST0005-osc-MDT0000: cannot cleanup orphans: rc = -107 [Mon Jul 18 05:59:35 2022] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 05:59:35 2022] LustreError: 199852:0:(osp_precreate.c:966:osp_precreate_cleanup_orphans()) Skipped 9 previous similar messages [Mon Jul 18 05:59:35 2022] Lustre: Skipped 2 previous similar messages [Mon Jul 18 05:59:35 2022] Lustre: lustre-OST0000-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 05:59:35 2022] Lustre: Skipped 3 previous similar messages [Mon Jul 18 06:09:22 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1658123963/real 1658123963] req@000000008b9f0b5d x1738664899455104/t0(0) o6->lustre-OST0002-osc-MDT0000@10.240.28.246@tcp:28/4 lens 544/432 e 20 to 1 dl 1658124564 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'osp-syn-2-0.0' [Mon Jul 18 06:09:22 2022] Lustre: lustre-OST0001-osc-MDT0000: Connection to lustre-OST0001 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete [Mon Jul 18 06:09:22 2022] Lustre: 6599:0:(client.c:2295:ptlrpc_expire_one_request()) Skipped 47 previous similar messages [Mon Jul 18 06:09:22 2022] Lustre: lustre-OST0001-osc-MDT0000: Connection restored to (at 10.240.28.246@tcp) [Mon Jul 18 06:12:22 2022] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.240.28.246@tcp) was lost; in progress operations using this service will wait for recovery to complete VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Chris Horn [ 12/May/23 ] |
|
+1 on master - https://testing.whamcloud.com/test_sets/e92cca2e-39ea-459e-9528-c4f2226cf840 |
| Comment by Andreas Dilger [ 13/Jun/23 ] |
|
Failed 17x on master and b2_15 in the past week. |