[LU-15185] sanityn test_77c: Error: 'dd (write) failed (2)' Created: 01/Nov/21 Updated: 01/Nov/21 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for Chris Horn <hornc@cray.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/a207aa40-96c4-434c-bd4d-eef570e96859 test_77c failed with the following error: dd (write) failed (2) trevis-54vm1: 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00923709 s, 114 MB/s Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6332:error() = /usr/lib64/lustre/tests/sanityn.sh:3735:nrs_write_read() = /usr/lib64/lustre/tests/sanityn.sh:3794:orr_trr() = /usr/lib64/lustre/tests/sanityn.sh:3819:test_77c() = /usr/lib64/lustre/tests/test-framework.sh:6636:run_one() = /usr/lib64/lustre/tests/test-framework.sh:6683:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:6524:run_test() = /usr/lib64/lustre/tests/sanityn.sh:3824:main() Dumping lctl log to /autotest/autotest-2/2021-10-29/lustre-reviews_review-dne-zfs-part-5_83998_1_14_304f7726-8602-463b-b32a-48a2b320195d//sanityn.test_77c.*.1635570109.log CMD: trevis-54vm1.trevis.whamcloud.com,trevis-54vm2,trevis-54vm3,trevis-54vm4,trevis-54vm5 /usr/sbin/lctl dk > /autotest/autotest-2/2021-10-29/lustre-reviews_review-dne-zfs-part-5_83998_1_14_304f7726-8602-463b-b32a-48a2b320195d//sanityn.test_77c.debug_log.\$(hostname -s).1635570109.log; dmesg > /autotest/autotest-2/2021-10-29/lustre-reviews_review-dne-zfs-part-5_83998_1_14_304f7726-8602-463b-b32a-48a2b320195d//sanityn.test_77c.dmesg.\$(hostname -s).1635570109.log Cluster hit some network errors. Not clear why: [17983.580383] Lustre: DEBUG MARKER: declare -a pids_r;
for ((i = 0; i lustre-OST0003-osc-ffff8ae7a6f75800@10.9.6.102@tcp:17/18 lens 328/224 e 0 to 1 dl 1635570078 ref 2 fl Rpc:Xr/0/ffffffff rc 0/-1 job:'ldlm_bl_09.0'
[18040.721602] Lustre: lustre-OST0003-osc-ffff8ae7a6f75800: Connection to lustre-OST0003 (at 10.9.6.102@tcp) was lost; in progress operations using this service will wait for recovery to complete
[18057.099196] Lustre: 196580:0:(client.c:2290:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1635570036/real 0] req@00000000d56fcfbd x1715011554636160/t0(0) o400->lustre-OST0000-osc-ffff8ae7a6f75800@10.9.6.102@tcp:28/4 lens 224/224 e 0 to 1 dl 1635570093 ref 2 fl Rpc:XNr/0/ffffffff rc 0/-1 job:'kworker/u4:1.0'
[18057.104557] Lustre: 196580:0:(client.c:2290:ptlrpc_expire_one_request()) Skipped 1 previous similar message
[18057.106345] Lustre: lustre-OST0000-osc-ffff8ae7a6f75800: Connection to lustre-OST0000 (at 10.9.6.102@tcp) was lost; in progress operations using this service will wait for recovery to complete
[18067.338589] Lustre: 196579:0:(client.c:2290:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1635570046/real 0] req@00000000a0cd1f2a x1715011554638208/t0(0) o400->lustre-OST0000-osc-ffff8ae7a6f75800@10.9.6.102@tcp:28/4 lens 224/224 e 0 to 1 dl 1635570104 ref 2 fl Rpc:XNr/0/ffffffff rc 0/-1 job:'kworker/u4:1.0'
[18067.344444] Lustre: 196579:0:(client.c:2290:ptlrpc_expire_one_request()) Skipped 5 previous similar messages
[18067.346517] Lustre: lustre-OST0002-osc-ffff8ae7a6f75800: Connection to lustre-OST0002 (at 10.9.6.102@tcp) was lost; in progress operations using this service will wait for recovery to complete
[18067.349960] Lustre: Skipped 4 previous similar messages
[18067.838060] Lustre: Evicted from lustre-OST0000_UUID (at 10.9.6.102@tcp) after server handle changed from 0x44b76394139bb0ba to 0x44b76394139c0ce7
[18067.842456] LustreError: 167-0: lustre-OST0000-osc-ffff8ae7a6f75800: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
[18067.850084] Lustre: 196577:0:(llite_lib.c:3360:ll_dirty_page_discard_warn()) lustre: dirty page discard: 10.9.6.103@tcp:/lustre/fid: [0x200000404:0xb32:0x0]// may get corrupted (rc -5)
[18067.850086] Lustre: 196579:0:(llite_lib.c:3360:ll_dirty_page_discard_warn()) lustre: dirty page discard: 10.9.6.103@tcp:/lustre/fid: [0x200000404:0xb33:0x0]// may get corrupted (rc -5)
[18067.869310] Lustre: lustre-OST0003-osc-ffff8ae7a6f75800: Connection restored to 10.9.6.102@tcp (at 10.9.6.102@tcp)
[18067.878592] LustreError: 488862:0:(ldlm_resource.c:1124:ldlm_resource_complain()) lustre-OST0002-osc-ffff8ae7a6f75800: namespace resource [0x184:0x0:0x0].0x0 (000000008626a371) refcount nonzero (1) after lock cleanup; forcing cleanup.
[18067.885709] LustreError: 488862:0:(ldlm_resource.c:1124:ldlm_resource_complain()) Skipped 3 previous similar messages
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |