[LU-2832] Test failure on test suite replay-ost-single test_3: ost_write failure -19 Created: 19/Feb/13  Updated: 05/Mar/13  Resolved: 05/Mar/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Hongchao Zhang
Resolution: Fixed Votes: 0
Labels: HB, ldiskfs

Issue Links:
Related
is related to LU-2285 Test failure on replay-ost-single tes... Resolved
Severity: 3
Rank (Obsolete): 6858

 Description   

This issue was created by maloo for nasf <fan.yong@intel.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/72d5ff8c-7a7c-11e2-b916-52540035b04c.

Logs show that:

5242880 bytes (5.2 MB) copied, 18.2551 s, 287 kB/s
tee: write error: Input/output error
replay-ost-single test_3: @@@@@@ FAIL: test_3 failed with 1
Trace dump:
...

21:50:35:Lustre: DEBUG MARKER: == replay-ost-single test 3: Fail OST during write, with verification == 21:50:28 (1361253028)
21:50:35:LustreError: 11-0: lustre-OST0000-osc-ffff88007bc74c00: Communicating with 10.10.17.29@tcp, operation ost_write failed with -19.
21:50:35:Lustre: lustre-OST0000-osc-ffff88007bc74c00: Connection to lustre-OST0000 (at 10.10.17.29@tcp) was lost; in progress operations using this service will wait for recovery to complete
21:50:47:Lustre: lustre-OST0000-osc-ffff88007bc74c00: Connection restored to lustre-OST0000 (at 10.10.17.29@tcp)
21:50:47:LustreError: 32117:0:(osc_request.c:1156:check_write_rcs()) Unexpected # bytes transferred: 1482752 (requested 741376)
21:50:47:LustreError: 32118:0:(osc_request.c:1156:check_write_rcs()) Unexpected # bytes transferred: 2097152 (requested 1048576)
21:50:48:Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-ost-single test_3: @@@@@@ FAIL: test_3 failed with 1
21:50:50:Lustre: DEBUG MARKER: replay-ost-single test_3: @@@@@@ FAIL: test_3 failed with 1



 Comments   
Comment by nasf (Inactive) [ 19/Feb/13 ]

My similar failure instances recently:

https://maloo.whamcloud.com/sub_tests/query?commit=Update+results&page=1&sub_test%5Bquery_bugs%5D=&sub_test%5Bstatus%5D=FAIL&sub_test%5Bsub_test_script_id%5D=7a217e64-3db2-11e0-80c0-52540025f9af&test_node%5Barchitecture_type_id%5D=&test_node%5Bdistribution_type_id%5D=&test_node%5Bfile_system_type_id%5D=&test_node%5Blustre_branch_id%5D=&test_node%5Bos_type_id%5D=&test_node_network%5Bnetwork_type_id%5D=&test_session%5Bquery_date%5D=&test_session%5Bquery_recent_period%5D=604800&test_session%5Btest_group%5D=&test_session%5Btest_host%5D=&test_session%5Buser_id%5D=&test_set%5Btest_set_script_id%5D=79cbec9c-3db2-11e0-80c0-52540025f9af&utf8=

Comment by Nathaniel Clark [ 25/Feb/13 ]

This is ldiskfs only.

Comment by Hongchao Zhang [ 26/Feb/13 ]

the issue is caused by the reuse of the ptlrpc_bulk_desc when resending the reqeust.
the patch is tracked at http://review.whamcloud.com/#change,5532

Comment by Peter Jones [ 05/Mar/13 ]

Landed for 2.4

Generated at Sat Feb 10 01:28:35 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.