[LU-2817] Failure on test suite replay-ost-single test_5: iozone failed Created: 15/Feb/13  Updated: 05/Jun/13  Resolved: 12/Mar/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: nasf (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: HB

Issue Links:
Related
is related to LU-3438 replay-ost-single test_5 failed with ... Resolved
Severity: 3
Rank (Obsolete): 6821

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/f3358f28-755c-11e2-bf59-52540035b04c.

The sub-test test_5 failed with the following error:

iozone failed

test log:

mount facets: ost1
CMD: client-16vm4 test -b /dev/lvm-OSS/P1
Starting ost1:   /dev/lvm-OSS/P1 /mnt/ost1
CMD: client-16vm4 mkdir -p /mnt/ost1; mount -t lustre   		                   /dev/lvm-OSS/P1 /mnt/ost1
CMD: client-16vm4 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin: NAME=autotest_config sh rpc.sh set_default_debug \"vfstrace rpctrace dlmtrace neterror ha config ioctl super\" \"all -lnet -lnd -pinger\" 2 
CMD: client-16vm4 e2label /dev/lvm-OSS/P1 2>/dev/null
Started lustre-OST0000
fsync: Input/output error
1048576       4
iozone: interrupted

exiting iozone

iozone failed!
iozone rc=1
 replay-ost-single test_5: @@@@@@ FAIL: iozone failed 
Lustre: DEBUG MARKER: == replay-ost-single test 5: Fail OST during iozone == 23:53:00 (1360655580)
LustreError: 11-0: lustre-OST0000-osc-ffff88004c2f5000: Communicating with 10.10.4.123@tcp, operation ost_write failed with -107.
Lustre: lustre-OST0000-osc-ffff88004c2f5000: Connection to lustre-OST0000 (at 10.10.4.123@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: lustre-OST0000-osc-ffff88004c2f5000: Connection restored to lustre-OST0000 (at 10.10.4.123@tcp)
LustreError: 17140:0:(osc_request.c:1156:check_write_rcs()) Unexpected # bytes transferred: 2097152 (requested 1048576)
LustreError: 17140:0:(osc_request.c:1156:check_write_rcs()) Unexpected # bytes transferred: 2097152 (requested 1048576)
Lustre: DEBUG MARKER: /usr/sbin/lctl mark iozone rc=1
Lustre: DEBUG MARKER: iozone rc=1
Lustre: DEBUG MARKER: /usr/sbin/lctl mark  replay-ost-single test_5: @@@@@@ FAIL: iozone failed 
Lustre: DEBUG MARKER: replay-ost-single test_5: @@@@@@ FAIL: iozone failed
Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2013-02-11/lustre-master-el6-x86_64--full--2_1_1__1256__-70152466552720-140916/replay-ost-single.test_5.debug_log.$(hostname -s).1360655688.log;
         dmesg > /logdir/test_logs/2013-02-11/lustre-master-el6-x86_64--


 Comments   
Comment by Doug Oucharek (Inactive) [ 15/Feb/13 ]

Hi Jinshan,

This is a high blocker. Can you determine if this is something you can look at and if not, let me know.

Comment by Sarah Liu [ 18/Feb/13 ]

another failure seen in zfs: https://maloo.whamcloud.com/test_sets/c34fa2a2-7788-11e2-987d-52540035b04c

Comment by Sarah Liu [ 21/Feb/13 ]

Also seen in interop test between 2.3.0 server and 2.4 client:
https://maloo.whamcloud.com/test_sets/23aca6c0-7574-11e2-93d9-52540035b04c

Comment by Sarah Liu [ 25/Feb/13 ]

more instance seen in ldiskfs: https://maloo.whamcloud.com/test_sets/1811c6a4-7e59-11e2-8f4f-52540035b04c

Comment by Peter Jones [ 26/Feb/13 ]

Fanyong

Could you please look into this one?

Thanks

Peter

Comment by nasf (Inactive) [ 06/Mar/13 ]

It looks quite similar as LU-2832. Let's see what will happen after http://review.whamcloud.com/#change,5532 landed to master.

Comment by Peter Jones [ 07/Mar/13 ]

ok - Sarah please comment whether this reproduces in 2.3.62 or not.

Comment by nasf (Inactive) [ 11/Mar/13 ]

Sarah, any update for this one?

Comment by Sarah Liu [ 12/Mar/13 ]

it passed in tag-2.3.62: https://maloo.whamcloud.com/test_sessions/dbf0e494-8906-11e2-83c6-52540035b04c

Comment by Peter Jones [ 12/Mar/13 ]

ok so let's close this for now and reopen if it reoccurs.

Generated at Sat Feb 10 01:28:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.