[LU-4108] Failure on test suite performance-sanity test_4 Created: 15/Oct/13  Updated: 29/May/14  Resolved: 26/Mar/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.4.2, Lustre 2.5.1, Lustre 2.4.3
Fix Version/s: Lustre 2.6.0, Lustre 2.5.2

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: mn4, performance, zfs
Environment:

client and server: lustre-b2_5 RHEL6 build #2


Issue Links:
Duplicate
duplicates LU-1357 Test failure on test suite performanc... Resolved
duplicates LU-2887 sanity-quota test_12a: slow due to ZF... Resolved
Related
is related to LU-2600 lustre metadata performance is very s... Resolved
Severity: 3
Rank (Obsolete): 11060

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/daabb6e4-3505-11e3-b76d-52540035b04c.

The sub-test test_4 failed with the following error:

test failed to respond and timed out

Info required for matching: performance-sanity 4



 Comments   
Comment by Sarah Liu [ 15/Oct/13 ]

Cannot find useful logs, it looks like just a slow run caused the timeout. In the following link, similar situation of parallel-scale, parallel-scale-nvsv3/4 and obdfilter-survey

https://maloo.whamcloud.com/test_sessions/3f307b78-3500-11e3-b76d-52540035b04c

Comment by Andreas Dilger [ 16/Oct/13 ]

This is almost certainly caused by slowness due to many ZFS pools sharing the same underlying disk.

Comment by Jian Yu [ 23/Dec/13 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/70/ (2.4.2 RC2)
Distro/Arch: RHEL6.4/x86_64

FSTYPE=zfs
MDSCOUNT=1
MDSSIZE=2097152
OSTCOUNT=2
OSTSIZE=8388608

performance-sanity test 8 timed out in 28800s:
https://maloo.whamcloud.com/test_sets/37e26e00-6b4f-11e3-99ba-52540035b04c

parallel-scale test metabench timed out in 14400s:
https://maloo.whamcloud.com/test_sets/92f82460-6b4f-11e3-99ba-52540035b04c

conf-sanity test 69 timed out in 3600s:
https://maloo.whamcloud.com/test_sets/d2e9712c-6b4b-11e3-99ba-52540035b04c

sanity-benchmark test iozone timed out in 14400s:
https://maloo.whamcloud.com/test_sets/3935574e-6b4b-11e3-99ba-52540035b04c

Nothing abnormal in the console logs.

Comment by Jian Yu [ 03/Jan/14 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/5/
Distro/Arch: RHEL6.4/x86_64

FSTYPE=zfs
MDSCOUNT=1
MDSSIZE=2097152
OSTCOUNT=2
OSTSIZE=8388608

parallel-scale test metabench timed out in 14400s:
https://maloo.whamcloud.com/test_sets/628b4e78-73c5-11e3-b4ff-52540035b04c

conf-sanity test 69 timed out in 3600s:
https://maloo.whamcloud.com/test_sets/93e12716-73c2-11e3-b4ff-52540035b04c

Comment by Nathaniel Clark [ 06/Feb/14 ]

performance-sanity/4 hasn't timed out on master since 2013-07-25 (where it was LU-1357)
This bug has happened several times on b2_5 and many times b2_4

Comment by Jian Yu [ 10/Feb/14 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/20/
Distro/Arch: RHEL6.4/x86_64

FSTYPE=zfs
MDSCOUNT=1
MDSSIZE=2097152
OSTCOUNT=2
OSTSIZE=8388608

parallel-scale-nfsv4 test compilebench timed out in 7200s:
https://maloo.whamcloud.com/test_sets/863a9f90-91a7-11e3-ba94-52540035b04c

The following sub-tests timed out in 3600s:

sanity-benchmark test bonnie
replay-ost-single test 8a
metadata-updates
ost-pools test 23a
obdfilter-survey test 1a

Maloo report: https://maloo.whamcloud.com/test_sessions/d343de6e-91a2-11e3-ba94-52540035b04c

Comment by Jian Yu [ 17/Mar/14 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/73/ (2.4.3 RC1)
Distro/Arch: RHEL6.4/x86_64
FSTYPE=zfs

https://maloo.whamcloud.com/test_sets/7183f11a-ac5e-11e3-81d7-52540035b04c
https://maloo.whamcloud.com/test_sets/ef26bbce-ac5f-11e3-81d7-52540035b04c

Comment by Nathaniel Clark [ 19/Mar/14 ]

This bug is basically the same issue as LU-2600 (poor metadata performance on ZFS). The NUM_FILES run should either be decreased (similar to parallel-scale.sh) or the test should be marked as SLOW for zfs

Comment by Nathaniel Clark [ 19/Mar/14 ]

http://review.whamcloud.com/9725

Comment by Nathaniel Clark [ 26/Mar/14 ]

Patch Landed to master

Generated at Sat Feb 10 01:39:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.