Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4108

Failure on test suite performance-sanity test_4

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0, Lustre 2.5.2
    • Lustre 2.5.0, Lustre 2.4.2, Lustre 2.5.1, Lustre 2.4.3
    • client and server: lustre-b2_5 RHEL6 build #2
    • 3
    • 11060

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/daabb6e4-3505-11e3-b76d-52540035b04c.

      The sub-test test_4 failed with the following error:

      test failed to respond and timed out

      Info required for matching: performance-sanity 4

      Attachments

        Issue Links

          Activity

            [LU-4108] Failure on test suite performance-sanity test_4

            Patch Landed to master

            utopiabound Nathaniel Clark added a comment - Patch Landed to master
            utopiabound Nathaniel Clark added a comment - http://review.whamcloud.com/9725

            This bug is basically the same issue as LU-2600 (poor metadata performance on ZFS). The NUM_FILES run should either be decreased (similar to parallel-scale.sh) or the test should be marked as SLOW for zfs

            utopiabound Nathaniel Clark added a comment - This bug is basically the same issue as LU-2600 (poor metadata performance on ZFS). The NUM_FILES run should either be decreased (similar to parallel-scale.sh) or the test should be marked as SLOW for zfs
            yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/73/ (2.4.3 RC1) Distro/Arch: RHEL6.4/x86_64 FSTYPE=zfs https://maloo.whamcloud.com/test_sets/7183f11a-ac5e-11e3-81d7-52540035b04c https://maloo.whamcloud.com/test_sets/ef26bbce-ac5f-11e3-81d7-52540035b04c
            yujian Jian Yu added a comment -

            Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/20/
            Distro/Arch: RHEL6.4/x86_64

            FSTYPE=zfs
            MDSCOUNT=1
            MDSSIZE=2097152
            OSTCOUNT=2
            OSTSIZE=8388608

            parallel-scale-nfsv4 test compilebench timed out in 7200s:
            https://maloo.whamcloud.com/test_sets/863a9f90-91a7-11e3-ba94-52540035b04c

            The following sub-tests timed out in 3600s:

            sanity-benchmark test bonnie
            replay-ost-single test 8a
            metadata-updates
            ost-pools test 23a
            obdfilter-survey test 1a
            

            Maloo report: https://maloo.whamcloud.com/test_sessions/d343de6e-91a2-11e3-ba94-52540035b04c

            yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/20/ Distro/Arch: RHEL6.4/x86_64 FSTYPE=zfs MDSCOUNT=1 MDSSIZE=2097152 OSTCOUNT=2 OSTSIZE=8388608 parallel-scale-nfsv4 test compilebench timed out in 7200s: https://maloo.whamcloud.com/test_sets/863a9f90-91a7-11e3-ba94-52540035b04c The following sub-tests timed out in 3600s: sanity-benchmark test bonnie replay-ost-single test 8a metadata-updates ost-pools test 23a obdfilter-survey test 1a Maloo report: https://maloo.whamcloud.com/test_sessions/d343de6e-91a2-11e3-ba94-52540035b04c

            performance-sanity/4 hasn't timed out on master since 2013-07-25 (where it was LU-1357)
            This bug has happened several times on b2_5 and many times b2_4

            utopiabound Nathaniel Clark added a comment - performance-sanity/4 hasn't timed out on master since 2013-07-25 (where it was LU-1357 ) This bug has happened several times on b2_5 and many times b2_4
            yujian Jian Yu added a comment -

            Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/5/
            Distro/Arch: RHEL6.4/x86_64

            FSTYPE=zfs
            MDSCOUNT=1
            MDSSIZE=2097152
            OSTCOUNT=2
            OSTSIZE=8388608

            parallel-scale test metabench timed out in 14400s:
            https://maloo.whamcloud.com/test_sets/628b4e78-73c5-11e3-b4ff-52540035b04c

            conf-sanity test 69 timed out in 3600s:
            https://maloo.whamcloud.com/test_sets/93e12716-73c2-11e3-b4ff-52540035b04c

            yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/5/ Distro/Arch: RHEL6.4/x86_64 FSTYPE=zfs MDSCOUNT=1 MDSSIZE=2097152 OSTCOUNT=2 OSTSIZE=8388608 parallel-scale test metabench timed out in 14400s: https://maloo.whamcloud.com/test_sets/628b4e78-73c5-11e3-b4ff-52540035b04c conf-sanity test 69 timed out in 3600s: https://maloo.whamcloud.com/test_sets/93e12716-73c2-11e3-b4ff-52540035b04c
            yujian Jian Yu added a comment -

            Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/70/ (2.4.2 RC2)
            Distro/Arch: RHEL6.4/x86_64

            FSTYPE=zfs
            MDSCOUNT=1
            MDSSIZE=2097152
            OSTCOUNT=2
            OSTSIZE=8388608

            performance-sanity test 8 timed out in 28800s:
            https://maloo.whamcloud.com/test_sets/37e26e00-6b4f-11e3-99ba-52540035b04c

            parallel-scale test metabench timed out in 14400s:
            https://maloo.whamcloud.com/test_sets/92f82460-6b4f-11e3-99ba-52540035b04c

            conf-sanity test 69 timed out in 3600s:
            https://maloo.whamcloud.com/test_sets/d2e9712c-6b4b-11e3-99ba-52540035b04c

            sanity-benchmark test iozone timed out in 14400s:
            https://maloo.whamcloud.com/test_sets/3935574e-6b4b-11e3-99ba-52540035b04c

            Nothing abnormal in the console logs.

            yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/70/ (2.4.2 RC2) Distro/Arch: RHEL6.4/x86_64 FSTYPE=zfs MDSCOUNT=1 MDSSIZE=2097152 OSTCOUNT=2 OSTSIZE=8388608 performance-sanity test 8 timed out in 28800s: https://maloo.whamcloud.com/test_sets/37e26e00-6b4f-11e3-99ba-52540035b04c parallel-scale test metabench timed out in 14400s: https://maloo.whamcloud.com/test_sets/92f82460-6b4f-11e3-99ba-52540035b04c conf-sanity test 69 timed out in 3600s: https://maloo.whamcloud.com/test_sets/d2e9712c-6b4b-11e3-99ba-52540035b04c sanity-benchmark test iozone timed out in 14400s: https://maloo.whamcloud.com/test_sets/3935574e-6b4b-11e3-99ba-52540035b04c Nothing abnormal in the console logs.

            This is almost certainly caused by slowness due to many ZFS pools sharing the same underlying disk.

            adilger Andreas Dilger added a comment - This is almost certainly caused by slowness due to many ZFS pools sharing the same underlying disk.
            sarah Sarah Liu added a comment -

            Cannot find useful logs, it looks like just a slow run caused the timeout. In the following link, similar situation of parallel-scale, parallel-scale-nvsv3/4 and obdfilter-survey

            https://maloo.whamcloud.com/test_sessions/3f307b78-3500-11e3-b76d-52540035b04c

            sarah Sarah Liu added a comment - Cannot find useful logs, it looks like just a slow run caused the timeout. In the following link, similar situation of parallel-scale, parallel-scale-nvsv3/4 and obdfilter-survey https://maloo.whamcloud.com/test_sessions/3f307b78-3500-11e3-b76d-52540035b04c

            People

              utopiabound Nathaniel Clark
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: