Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1625

Test failure on test suite parallel-scale-nfsv4, subtest test_metabench

Details

    • 3
    • 4488

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/4a115426-cba8-11e1-8847-52540035b04c.

      The sub-test test_metabench failed with the following error:

      test failed to respond and timed out

      From the log, this test took more than 35 minutes before it was ended. I check several pass runs, it usual takes less than 1800s, so the test may just be killed by the system.

      Attachments

        Issue Links

          Activity

            [LU-1625] Test failure on test suite parallel-scale-nfsv4, subtest test_metabench
            pjones Peter Jones added a comment -

            Extra tweak landed too

            pjones Peter Jones added a comment - Extra tweak landed too

            One extra change is need. I missed part of the needed patch. http://review.whamcloud.com/3701 has been pushed to fix this issue.

            keith Keith Mannthey (Inactive) added a comment - One extra change is need. I missed part of the needed patch. http://review.whamcloud.com/3701 has been pushed to fix this issue.
            pjones Peter Jones added a comment -

            Patch landed for 2.1.3 and 2.3. If there are still issues with this with Minh's changes in place then please reopen

            pjones Peter Jones added a comment - Patch landed for 2.1.3 and 2.3. If there are still issues with this with Minh's changes in place then please reopen
            yujian Jian Yu added a comment -

            RHEL6.3/x86_64 (2.1.3 Server + 1.8.8-wc1 Client):
            https://maloo.whamcloud.com/test_sets/7422aff4-e42a-11e1-b6d3-52540035b04c

            yujian Jian Yu added a comment - RHEL6.3/x86_64 (2.1.3 Server + 1.8.8-wc1 Client): https://maloo.whamcloud.com/test_sets/7422aff4-e42a-11e1-b6d3-52540035b04c
            mdiep Minh Diep added a comment -

            patch to reduce parallel-scale for nfs
            http://review.whamcloud.com/#change,3596

            mdiep Minh Diep added a comment - patch to reduce parallel-scale for nfs http://review.whamcloud.com/#change,3596

            Minh, could you please just change the default compilebench numbers to "2" and "2" for parallel-scale-nfs.sh. This is a trivial change, and reduces the testing time, rather than making it take longer, and I don't think the benefits of testing NFS for such a long time is matched by the number of users who use NFS.

            I think this is simply the following:

            test_compilebench() {
                export cbench_IDIRS=${cbench_IDIRS:-2}
                export cbench_RUNS=${cbench_RUNS:-2}
            
                run_compilebench
            }       
            run_test compilebench "compilebench"
            

            If there are other sub-parts of parallel-scale-nfsv4 that are taking a long time, I think they can also be shortened in a similar manner.

            adilger Andreas Dilger added a comment - Minh, could you please just change the default compilebench numbers to "2" and "2" for parallel-scale-nfs.sh. This is a trivial change, and reduces the testing time, rather than making it take longer, and I don't think the benefits of testing NFS for such a long time is matched by the number of users who use NFS. I think this is simply the following: test_compilebench() { export cbench_IDIRS=${cbench_IDIRS:-2} export cbench_RUNS=${cbench_RUNS:-2} run_compilebench } run_test compilebench "compilebench" If there are other sub-parts of parallel-scale-nfsv4 that are taking a long time, I think they can also be shortened in a similar manner.
            sarah Sarah Liu added a comment -

            Bobi, I think for each test, timeout is set to 3600s, 9499s was the total number for 5 tests

            sarah Sarah Liu added a comment - Bobi, I think for each test, timeout is set to 3600s, 9499s was the total number for 5 tests
            bobijam Zhenyu Xu added a comment -

            Sarah,

            what the timeout rule for autotest? as in https://maloo.whamcloud.com/test_sessions/3b113b66-e157-11e1-b541-52540035b04c, I saw parallel-scale-nfsv3 can run 9499 seconds, while parallel-scale-nfsv4 timed out in 3600 seconds.

            bobijam Zhenyu Xu added a comment - Sarah, what the timeout rule for autotest? as in https://maloo.whamcloud.com/test_sessions/3b113b66-e157-11e1-b541-52540035b04c , I saw parallel-scale-nfsv3 can run 9499 seconds, while parallel-scale-nfsv4 timed out in 3600 seconds.
            pjones Peter Jones added a comment -

            Bobijam will help with this one

            pjones Peter Jones added a comment - Bobijam will help with this one
            ys Yang Sheng added a comment -

            I suspect this issue relate to some nfs problem. from stacktrace, the nfsv4-svc thread always running same location. I'll do more check for that.

            ys Yang Sheng added a comment - I suspect this issue relate to some nfs problem. from stacktrace, the nfsv4-svc thread always running same location. I'll do more check for that.
            ys Yang Sheng added a comment -

            So looks compilebeach works normal. Just metabench was killed by timeout. But got less info from the logs. I'll trying to search other failed instance to investigate.

            ys Yang Sheng added a comment - So looks compilebeach works normal. Just metabench was killed by timeout. But got less info from the logs. I'll trying to search other failed instance to investigate.

            People

              keith Keith Mannthey (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: