Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4344

Test marked FAIL with "No sub tests failed in this test set"

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • None
    • patches pushed to git
    • 3
    • 1792

    Description

      LU context can be found at LU-764 test marked "No sub tests failed in this test set.":

       
      Keith Mannthey added a comment - 11/Mar/13 12:44 PM 
      I have seen a bit of this over the last week or so on master. This is a good example.
      
      https://maloo.whamcloud.com/test_sessions/bf361f32-8919-11e2-b643-52540035b04c
      
      There was one real error at the very end end of this test. Other than that all subtests "passed" even though 3 whole sections are marked FAILED. What logs do you look at to see the cleanup issues previously mentioned? How can we tell if it is a problem with the patch or some autotest anomaly?
      "
      

      Minh Diep added a comment - 21/Mar/13 9:37 AM

      I looked at the log above, sanity actually ran much longer but maloo only show a few hundreds of second
      == sanity test complete, duration 2584 sec == 14:03:39 (1362866619)

      I don't think this is the same problem. Please file a TT ticket.

      
      

      This has happened a bit here an there. It would be nice to always know what failed when something "fails".

      Please update the LU if this is seen as an LU issue.

      Attachments

        Issue Links

          Activity

            [LU-4344] Test marked FAIL with "No sub tests failed in this test set"
            dmiter Dmitry Eremin (Inactive) added a comment - one more fail: https://testing.hpdd.intel.com/test_sets/7fc6919c-3863-11e4-b7d4-5254006e85c2

            Matt,

            I think that one failed due to TEI-1403.

            jamesanunez James Nunez (Inactive) added a comment - Matt, I think that one failed due to TEI-1403.
            ezell Matt Ezell added a comment -

            I don't see why this failed, is this another instance?
            https://maloo.whamcloud.com/test_sets/9bea8fb6-b77d-11e3-98de-52540035b04c

            ezell Matt Ezell added a comment - I don't see why this failed, is this another instance? https://maloo.whamcloud.com/test_sets/9bea8fb6-b77d-11e3-98de-52540035b04c
            mjmac Michael MacDonald (Inactive) added a comment - More instances? https://maloo.whamcloud.com/test_sets/392d7044-a794-11e3-ba84-52540035b04c https://maloo.whamcloud.com/test_sets/bb8b2514-a80b-11e3-9505-52540035b04c
            dmiter Dmitry Eremin (Inactive) added a comment - One more time it happens: https://maloo.whamcloud.com/test_sets/9618ff94-81e2-11e3-94d9-52540035b04c

            Perhaps LU-764 should be reopened?

            keith Keith Mannthey (Inactive) added a comment - Perhaps LU-764 should be reopened?

            This needs to be investigated in the Lustre code and so needs to be fixed under an LU ticket.

            chris Chris Gearing (Inactive) added a comment - This needs to be investigated in the Lustre code and so needs to be fixed under an LU ticket.

            I agree with Chris. Most the time there is some issue. Normally in the unmounting of filesystems or some other cleanup phase of the test.

            Perhaps a setup and cleanup phase for each subtest could catch all these extra issues in a less confusing way.

            keith Keith Mannthey (Inactive) added a comment - I agree with Chris. Most the time there is some issue. Normally in the unmounting of filesystems or some other cleanup phase of the test. Perhaps a setup and cleanup phase for each subtest could catch all these extra issues in a less confusing way.

            Right so the problem here is in the Lustre test framework.

            The results.yaml file created by the results framework looks like this (abbreviated).

            Tests:
            -
                    name: sanity
                    description: auster sanity
                    submission: Thu Jul 25 06:26:09 PDT 2013
                    report_version: 2
                    SubTests:
                    -
                        name: test_0a
                        status: PASS
                        duration: 0
                        return_code: 0
                        error:
                    -
                        name: test_0b
                        status: PASS
                        duration: 3
                        return_code: 0
                        error:
                   status: PASS
            

            but sometimes it looks like this

            Tests:
            -
                    name: sanity
                    description: auster sanity
                    submission: Thu Jul 25 06:26:09 PDT 2013
                    report_version: 2
                    SubTests:
                    -
                        name: test_0a
                        status: PASS
                        duration: 0
                        return_code: 0
                        error:
                    -
                        name: test_0b
                        status: PASS
                        duration: 3
                        return_code: 0
                        error:
                   status: FAIL
            

            Which autotest/maloo faithfully reports.

            So the framework needs to be fixed up.

            Now I think we could remove the final status from the yaml and let Maloo decide if the suite passed, but while the result is their Maloo should report the result it receives.

            My guess is that it is showing a real issue.

            This needs to be fixed in the test framework so I will change this to a lustre ticket.

            chris Chris Gearing (Inactive) added a comment - Right so the problem here is in the Lustre test framework. The results.yaml file created by the results framework looks like this (abbreviated). Tests: - name: sanity description: auster sanity submission: Thu Jul 25 06:26:09 PDT 2013 report_version: 2 SubTests: - name: test_0a status: PASS duration: 0 return_code: 0 error: - name: test_0b status: PASS duration: 3 return_code: 0 error: status: PASS but sometimes it looks like this Tests: - name: sanity description: auster sanity submission: Thu Jul 25 06:26:09 PDT 2013 report_version: 2 SubTests: - name: test_0a status: PASS duration: 0 return_code: 0 error: - name: test_0b status: PASS duration: 3 return_code: 0 error: status: FAIL Which autotest/maloo faithfully reports. So the framework needs to be fixed up. Now I think we could remove the final status from the yaml and let Maloo decide if the suite passed, but while the result is their Maloo should report the result it receives. My guess is that it is showing a real issue. This needs to be fixed in the test framework so I will change this to a lustre ticket.

            This is a totally broken ZFS run:
            https://maloo.whamcloud.com/test_sessions/6fcdc280-c837-11e2-8dd9-52540035b04c

            For example on lnetself test:
            https://maloo.whamcloud.com/test_sets/9c5751b0-c83a-11e2-8dd9-52540035b04c

            Duration is 376 seconds, it is marked FAIL, 0/0 subtests have ran

            There are no lustre-initialization logs to look at as well.

            https://maloo.whamcloud.com/test_sessions/9917ca1e-c832-11e2-8dd9-52540035b04c is similar.

            keith Keith Mannthey (Inactive) added a comment - This is a totally broken ZFS run: https://maloo.whamcloud.com/test_sessions/6fcdc280-c837-11e2-8dd9-52540035b04c For example on lnetself test: https://maloo.whamcloud.com/test_sets/9c5751b0-c83a-11e2-8dd9-52540035b04c Duration is 376 seconds, it is marked FAIL, 0/0 subtests have ran There are no lustre-initialization logs to look at as well. https://maloo.whamcloud.com/test_sessions/9917ca1e-c832-11e2-8dd9-52540035b04c is similar.
            utopiabound Nathaniel Clark added a comment - Another review-zfs run (sanity-quota): https://maloo.whamcloud.com/test_sets/4704e34c-c83c-11e2-b8c5-52540035b04c

            People

              wc-triage WC Triage
              keith Keith Mannthey (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: