Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13773

sanity-dom test failure not triggering test suite marked as FAIL

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.13.0, Lustre 2.14.0, Lustre 2.12.5
    • 3
    • 9223372036854775807

    Description

      We see, in LU-13759, that a sanityn test that is run as part of sanity-dom can fail and the failure is not propagated to a level that Maloo recognizes it as a failure. There are several problems that are also seen with sanity-dom logs; logs for the failed tests are not collected and client log for successful tests are not collected. Note that this seems like an issue with the test framework and not with Maloo.

      For failure at https://testing.whamcloud.com/test_sets/08c6fa9d-e2a3-457b-a1ed-b4318dbf166a, we can see that the sanity-dom/sanityn test 20 failure is recognized, but this does not lead to the sanity-dom test suite being marked at failed. In the results file at https://testing.whamcloud.com/test_sessions/b31ca578-dd4f-4725-9a0c-5e19f3031c69/show_results, we see that the fail is registered, but

          -   name: test_sanityn
          -   name: test_1
              status: PASS
              duration: 4
              return_code: 0
              error: 
      …
          -   name: test_19
              status: SKIP
              duration: 2
              return_code: 0
              error: not cache-capable obdfilter
          -   name: test_20
              status: FAIL
              duration: 5
              return_code: 1
              error: 1 page left in cache after lock cancel
          -   name: test_23
              status: PASS
              duration: 65
              return_code: 0
              error: 
      …
          -   name: test_51d
              status: PASS
              duration: 435
              return_code: 0
              error: 
          duration: 1129
          status: PASS
      
      

      We had a related issue where, when a sanity-dom/sanityn test failed, the failure would trigger the whole test suite to fail, but Maloo thinks the last test in sanityn failed which is false; see LU-10589 and one example failure at https://testing.whamcloud.com/test_sets/870c7a78-467f-11e9-9646-52540065bddc. In this case it looks like all logs are not collected.

      Patrick Farrell produced a patch which clears the problem with a sub test of a sub suite failure not triggering the suite to be marked as fail at https://review.whamcloud.com/#/c/34186/ . I think the no log collected/displayed issue still exists with this proposed solution.

      Attachments

        Activity

          [LU-13773] sanity-dom test failure not triggering test suite marked as FAIL

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39409/
          Subject: LU-13773 tests: subscript failure propagation
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: e2cb43c409b98b97fdb84a678a93651d2176b242

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39409/ Subject: LU-13773 tests: subscript failure propagation Project: fs/lustre-release Branch: master Current Patch Set: Commit: e2cb43c409b98b97fdb84a678a93651d2176b242

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39552/
          Subject: LU-13773 tests: use TESTLOG_PREFIX in run_one_logged
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: cf6ac632a18c3acb3534cd07b717673997a58515

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39552/ Subject: LU-13773 tests: use TESTLOG_PREFIX in run_one_logged Project: fs/lustre-release Branch: master Current Patch Set: Commit: cf6ac632a18c3acb3534cd07b717673997a58515

          James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39591
          Subject: LU-13773 tests: modify test and log names
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 5366881770fde7fd7cc244b0eff258f316388797

          gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39591 Subject: LU-13773 tests: modify test and log names Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 5366881770fde7fd7cc244b0eff258f316388797

          James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39552
          Subject: LU-13773 tests: use TESTLOG_PREFIX in run_one_logged
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 051ace294af11fb1da6241f04471deb092f90cd8

          gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39552 Subject: LU-13773 tests: use TESTLOG_PREFIX in run_one_logged Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 051ace294af11fb1da6241f04471deb092f90cd8

          The patch https://review.whamcloud.com/39409 should fix the problem where sanity and/or sanityn will fail, but sanity-dom does not recognize that the sub-test suite test failed and, thus, Maloo does not recognize that a sub-test suite test failed.

          There is one other issue that needs to be corrected; test logs for the sub-test suites tests are not correctly associated with the test results. A solution that Maloo can work with out of the box is to preface the log files with <suite name>.<sub-testsuite_name>_test<test_number>. ... .and make the same change to the results.yml file.

          jamesanunez James Nunez (Inactive) added a comment - The patch https://review.whamcloud.com/39409 should fix the problem where sanity and/or sanityn will fail, but sanity-dom does not recognize that the sub-test suite test failed and, thus, Maloo does not recognize that a sub-test suite test failed. There is one other issue that needs to be corrected; test logs for the sub-test suites tests are not correctly associated with the test results. A solution that Maloo can work with out of the box is to preface the log files with <suite name>.<sub-testsuite_name>_test <test_number>. ... .and make the same change to the results.yml file.

          James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39409
          Subject: LU-13773 tests: testing failure of subscripts
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 7be9f991dc4221dab5df0d2a01a1a863816b9825

          gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39409 Subject: LU-13773 tests: testing failure of subscripts Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7be9f991dc4221dab5df0d2a01a1a863816b9825

          People

            wc-triage WC Triage
            jamesanunez James Nunez (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: