Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13514

conf-sanity test_32a: Timeout occurred after 143 mins

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.12.6, Lustre 2.15.0
    • Lustre 2.14.0, Lustre 2.12.6
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Chris Horn <hornc@cray.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/4960ea6e-2914-4b3d-a77d-e0e5a0a4c9a6

      test_32a failed with the following error:

      Timeout occurred after 143 mins, last suite running was conf-sanity
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      conf-sanity test_32a - Timeout occurred after 143 mins, last suite running was conf-sanity

      Attachments

        Issue Links

          Activity

            [LU-13514] conf-sanity test_32a: Timeout occurred after 143 mins

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39109/
            Subject: LU-13514 tests: remove upgrade images for conf-sanity
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d574a55778f035691bd3bed621cfcdb8200a9785

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39109/ Subject: LU-13514 tests: remove upgrade images for conf-sanity Project: fs/lustre-release Branch: master Current Patch Set: Commit: d574a55778f035691bd3bed621cfcdb8200a9785

            James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39109
            Subject: LU-13514 tests: remove upgrade images for conf-sanity
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b33b1a85c843be4ffffd181605d8ae2ac07c3bac

            gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39109 Subject: LU-13514 tests: remove upgrade images for conf-sanity Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b33b1a85c843be4ffffd181605d8ae2ac07c3bac

            James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39108
            Subject: LU-13514 tests: stop running conf-sanity test 32a ldiskfs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 440654a0768c4bebec20fa99571318e2bf429ccc

            gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39108 Subject: LU-13514 tests: stop running conf-sanity test 32a ldiskfs Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 440654a0768c4bebec20fa99571318e2bf429ccc

            Looking at the test results, it seems that review-dne-part-3 (ldiskfs) is the only session that is timing out, and never review-dne-zfs-part-3, so the failure must be related to one of the ldiskfs test images.

            adilger Andreas Dilger added a comment - Looking at the test results, it seems that review-dne-part-3 (ldiskfs) is the only session that is timing out, and never review-dne-zfs-part-3 , so the failure must be related to one of the ldiskfs test images.

            It seems that almost no patch managed to pass conf-sanity test_32a in the last couple of days.

            sebastien Sebastien Buisson added a comment - It seems that almost no patch managed to pass conf-sanity test_32a in the last couple of days.
            hornc Chris Horn added a comment - +1 on master: https://testing.whamcloud.com/test_sessions/807df025-8fd7-45d1-8961-b7fbaf84cdc2
            emoly.liu Emoly Liu added a comment - - edited more on master:  https://testing.whamcloud.com/test_sets/b52e5506-8e0a-48c7-8c97-3666e4df460e https://testing.whamcloud.com/test_sets/3c7736eb-a501-445a-ad51-4e4d2e004212
            arshad512 Arshad Hussain added a comment - +1 on Master: https://testing.whamcloud.com/sub_tests/e3575182-057a-4057-965e-fd0c293e939b

            I see that conf-sanity test_32a is still failing with this same error even for a patch based on the latest master commit v2_13_53-165-gebaf3b1b9980 "LU-11643 tests: revert new images and tests for upgrade patch":
            https://testing.whamcloud.com/test_sets/a4c8f3d3-b8e5-4e7e-9192-2b0bc22279b4

            adilger Andreas Dilger added a comment - I see that conf-sanity test_32a is still failing with this same error even for a patch based on the latest master commit v2_13_53-165-gebaf3b1b9980 " LU-11643 tests: revert new images and tests for upgrade patch ": https://testing.whamcloud.com/test_sets/a4c8f3d3-b8e5-4e7e-9192-2b0bc22279b4

            Well, commit 6b979daaff "LU-11643 tests: add new images and tests for upgrade tests" explicitly adds stuff that conf-sanity test_32c goes through. 2 more patches landed after this one, but one is adding a new test script, and the other changes the file lustre/osd-zfs/osd_scrub.c. So they cannot be responsible for failure in review-dne-part-3 test group, which runs on ldiskfs.

            The explanation I see for the test failure in master now is that patch https://review.whamcloud.com/35049 (6b979daaff "LU-11643 tests: add new images and tests for upgrade tests") was tested on a too old branch. I can see that patchset 16 was based on commit a83c820f89 "LU-12312 lnet: handle no discovery flag", that dates back from April, 23rd.

            sebastien Sebastien Buisson added a comment - Well, commit 6b979daaff " LU-11643 tests: add new images and tests for upgrade tests" explicitly adds stuff that conf-sanity test_32c goes through. 2 more patches landed after this one, but one is adding a new test script, and the other changes the file lustre/osd-zfs/osd_scrub.c. So they cannot be responsible for failure in review-dne-part-3 test group, which runs on ldiskfs. The explanation I see for the test failure in master now is that patch https://review.whamcloud.com/35049 (6b979daaff " LU-11643 tests: add new images and tests for upgrade tests") was tested on a too old branch. I can see that patchset 16 was based on commit a83c820f89 " LU-12312 lnet: handle no discovery flag", that dates back from April, 23rd.

            Sebastien, do you know what is broken in the tests, and why/how that patch passed testing before it landed? Is it just because the testing is now slower and timing out, or is there a code defect (hang)?

            adilger Andreas Dilger added a comment - Sebastien, do you know what is broken in the tests, and why/how that patch passed testing before it landed? Is it just because the testing is now slower and timing out, or is there a code defect (hang)?

            People

              ys Yang Sheng
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: