Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15151

conf-sanity test_119: mds1: ssh: Could not resolve hostname mds1: Name or service not known

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Elena <elena.gryaznova@hpe.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/4f01fe03-3433-4b2b-9ca9-ebce8607c27b

      test_119 PASSED, but

      mds1: ssh: Could not resolve hostname mds1: Name or service not known
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      conf-sanity test_119 -

      Attachments

        Issue Links

          Activity

            [LU-15151] conf-sanity test_119: mds1: ssh: Could not resolve hostname mds1: Name or service not known

            I use default local.sh as a configuration, this is a single VM. and yes, wait_update_facet_cond() just times out while I set an explicit limit for a whole test (conf-sanity) in this case. I don't quite understand the test - it doesn't fail if wait_update_facet_cond() times out. what's the point of this waiting?

            bzzz Alex Zhuravlev added a comment - I use default local.sh as a configuration, this is a single VM. and yes, wait_update_facet_cond() just times out while I set an explicit limit for a whole test (conf-sanity) in this case. I don't quite understand the test - it doesn't fail if wait_update_facet_cond() times out. what's the point of this waiting?

            Alex, what do you have in your config for $mds1_HOST (maps to $mds_HOST if unset)? Can the client $PDSH to that node (should be a no-op if it is a single client).

            According to the Gerrit Janitor test logs, the test is taking 1200s to finish, and the wait_update_facet_cond() is continually timing out, but this does not cause an error return:

            https://testing-archive.whamcloud.com/gerrit-janitor/19360/results.html

            I'm not sure if that is how this test is supposed to work or not, but 1200s is a long time and probably the test should be added to the SLOW list.

            adilger Andreas Dilger added a comment - Alex, what do you have in your config for $mds1_HOST (maps to $mds_HOST if unset)? Can the client $PDSH to that node (should be a no-op if it is a single client). According to the Gerrit Janitor test logs, the test is taking 1200s to finish, and the wait_update_facet_cond() is continually timing out, but this does not cause an error return: https://testing-archive.whamcloud.com/gerrit-janitor/19360/results.html I'm not sure if that is how this test is supposed to work or not, but 1200s is a long time and probably the test should be added to the SLOW list.

            conf-sanity/119 times out every time with the patch:

            COMMIT          TESTED  PASSED  FAILED          COMMIT DESCRIPTION
            dc27b1c3ea      3       0       3       BAD     LU-15151 tests: use facet check instead of node check
            fa36c6b0b9      3       3       0       GOOD    LU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2]
            
            bzzz Alex Zhuravlev added a comment - conf-sanity/119 times out every time with the patch: COMMIT TESTED PASSED FAILED COMMIT DESCRIPTION dc27b1c3ea 3 0 3 BAD LU-15151 tests: use facet check instead of node check fa36c6b0b9 3 3 0 GOOD LU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2]
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45369/
            Subject: LU-15151 tests: use facet check instead of node check
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: dc27b1c3ea1852617012753a3cce1f6c76b164af

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45369/ Subject: LU-15151 tests: use facet check instead of node check Project: fs/lustre-release Branch: master Current Patch Set: Commit: dc27b1c3ea1852617012753a3cce1f6c76b164af

            "Elena Gryaznova <elena.gryaznova@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45369
            Subject: LU-15151 tests: use facet check intead of node check
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 59432387b259d06b0bee1b2e769ee2b639b9c46d

            gerrit Gerrit Updater added a comment - "Elena Gryaznova <elena.gryaznova@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45369 Subject: LU-15151 tests: use facet check intead of node check Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 59432387b259d06b0bee1b2e769ee2b639b9c46d

            It looks like this was introduced by patch https://review.whamcloud.com/27753 "LU-9699 osp: don't assert on OSP duplicating".

            The wait_update_cond() call in test_119 should take a hostname as an argument instead of a facet name.

            adilger Andreas Dilger added a comment - It looks like this was introduced by patch https://review.whamcloud.com/27753 " LU-9699 osp: don't assert on OSP duplicating ". The wait_update_cond() call in test_119 should take a hostname as an argument instead of a facet name.

            People

              egryaznova Elena Gryaznova
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: