[LU-15151] conf-sanity test_119: mds1: ssh: Could not resolve hostname mds1: Name or service not known Created: 22/Oct/21 Updated: 05/Nov/21 Resolved: 03/Nov/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Elena Gryaznova |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for Elena <elena.gryaznova@hpe.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/4f01fe03-3433-4b2b-9ca9-ebce8607c27b test_119 PASSED, but mds1: ssh: Could not resolve hostname mds1: Name or service not known VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Andreas Dilger [ 22/Oct/21 ] |
|
It looks like this was introduced by patch https://review.whamcloud.com/27753 " The wait_update_cond() call in test_119 should take a hostname as an argument instead of a facet name. |
| Comment by Gerrit Updater [ 26/Oct/21 ] |
|
"Elena Gryaznova <elena.gryaznova@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45369 |
| Comment by Gerrit Updater [ 03/Nov/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45369/ |
| Comment by Peter Jones [ 03/Nov/21 ] |
|
Landed for 2.15 |
| Comment by Alex Zhuravlev [ 04/Nov/21 ] |
|
conf-sanity/119 times out every time with the patch: COMMIT TESTED PASSED FAILED COMMIT DESCRIPTION dc27b1c3ea 3 0 3 BAD LU-15151 tests: use facet check instead of node check fa36c6b0b9 3 3 0 GOOD LU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2] |
| Comment by Andreas Dilger [ 04/Nov/21 ] |
|
Alex, what do you have in your config for $mds1_HOST (maps to $mds_HOST if unset)? Can the client $PDSH to that node (should be a no-op if it is a single client). According to the Gerrit Janitor test logs, the test is taking 1200s to finish, and the wait_update_facet_cond() is continually timing out, but this does not cause an error return: https://testing-archive.whamcloud.com/gerrit-janitor/19360/results.html I'm not sure if that is how this test is supposed to work or not, but 1200s is a long time and probably the test should be added to the SLOW list. |
| Comment by Alex Zhuravlev [ 05/Nov/21 ] |
|
I use default local.sh as a configuration, this is a single VM. and yes, wait_update_facet_cond() just times out while I set an explicit limit for a whole test (conf-sanity) in this case. I don't quite understand the test - it doesn't fail if wait_update_facet_cond() times out. what's the point of this waiting? |