[LU-5076] Test failure on test suite conf-sanity, subtest test_46a test failed to respond and timed out Created: 17/May/14 Updated: 10/Jun/14 Resolved: 10/Jun/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Minh Diep |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 14010 | ||||||||||||
| Description |
|
This issue was created by maloo for wangdi <di.wang@intel.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/0638f47c-dd56-11e3-8e9b-52540035b04c. The sub-test test_46a failed with the following error:
This failure is a bit strange, according to the syslog on MDS0 Lustre: lustre-MDT0000: Client lustre-MDT0000-lwp-OST0006_UUID seen on new nid 10.10.4.199@tcp when existing nid 10.10.4.203@tcp is already connected But the ip of OSS should be on 10.10.4.199, I do not know where this 10.10.4.203 comes from. So I am not sure this is a TEI ticket. If some one confirm this is a TEI ticket, please close this one. Thanks. Info required for matching: conf-sanity 46a |
| Comments |
| Comment by Andreas Dilger [ 20/May/14 ] |
|
It would we worthwhile to track down which mv cluster this other IO address belongs to, and why it thinks it should be connecting to this MDS. Separately, one option to avoid such problems is to use a more unique $NAME variable for each test cluster (e.g. hostname of master test node instead of ALWAYS "lustre") so that the clients and servers are not able to connect to the wrong system being tested. |
| Comment by Andreas Dilger [ 10/Jun/14 ] |
|
Closing this as a duplicate of TEI-1993. There are two possible fixes in the test infrastructure that are possible:
|