[LU-8404] When service node nid is incorrect, MDT log message missing bad nid Created: 15/Jul/16 Updated: 18/Aug/16 Resolved: 18/Aug/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Gary Hagensen (Inactive) | Assignee: | Hongchao Zhang |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
When entering a bad nid address in the --servicenode argument. This caused the MDT to not be able to establish a connection and reports in the log as below. Note the 0@<0:0> at the end that, I am guessing, should be the servicenode failover args and is reporting "0:0" instead. If it had reported the correct args, this would have been a much easier problem to figure out. Also in the mds log: class_config_llog_handler()) MGC192.168.20.121@o2ib: cfg command failed: rc = -2 cmd=cf003 0:lustrefs-OST0010-osc-MDT0000 1:lustrefs-OST0010_UUID 2:0@<0:0> |
| Comments |
| Comment by Andreas Dilger [ 15/Jul/16 ] |
|
Gary, can you please provide more details about how the NID was incorrect, so that this can be debugged/fixed properly and there can be a test case added for this. As it is, I have no idea how to reproduce this problem. |
| Comment by Gary Hagensen (Inactive) [ 15/Jul/16 ] |
|
The bad servicenode argument I used was the node's management interface @ tcp0 . So it was a valid IP, but not a valid NID as it is not on an LNET. The LNET interface was the IB interface. So I assume to reproduce all you would need is a system (or VMs) with 2 interfaces on the HA pair, one for management, one for LNET and define the service nodes to be on the non-lnet interface. |
| Comment by Hongchao Zhang [ 29/Jul/16 ] |
|
the patch http://review.whamcloud.com/19933/ from |
| Comment by Peter Jones [ 18/Aug/16 ] |
|
So this is confirmed as a duplicate of |