Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
3
-
9223372036854775807
Description
There is MDT-MDT connection problem after failover.
Here is config for MDT0000 (no replace_nids applied for it)
#10 (224)marker 17 (flags=0x01, v2.5.1.0) lustre-MDT0000 ‘add osp’ Wed Aug 30 10:34:36 2017- #11 (088)add_uuid nid=192.168.0.112@tcp(0x20000c0a80070) 0: 1:192.168.0.112@tcp #12 (144)attach 0:lustre-MDT0000-osp-MDT0001 1:osp 2:lustre-MDT0001-mdtlov_UUID #13 (152)setup 0:lustre-MDT0000-osp-MDT0001 1:lustre-MDT0000_UUID 2:192.168.0.112@tcp #14 (088)add_uuid nid=192.168.0.113@tcp(0x20000c0a80071) 0: 1:192.168.0.113@tcp #15 (120)add_conn 0:lustre-MDT0000-osp-MDT0001 1:192.168.0.113@tcp #16 (136)modify_mdc_tgts add 0:lustre-MDT0001-mdtlov 1:lustre-MDT0000_UUID 2:0 3:1 #17 (224)END marker 17 (flags
And MDT0001 config after replace_nids.
#19 (224)marker 20 (flags=0x01, v2.5.1.0) lustre-MDT0001 ‘add osp’ Wed Aug 30 10:34:36 2017- #20 (088)add_uuid nid=192.168.0.113@tcp(0x20000c0a80071) 0: 1:192.168.0.113@tcp #21 (144)attach 0:lustre-MDT0001-osp-MDT0000 1:osp 2:lustre-MDT0000-mdtlov_UUID #22 (152)setup 0:lustre-MDT0001-osp-MDT0000 1:lustre-MDT0001_UUID 2:192.168.0.113@tcp #23 (136)modify_mdc_tgts add 0:lustre-MDT0000-mdtlov 1:lustre-MDT0001_UUID 2:1 3:1 #24 (224)END marker 20 (flags=0x02, v2.5.1.0) lustre-MDT0001 ‘add osp’ Wed Aug 30 10:34:36 2017-
Replace nids doesn't add failover nid and add_conn string to config. This is the reason ops connection can not be established after failover.
The solution is add option to replace_nids that adds failover record.
Attachments
Issue Links
- is related to
-
LUDOC-523 add proper documentation for replace_nids command
- Open