[LU-1500] how to change failover mds? Created: 11/Jun/12  Updated: 06/Nov/13  Resolved: 06/Nov/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.7
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: chenlianghua Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Rank (Obsolete): 10706

 Description   

I have 2 mds which is HA,I use this command to make mds:
mkfs.lustre --fsname=hydx --mdt --reformat --mgs --failnode=12.12.12.6 /dev/drbd0
and the lustre client mount lustre like this:
mount -t lustre 12.12.12.5@o2ib:12.12.12.6@o2ib:/hydx /lustre/

But i found like this on the lustre client which mdc failover_nids is use tcp not o2ib,but the mgc is normal:

[root@polaris-mgmt lustre]# lctl get_param mdc.*.import
mdc.bak-MDT0000-mdc-ffff810611a9c000.import=
import:
name: bak-MDT0000-mdc-ffff810611a9c000
target: bak-MDT0000_UUID
state: FULL
connect_flags: [version, inode_bit_locks, join_file, getattr_by_fid, no_oh_for_devices, early_lock_cancel, adaptive_timeouts, lru_resize, version_recovery, pools]
import_flags: [replayable, pingable]
connection:
failover_nids: [12.12.12.16@o2ib, 12.12.12.15@tcp]
current_connection: 12.12.12.16@o2ib
connection_attempts: 1
generation: 1
in-progress_invalidations: 0
rpcs:
inflight: 0
unregistering: 0
timeouts: 0
avg_waittime: 1381 usec
service_estimates:
services: 1 sec
network: 1 sec
transactions:
last_replay: 0
peer_committed: 21475297287
last_checked: 21475297287

[root@polaris-mgmt bak-MDT0000-mdc-ffff810611a9c000]# lctl get_param mgc.*.import
mgc.MGC12.12.12.16@o2ib.import=
import:
name: MGC12.12.12.16@o2ib
target: MGS
state: FULL
connect_flags: [version, adaptive_timeouts]
import_flags: [pingable, recon_bk]
connection:
failover_nids: [12.12.12.16@o2ib, 12.12.12.15@o2ib]
current_connection: 12.12.12.16@o2ib
connection_attempts: 1
generation: 1
in-progress_invalidations: 0
rpcs:
inflight: 0
unregistering: 0
timeouts: 0
avg_waittime: 0 <NULL>
service_estimates:
services: 1 sec
network: 1 sec
transactions:
last_replay: 0
peer_committed: 0
last_checked: 0

It maybe caused by wrong format parameters.how could I change failover_nids use tunefs?
Can u help me ?



 Comments   
Comment by Andreas Dilger [ 06/Nov/13 ]

In newer versions of Lustre there is the "lctl replace_nids" command that could change the NID. In older versions of Lustre it is necessary to run --writeconf to rewrite the whole configuration. Please see the manual for details.

Generated at Sat Feb 10 01:17:11 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.