Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1500

how to change failover mds?

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 1.8.7
    • None
    • 10706

    Description

      I have 2 mds which is HA,I use this command to make mds:
      mkfs.lustre --fsname=hydx --mdt --reformat --mgs --failnode=12.12.12.6 /dev/drbd0
      and the lustre client mount lustre like this:
      mount -t lustre 12.12.12.5@o2ib:12.12.12.6@o2ib:/hydx /lustre/

      But i found like this on the lustre client which mdc failover_nids is use tcp not o2ib,but the mgc is normal:

      [root@polaris-mgmt lustre]# lctl get_param mdc.*.import
      mdc.bak-MDT0000-mdc-ffff810611a9c000.import=
      import:
      name: bak-MDT0000-mdc-ffff810611a9c000
      target: bak-MDT0000_UUID
      state: FULL
      connect_flags: [version, inode_bit_locks, join_file, getattr_by_fid, no_oh_for_devices, early_lock_cancel, adaptive_timeouts, lru_resize, version_recovery, pools]
      import_flags: [replayable, pingable]
      connection:
      failover_nids: [12.12.12.16@o2ib, 12.12.12.15@tcp]
      current_connection: 12.12.12.16@o2ib
      connection_attempts: 1
      generation: 1
      in-progress_invalidations: 0
      rpcs:
      inflight: 0
      unregistering: 0
      timeouts: 0
      avg_waittime: 1381 usec
      service_estimates:
      services: 1 sec
      network: 1 sec
      transactions:
      last_replay: 0
      peer_committed: 21475297287
      last_checked: 21475297287

      [root@polaris-mgmt bak-MDT0000-mdc-ffff810611a9c000]# lctl get_param mgc.*.import
      mgc.MGC12.12.12.16@o2ib.import=
      import:
      name: MGC12.12.12.16@o2ib
      target: MGS
      state: FULL
      connect_flags: [version, adaptive_timeouts]
      import_flags: [pingable, recon_bk]
      connection:
      failover_nids: [12.12.12.16@o2ib, 12.12.12.15@o2ib]
      current_connection: 12.12.12.16@o2ib
      connection_attempts: 1
      generation: 1
      in-progress_invalidations: 0
      rpcs:
      inflight: 0
      unregistering: 0
      timeouts: 0
      avg_waittime: 0 <NULL>
      service_estimates:
      services: 1 sec
      network: 1 sec
      transactions:
      last_replay: 0
      peer_committed: 0
      last_checked: 0

      It maybe caused by wrong format parameters.how could I change failover_nids use tunefs?
      Can u help me ?

      Attachments

        Activity

          People

            wc-triage WC Triage
            chenlianghua chenlianghua
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: