Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9368

snapshot testing: ssh_exchange_identification: Connection closed by remote host

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.10.0
    • None
    • 1 Client, 1 OST, 1/2/4/8/16 MDTs
      master, build# 3550
    • 3
    • 9223372036854775807

    Description

      Scalability testing for 1/2/4/8/16 MDTs
      Works perfectly fine till 8 MDTs
      As moved to 16 MDTs saw following messages:
      Create snapshot for 16 MDTs

      [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16
      ssh_exchange_identification: Connection closed by remote host
      ssh_exchange_identification: Connection closed by remote host
      [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre
      
      filesystem_name: lustre
      snapshot_name: MDT16
      create_time: Thu Apr 20 05:28:21 2017
      modify_time: Thu Apr 20 05:28:21 2017
      comment: TEST5 
      snapshot_fsname: 5594a65 
      status: not mount
      
      filesystem_name: lustre
      snapshot_name: MDT8
      modify_time: Thu Apr 20 05:14:13 2017
      create_time: Thu Apr 20 05:14:13 2017
      snapshot_fsname: 58754c5 
      comment: TEST4 
      status: not mount
      [root@eagle-52vm2 ~]# lctl snapshot_destroy -f -F lustre -n MDT8
      ssh_exchange_identification: Connection closed by remote host
      ssh_exchange_identification: Connection closed by remote host
      [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre
      
      filesystem_name: lustre
      snapshot_name: MDT16
      create_time: Thu Apr 20 05:28:21 2017
      modify_time: Thu Apr 20 05:28:21 2017
      comment: TEST5 
      snapshot_fsname: 5594a65 
      status: not mount
      [root@eagle-52vm2 ~]# lctl snapshot_destroy -F lustre -n MDT16
      ssh_exchange_identification: Connection closed by remote host
      ssh_exchange_identification: Connection closed by remote host
      ssh_exchange_identification: Connection closed by remote host
      Miss snapshot piece on the MDT0003. Use '-f' option if want to destroy it by force.
      Can't destroy the snapshot MDT16
      [root@eagle-52vm2 ~]# lctl snapshot_destroy -f -F lustre -n MDT16
      ssh_exchange_identification: Connection closed by remote host
      ssh_exchange_identification: Connection closed by remote host
      [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre
      

      Even though the snapshot is created on creation and destroyed whenever the appropriate command is used, but still get the connection closed by remote host message. Only seen for 16 MDTs configuration, not for 1/2/4/8 MDTs.

      Attachments

        Activity

          [LU-9368] snapshot testing: ssh_exchange_identification: Connection closed by remote host

          Saurabh,

          Any feedback for this? Thanks!

          yong.fan nasf (Inactive) added a comment - Saurabh, Any feedback for this? Thanks!

          Have you restarted the sshd service after changed the "MaxStartups" ?
          Would you please to verify how many pure ssh connection (without lsnapshot) you can establish with the Lustre server nodes in parallel?

          Thanks!

          yong.fan nasf (Inactive) added a comment - Have you restarted the sshd service after changed the "MaxStartups" ? Would you please to verify how many pure ssh connection (without lsnapshot) you can establish with the Lustre server nodes in parallel? Thanks!

          It seems related with the 'ssh' configuration. Would you please to verify how many pure ssh connection (without lsnapshot) you can establish with the Lustre server nodes in parallel? Thanks!

          yong.fan nasf (Inactive) added a comment - It seems related with the 'ssh' configuration. Would you please to verify how many pure ssh connection (without lsnapshot) you can establish with the Lustre server nodes in parallel? Thanks!

          Did as above but still got the following:

          [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          Can't create the snapshot MDT16
          [root@eagle-52vm2 ~]# 
          
          standan Saurabh Tandan (Inactive) added a comment - Did as above but still got the following: [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host Can't create the snapshot MDT16 [root@eagle-52vm2 ~]#

          edit /etc/ssh/sshd_config, enable MaxStartups and set it as a larger value, such 128, then restart "sshd"

          yong.fan nasf (Inactive) added a comment - edit /etc/ssh/sshd_config, enable MaxStartups and set it as a larger value, such 128, then restart "sshd"

          Tried with another name. Got same result:

          [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16_2
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          could not find any snapshots to destroy; check snapshot names.
          Can't create the snapshot MDT16_2
          [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre -n MDT16_2 --detail
          cannot open 'lustre-mdt1/mdt1@MDT16_2': dataset does not exist
          Can't list the snapshot MDT16_2
          
          standan Saurabh Tandan (Inactive) added a comment - Tried with another name. Got same result: [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16_2 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host could not find any snapshots to destroy; check snapshot names. Can't create the snapshot MDT16_2 [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre -n MDT16_2 --detail cannot open 'lustre-mdt1/mdt1@MDT16_2': dataset does not exist Can't list the snapshot MDT16_2

          It only requires the auto ssh from current node to all Lustre servers (MGS/MDS/OSS).
          Would you please to use another name for another try, such as DMT16_2?

          yong.fan nasf (Inactive) added a comment - It only requires the auto ssh from current node to all Lustre servers (MGS/MDS/OSS). Would you please to use another name for another try, such as DMT16_2?
          standan Saurabh Tandan (Inactive) added a comment - - edited

          Yes I can ssh from MDS to OSS without password. Is vice-versa also required? Because I have not set up password less ssh from OSS to MDS.
          Earlier I had destroyed the snapshot for MDT16, so in order to give you details of "lctl snapshot_list -F lustre -n MDT16 --detail" i will have to re-create the snapshot. But when I am are-creating it I get following:

          [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          cannot create snapshot 'lustre-mdt2/mdt2@MDT16': dataset already exists
          cannot create snapshot 'lustre-mdt11/mdt11@MDT16': dataset already exists
          ssh_exchange_identification: Connection closed by remote host
          ssh_exchange_identification: Connection closed by remote host
          could not find any snapshots to destroy; check snapshot names.
          could not find any snapshots to destroy; check snapshot names.
          Can't create the snapshot MDT16
          [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre -n MDT16 --detail
          cannot open 'lustre-mdt1/mdt1@MDT16': dataset does not exist
          Can't list the snapshot MDT16
          
          standan Saurabh Tandan (Inactive) added a comment - - edited Yes I can ssh from MDS to OSS without password. Is vice-versa also required? Because I have not set up password less ssh from OSS to MDS. Earlier I had destroyed the snapshot for MDT16, so in order to give you details of "lctl snapshot_list -F lustre -n MDT16 --detail" i will have to re-create the snapshot. But when I am are-creating it I get following: [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host cannot create snapshot 'lustre-mdt2/mdt2@MDT16': dataset already exists cannot create snapshot 'lustre-mdt11/mdt11@MDT16': dataset already exists ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host could not find any snapshots to destroy; check snapshot names. could not find any snapshots to destroy; check snapshot names. Can't create the snapshot MDT16 [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre -n MDT16 --detail cannot open 'lustre-mdt1/mdt1@MDT16': dataset does not exist Can't list the snapshot MDT16

          Would you please to check whether you can "ssh" from current node to all other Lustre servers without password? Thanks!

          yong.fan nasf (Inactive) added a comment - Would you please to check whether you can "ssh" from current node to all other Lustre servers without password? Thanks!

          What is the output "lctl snapshot_list -F lustre -n MDT16 --detail"?

          yong.fan nasf (Inactive) added a comment - What is the output "lctl snapshot_list -F lustre -n MDT16 --detail"?

          People

            yong.fan nasf (Inactive)
            standan Saurabh Tandan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: