[LU-9368] snapshot testing: ssh_exchange_identification: Connection closed by remote host Created: 20/Apr/17 Updated: 04/Dec/17 Resolved: 04/Dec/17 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Saurabh Tandan (Inactive) | Assignee: | nasf (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
1 Client, 1 OST, 1/2/4/8/16 MDTs |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Scalability testing for 1/2/4/8/16 MDTs [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre filesystem_name: lustre snapshot_name: MDT16 create_time: Thu Apr 20 05:28:21 2017 modify_time: Thu Apr 20 05:28:21 2017 comment: TEST5 snapshot_fsname: 5594a65 status: not mount filesystem_name: lustre snapshot_name: MDT8 modify_time: Thu Apr 20 05:14:13 2017 create_time: Thu Apr 20 05:14:13 2017 snapshot_fsname: 58754c5 comment: TEST4 status: not mount [root@eagle-52vm2 ~]# lctl snapshot_destroy -f -F lustre -n MDT8 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre filesystem_name: lustre snapshot_name: MDT16 create_time: Thu Apr 20 05:28:21 2017 modify_time: Thu Apr 20 05:28:21 2017 comment: TEST5 snapshot_fsname: 5594a65 status: not mount [root@eagle-52vm2 ~]# lctl snapshot_destroy -F lustre -n MDT16 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host Miss snapshot piece on the MDT0003. Use '-f' option if want to destroy it by force. Can't destroy the snapshot MDT16 [root@eagle-52vm2 ~]# lctl snapshot_destroy -f -F lustre -n MDT16 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre Even though the snapshot is created on creation and destroyed whenever the appropriate command is used, but still get the connection closed by remote host message. Only seen for 16 MDTs configuration, not for 1/2/4/8 MDTs. |
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 20/Apr/17 ] |
|
Hi Fan Yong, Can you please look into this snapshot issue? Thanks. |
| Comment by nasf (Inactive) [ 20/Apr/17 ] |
|
What is the output "lctl snapshot_list -F lustre -n MDT16 --detail"? |
| Comment by nasf (Inactive) [ 20/Apr/17 ] |
|
Would you please to check whether you can "ssh" from current node to all other Lustre servers without password? Thanks! |
| Comment by Saurabh Tandan (Inactive) [ 20/Apr/17 ] |
|
Yes I can ssh from MDS to OSS without password. Is vice-versa also required? Because I have not set up password less ssh from OSS to MDS. [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host cannot create snapshot 'lustre-mdt2/mdt2@MDT16': dataset already exists cannot create snapshot 'lustre-mdt11/mdt11@MDT16': dataset already exists ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host could not find any snapshots to destroy; check snapshot names. could not find any snapshots to destroy; check snapshot names. Can't create the snapshot MDT16 [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre -n MDT16 --detail cannot open 'lustre-mdt1/mdt1@MDT16': dataset does not exist Can't list the snapshot MDT16 |
| Comment by nasf (Inactive) [ 20/Apr/17 ] |
|
It only requires the auto ssh from current node to all Lustre servers (MGS/MDS/OSS). |
| Comment by Saurabh Tandan (Inactive) [ 20/Apr/17 ] |
|
Tried with another name. Got same result: [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16_2 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host could not find any snapshots to destroy; check snapshot names. Can't create the snapshot MDT16_2 [root@eagle-52vm2 ~]# lctl snapshot_list -F lustre -n MDT16_2 --detail cannot open 'lustre-mdt1/mdt1@MDT16_2': dataset does not exist Can't list the snapshot MDT16_2 |
| Comment by nasf (Inactive) [ 20/Apr/17 ] |
|
edit /etc/ssh/sshd_config, enable MaxStartups and set it as a larger value, such 128, then restart "sshd" |
| Comment by Saurabh Tandan (Inactive) [ 21/Apr/17 ] |
|
Did as above but still got the following: [root@eagle-52vm2 ~]# lctl snapshot_create -c TEST5 -F lustre -n MDT16 ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host ssh_exchange_identification: Connection closed by remote host Can't create the snapshot MDT16 [root@eagle-52vm2 ~]# |
| Comment by nasf (Inactive) [ 27/Apr/17 ] |
|
It seems related with the 'ssh' configuration. Would you please to verify how many pure ssh connection (without lsnapshot) you can establish with the Lustre server nodes in parallel? Thanks! |
| Comment by nasf (Inactive) [ 18/May/17 ] |
|
Have you restarted the sshd service after changed the "MaxStartups" ? Thanks! |
| Comment by nasf (Inactive) [ 13/Jun/17 ] |
|
Saurabh, Any feedback for this? Thanks! |