[LU-13316] frequent "Connection restored" messages without any prior errors/indicators Created: 02/Mar/20  Updated: 05/Mar/20  Resolved: 05/Mar/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.2
Fix Version/s: None

Type: Question/Request Priority: Minor
Reporter: James Erwin Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-13098 supress connection restore message on... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

On  a cluster of about ~150 nodes, I often see these types of messages on my lustre servers:

 

[Mon Mar 2 14:30:24 2020] Lustre: zfs-OST0000: Connection restored to 4cfa4498-94fb-e2ca-912e-ab884d2e0570 (at 192.168.237.66@o2ib)
[Mon Mar 2 14:36:02 2020] Lustre: zfs-OST0002: Connection restored to 124ec1af-a48b-2cca-c990-bfdf282d5991 (at 192.168.237.120@o2ib)
[Mon Mar 2 14:54:32 2020] Lustre: zfs-OST0002: Connection restored to 30f267b9-2731-3c0f-54f2-4308757d83d9 (at 192.168.237.67@o2ib)
[Mon Mar 2 14:55:03 2020] Lustre: zfs-OST0000: Connection restored to b8172f2e-053f-a5d4-b8c5-28fbffea9691 (at 192.168.237.194@o2ib)

 

However, I do not see any messages about the connection being lost, on either the storage server or the problematic client. I also do not see any functional issues with Lustre. Is this an actual problem I need to worry about? How can I investigate further if so?



 Comments   
Comment by Andreas Dilger [ 02/Mar/20 ]

This looks like a duplicate of LU-13098, which is fixed in the recent 2.12.4 release.

Generated at Sat Feb 10 03:00:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.