Details
-
Question/Request
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.12.2
-
None
-
9223372036854775807
Description
On a cluster of about ~150 nodes, I often see these types of messages on my lustre servers:
[Mon Mar 2 14:30:24 2020] Lustre: zfs-OST0000: Connection restored to 4cfa4498-94fb-e2ca-912e-ab884d2e0570 (at 192.168.237.66@o2ib)
[Mon Mar 2 14:36:02 2020] Lustre: zfs-OST0002: Connection restored to 124ec1af-a48b-2cca-c990-bfdf282d5991 (at 192.168.237.120@o2ib)
[Mon Mar 2 14:54:32 2020] Lustre: zfs-OST0002: Connection restored to 30f267b9-2731-3c0f-54f2-4308757d83d9 (at 192.168.237.67@o2ib)
[Mon Mar 2 14:55:03 2020] Lustre: zfs-OST0000: Connection restored to b8172f2e-053f-a5d4-b8c5-28fbffea9691 (at 192.168.237.194@o2ib)
However, I do not see any messages about the connection being lost, on either the storage server or the problematic client. I also do not see any functional issues with Lustre. Is this an actual problem I need to worry about? How can I investigate further if so?