Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13316

frequent "Connection restored" messages without any prior errors/indicators

    XMLWordPrintable

Details

    • Question/Request
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.12.2
    • None
    • 9223372036854775807

    Description

      On  a cluster of about ~150 nodes, I often see these types of messages on my lustre servers:

       

      [Mon Mar 2 14:30:24 2020] Lustre: zfs-OST0000: Connection restored to 4cfa4498-94fb-e2ca-912e-ab884d2e0570 (at 192.168.237.66@o2ib)
      [Mon Mar 2 14:36:02 2020] Lustre: zfs-OST0002: Connection restored to 124ec1af-a48b-2cca-c990-bfdf282d5991 (at 192.168.237.120@o2ib)
      [Mon Mar 2 14:54:32 2020] Lustre: zfs-OST0002: Connection restored to 30f267b9-2731-3c0f-54f2-4308757d83d9 (at 192.168.237.67@o2ib)
      [Mon Mar 2 14:55:03 2020] Lustre: zfs-OST0000: Connection restored to b8172f2e-053f-a5d4-b8c5-28fbffea9691 (at 192.168.237.194@o2ib)

       

      However, I do not see any messages about the connection being lost, on either the storage server or the problematic client. I also do not see any functional issues with Lustre. Is this an actual problem I need to worry about? How can I investigate further if so?

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jerwin James Erwin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: