Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8429

Add option for gnilnd to not reconnect after connection timeout

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • None
    • 3
    • 9223372036854775807

    Description

      When routers time out a client connection during a catastrophic
      network disturbance like a cabinet EPO, there still may be
      traffic from the file system that is using the router for the
      return path to the client. This will cause a new connection to try
      to be formed before the network has quiesced causing multiple failed
      connection attempts which need to be put in purgatory since they could
      possibly connect in the future. This can cause the gart space to be
      consumed with registrations.

      So we'll add an option to not reconnect after connection timeout

      Attachments

        Activity

          [LU-8429] Add option for gnilnd to not reconnect after connection timeout
          pjones Peter Jones added a comment -

          Landed for 2.9

          pjones Peter Jones added a comment - Landed for 2.9

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21459/
          Subject: LU-8429 gnilnd: Option to not reconnect after conn timeout
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 99bc4ba277637656f6329a67158af6cee7070b48

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21459/ Subject: LU-8429 gnilnd: Option to not reconnect after conn timeout Project: fs/lustre-release Branch: master Current Patch Set: Commit: 99bc4ba277637656f6329a67158af6cee7070b48

          Chris Horn (hornc@cray.com) uploaded a new patch: http://review.whamcloud.com/21459
          Subject: LU-8429 gnilnd: Option to not reconnect after conn timeout
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: e226eff34f1c9331fa73619e97851c57d7808b09

          gerrit Gerrit Updater added a comment - Chris Horn (hornc@cray.com) uploaded a new patch: http://review.whamcloud.com/21459 Subject: LU-8429 gnilnd: Option to not reconnect after conn timeout Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e226eff34f1c9331fa73619e97851c57d7808b09

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: