Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13001

check_routers_before_use causes LNet to hang indefinitely if any router is down

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0
    • Lustre 2.13.0, Lustre 2.14.0
    • None
    • 3
    • 9223372036854775807

    Description

      Historically, check_routers_before_use would cause LNet
      initialization to pause until all routers had been ping'd once.

      This behavior was changed in commit
      fe17e9b8370affe063769b880f02b9190584baaa from LU-11298. Now, LNet
      will wait indefinitely until discovery completes on all routers.
      This is problematic, because if even one router is down then LNet
      will stall forever.

      Attachments

        Activity

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: