Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14206

Router ping timeouts don't mark routes down if DD is disabled

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Discovery pings are used to determine the health of gateways and
      associated routes. Ping replies from gateways with dynamic discovery
      (DD) disabled (or if DD is disabled locally) are handled in
      a special routine, lnet_router_discovery_ping_reply(), but this
      function and related code doesn't handle the case where a discovery
      ping hits the response tracker timeout and is unlinked by the
      monitor thread. In this case, an UNLINK event is generated and we
      do not call the lnet_router_discovery_ping_reply(). For gateways
      with DD enabled (and DD enabled locally), we handle this case
      in lnet_router_discovery_copmlete(). If discovery failed then
      lp_dc_error is set and we mark all routes down for the gateway. We
      can simply extend this logic to the case of gateways w/DD disabled
      (or DD disabled locally).

      Attachments

        Activity

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: