Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13708

lnet_notify can set route aliveness incorrectly

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      lnet_notify() modifies route aliveness in two ways:
      1. By setting lp_alive field of the lnet_peer struct.
      2. By setting lr_alive field of the lnet_route struct (via call to lnet_set_route_aliveness())

      In both cases, the aliveness value assigned is determined by a call
      to lnet_is_peer_ni_alive(), but that value only reflects the aliveness
      of a particular peer NI. A gateway may have multiple peer NIs, so the
      aliveness of a gateway peer (lp_alive) is not necessarily equivalent
      to the aliveness of one of its NIs. Furthermore, the lr_alive field
      is only used to determine route aliveness for path selection if
      discovery is disabled locally or on the gateway (see
      lnet_find_route_locked() and lnet_is_route_alive()).

      In general, we should not set lp_alive based on an lnet_notify()
      call, and we should only set lr_alive if discovery is disabled. for
      lr_alive specifically, we should only set it for those routes that
      have the peer NI as a next-hop.

      An exception to the above exists when the reset argument to
      lnet_notify() is set. The gnilnd uses this flag in its calls to
      lnet_notify() because gnilnd receives out-of-band notifications of
      node up and down events. Thus, when gnilnd calls lnet_notify() we
      actually know whether the gateway peer is up or down and we can set
      lp_alive and lr_alive appropriately.

        Attachments

          Activity

            People

            • Assignee:
              wc-triage WC Triage
              Reporter:
              hornc Chris Horn
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: