Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13277

Potential deadlock in lnet_peer_discovery

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Potential deadlock here when LNet is shutting down:

      static int lnet_peer_discovery(void *arg)
      ...
              for (;;) {
      ...
                      lnet_net_lock(LNET_LOCK_EX);
                      if (the_lnet.ln_dc_state == LNET_DC_STATE_STOPPING)
                              break;
      ...
              }
      
              CDEBUG(D_NET, "stopping\n");
              /*
               * Clean up before telling lnet_peer_discovery_stop() that
               * we're done. Use wake_up() below to somewhat reduce the
               * size of the thundering herd if there are multiple threads
               * waiting on discovery of a single peer.
               */
      
              /* Queue cleanup 1: stop all pending pings and pushes. */
              lnet_net_lock(LNET_LOCK_EX); <<< Deadlock
      ...
      

      Attachments

        Activity

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: