Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13277

Potential deadlock in lnet_peer_discovery

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Potential deadlock here when LNet is shutting down:

      static int lnet_peer_discovery(void *arg)
      ...
              for (;;) {
      ...
                      lnet_net_lock(LNET_LOCK_EX);
                      if (the_lnet.ln_dc_state == LNET_DC_STATE_STOPPING)
                              break;
      ...
              }
      
              CDEBUG(D_NET, "stopping\n");
              /*
               * Clean up before telling lnet_peer_discovery_stop() that
               * we're done. Use wake_up() below to somewhat reduce the
               * size of the thundering herd if there are multiple threads
               * waiting on discovery of a single peer.
               */
      
              /* Queue cleanup 1: stop all pending pings and pushes. */
              lnet_net_lock(LNET_LOCK_EX); <<< Deadlock
      ...
      

      Attachments

        Activity

          [LU-13277] Potential deadlock in lnet_peer_discovery
          pjones Peter Jones added a comment -

          Landed for 2.14

          pjones Peter Jones added a comment - Landed for 2.14

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37675/
          Subject: LU-13277 lnet: Discovery thread can deadlock on shutdown
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 82bb93410fc6f74e32ad74339ece5b4f62dc9967

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37675/ Subject: LU-13277 lnet: Discovery thread can deadlock on shutdown Project: fs/lustre-release Branch: master Current Patch Set: Commit: 82bb93410fc6f74e32ad74339ece5b4f62dc9967

          Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/37675
          Subject: LU-13277 lnet: Discovery thread can deadlock on shutdown
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: ea1eb3b2b7aa1f5f05498739a8a55da16bf9f4ac

          gerrit Gerrit Updater added a comment - Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/37675 Subject: LU-13277 lnet: Discovery thread can deadlock on shutdown Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ea1eb3b2b7aa1f5f05498739a8a55da16bf9f4ac

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: