Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12293

Memory leak after router checker packet processing

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.1
    • None
    • 3
    • 9223372036854775807

    Description

      If net_monitor_thr is stopped with a condition that router checker packet is waiting for retry,
      resources for the packet is not released.

      As a workaround, we correct to wait for completion of router checker shutdown(TIMEOUT is 10sec x 2). After that, purge retry packet.
      Could you discuss how to fix this bug.

      diff --git a/lnet/lnet/lib-move.c b/lnet/lnet/lib-move.c
      index 5e990d9..3b16d89 100644
      --- a/lnet/lnet/lib-move.c
      +++ b/lnet/lnet/lib-move.c
      @@ -3682,6 +3682,14 @@ void lnet_monitor_thr_stop(void)
              /* tell the monitor thread that we're shutting down */
              wake_up(&the_lnet.ln_mt_waitq);
       
      +       /* wait tx completion for router checker */
      +       if (atomic_read(&the_lnet.ln_routers_nsends)) {
      +               set_current_state(TASK_UNINTERRUPTIBLE);
      +               schedule_timeout(cfs_time_seconds(lnet_get_lnd_timeout() * 2));
      +       }
      +       /* purge resend messages */
      +       lnet_clean_resendqs();
      +
              /* block until monitor thread signals that it's done */
              down(&the_lnet.ln_mt_signal);
              LASSERT(the_lnet.ln_mt_state == LNET_MT_STATE_SHUTDOWN);
      @@ -3691,7 +3699,6 @@ void lnet_monitor_thr_stop(void)
              lnet_rsp_tracker_clean();
              lnet_clean_local_ni_recoveryq();
              lnet_clean_peer_ni_recoveryq();
      -       lnet_clean_resendqs();
              rc = LNetEQFree(the_lnet.ln_mt_eqh);
              LASSERT(rc == 0);
              return;
      
      

      Attachments

        Activity

          People

            ashehata Amir Shehata (Inactive)
            takamura Tatsushi Takamura
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: