Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1239

cascading client evictions

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.3.0
    • None
    • None
    • 3
    • 4565

    Description

      recently I have found the following scenario that may lead to cascading client reconnects, lock timeouts, evictions, etc.

      1. MDS is overloaded with enqueues, they consume all the threads on MDS_REQUEST portal.
      2. it happened that some rpc timed out on 1 client what led to its reconnection. this client has some locks to cancel, MDS is waiting for them.
      3. client sends MDS_CONNECT, but there is no empty thread to handle it.
      4. other clients are waiting for their enqueue completions, they try to ping MDS if it is still alive, but PING is also sent to MDS_REQUEST portal, despite the fact it is a high priority rpc, it has no special handlers (srv_hpreq_handler == NULL) and therefore 2nd thread is not reserved for hi-priority rpcs on such services:

      static int ptlrpc_server_allow_normal(struct ptlrpc_service *svc, int force)
      {
      #ifndef __KERNEL__
              if (1) /* always allow to handle normal request for liblustre */
                      return 1;
      #endif
              if (force ||
                  svc->srv_n_active_reqs < svc->srv_threads_running - 2)
                      return 1;
      
              if (svc->srv_n_active_reqs >= svc->srv_threads_running - 1)
                      return 0;
      
              return svc->srv_n_active_hpreq > 0 || svc->srv_hpreq_handler == NULL;
      }
      

      no thread to handle pings - other clients get timed out rpc.
      6. once 1 ldlm lock times out, enqueue completes and an MDS_CONNECT may be taken into handling, however this client is likely to have an enqueue rpc in processing on MDS, thus it gets ebusy and will re-try only after some delay, whereas others tries to re-connect and consume MDS threads by enqueues
      again. this is being discussed in LU-7, but it is not the main issue here.

      fixes:
      1) reserve an extra threads on services which expect PINGS to come.
      2) make CONNECTs hi-priority RPCs.
      3) LU-7 to address (6)

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              vitaly_fertman Vitaly Fertman
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: