Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5319

Support multiple slots per client in last_rcvd file

    XMLWordPrintable

Details

    • 14856

    Description

      While running mdtest benchmark, I have observed that file creation and unlink operations from a single Lustre client quickly saturates to around 8000 iops: maximum is reached as soon as with 4 tasks in parallel.
      When using several Lustre mount points on a single client node, the file creation and unlink rate do scale with the number of tasks, up to the 16 cores of my client node.

      Looking at the code, it appears that most metadata operations are serialized by a mutex in the MDC layer.
      In mdc_reint() routine, request posting is protected by mdc_get_rpc_lock() and mdc_put_rpc_lock(), where the lock is :
      struct client_obd -> struct mdc_rpc_lock *cl_rpc_lock -> struct mutex rpcl_mutex.

      After an email discussion with Andreas Dilger, it appears that the limitation is actually on the MDS, since it cannot handle more than a single filesystem-modifying RPC at one time. There is only one slot in the MDT last_rcvd file for each client to save the state for the reply in case it is lost.

      The aim of this ticket is to implement multiple slots per client in the last_rcvd file so that several filesystem-modifying RPCs can be handled in parallel.

      The single client metadata performance should be significantly improved while still ensuring a safe recovery mecanism.

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              pichong Gregoire Pichon
              Votes:
              0 Vote for this issue
              Watchers:
              34 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: