Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3970

Add procfs interface for clearing lustre caches in parallel

Details

    • Improvement
    • Resolution: Won't Fix
    • Minor
    • None
    • None

    Description

      Cray experienced slowness clearing the lustre caches after job termination. This slowness is a result of clearing the caches for each namespace in series. Lustre should provide a high-level procfs interface which clears lustre caches across namespaces in parallel and returns when all caches are cleared.

      The interface should be at /proc/fs/lustre/ldlm/drop_caches, and when written to, it must clear all lustre caches in parallel and return when finished.

      It must have the same effect as the following, but in parallel:

      for LRU in $(ls /proc/fs/lustre/ldlm/namespaces/*osc*/lru_size); do
          echo clear > $LRU
      done
      
      for LRU in $(ls /proc/fs/lustre/ldlm/namespaces/*mdc*/lru_size); do
          echo clear > $LRU
      done
      

      Attachments

        Issue Links

          Activity

            [LU-3970] Add procfs interface for clearing lustre caches in parallel
            pjones Peter Jones added a comment -

            ok Cory

            pjones Peter Jones added a comment - ok Cory
            spitzcor Cory Spitz added a comment -

            http://review.whamcloud.com/#/c/7783 is abandoned in favor of the approach taken with LU-5134.

            This bug ought to be closed now. Jian, can you please make it so?

            spitzcor Cory Spitz added a comment - http://review.whamcloud.com/#/c/7783 is abandoned in favor of the approach taken with LU-5134 . This bug ought to be closed now. Jian, can you please make it so?
            haasken Ryan Haasken added a comment -

            Can somebody please mark this bug related to LU-5134?

            LU-5134 will resolve this issue by allowing lctl set_param to spawn threads in user space when setting lru_size=clear.

            haasken Ryan Haasken added a comment - Can somebody please mark this bug related to LU-5134 ? LU-5134 will resolve this issue by allowing lctl set_param to spawn threads in user space when setting lru_size=clear.

            I have submitted a patch to Gerrit: http://review.whamcloud.com/#/c/7783/

            haasken Ryan Haasken added a comment - I have submitted a patch to Gerrit: http://review.whamcloud.com/#/c/7783/
            haasken Ryan Haasken added a comment -

            That would work as well. However, Cray's ALPS (Application Level Placement Scheduler) team requested that Lustre provide a higher level interface which clears all the caches in parallel. Do you feel that this is an appropriate enhancement? I've already tested a patch, but I am still learning how to submit it to Gerrit for review.

            haasken Ryan Haasken added a comment - That would work as well. However, Cray's ALPS (Application Level Placement Scheduler) team requested that Lustre provide a higher level interface which clears all the caches in parallel. Do you feel that this is an appropriate enhancement? I've already tested a patch, but I am still learning how to submit it to Gerrit for review.
            green Oleg Drokin added a comment -

            I wonder if why doing hte echos in parallel won't work?

            green Oleg Drokin added a comment - I wonder if why doing hte echos in parallel won't work?
            haasken Ryan Haasken added a comment -

            I am working on uploading a patch to Gerrit.

            haasken Ryan Haasken added a comment - I am working on uploading a patch to Gerrit.

            People

              yujian Jian Yu
              haasken Ryan Haasken
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: