Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10927

Improve efficiency of OSC LRU reclaim

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • 9223372036854775807

    Description

      We have seen some performance results which make us believe that the LRU cache slot reclaiming might need some improvement.

      In one of the case, when we run a specific kind of application, if we use more memory for Lustre cache (64GB V.S. 8GB), the peformance of the application will be worse, which shouldn't happen if reclaiming of cache works well.

      And in some other benchmarks, we found performance are really good when the cache is not full. But when performance becomes full, performance drops immediately. That shouldn't happen either, because the benchmarks only do sequential read, and won't read back.

      Those results shows that reclaiming of cache needs some improvement, especially when there are a lot of memory cache. One possible cause of the problem is, when memory becomes bigger, osc_lru_reclaim() needs more time to scan the whole list to free slots. Note that even the caller of osc_lru_reclaim() only need one slot, osc_lru_reclaim() will try to reclaim cl_max_pages_per_rpc slots. And the caller of osc_lru_reclaim() is always the I/O thread, that means the reclaiming a batch of slots will intoruduce overhead directly to the application.And the overhead gets larger when memory gets larger.

      Maybe there is a different reason, but we are testing a patch. And I am going to push the patch.

      Attachments

        Activity

          People

            lixi_wc Li Xi
            lixi Li Xi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: