"
One thing that we have to worry about is delaying writeback to the MDT/OST for too long, as that can cause memory pressure to increase significantly, and we will have wasted tens of seconds not sending RPCs, which could have written GBs of dirty data during that time. I think as much as possible it makes sense to have a "write early, free late" kind of policy that we have for dirty file data so that we don't waste the bandwidth/IOPS just waiting until we are short of memory.
"
Can we tune the kernel writeback parameters to achieve this goal?
Linux Writeback Settings
Variable |
Description |
dirty_background_ratio |
As a percentage of total memory, the number of pages at which the flusher threads begin writeback of dirty data. |
dirty_expire_centisecs |
In milliseconds, how old data must be to be written out the next time a flusher thread wakes to perform periodic writeback. |
dirty_ratio |
As a percentage of total memory, the number of pages a process generates before it begins writeback of dirty data. |
dirty_writeback_centisecs |
In milliseconds, how often a flusher thread should wake up to write data back out to disk. |
Moreover, for data IO pages, we can control the limit of cache pages in MemFS per file to allow data caching in MemFS. If exceed this threshold (i.e. max_pages_per_rpc: 16M? or only 1M to allow to cache much more small files), the client will assimilate the cache pages from MemFS into Lustre. After that, all data IO on this file is directed to Lustre OSTs via Lustre normal IO path.
Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38875
Subject: LU-13563 mdt: ignore quota when creating slave stripe
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d497a600c487fd62401d776cea7d18644a74d4e2