Details
-
Technical task
-
Resolution: Fixed
-
Minor
-
None
-
None
-
None
-
2829
Description
When running million file runs of createmany with the '-o' (open) option we are seeing a drop in performance from it's starting rate of about 500 creates per second, to about 150 creates per second. This drop in performance seems to hit at the same time as when the ZFS ARC's 'arc_meta_used' hits and then exceeds it's 'arc_meta_limit'. We believe that since Lustre is doing it's own caching and holds a reference to all of it's objects, the ARC is unable to limit it's cache to 'arc_meta_limit'. Thus, the ARC is spending useless effort trying to uphold it's limit and drop objects (but can't because of Lustre's references) which is causing the create rate decrease.
One method to slightly relive this, is to use the ARC prune callback feature that was recently added to the ZFS on Linux project in commit: https://github.com/zfsonlinux/zfs/commit/ab26409db753bb087842ab6f1af943f3386c764f
This would allow the ARC to notify Lustre that it needs to release some of the objects it is holding so the ARC can free up part of it's cache and uphold it's 'arc_meta_limit'.
We have pretty much confirmed that the current LU object caching approach for ZFS has the side effect of preventing the ARC from caching any of the OIs. This makes FID lookups terribly expensive. The following patch disables the LU object cache for ZFS OSDs which allows the OIs to be properly cached. I don't have any solid performance numbers for specific workloads. But I wanted to post the patch to start a discussion on what the right way to fix this is.
http://review.whamcloud.com/10237