The lvbo methods have to reallocate lu_env every time, which can be quite expensive in terms of CPU cycles at scale.
The layers above can pass lu_env to reuse existing one.
Thanks for the detailed benchmark info, Andreas, Ihara. (Sorry, I was not originally following the LU so did not see the earlier comment.)
Patrick Farrell (Inactive)
added a comment - Thanks for the detailed benchmark info, Andreas, Ihara. (Sorry, I was not originally following the LU so did not see the earlier comment.)
Patrick,
attached is a flame graph ( ost_io.svg ) showing CPU usage for the OST under a high-throughput random read workload (fake IO used so that no storage overhead is present, just network and RPC processing). In ost_lvbo_update() the lu_env_init() and lu_env_fini() functions are consuming over 10% of the OSS CPU for basically no benefit. The master-patch-32832.svg flame graph shows the ost_lvbo_update() CPU usage is down to 1.5% when the patch is applied, which resulted in a 6.3% performance improvement for random 4KB reads. Alex, could you please include these results in the commit comment so that it is more clear why we want to land that patch.
Andreas Dilger
added a comment - Patrick,
attached is a flame graph ( ost_io.svg ) showing CPU usage for the OST under a high-throughput random read workload (fake IO used so that no storage overhead is present, just network and RPC processing). In ost_lvbo_update() the lu_env_init() and lu_env_fini() functions are consuming over 10% of the OSS CPU for basically no benefit. The master-patch-32832.svg flame graph shows the ost_lvbo_update() CPU usage is down to 1.5% when the patch is applied, which resulted in a 6.3% performance improvement for random 4KB reads. Alex, could you please include these results in the commit comment so that it is more clear why we want to land that patch.
The ost_io.svg graph is also showing find_or_create_page() using 4.25% of CPU, which drove the creation of patch https://review.whamcloud.com/32875 " LU-11347 osd: do not use pagecache for I/O ".
Landed for 2.12. James, perf testing is a standard part of release testing