[LU-16368] ZFS RPMs without vectorization Created: 06/Dec/22  Updated: 06/Dec/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.1
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Patrick Keller Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: performance, zfs
Environment:

RHEL8.6


Epic/Theme: ZFS
Rank (Obsolete): 9223372036854775807

 Description   

The provided ZFS RPMs for RHEL86 Lustre servers are appearently built without vectorization support for raidz and fletcher4.

The available algorithms are benchmarked on module load. The benchmarked implementations can be checked under

cat /proc/spl/kstat/zfs/fletcher_4_bench
implementation   native         byteswap
scalar           5497759569     4845665644
superscalar      5480937676     4858516672
superscalar4     5488439650     4847575376
fastest          scalar         superscalar

cat /proc/spl/kstat/zfs/vdev_raidz_bench
implementation   gen_p           gen_pq          gen_pqr         rec_p           rec_q           rec_r           rec_pq          rec_pr          rec_qr          rec_pqr
original         254722784       200239743       97275027        1207670510      225639247       35724759        87910053        20776517        20811404        13297656
scalar           1234755379      393704159       159539402       1211544896      385637841       266021529       195891453       153232648       90736440        81828326
fastest          scalar          scalar          scalar          scalar          scalar          scalar          scalar          scalar          scalar          scalar
  

The same hardware with official ZFS RPMs supports further optimizations, e.g.:

cat /proc/spl/kstat/zfs/fletcher_4_bench
implementation   native         byteswap
scalar           6421841866     4605167866
superscalar      9151122207     5740950514
superscalar4     8273988438     6065318192
sse2             10187627568    2365144114
ssse3            10428775432    7506302223
avx2             24481349937    19454027999
avx512f          35018614756    10724551066
avx512bw         36102325545    30780239769
fastest          avx512bw       avx512bw

A noticable performance improvement on high throughput scenarios could be observerd when changing to official ZFS RPMs with proper vectorization support.


Generated at Sat Feb 10 03:26:25 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.