Details
-
Bug
-
Resolution: Won't Fix
-
Minor
-
None
-
Lustre 2.14.0
-
None
-
master, rhel8.1 (4.18.0-147.el8.x86_64)
-
3
-
9223372036854775807
Description
t10crc4K/512 algorithm in rhel8.1 kernel is slower than rhel7.7
The performance with T10PI checksum algorithm of t10crc4K/512 in rhel8.1 kernel is broken.
If client is running with rhel8.1 kernel and enabled t10crc4K/512 checksum, that client performance is much slower than rhel7.7 kernel with enabling same t10crc4K/512 checksum.
Here is test configuration and results.
Configuration
1 x client 1 x Platinum 8160, 96GB memory, 1 x IB-EDR (lctl set_param osc.*.max_pages_per_rpc=16M osc.*.max_rpcs_in_flight=16 osc.*.max_dirty_mb=512 llite.*.max_read_ahead_mb=2048 osc.*.checksum_type=t10crc4K)
Test resutl on RHEL7.7 (3.10.0-1062.el7.x86_64)
PPN=1 mpirun --allow-run-as-root -np 1 ior -w -r -t 1m -b 256g -e -F -o /testfs/s/file Max Write: 1981.81 MiB/sec (2078.07 MB/sec) Max Read: 2685.01 MiB/sec (2815.44 MB/sec) PPN=16 mpirun --allow-run-as-root -np 16 ior -w -r -t 1m -b 16g -e -F -o /testfs/file Max Write: 9887.55 MiB/sec (10367.84 MB/sec) Max Read: 11212.37 MiB/sec (11757.03 MB/sec)
Test resutl on RHEL8.1 (4.18.0-147.el8.x86_64)
PPN=1 mpirun --allow-run-as-root -np 1 ior -w -r -t 1m -b 256g -e -F -o /testfs/s/file Max Write: 1703.20 MiB/sec (1785.94 MB/sec) Max Read: 758.24 MiB/sec (795.07 MB/sec) PPN=16 mpirun --allow-run-as-root -np 16 ior -w -r -t 1m -b 16g -e -F -o /testfs/file Max Write: 6741.36 MiB/sec (7068.83 MB/sec) Max Read: 5821.17 MiB/sec (6103.94 MB/sec)
Even algorithm performance test indicated t10crc4K/512 algorithm in rhel8.1 is slow against rhel7.7 kernel. (30x slower.)
RHEL7.7 (3.10.0-1062.el7.x86_64)
obd_t10_performance_test() T10 checksum algorithm t10ip512 speed = 13015 MB/s obd_t10_performance_test() T10 checksum algorithm t10ip4K speed = 16855 MB/s obd_t10_performance_test() T10 checksum algorithm t10crc512 speed = 2551 MB/s obd_t10_performance_test() T10 checksum algorithm t10crc4K speed = 9231 MB/s
RHEL8.1 (4.18.0-147.el8.x86_64)
obd_t10_performance_test() T10 checksum algorithm t10ip512 speed = 13395 MB/s obd_t10_performance_test() T10 checksum algorithm t10ip4K speed = 19267 MB/s obd_t10_performance_test() T10 checksum algorithm t10crc512 speed = 339 MB/s obd_t10_performance_test() T10 checksum algorithm t10crc4K speed = 342 MB/s
Attachments
Issue Links
- is related to
-
LU-13391 no print checksum speed for t10pi checksum in debug messages
-
- Open
-