[LU-13905] A single stream performance regression in client 4.18.0-193.14.2.el8_2 kernel Created: 12/Aug/20  Updated: 12/Aug/20

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Shuichi Ihara Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

RHEL8.2 (kernel kernel 4.18.0-193.14.2.el8_2)
lustre-commit: c54b6ca (master)


Issue Links:
Related
Severity: 2
Rank (Obsolete): 9223372036854775807

 Description   

There is a single stream read performance regression with 4.18.0-193.14.2.el8_2 kernel. Here is test environment and a reproducer.

1 x client (1 x Gold 5218 CPU @ 2.30GHz, 96GB RAM, 1 x IB-HDR100)
CentOS8.2  (Tested kernel version: 4.18.0-147.el8.x86_64 and 4.18.0-193.14.2.el8_2.x86_64)
OFED-5.0-2.1.8.0

 

[root@ec01 ~]# lctl set_param osc.*.max_pages_per_rpc=16M osc.*.max_rpcs_in_flight=16 llite.*.max_read_ahead_mb=2048 llite.*.max_read_ahead_per_file_mb=N
[root@ec01 ~]# clush -w es400nvx1-vm[1-4],7990e3-vm[1-2],ec01 "echo 3 > /proc/sys/vm/drop_caches"
[root@ec01 ~]# /work/tools/bin/ior -r -t 1m -b 192g -e -o /es400nv/s/file -k  

At least, the behaviors with max_read_ahead_per_file_mb=64 (default) are different between two kernel versions 4.18.0-147.el8.x86_64 and 4.18.0-193.14.2.el8_2.x86_64.
Here is what I've tested on NVMe OST system.

  4.18.0-147.el8.x86_64 4.18.0-193.14.2.el8_2.x86_64
max_read_ahead_per_file_mb=64 4252(MiB/s) 2943(MiB/s)
max_read_ahead_per_file_mb=128 4186(MiB/s) 4287(MiB/s)

It was 30% slower performance with max_read_ahead_per_file_mb=64, but when it increased to 128, both performance were close.

There is another results which was tested on HDD based OSTs.

  4.18.0-147.el8.x86_64 4.18.0-193.14.2.el8_2.x86_64
max_read_ahead_per_file_mb=64 1578(MiB/s) 1326(MiB/s)
max_read_ahead_per_file_mb=128 3396(MiB/s) 2827(MiB/s)

In this case, there was still ~16% performrance regressions in 4.18.0-193.14.2.el8_2.x86_64 regardless max_read_ahead_per_file_mb=64 or 128.



 Comments   
Comment by Wang Shilong (Inactive) [ 12/Aug/20 ]

The problem is somehow 4.18.0-147.el8.x86_64 schedule kworker more often than 4.18.0-147.el8.x86_64, we might need investigate what changes has been applied for kernel work queue between this minor version updates.

Comment by Wang Shilong (Inactive) [ 12/Aug/20 ]

not aware of specific workqueue changes, but could this be related to some cpupower frequency changes? cpupower frequency-info to check if we could both reach performance mode?

Generated at Sat Feb 10 03:05:12 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.