[LU-13288] a performance regression on the single stream write Created: 22/Feb/20  Updated: 22/Oct/20  Resolved: 01/Mar/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Minor
Reporter: Shuichi Ihara Assignee: Shaun Tancheff
Resolution: Fixed Votes: 0
Labels: None
Environment:

master(commit:2c0b2b7), CentOS8.1


Issue Links:
Related
is related to LU-13289 cgroup writeback support for Lustre k... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The latest master(commit:2c0b2b7) has a performance regression on the single stream write below.

test node: Xeon(R) Gold 5218 CPU @ 2.30GHz, 96GB memory, 1 x HDR-100
lustre params: osc.*.max_pages_per_rpc=16M osc.*.max_rpcs_in_flight=16 osc.*.max_dirty_mb=512 llite.*.max_read_ahead_mb=2048 osc.*.checksums=0
# ior -w -t 1m -b 192g -e -o /scratch/testdir/file
master(commit:2c0b2b7)
Max Write: 799.98 MiB/sec (838.84 MB/sec)

2.13.0
Max Write: 2268.02 MiB/sec (2378.19 MB/sec)


 Comments   
Comment by Shuichi Ihara [ 22/Feb/20 ]

'git bitsect' tells us the following patch causes problem.

Author: Shaun Tancheff <stancheff@cray.com>
Date:   Fri Oct 25 15:11:37 2019 -0500

    LU-12904 build: account_page_dirtied is not exported
    
    Linux 5.2 does not export account_page_dirtied
    mm: remove the account_page_dirtied export
    
    Use symbol_get() to access account_page_dirtied for Lustre
    
    kernel-commit: ac1c3e49a9a734150b33297eeca5b43d92fd5be8

Here is a single thread perforamnce before and after this commit and I've confirmed regression started from commit c38ab030d0 and at least centos8.1 client.

c38ab030d0 LU-12904 build: account_page_dirtied is not exported
# ior -w -t 1m -b 192g -e -o /scratch/testdir/file

Max Write: 802.32 MiB/sec (841.30 MB/sec)

998a494fa9 LU-12861 libcfs: provide an scnprintf and start using it
# ior -w -t 1m -b 192g -e -o /scratch/testdir/file

Max Write: 2253.32 MiB/sec (2362.78 MB/sec)

Comment by Wang Shilong (Inactive) [ 22/Feb/20 ]

It looks heavy to call symbol_get() every time, we could just call it once during module init time and then use it evey time.

Comment by Gerrit Updater [ 22/Feb/20 ]

Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/37682
Subject: LU-13288 llite: avoid symbol_get() in ll_account_page_dirtied()
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1b95bfcb6e17b3f33b7f94adcae4a3d7c8dbf9d6

Comment by Andreas Dilger [ 22/Feb/20 ]

The function pointer should just be named "account_page_dirtied" and then the function can be used as normal within the code.

Comment by Gerrit Updater [ 23/Feb/20 ]

Shaun Tancheff (shaun.tancheff@hpe.com) uploaded a new patch: https://review.whamcloud.com/37686
Subject: LU-13288 llite: Re-export account_page_dirtied
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 0da283be3407ac50ea1cfaafe5e5510a34ce3773

Comment by Gerrit Updater [ 01/Mar/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37686/
Subject: LU-13288 llite: Find account_page_dirtied on module init
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 788e464a7215e09987e05eeeeac107642e80cea5

Comment by Peter Jones [ 01/Mar/20 ]

Landed for 2.14

Comment by Gerrit Updater [ 22/Oct/20 ]

Jian Yu (yujian@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40355
Subject: LU-13288 llite: Find account_page_dirtied on module init
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: c7d0ca8cdba8917de4470d0f478345d6500c5c9e

Generated at Sat Feb 10 03:00:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.