[LU-13290] Write performance regression in ldiskfs patches Created: 23/Feb/20  Updated: 05/Mar/20  Resolved: 05/Mar/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Shuichi Ihara Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

CentOS7.7, master


Issue Links:
Related
is related to LU-13291 mballoc should not skip uninitialized... Resolved
is related to LU-12988 improve mount time on huge ldiskfs fi... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

ldiskfs patch in master branch(commit 2c0b2b7) causes write performance regressions on large OST.
There are two major performance regressions caused by "LU-12988 ldiskfs: mballoc to prefetch groups"" and "LU-12988 ldiskfs: skip non-loaded groups at cr=0/1".

As I commented on LU-12988, patch https://review.whamcloud.com/#/c/37619 has odd behaviors on large OST.
Howerver, even patch https://review.whamcloud.com/#/c/37619 reverts, there is still performance regression. After revert several ldiskfs patches, it seems that patch "LU-12988 ldiskfs: skip non-loaded groups at cr=0/1" causes another perforamnce regression.
Here is test case and test resutls.

8 clients (PPN=16), IOR (1MB, FPP) 
$ salloc --nodes=8 --ntasks-per-node=16 mpirun --allow-run-as-root /work/tools/bin/ior -w -t 1m -b 1g -e -F -C -o /scratch/file

cb86073 Revert "LU-12103 ldiskfs: don't search large block range if disk full"
Max Write: 6890.87 MiB/sec (7225.60 MB/sec)

19c4b48 Revert "LU-12988 ldiskfs: skip non-loaded groups at cr=0/1"
Max Write: 6757.56 MiB/sec (7085.81 MB/sec)

5222bf6 Revert "LU-13183 ldiskfs: Drop remove truncate warning patch"
Max Write: 985.87 MiB/sec (1033.76 MB/sec)

7b7c89c Revert "LU-12988 ldiskfs: mballoc to prefetch groups"
Max Write: 991.18 MiB/sec (1039.33 MB/sec)

2c0b2b7 LU-13166 osd-ldiskfs: fix to allow to get system inode
Max Write: 2184.79 MiB/sec (2290.92 MB/sec)


 Comments   
Comment by Alex Zhuravlev [ 23/Feb/20 ]

please try with LU-12988 ldiskfs: skip non-loaded groups at cr=0/1 and https://review.whamcloud.com/#/c/37626/

Comment by Shuichi Ihara [ 23/Feb/20 ]

please try with LU-12988 ldiskfs: skip non-loaded groups at cr=0/1 and https://review.whamcloud.com/#/c/37626/

looks better. I've just reverted patch "LU-12988 ldiskfs: mballoc to prefetch groups" and applied https://review.whamcloud.com/#/c/37626/

$ salloc --nodes=8 --ntasks-per-node=16 mpirun --allow-run-as-root /work/tools/bin/ior -w -t 1m -b 1g -e -F -C -o /scratch/file

Max Write: 6844.54 MiB/sec (7177.03 MB/sec)
Comment by Alex Zhuravlev [ 23/Feb/20 ]

thanks. so basically it's back to expected?

Comment by Shuichi Ihara [ 23/Feb/20 ]

Yeah. I think there is still a regression in "LU-12988 ldiskfs: mballoc to prefetch groups", but at least patch https://review.whamcloud.com/#/c/37626/ can fix a regression caused by "LU-12988 ldiskfs: skip non-loaded groups at cr=0/1"

Comment by Alex Zhuravlev [ 23/Feb/20 ]

well, prefetch actually should improve performance on non-empty fs.. can you please try it with https://review.whamcloud.com/#/c/37626/ or the latter doesn't apply ?
(can't check at the moment - OOO).

Comment by Shuichi Ihara [ 24/Feb/20 ]

ok, patch https://review.whamcloud.com/#/c/37626/ worked with LU-12988 and performance seems to be OK.
Anyway, LU-12988 was reverted in master with other reason, but we need an formal patch of https://review.whamcloud.com/#/c/37626/ for LU-13290. we will run other performance tests with LU-12988 later.

Comment by Alex Zhuravlev [ 24/Feb/20 ]

https://review.whamcloud.com/#/c/37687/ to fix this regression
mballoc-prefetch will be refreshed quickly.

Comment by Peter Jones [ 05/Mar/20 ]

Seems to have been fixed under LU-12988 and LU-13291

Generated at Sat Feb 10 03:00:01 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.