[LU-13309] performance optimizations for brw Created: 28/Feb/20  Updated: 07/Feb/22  Resolved: 11/Jan/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Improvement Priority: Minor
Reporter: Andrew Perepechko Assignee: Andrew Perepechko
Resolution: Fixed Votes: 0
Labels: patch, performance

Issue Links:
Related
is related to LU-13419 Simplify osc_enter_cache_try Open
is related to LU-15532 OOM in osd_device_alloc() Open
is related to LU-13542 osd stats are initialized too late Resolved
is related to LU-12179 allocate continuous pages when disabl... Open
Rank (Obsolete): 9223372036854775807

 Description   

A few trivial patches avoiding OSS CPU bottleneck with NVME storage will be uploaded shortly.



 Comments   
Comment by Gerrit Updater [ 28/Feb/20 ]

Andrew Perepechko (c17827@cray.com) uploaded a new patch: https://review.whamcloud.com/37758
Subject: LU-13309 osd-ldiskfs: remove per-page object_get/put in brw
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8524ffd75fa443d02c9ce74e9e998c02d8e55b72

Comment by Shuichi Ihara [ 01/Mar/20 ]

Andrew, what is particular workload you've seen this high CPU usages?

Comment by Andrew Perepechko [ 02/Mar/20 ]

sihara, it's an IOR IOPS test, i.e. random read 4 KiB chunk from a 16 GiB file, single OST.

OSS configuration:
CPU based on AMD Rome, NVME storage.

Client configuration:
48 clients, 8 processes per node.

Comment by Gerrit Updater [ 03/Mar/20 ]

Andrew Perepechko (c17827@cray.com) uploaded a new patch: https://review.whamcloud.com/37786
Subject: LU-13309 osd-ldiskfs: speedup osd_bufs_get/put
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7138a24a2362b8fd021e5fc000e671d309816534

Comment by Andrew Perepechko [ 03/Mar/20 ]

One more patch, brw_stats related, will be uploaded soon.

The IOPS test shows ~590000 IOPS -> ~620000 IOPS improvement with the three patches.

Either with or without these three patches CPU is the bottleneck, 100% CPU load.

The CPU profile shows that the load is spread between various CPU consumers, the most significant of which is memset from the malloc/free paths. So further optimizations can be done by disabling memory poisoning on free (a trivial single liner) and, more interesting, by removing memset(0) from the malloc path, especially for short IO.

Comment by Gerrit Updater [ 04/Mar/20 ]

Andrew Perepechko (c17827@cray.com) uploaded a new patch: https://review.whamcloud.com/37795
Subject: LU-13309 ofd: optimize the brw codepath
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8a065689c6fe38154d742f740355e58ab9f8f9d2

Comment by Gerrit Updater [ 05/Mar/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37758/
Subject: LU-13309 osd-ldiskfs: remove per-page object_get/put in brw
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dc9f28a541aec49b1787a25169f56a78a4924ee4

Comment by Gerrit Updater [ 13/Mar/20 ]

Andrew Perepechko (andrew.perepechko@hpe.com) uploaded a new patch: https://review.whamcloud.com/37915
Subject: LU-13309 osd: use per-cpu counters for brw_stats
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9f839039830b9aac5bf9038c2124a1e7a2450b0a

Comment by Gerrit Updater [ 31/Mar/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37795/
Subject: LU-13309 ofd: optimize the brw codepath
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 3beeb90cfa47b4753083d09760a6bd5ecaf58d76

Comment by Gerrit Updater [ 14/Apr/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37786/
Subject: LU-13309 osd-ldiskfs: speedup osd_bufs_get/put
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e790df5fa38d4e8928dd28ba9f250fec4c830786

Comment by Gerrit Updater [ 11/Jan/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/37915/
Subject: LU-13309 osd: use per-cpu counters for brw_stats
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 787c1884e6451ae764568ade3658e537dcc19097

Comment by Peter Jones [ 11/Jan/22 ]

Landed for 2.15

Generated at Sat Feb 10 03:00:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.