[LU-13799] DIO/AIO efficiency improvements Created: 17/Jul/20 Updated: 09/Mar/22 Resolved: 26/Jan/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Patrick Farrell | Assignee: | Patrick Farrell |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||
| Description |
|
There is a large degree of inefficiency/wasted time in the DIO/AIO path. This does not show up for DIO normally because of the waiting model. but it shows up easily in AIO. This ticket is to cover a set of improvements to DIO/AIO performance, which will also improve DIO performance once the waiting model is adjusted. (More on this in There is a grab bag of patches to be submitted here, and some further proposals that will probably end up in other tickets. The essence of the improvements is that all pages in a DIO submission are the same, and therefore much of the work done on a per-page basis is irrelevant and can be skipped for DIO. Note this statement is still compatible with unaligned DIO if it can be implemented in the future - In the case of unaligned DIO, only the first and last pages are different, and that can still be handled. Patches and benchmarks on each patch forthcoming. The total effect of the initial set of patches on my testbed is to raise AIO/DIO performance from around 5 GiB/s to around 9 GiB/s. I'll get more in to what else can be done shortly. |
| Comments |
| Comment by Gerrit Updater [ 17/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39437 | |
| Comment by Gerrit Updater [ 17/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39438 | |
| Comment by Gerrit Updater [ 17/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39439 | |
| Comment by Gerrit Updater [ 17/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39440 | |
| Comment by Patrick Farrell [ 17/Jul/20 ] | |
|
There are more patches coming in this series - Will update shortly. Eventual performance result is around 10 GiB/s from one thread, but there's still headroom to go higher. | |
| Comment by Gerrit Updater [ 17/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39441 | |
| Comment by Gerrit Updater [ 17/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39442 | |
| Comment by Gerrit Updater [ 17/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39443 | |
| Comment by Gerrit Updater [ 18/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39445 | |
| Comment by Gerrit Updater [ 18/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39448 | |
| Comment by Gerrit Updater [ 18/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39446 | |
| Comment by Gerrit Updater [ 18/Jul/20 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/39447 | |
| Comment by Patrick Farrell [ 18/Jul/20 ] | |
|
With all of the patches here, for a 64 MiB DIO, we get: read 8566 MiB/s For a very large DIO, such as 1 GiB, we get: write 9033 MiB/s read 9817 MiB/s | |
| Comment by Patrick Farrell [ 18/Jul/20 ] | |
|
This is getting near the limit of my testbed hardware, at least with one interface. It's possible to use two if you configure with just one CPT though, so... At this point, there is still actually a huge amount of overhead in the DIO/AIO path. If you look at perf with all of these patches in place, cl_page allocation is now over 50% of CPU time. Converting this from single calls to batch calls and removing a few per-page actions that can be done once per DIO, it's possible to get that down significantly. This leads to a DIO performance of something like 15 GiB/s. I don't have a patch ready for that - I'm going to open a separate ticket to discuss/describe what else I think can be done to improve this further. | |
| Comment by Andreas Dilger [ 18/Jul/20 ] | |
|
Patrick, let me state that you rock. That is all. | |
| Comment by Andreas Dilger [ 18/Jul/20 ] | |
|
Patrick, I've also filed | |
| Comment by Patrick Farrell [ 22/Jul/20 ] | |
|
(Moved from | |
| Comment by Gerrit Updater [ 26/May/21 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/43835 | |
| Comment by Gerrit Updater [ 27/May/21 ] | |
|
Patrick Farrell (farr0186@gmail.com) uploaded a new patch: https://review.whamcloud.com/43838 | |
| Comment by Gerrit Updater [ 30/Jun/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39437/ | |
| Comment by Gerrit Updater [ 30/Jun/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39439/ | |
| Comment by Gerrit Updater [ 30/Jun/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39440/ | |
| Comment by Patrick Farrell [ 30/Jun/21 ] | |
|
Current status on these patches, now that we've started landing some of this. The following patches (in order) are ready, modulo completion of testing (errors appear unrelated) and review: Several of those could use a second review; they are almost all pretty simple. These three patches are not complete: https://review.whamcloud.com/39443 (reworking implementation) | |
| Comment by Patrick Farrell [ 02/Jul/21 ] | |
|
Update on status: I've rebased this series on https://review.whamcloud.com/#/c/44131/ ( It appears, though, that https://review.whamcloud.com/39445 is OK and should be ready for review. So, updated status: The following patches (in order) are ready, modulo completion of testing (errors appear unrelated) and review: These patches are not complete: https://review.whamcloud.com/39443 (reworking implementation) | |
| Comment by Gerrit Updater [ 06/Jul/21 ] | |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44153 | |
| Comment by Gerrit Updater [ 06/Jul/21 ] | |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44154 | |
| Comment by Gerrit Updater [ 09/Jul/21 ] | |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44201 | |
| Comment by Gerrit Updater [ 11/Jul/21 ] | |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44209 | |
| Comment by Gerrit Updater [ 13/Jul/21 ] | |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44268 | |
| Comment by Gerrit Updater [ 13/Jul/21 ] | |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44293 | |
| Comment by Patrick Farrell [ 15/Jul/21 ] | |
|
I wanted to update the on the overall status of this series and related patches. The patches associated with this ticket should be ready for landing, but they're often failing testing due to bugs which have patches in flight. So here's a summary of how I'm imagining getting things landed. First, several bug fixes (in this order):
Strictly speaking, That fixes all the known bugs, and frees us up to consider the patches under this ticket, That should make it much easier to get those patches through testing. | |
| Comment by Gerrit Updater [ 15/Jul/21 ] | |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44321 | |
| Comment by Patrick Farrell [ 20/Jul/21 ] | |
|
Current status... There's a series of patches ready for Oleg & landing, starting here: These patches are waiting for review: The following patches also need review & landing (not attached to this ticket, but fix related bugs): There are also a bunch of patches which are reviewed and waiting for Oleg to land. Once these are in, there is a further set, starting here: But I would like to get the other patches landed and that patch & the ones after it rebased, etc, rather than try to include them in one giant drop. | |
| Comment by Gerrit Updater [ 27/Jul/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39442/ | |
| Comment by Gerrit Updater [ 27/Jul/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39441/ | |
| Comment by Gerrit Updater [ 27/Jul/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39446/ | |
| Comment by Gerrit Updater [ 27/Jul/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39447/ | |
| Comment by Gerrit Updater [ 27/Jul/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39448/ | |
| Comment by Gerrit Updater [ 27/Jul/21 ] | |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39482/ | |
| Comment by Patrick Farrell [ 30/Jul/21 ] | |
|
Oleg, Thanks for getting those merged in. The latest drop to master covers all of the fixes associated with other tickets that I think are required, so this series should be good to go now. Here's the current status of the patches here... I am still hoping to get everything here in to 2.15. There should not be any more patches added to this ticket at this point. This patch needs review again, but hasn't much changed: After that, patches get a little more complicated. I've rebased all the remaining patches on to tip of master + the test 398b improvement, to get them some more testing. (398b improvement: https://review.whamcloud.com/44321/) Here's the set: https://review.whamcloud.com/39443/ https://review.whamcloud.com/39438/ https://review.whamcloud.com/44268/ https://review.whamcloud.com/44293/
| |
| Comment by Patrick Farrell [ 30/Jul/21 ] | |
|
A comment on performance. I have not retested this recently, so definitely take this with some salt. The currently landed set of patches should put us at around 7 GiB/s. The lov caching patch should take that to around 8 GiB/s. The remaining set of patches should push the rest of the way to around 10 GiB/s. Then that's where this ticket stops and the work is picked up in other tickets. | |
| Comment by Shuichi Ihara [ 16/Aug/21 ] | |
|
https://jira.whamcloud.com/secure/attachment/40203/LU-13799.xlsx | |
| Comment by Patrick Farrell [ 17/Aug/21 ] | |
|
sihara: | |
| Comment by Patrick Farrell [ 17/Aug/21 ] | |
|
sihara how many OSTs did you have in your system and how many stripes did you use?
I'm just curious because it's possible some of these were limited by max_rpcs_in_flight. Still, that performance is excellent - significantly better than I expected. | |
| Comment by Patrick Farrell [ 17/Aug/21 ] | |
|
OK, I see in the spreadsheet you used: That is missing: Because I did not put it in the series. If possible, you might want to retest with that patch added. | |
| Comment by Shuichi Ihara [ 17/Aug/21 ] | |
Yeah, i also thought that helps, but didn't help very much. because it still doesn't reach to max_rpcs_in_flight limit somehow. see example. # lctl set_param osc.*.rpc_stats=clear # lfs setstripe -c 8 -S 1M /ai400x2/ior.out/ # mpirun -np 1 ior -w -r -t 256m -b 8G -o /ai400x2/ior.out/file --posix.odirect -e # lctl get_param osc.*.rpc_stats read write pages per rpc rpcs % cum % | rpcs % cum % 1: 0 0 0 | 0 0 0 2: 0 0 0 | 0 0 0 4: 0 0 0 | 0 0 0 8: 0 0 0 | 0 0 0 16: 0 0 0 | 0 0 0 32: 0 0 0 | 0 0 0 64: 0 0 0 | 0 0 0 128: 0 0 0 | 0 0 0 256: 1024 100 100 | 1024 100 100 read write rpcs in flight rpcs % cum % | rpcs % cum % 1: 142 13 13 | 856 83 83 2: 873 85 99 | 152 14 98 3: 5 0 99 | 9 0 99 4: 2 0 99 | 6 0 99 5: 1 0 99 | 1 0 100 6: 1 0 100 | 0 0 100 98-99% of rpcs is still 2 rpcs in flight here. | |
| Comment by Gerrit Updater [ 06/Jan/22 ] | |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/39445/ | |
| Comment by Gerrit Updater [ 11/Jan/22 ] | |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44154/ | |
| Comment by Gerrit Updater [ 11/Jan/22 ] | |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44153/ | |
| Comment by Gerrit Updater [ 11/Jan/22 ] | |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44209/ | |
| Comment by Gerrit Updater [ 14/Jan/22 ] | |
|
"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46135 | |
| Comment by Gerrit Updater [ 14/Jan/22 ] | |
|
"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46139 | |
| Comment by Gerrit Updater [ 26/Jan/22 ] | |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/39443/ | |
| Comment by Gerrit Updater [ 26/Jan/22 ] | |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/39438/ | |
| Comment by Gerrit Updater [ 26/Jan/22 ] | |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44293/ | |
| Comment by Patrick Farrell [ 26/Jan/22 ] | |
|
All of the patches have landed to master for this. Moved the last few test changes to LU-15843. | |
| Comment by Cory Spitz [ 26/Jan/22 ] | |
|
> Moved the last few test changes to LU-15843 | |
| Comment by Peter Jones [ 26/Jan/22 ] | |
|
I think that it's LU-15483 |