[LU-16724] generic page pool Created: 10/Apr/23  Updated: 12/Oct/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Patrick Farrell Assignee: Patrick Farrell
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Rank (Obsolete): 9223372036854775807

 Description   

Looking at the unaligned DIO work in LU-13805, Andreas pointed out we should consider how we're getting pages for our unaligned DIO, the already existing client side encryption, and our future compression feature (LU-10092).

All three features need to allocate individual pages of memory for what are essentially bounce buffers.  The encryption feature uses a page pool, which seems like a good idea for the others.

This ticket is to try to settle on one way to do this and ideally have only a single page pool (if that's our solution).

sebastien , can you talk about why the page pool was chosen?  Was it performance or functionality?

I assume page pool performance is similar to or better than getting pages from the kernel allocator.  My only concern is wasting memory by having too large of a pool re-allocated pages.  In theory, we could need several GiB of bounce buffers if there is several GiB of outstanding IO.

However, it looks like there are shrinkers for the page pools, so if the pages are unused, the pool can be responsive to memory pressure.

So my first guess at a solution is this:
Have a single shared page pool ("lustre bounce buffers" or similar), with some fairly small pre-allocated size, then it can grow and shrink as needed.  Because page pools respond to kernel memory pressure, we shouldn't have the problem of holding on to unneeded memory.

So, we have one page pool for all our bounce buffers, and we don't worry too much about starting size except to keep it small, since the required size will be highly variable.  If a customer is not using any of these features, the total need is zero (so we should keep the cost small in this case).  If they are doing heavy IO while using one or more of these features, they could easily require several GiB at once.  So we pay the cost of allocating the necessary memory the first time it's used, then we don't have to pay that cost again unless the kernel asks us to shrink it.

How does this sound?  sebastien , adilger , ablagodarenko 



 Comments   
Comment by Artem Blagodarenko [ 10/Apr/23 ]

Hi paf0186 , I am changing sec_bulk right now to address client-size compression needs. I am about to finish and going to share the code soon. Can we get this patch as a base for the solution? Thanks.

Comment by Patrick Farrell [ 10/Apr/23 ]

Sure, Artem.

Just to make sure I understand:
You think a single shared page pool like this is a good idea and you'd like a patch so you can base current work on it?  (I am happy to do this patch)  Or do you want to use your patch as a base for this work?

Comment by Artem Blagodarenko [ 10/Apr/23 ]

paf0186  Actually I mean that my patch adds some kind of generalization to the current encryption pool, so probably we can use it as a base for the next optimizations. But if we decided to rewrite the code completely (for any important reason) I am ready to use this code ofcause.

Comment by Patrick Farrell [ 10/Apr/23 ]

OK, that makes sense - I'm glad I asked.  I am happy to wait for your patch, it will be easy to integrate a page pool in to the unaligned DIO code.  It just needs to get pages from somewhere, it doesn't really have any special requirements.

Comment by Patrick Farrell [ 10/Apr/23 ]

So, this is a bit of a separate question, but I wanted to answer it here because it's related and I think it's reasonable to ask:

I do not think we can, realistically, avoid extra buffering for the case where we're doing unaligned DIO and compression or encryption.  Compression and encryption both operate on the data at lower layers (basically OSC/ptlrpc), and they expect aligned memory, as does our RDMA (in general).  So something has to ensure the memory is aligned before it gets to these layers.  For buffered I/O, that's done by copying in to the page cache, for regular direct I/O, it is required in userspace.  For unaligned direct I/O, we have to do the aligning in the DIO code, which requires copying in to (/out of) an aligned buffer.  So we do end up with three copies in that case - userspace, a copy in the DIO layer, and one for compression (or encryption).

Avoiding that would require teaching the encryption and compression code to handle copying to/from unaligned user memory, and this means splitting the copying work up differently in a different spot (we currently split it by stripe).  This is possible, but it don't think it would be fun to code it up and I can see some big stumbling blocks around how we handle the iovec from VFS, etc, since we'd have many threads hitting it at once.

So I think this is worth considering if we end up with IO to compressed files as a major performance concern (for example, if they're often used for 'hot' or hot-ish data, rather than colder data), but I think it should be left for later rather than trying to work it in to the initial versions of any of this.

Comment by Andreas Dilger [ 10/Apr/23 ]

Patrick, my understanding is that the page pools were added for encryption for performance reasons, to reduce the overhead of repeated allocation/free of the bounce buffers. Sebastien discussed this in his LUG2021 Lustre Client Encryption presentation.

The compression page pool has some stronger requirements vs. the encryption or DIO page pool, because it needs multi-page allocations (either higher-order alloc_page() or vmalloc()) in order to provide the compression code with sufficient contiguous address space for the compression code to run (it unfortunately cannot currently work with page-fragmented source and/or target buffers). That should be a superset of the requirements for either of the other two uses.

Doing the unaligned DIO->buffer copying in the same step as compression/encryption would potentially be an optimization worthwhile to do if those use combined use cases are common, but it isn't clear whether they overlap a lot? The unaligned DIO will be useful for poorly-formed IO sized and aligned writes, which doesn't work well for either client-side compression or encryption since they both need to be page sized (or a multiple thereof for compression) and aligned.

Comment by Gerrit Updater [ 11/Sep/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52335
Subject: LU-16724 ptlrc: ptlrpc: extend sec bulk functionality
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 991aeaad4f43b7eec3bb33f7bc420aa5b92bf7ea

Comment by Patrick Farrell [ 11/Oct/23 ]

I'm also going to stick some cleanup patches under here.

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52634
Subject: LU-16724 ptlrpc: rename 'pool' to 'pool_idx'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2fdf349f4c082bfee393ae08f0e297cfd580310c

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52635
Subject: LU-16724 ptlrpc: rename 'size_bits' to 'order'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d74e47584a975f26e737cfa9a04c739872dd1849

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52636
Subject: LU-16724 ptlrpc: rename get_buf to get_pages
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8ab90ffeb5e599febf01cee904e994b5745f2ffe

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52637
Subject: LU-16724 ptlrpc: improve usage of PAGES_POOL
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4f0e0596087054d660fb3c5213b937e807ebe407

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52638
Subject: LU-16724 ptlrpc: start removing 'enc' from pool
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ca26b5c17bdc8d009f2e4c0b86cdde9bb92905ca

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52639
Subject: LU-16724 ptlrpc: rename 'epp' to 'ppp'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 740a4b822fb8f5a4d8305559b25481b870a0f849

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52640
Subject: LU-16724 ptlrpc: remove more uses of 'enc'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6938f98d595ae091b226423a6b4928fbdc27d002

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52641
Subject: LU-16724 ptlrpc: improve use of 'count'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3fec980b7aa8d3884b9af77cb2a664294a8e855c

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52642
Subject: LU-16724 ptlrpc: simplify pools_should_grow
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b89bbc04cf0800d4a92a7deace7a44e6cebf71d5

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52643
Subject: LU-16724 ptlrpc: correct use of plural 'pools'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 967aac3b114cd455ffda52d6de263fa6a853b9b7

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52644
Subject: LU-16724 ptlrpc: reduce usage of pool_index
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 68452fa2564a4c6799091f49db5288ee988520e2

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52645
Subject: LU-16724 ptlrpc: stop passing around pool_index
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: be0ede5b6c82e2af11da123afaa181495f71ec8d

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52646
Subject: LU-16724 ptlrpc: convert to void
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d93b3b9cfe4e3df472e0cae45af816d21345a755

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52647
Subject: LU-16724 ptlrpc: refactor pool growing code
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: faa6fa99a6f5f0f8deba7640c1e4445f9e38feee

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52648
Subject: LU-16724 ptlrpc: replace ELEMENT_SIZE
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f7fc5ee18ce7e6e166765594a1f9d3f1e4c9a45d

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52649
Subject: LU-16724 ptlrpc: simplify pool arrays
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6546410d5422108d827961de7cf893f07121b56e

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52650
Subject: LU-16724 ptlrpc: begin renaming pool_index to order
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 70303af1034f0db94b62bc6c147b1426dd78a1e9

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52651
Subject: LU-16724 ptlrpc: rename ppp_index to ppp_order
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 04fce416bb07af06d159752f7fdc3ef5625b9076

Comment by Gerrit Updater [ 11/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52652
Subject: LU-16724 ptlrpc: rename INDEX macros
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cb5364cdbef5dcb1bb121d3f91620fa058fff3eb

Comment by Gerrit Updater [ 12/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52662
Subject: LU-16724 ptlrpc: remove PAGES_POOL macro
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f2618ea7bc85183fda56e4e31552e483c4592652

Comment by Gerrit Updater [ 12/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52663
Subject: LU-16724 ptlrpc: remove PAGES_PER_POOL macro
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 459ce6decc1425cf5cc541cf8ce910959a662cff

Comment by Gerrit Updater [ 12/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52664
Subject: LU-16724 ptlrpc: change "pool" to "page_ptrs"
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6baf8186dfd1f196999a6593053266a320572c24

Comment by Gerrit Updater [ 12/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52665
Subject: LU-16724 ptlrpc: rename max_pools to max_ptr_pages
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ddfbc141aa299ff12ebb6d3de3a369b55692495b

Comment by Gerrit Updater [ 12/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52666
Subject: LU-16724 ptlrpc: rename npools to nptr_pages
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a24d26a514c73766eaeb3d43b66b5e4dea0363b4

Comment by Gerrit Updater [ 12/Oct/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52667
Subject: LU-16724 ptlrpc: rename 'pools' to 'ptr_pages'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 31283ac63a783f82f5469bf8a01bf193d687836b

Generated at Sat Feb 10 03:29:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.