Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
9223372036854775807
Description
With the DIO performance improvements in LU-13798 and LU-13799, it becomes interesting to do larger buffered i/o (BIO) using the DIO path, as in LU-13802.
LU-13802 covers the code for switching between the BIO and DIO paths, allowing BIO which meets the requirements for DIO to use the BIO path when appropriate.
The problem is, the requirements for DIO are sometimes hard to meet. i/o must be both page & size aligned. This ticket is about how to do unaligned DIO, in order to let us do any BIO through the DIO path.
This cannot be done with the existing Lustre i/o path. There are a few minor issues, but the central problem is that if an i/o is unaligned, we no longer have a 1-to-1 mapping between a page on the client and a page in the file/on the server. (Buffered i/o creates this 1-to-1 mapping by copying in to an aligned buffer.) This 1-to-1 mapping could possibly be removed, but it would require a significant rework of the Lustre i/o path to make this possible.
So, one option is creating a new DIO path which permits unaligned i/o from userspace all the way to disk.
The other option comes from the following observation:
When doing buffered i/o, about 20% of the time is spent in allocating the buffer and doing memcopy() in to that buffer. Of the remaining 80%, something like 70% is page tracking of various kinds.
Because each page in the page cache can be accessed from multiple threads, including being flushed at any time from various threads (memory pressure va kswapd, lock cancellation, writeout...), it has to be on various lists & have references on (effectively) the file it is part of, etc.
This work, not allocation and memcopy, is where most of the time goes.
So if we implement a simple buffering scheme - allocate an aligned buffer, then copy data to (or from) that buffer - and then do a normal DIO write(/read) from(/to) that buffer, this can be hugely faster than buffered i/o.
If we use the normal DIO path (ie, sync write, and do not keep pages after read), we keep this as a buffer, and not a cache, so we can keep the DIO path lockless.
Also, if we implement this correctly, we have a number of excellent options for speeding this up:
- Move allocation (if we're not pre-allocated) and memcopy from the user thread to the ptlrpcd threads handling RPC submission - This allows us to do these operations in parallel, which should dramatically improve speed.
- Use pre-allocated buffers.
- Potentially, since we control the entire copying path, we could enable the FPU to use vectorized memcopies. (Various aspects of the buffered i/o path in the kernel mean the FPU has to be turned on and off for each page. The cost of this outweighs the benefit of vectorized memcopy.)
Attachments
Issue Links
- is blocked by
-
LU-17597 interop: master/2.15/2.14/2.12 sanity test_56x: migrate failed rc = 22
- Resolved
- is related to
-
LU-13802 New i/o path: Buffered i/o as DIO
- Open
-
LU-18006 sanity test_119f: crash in ll_dio_user_copy
- Resolved
-
LU-17450 sanity: interop test failures with master+2.15
- Resolved
-
LU-17525 Unaligned DIO interop with different page sizes fails
- Resolved
-
LU-18284 interop sanity test_119e test_119f: UDIO files differ, bsize 1048575, 2.12 servers crash
- Resolved
-
LU-13799 DIO/AIO efficiency improvements
- Resolved
-
LU-17156 sanityn test_16j: timeout
- Resolved
-
LU-17215 sanity/398q should use $tfile
- Resolved
-
LU-12550 automatic lockahead
- Open
-
LU-17433 async hybrid writes
- Open
-
LU-13798 Improve direct i/o performance with multiple stripes: Submit all stripes of a DIO and then wait
- Resolved
-
LU-17422 unaligned DIO: use page pools
- Resolved
-
LU-16964 I/O Path: Auto switch from BIO to DIO
- Closed
-
LU-17194 parallelize DIO submit
- Closed
- is related to
-
LU-247 Lustre client slow performance on BG/P IONs: unaligned DIRECT_IO
- Resolved
-
LU-13814 DIO performance: cl_page struct removal for DIO path
- Open