Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.10.0
-
3
-
9223372036854775807
Description
Appending to a PFL file will cause all layout components to be instantiated because it isn't possible to know what the ending offset is at the time the write is started.
It would be better to avoid this, potentially by locking/instantiating some large(r), but not gigantic range beyond current EOF, and if that fails retry the layout intent? The client must currently be in charge of locking the file during append, so it should know at write time how much of the file to instantiate, and it could retry.
Attachments
Issue Links
- is duplicated by
-
LU-10665 DoM: append to file causes OST component initialization
-
- Resolved
-
- is related to
-
LU-9479 sanity test 184d 244: don't instantiate PFL component when taking group lock
-
- Open
-
-
LU-10176 Data-on-MDT phase II
-
- Open
-
-
LU-13420 append to PFL-file without 'eof' component fails
-
- Open
-
-
LU-17694 sanity-compr test_184d: last component index number got assigned even if it was not used after layout swap
-
- Open
-
-
LU-15727 lod_get_default_lov_striping() misinterprets composite striping for append
-
- Resolved
-
-
LU-12738 PFL: append of PFL file should not instantiate full layout
-
- Open
-
-
LU-17159 Mark file layouts using append striping
-
- Open
-
- is related to
-
LU-10782 Enable tiny write append for singly striped non-composite file
-
- Open
-
-
LU-8998 Progressive File Layout (PFL)
-
- Resolved
-
Andreas, no, I don't have a size distribution for O_APPEND files, sorry. The users scatter and name their slurm job output files in various ways so I would have to scan the whole file system and guess at naming conventions. And even that might not catch other files that happened to be created with O_APPEND.
If you can suggest a way to scan the filesystem for files which are not using their last instantiated extent, I'm happy to try to provide more data.
Capping the size on O_APPEND files is potentially useful, but also violates the principle of least surprise on a POSIX-like filesystem, and would lead to very unhappy and confused users if writes fail unexpectedly. Hence my suggestion of "truncating" the PFL layout to N extents, and keeping the extent end of the last component. Hopefully it would be fairly easy to take the layout that would otherwise be created on O_APPEND and just set the layout to the first N components, modifying the last component end to be the original layout's last component end. No additional pool specification needed, no max size limit surprises, the system "just works" for the users.