[LU-16695] switch Lustre to use IOCB_APPEND and IOCB_DIRECT instead of file flags Created: 31/Mar/23  Updated: 13/Dec/23  Resolved: 13/Dec/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Improvement Priority: Minor
Reporter: Patrick Farrell Assignee: Patrick Farrell
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-17116 losetup --direct-io=on trigger LASSER... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

Usage of O_DIRECT and O_APPEND from file->f_flags to make decisions about IO exposes us to races with fcntl.  The upstream kernel fixed this in 2015 by mirroring them in to the IOCB flags.  Let's use these.



 Comments   
Comment by Patrick Farrell [ 31/Mar/23 ]

Note: I'm not aware of any actual issues here, but there is a theoretical problem and this is also required for switching buffered IO to direct IO inside the kernel, because we need to play with these flags and we can't safely do that on the actual file flags.

Comment by Gerrit Updater [ 31/Mar/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50493
Subject: LU-16695 llite: switch to ki_flags from f_flags
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6d057bbfe727f4372e19326ea225e2b7736d23cb

Comment by Guillaume Courrier [ 14/Sep/23 ]

We ran into an issue when using loop devices in direct I/O to manage file systems backed on files in Lustre. I have opened a ticket before I found this one... https://jira.whamcloud.com/browse/LU-17116

I'm happy to close it if it is considered to be a duplicate. Worst case scenario, on master the client can crash during a read (see the ticket for more details). We ran into the issue in 2.12 initially where we only got an Input/Output error in this case. On master, the client would crash right after the losetup has finished because a read was triggered on the loop device.

I will try to see if your patch fixes the issue. I think we should at least add a test to check whether losetup --direct-io=on can be used successfully.

Comment by Guillaume Courrier [ 14/Sep/23 ]

The patch seems to fix the issue.

Comment by Patrick Farrell [ 14/Sep/23 ]

Guillaume,

Thanks for your report in LU-17116 - I was aware of this more as a compatibility issue and a small theoretical issue with someone changing those flags.  I wasn't aware we'd ever see IOCB_DIRECT without O_DIRECT.  That's a very good catch.  And I can work on getting this patch actually merged, because previously it was just in the (lengthy) backlog.

Comment by Patrick Farrell [ 14/Sep/23 ]

Guillaume,

Actually, since you generated a patch in LU-17116, would you be able to rebase and clean up the patch for LU-16695?  It's tough for me to find time for that right now.  I can help if you have questions or issues.

Comment by Guillaume Courrier [ 15/Sep/23 ]

I will look into it.

Comment by Gerrit Updater [ 27/Nov/23 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53248
Subject: LU-16695 llite: switch to ki_flags from f_flags
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 9795cdb4f4230ae654990ccae5a2803ad850954e

Comment by Gerrit Updater [ 13/Dec/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50493/
Subject: LU-16695 llite: switch to ki_flags from f_flags
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dad7079dfd9d1e17af15a2df67e76605db677e84

Comment by Peter Jones [ 13/Dec/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:29:13 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.