[LU-15608] DIO broken for encrypted files Created: 01/Mar/22  Updated: 31/Mar/22  Resolved: 17/Mar/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.0
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Blocker
Reporter: Sebastien Buisson Assignee: Sebastien Buisson
Resolution: Fixed Votes: 0
Labels: encryption

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

In case of direct IO, the page index is not calculated correctly when doing encryption and decryption. With Direct IO, because we do not have proper page cache pages, we need to retrieve by ourselves the page mapping and the page index of the page to be encrypted/decrypted. For this we use (struct brw_page*)->off. Unfortunately, this offset is not relative to the whole file, but it is just an offset within the object. At the end of the road, it means we do not use consistent indexes between the buffered IO and direct IO cases. And more specifically, the indexes used by the direct IO case can repeat (if we access pages at the same offset but in different stripes), which is bad.



 Comments   
Comment by Andreas Dilger [ 01/Mar/22 ]

Does it make sense to properly initialize the page->index value on the DIO pages before they are submitted, so that this can be used by the fscrypt code?

Comment by Sebastien Buisson [ 01/Mar/22 ]

As these pages are not page cache pages, I would be afraid to break something by changing their ->index value.

Comment by Gerrit Updater [ 01/Mar/22 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46664
Subject: LU-15608 sec: use correct page index for DIO on enc file
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 0b27b8934ab72936baee54b1973e9ea9611b1507

Comment by Gerrit Updater [ 02/Mar/22 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46673
Subject: LU-15608 debug: test correctness of DIO for encrypted files
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4d7427cb21ff967830c9cb749985af5916c06ce4

Comment by Gerrit Updater [ 02/Mar/22 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46674
Subject: LU-15608 debug: test correctness of DIO for encrypted files
Project: fs/lustre-release
Branch: b2_14
Current Patch Set: 1
Commit: 04bba737098b551a2faf2cda01085700150f444f

Comment by Andreas Dilger [ 10/Mar/22 ]

I was wondering whether there needs to be a way to allow turning this fix off, if there were files previously only read/written by O_DIRECT and an old version of the code? It seems somewhat unlikely, but possible, and would appear to userspace as their upgrade resulted in all files becoming corrupt...

I don't think there are a lot of users running 2.14, fewer of those running fscrypt, and even fewer of those that would use O_DIRECT, but something to keep in mind in the future if there is such a problem.

Comment by Sebastien Buisson [ 10/Mar/22 ]

I would rather publish a note explaining that encrypted files created in 2.14 with O_DIRECT are likely to be corrupted when upgrading to 2.15, so they need to be recreated without O_DIRECT ('read with O_DIRECT' > 'write without O_DIRECT') before upgrading to 2.15.

As you say, ending up in this situation is a pretty unlikely combination of factors, and I am not a big fan of adding an on/off switch for a fix

Comment by Gerrit Updater [ 17/Mar/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46664/
Subject: LU-15608 sec: fix DIO for encrypted files
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 966ca46e4aa2eb39c70e49648ffe6fcaaf475536

Comment by Gerrit Updater [ 17/Mar/22 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46855
Subject: LU-15608 sec: fix DIO for encrypted files
Project: fs/lustre-release
Branch: b2_14
Current Patch Set: 1
Commit: faa620a653421e08d17614bea28c093356ad49b2

Comment by Peter Jones [ 17/Mar/22 ]

Landed for 2.15

Generated at Sat Feb 10 03:19:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.