[LU-15608] DIO broken for encrypted files Created: 01/Mar/22 Updated: 31/Mar/22 Resolved: 17/Mar/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.15.0 |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Sebastien Buisson | Assignee: | Sebastien Buisson |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | encryption | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
In case of direct IO, the page index is not calculated correctly when doing encryption and decryption. With Direct IO, because we do not have proper page cache pages, we need to retrieve by ourselves the page mapping and the page index of the page to be encrypted/decrypted. For this we use (struct brw_page*)->off. Unfortunately, this offset is not relative to the whole file, but it is just an offset within the object. At the end of the road, it means we do not use consistent indexes between the buffered IO and direct IO cases. And more specifically, the indexes used by the direct IO case can repeat (if we access pages at the same offset but in different stripes), which is bad. |
| Comments |
| Comment by Andreas Dilger [ 01/Mar/22 ] |
|
Does it make sense to properly initialize the page->index value on the DIO pages before they are submitted, so that this can be used by the fscrypt code? |
| Comment by Sebastien Buisson [ 01/Mar/22 ] |
|
As these pages are not page cache pages, I would be afraid to break something by changing their ->index value. |
| Comment by Gerrit Updater [ 01/Mar/22 ] |
|
"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46664 |
| Comment by Gerrit Updater [ 02/Mar/22 ] |
|
"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46673 |
| Comment by Gerrit Updater [ 02/Mar/22 ] |
|
"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46674 |
| Comment by Andreas Dilger [ 10/Mar/22 ] |
|
I was wondering whether there needs to be a way to allow turning this fix off, if there were files previously only read/written by O_DIRECT and an old version of the code? It seems somewhat unlikely, but possible, and would appear to userspace as their upgrade resulted in all files becoming corrupt... I don't think there are a lot of users running 2.14, fewer of those running fscrypt, and even fewer of those that would use O_DIRECT, but something to keep in mind in the future if there is such a problem. |
| Comment by Sebastien Buisson [ 10/Mar/22 ] |
|
I would rather publish a note explaining that encrypted files created in 2.14 with O_DIRECT are likely to be corrupted when upgrading to 2.15, so they need to be recreated without O_DIRECT ('read with O_DIRECT' > 'write without O_DIRECT') before upgrading to 2.15. As you say, ending up in this situation is a pretty unlikely combination of factors, and I am not a big fan of adding an on/off switch for a fix |
| Comment by Gerrit Updater [ 17/Mar/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46664/ |
| Comment by Gerrit Updater [ 17/Mar/22 ] |
|
"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46855 |
| Comment by Peter Jones [ 17/Mar/22 ] |
|
Landed for 2.15 |