[LU-14677] lfs migrate/mirror of encrypted files Created: 11/May/21  Updated: 16/Apr/23  Resolved: 23/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.15.0
Fix Version/s: Lustre 2.15.0

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: Sebastien Buisson
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-16374 Implement backup/restore of encrypted... Open
is related to LU-16091 Set S_ENCRYPTED flag on OST objects f... Resolved
is related to LU-12275 Client-side file data encryption Resolved
is related to LU-13717 Client-side encryption - support file... Resolved
is related to LU-15790 Fix mirror read/write in Test Plan fo... Open
Rank (Obsolete): 9223372036854775807

 Description   

It should be possible to use "lfs migrate" and "lfs mirror extend/resync" of encrypted files, even if the process does not have access to the key. There is no need for these operations to actually understand the file data, only to read the data in whole-chunk units and then write it to a new file copy/mirror before updating the file layout.

Barriers to handling this properly are:

  • need to be able to open() the encrypted file without a key, since this currently returns -ENOKEY to callers, so that they don't accidentally try to read/write unintelligible data. This could be handled with a special open flag (e.g. something similar to O_LOV_DELAY_CREATE for tools that know what they are doing. This should be done with an llapi_file_open_fscrypt() helper to isolate the logic in case it needs to be changed in the future.
  • data tools should always read/write the full 4KB chunk of encrypted data. IMHO, this should imply that the file size is always reported to userspace rounded up to the next 4KB chunk size. That shouldn't affect processes that have the key, and helps somewhat to avoid fingerprinting files based on their size.

For file backup/restore, the tools would also need to be able to read/save/restore the unique salt/nonce stored with the file so that the file can later be decrypted again. Special considerations are needed when restoring an encrypted file, since it is typically not possible to create "unencrypted" files in a directory with an encryption key, and similarly setting an encryption key on a non-empty directory is not possible. As such backup/restore of cyphertext files should be handled in a separate ticket. In the meantime, it would be possible to backup restore encrypted files in plaintext (presumably to a medium that is itself encrypted) if the backup tools have access to a master key added to each fscrypt directory key).



 Comments   
Comment by Sebastien Buisson [ 17/May/21 ]

When you say

IMHO, this should imply that the file size is always reported to userspace rounded up to the next 4KB chunk size.

You mean without the encryption key, right?
I agree with this, and it should make implementation of access without key much easier.

As fscrypt's core principle is to only have clear text pages in the page cache, we must be careful: pages that belong to files opened without the encryption key must be removed from the page cache when files are closed. Alternatively, we could support O_DIRECT access only, in the no encryption key case. lfs mirror uses O_DIRECT, which is fine, but do we need to support other commands that would not proceed with O_DIRECT?

Another aspect to keep in mind is compatibility with the future name encryption feature. Indeed, without this feature, tools such as lfs migrate/mirror can just use the plain text file names to proceed to file migration without the encryption key. But once name encryption feature is landed, the names as seen by the applications will be different with and without the key. In this case, is it still viable to want to migrate files without the encryption key?

Comment by Andreas Dilger [ 18/May/21 ]

It is definitely OK to require O_DIRECT for data movers, even if that is slightly less fast than cached, unless we link in libaio for doing data copies. It is likely that using AIO/DIO is even faster than buffered IO in the end.

As for the filenames, it seems to me that it shouldn't matter if the filename is "test.out" or "wT58-Ry7", so long as opening the file allows both read and write of the file data.

Comment by Sebastien Buisson [ 18/May/21 ]

Regarding file names, I was wondering how migration tools would identify files they want to migrate. If this is from a list built when encryption key was available, then the names in this list cannot be used without the key. But if that list is built without the key, for instance based on some attributes on the files, then it is fine, the names will be encoded cipher text names suitable for use.

Now that I started working on implementing file read/write (in O_DIRECT) without the key, I realize that the question of the file size is pretty tricky. In READ, returning the file size rounded up to the next LUSTRE_ENCRYPTION_UNIT_SIZE is mandatory, otherwise the application would not get valid and entire encrypted blocks. But then in WRITE, having this rounded up file size raises an issue: the object size stored on disk on server side also gets rounded up. So when reading the file with the encryption key later on, the application will see a file size rounded up to the next LUSTRE_ENCRYPTION_UNIT_SIZE, and will get file content padded with trailing zeros...
I have not been able to identify a way through this problem yet.

Comment by Andreas Dilger [ 18/May/21 ]

It would be possible to set the size on the mirror file based on the current size stored on the MDT inode (via fetched via statx() to avoid an OST RPC). However, this may result in the OST truncating the object and losing the end of the page, which gets us no further ahead than a short read. I can't think of any obvious interface that shrinks the size without actually truncating the data. fallocate() is able to allocate blocks beyond EOF, but I think fallocate(PUNCH_HOLE) will also release space at the end of the file.

We already need to set timestamps on mirrors (LU-14508) from the client. I wonder if it would be better to transfer attributes from the source to the mirror directly from the MDS during mirror extend, so that we don't need to manage this in userspace?

Comment by Sebastien Buisson [ 21/May/21 ]

I think I managed to find a way to set the clear text size on the object stored on disk on server side, by using the o_size field of the request's body. This is similar to what was done in patch #36146 to properly handle encrypted object size.
With this, I have a prototype that is capable of supporting lfs mirror resync without the encryption key.

But I found a problem with lfs migrate/extend/split that also affects access with the key, both in master and 2.14.
The operations on encrypted file mirrors that I tested originally, and that are landed in sanity-sec test_52, are of this kind:

lfs mirror create -N -i0 -N -i1 $testfile
dd if=/dev/zero of=$testfile bs=5000 count=1 conv=fsync
lfs mirror resync $testfile
lfs mirror verify $testfile
lfs mirror read -N 2 -o $mirror2 $testfile
lfs mirror write -N 1 -i $tmpfile $testfile

In this case, it works fine on encrypted files, because the mirror layout is created initially, and then it is all about reading and writing in mirrors (with O_DIRECT).

However, using lfs migrate/extend/split is a whole different story. In this case, the original, possibly non-mirror layout is modified into a mirror layout, then data is copied between mirrors, and at the end it is possibly switched back to a non-mirror layout. IIUC, these layout changes are carried out via volatile files. For instance, in the case of lfs migrate, in broad outline a temporary file with the target layout is created as volatile, then data is copied from the source components to the target components, and at the end the layouts are swapped between the original file and the volatile file, and the volatile file is finally removed.
When these operations are carried out on an encrypted file, the volatile file is assigned an encryption context that is different from the original file's (this encryption context is automatically generated at file creation by fscrypt). So when data is written to the new layout components, it gets encrypted with the volatile file's encryption key. Unfortunately, later in the migration process, this key gets destroyed with the volatile file being removed. So after migration, the file does contain encrypted content, but that cannot be decrypted with the file's encryption key.

A solution to fix this would be to set the volatile file's encryption context to be the same as the original file's. However it does not seem possible to pass additional information when creating a new file via the open syscall. Maybe with mknod through the dev parameter, but I am not sure lfs migrate calls mknod. Or maybe the setstripe family could be extended to force an encryption context given as input? Otherwise, we would have to allow creation of the volatile file without an encryption context, and then define a new ioctl to explicitly set an encryption context.
If we manage to have the volatile file with the same encryption context as the original file's, then data encrypted in the new layout components should be valid after the migration procedure.

Anyway, this problem with lfs migrate/extend/split needs to be fixed before we move forward and look at migration/mirroring without the encryption key.

Comment by Andreas Dilger [ 21/May/21 ]

Could we allow an ioctl on a volatile file to copy the encryption context from the source file after it is opened?

Comment by Sebastien Buisson [ 21/May/21 ]

That might be possible. The difficulty is to allow the creation of the volatile file without associating any encryption context. This is why I was thinking about creating the volatile file directly with the proper encryption context, with the "setstripe" ioctl. But I do not know if this makes sense, and if lfs migrate is using, or could be changed to use, the "setstripe" ioctl to create the volatile file.

Comment by Andreas Dilger [ 21/May/21 ]

Since the volatile file is not bound to the namespace, it could even be created outside the encrypted directory tree. The reason that volatile files are created in the same directory as the source are:

  • inherit the default layout from the parent directory
  • create on the same MDT as the source file (but that has to be manually set for striped directories anyway, so is likely no longer an issue)
  • user has permission to create a file there. On the one hand, creating a volatile file could be done anywhere, but it also doesn't seem like a good idea to allow unprivileged users to create files if they have no write permission at all (eg. read-only filesystem, in another encrypted directory where they may inherit the security context of another user, etc.)
Comment by Sebastien Buisson [ 27/May/21 ]

Looking into this further, my understanding is that for lfs migrate and lfs mirror extend, file content is read by the client from the old layout components, and then rewritten to the new layout components. From a client encryption perspective, it means data will be decrypted and then re-encrypted.
For lfs mirror split however, it seems that this is a server side (MDS) only operation on the layouts. When the split mirror is not destroyed, a new file (not volatile) is created and is assigned the split mirror layout components. From a client encryption perspective, it means this new file has its own encryption context, but holds data encrypted with the original file's encryption context. Which is a problem to further get access to plain text data.

Am I correct in my understanding? Is there a way to proceed to mirror splitting on client side (ie like migrate or extend), or is the server side fashion the only way to go?

Comment by Andreas Dilger [ 27/May/21 ]

I would argue that mirror extend/resync may often happen without the key (eg. some remote service node watching a Changelog), and ideally even if the key is available it should not decrypt+encrypt the data, since that is wasted effort.

Comment by Sebastien Buisson [ 27/May/21 ]

Allowing access to cipher text content without the key is one thing, but getting access to cipher text content instead of plain text when the key is available is maybe trampling fscrypt principles underfoot

Comment by Andreas Dilger [ 27/May/21 ]

For mirror resync we already have to do O_DIRECT read/write to specific mirrors, since we don't want the stale data to appear in the page cache

Comment by Sebastien Buisson [ 28/May/21 ]

Having lfs mirror split -d on encrypted files to destroy existing layout components seems to be feasible, with the same code changes as the ones required to have lfs migrate and lfs mirror extend.

But enabling lfs mirror split on encrypted files without the -d option poses a security issue.

  • As this is a server side (MDS) only operation on the layouts, the only possible way to implement it would be to create the new file (the one with the layout of the split mirror) with the same encryption context (ie same per-file encryption key) as the original file's, so that its content can be accessed later on.
  • However, one of the core principles of file system level encryption is that the same plaintext in two files must not map to the same cipher text, or vice versa.

For applications, it would be possible to get the same behavior as lfs mirror split -f by following these steps:

  1. get layout of targeted mirror from original file
  2. create new file with given layout
  3. lfs mirror read -N <targeted mirror> <original file> > <new file>
  4. lfs mirror split --mirror-id <targeted mirror> -d <original file>
Comment by Andreas Dilger [ 28/May/21 ]

I don't know if "lfs mirror split -f" is used very often. My first thought would be - just have it return an error for encrypted files, and see if anyone complains.

Comment by Gerrit Updater [ 31/May/21 ]

Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/43878
Subject: LU-14677 sec: migrate/extend/split on encrypted file
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f587ce5b284533497c38f116642b6b2ed4e6ddbf

Comment by Sebastien Buisson [ 31/May/21 ]

Patch #43878 implements support for lfs migrate and lfs mirror extend/split on encrypted files, by making sure the volatile file used to proceed to the operation is assigned the same encryption context as the original file. With the noticeable limitation that lfs mirror split -d (mirror deletion) is the only supported split operation on encrypted files.

Comment by Gerrit Updater [ 17/Jun/21 ]

Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/44024
Subject: LU-14677 sec: no encryption key migrate/extend/resync/split
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5b6879499b5973f12e26624ee5945adbac097129

Comment by Gerrit Updater [ 28/Jun/21 ]

Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/44101
Subject: LU-14677 sec: do not expose security.c to listxattr/getxattr
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a7c4cdbd42f7ce599ccfe3aa4160f6eb44cc32f1

Comment by Gerrit Updater [ 09/Jul/21 ]

Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/44198
Subject: LU-14677 llite: move env contexts to ll_inode_info level
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 932007c91333117b7b0905ce5601aafc9b3bdd4e

Comment by Gerrit Updater [ 12/Jul/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43878/
Subject: LU-14677 sec: migrate/extend/split on encrypted file
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 09c558d16f0a80f436522edde89367c088fe2055

Comment by Gerrit Updater [ 16/Sep/21 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/44957
Subject: LU-14677 sec: change MIGRATION_ flags to LLAPI_MIGRATION_
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f0b4d394a70eab516c6a2320d4c3a7cacee34ebc

Comment by Gerrit Updater [ 22/Sep/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44198/
Subject: LU-14677 llite: move env contexts to ll_inode_info level
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 957e7de61ec129013ba0df90c3abe64ff024e438

Comment by Gerrit Updater [ 22/Sep/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44101/
Subject: LU-14677 sec: do not expose security.c to listxattr/getxattr
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: efb66de719329ce4d96b40f00ad592cca1e432fd

Comment by Gerrit Updater [ 23/Dec/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44024/
Subject: LU-14677 sec: no encryption key migrate/extend/resync/split
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: fdbf2ffd41fa5660782d5ca8489ec2eb644c8113

Comment by Gerrit Updater [ 23/Dec/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44957/
Subject: LU-14677 sec: remove MIGRATION_ compatibility defines
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e42d2d67d3a0dcc726d1424d3158b6f649b5abd7

Comment by Peter Jones [ 23/Dec/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:11:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.