Details
-
New Feature
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
Introduction
If files are accidentally deleted from a file system, an application may be interrupted and the user data may be permanently lost. The trash can (also called "undelete" or "recycle bin") is a recommended feature in file systems that acts as a virtual trash can, allowing users to store deleted files temporarily before permanently deleting them. It provides a way to restore or retrieve deleted files if needed.
Once the trash can feature is enabled, when a user deletes a file from a file system, it is not actually deleted but moved to the trash can, deleted files and directories are temporarily stored in the trash can. The trash can may be manually emptied or once it is full, it will remove the oldest files first. Additionally, items in the trash can may be restored or retrieved if they are still there.
Trash Can/Undelete Functionalities
The trash can should including the following functionalities:
- List "undeleted" files in the trash can;
- After a file is deleted and moved into trash can, the quota for this file should be accounted and updated (reduced) accordingly;
- The trash can, and all files therein are not visible in the namespace of the file system;
- Restore a file in the trash can. This will restore a file to its original path. The corresponding quota account should be updated also;
- Delete a file in the trash can. This will finally remove the file from the file system and free the used space. The file is now unrecoverable;
- Empty the trash can. This will remove all files in the trash can;
- A user can restore files from trash can within the specified retention period. By this way, a file can be kept "undeleted" under a pre-defined configurable grace period.
- Enable/disable trash can feature on a entire file system;
- A administrator can enable/disable trash can feature on a specified directory;
Deleted files can no longer be restored from the trash can when:
- A file (or directory) is deleted again from the trash can. In other words it have been deleted twice. The first deletion only moves the file to the trash can. The second deletion actually removes the file from the file system.
- The trash can is emptied of all of its contents.
The Trash Can/Undelete HLD contains details of the design and implementation of this feature.
Attachments
Issue Links
- duplicates
-
LU-18253 implement Trash Can/Undelete
-
- Resolved
-
- is related to
-
LU-13660 ldiskfs integtrated filesystem snapshot
-
- Open
-
-
LU-18457 Flashback for Lustre
-
- Open
-
-
LU-18917 TCU: implement per-user subdirectories in trash can
-
- Open
-
-
LU-7880 add OST/MDT performance statistics to obd_statfs
-
- Open
-
-
LU-17648 store JobID of process deleting file into xattr
-
- Open
-
-
LU-18800 TCU: implement virtual ".Trash" subdirectory for undelete
-
- Open
-
Activity
Epic Link | Original: EX-428 [ 55037 ] |
Link | New: This issue is cloned by LU-18914 [ LU-18914 ] |
Link | New: This issue is related to LU-18913 [ LU-18913 ] |
Description |
Original:
h2. Introduction
If files are accidentally deleted from a file system, an application may be interrupted and the user data may be permanently lost. The trash can (also called "undelete" or "recycle bin") is a recommended feature in file systems that acts as a virtual trash can, allowing users to store deleted files temporarily before permanently deleting them. It provides a way to restore or retrieve deleted files if needed. Once the trash can feature is enabled, when a user deletes a file from a file system, it is not actually deleted but moved to the trash can, deleted files and directories are temporarily stored in the trash can. The trash can may be manually emptied or once it is full, it will remove the oldest files first. Additionally, items in the trash can may be restored or retrieved if they are still there. h2. Trash Can/Undelete Functionalities The trash can should including the following functionalities: - List "undeleted" files in the trash can; - After a file is deleted and moved into trash can, the quota for this file should be accounted and updated (reduced) accordingly; - The trash can, and all files therein are not visible in the namespace of the file system; - Restore a file in the trash can. This will restore a file to its original path. The corresponding quota account should be updated also; - Delete a file in the trash can. This will finally remove the file from the file system and free the used space. The file is now unrecoverable; - Empty the trash can. This will remove all files in the trash can; - A user can restore files from trash can within the specified retention period. By this way, a file can be kept "undeleted" under a pre-defined configurable grace period. - Enable/disable trash can feature on a entire file system; - A administrator can enable/disable trash can feature on a specified directory; Deleted files can no longer be restored from the trash can when: - A file (or directory) is deleted again from the trash can. In other words it have been deleted twice. The first deletion only moves the file to the trash can. The second deletion actually removes the file from the file system. - The trash can is emptied of all of its contents. The [Trash Can/Undelete HLD]|https://wiki.whamcloud.com/pages/viewpage.action?pageId=351437962] contains details of the design and implementation of this feature. |
New:
h2. Introduction
If files are accidentally deleted from a file system, an application may be interrupted and the user data may be permanently lost. The trash can (also called "undelete" or "recycle bin") is a recommended feature in file systems that acts as a virtual trash can, allowing users to store deleted files temporarily before permanently deleting them. It provides a way to restore or retrieve deleted files if needed. Once the trash can feature is enabled, when a user deletes a file from a file system, it is not actually deleted but moved to the trash can, deleted files and directories are temporarily stored in the trash can. The trash can may be manually emptied or once it is full, it will remove the oldest files first. Additionally, items in the trash can may be restored or retrieved if they are still there. h2. Trash Can/Undelete Functionalities The trash can should including the following functionalities: - List "undeleted" files in the trash can; - After a file is deleted and moved into trash can, the quota for this file should be accounted and updated (reduced) accordingly; - The trash can, and all files therein are not visible in the namespace of the file system; - Restore a file in the trash can. This will restore a file to its original path. The corresponding quota account should be updated also; - Delete a file in the trash can. This will finally remove the file from the file system and free the used space. The file is now unrecoverable; - Empty the trash can. This will remove all files in the trash can; - A user can restore files from trash can within the specified retention period. By this way, a file can be kept "undeleted" under a pre-defined configurable grace period. - Enable/disable trash can feature on a entire file system; - A administrator can enable/disable trash can feature on a specified directory; Deleted files can no longer be restored from the trash can when: - A file (or directory) is deleted again from the trash can. In other words it have been deleted twice. The first deletion only moves the file to the trash can. The second deletion actually removes the file from the file system. - The trash can is emptied of all of its contents. The [Trash Can/Undelete HLD|https://wiki.whamcloud.com/pages/viewpage.action?pageId=351437962] contains details of the design and implementation of this feature. |
Description |
Original:
h2. Introduction
If files are accidentally deleted from a file system, an application may be interrupted and the user data may be permanently lost. The trash can (also called "undelete" or "recycle bin") is a recommended feature in file systems that acts as a virtual trash can, allowing users to store deleted files temporarily before permanently deleting them. It provides a way to restore or retrieve deleted files if needed. Once the trash can feature is enabled, when a user deletes a file from a file system, it is not actually deleted but moved to the trash can, deleted files and directories are temporarily stored in the trash can. The trash can may be manually emptied or once it is full, it will remove the oldest files first. Additionally, items in the trash can may be restored or retrieved if they are still there. h2. Trash Can/Undelete Functionalities The trash can should including the following functionalities: - List "undeleted" files in the trash can; - After a file is deleted and moved into trash can, the quota for this file should be accounted and updated (reduced) accordingly; - The trash can, and all files therein are not visible in the namespace of the file system; - Restore a file in the trash can. This will restore a file to its original path. The corresponding quota account should be updated also; - Delete a file in the trash can. This will finally remove the file from the file system and free the used space. The file is now unrecoverable; - Empty the trash can. This will remove all files in the trash can; - A user can restore files from trash can within the specified retention period. By this way, a file can be kept "undeleted" under a pre-defined configurable grace period. - Enable/disable trash can feature on a entire file system; - A administrator can enable/disable trash can feature on a specified directory; Deleted files can no longer be restored from the trash can when: - A file (or directory) is deleted again from the trash can. In other words it have been deleted twice. The first deletion only moves the file to the trash can. The second deletion actually removes the file from the file system. - The trash can is emptied of all of its contents. h2. Design and Implementation for Trash Can in Lustre The design for the trash can feature in Lustre is straight forward. On the server side, It just implements the basic functionalities such as moving the "undeleted" files into the cycle bin and the interface how to traverse them. On the client side, it implements the basic utility tools to interact with the trash can ({{{}lfs trash list|rm|unrm FILE{}}}), including: - Set or clear the recycle flag on a given file or directory; - list "undeleted" files; - Permanently delete a file within the trash can; - Empty the trash can, or a subdirectory of it; - Restore a file or directory in the trash can; *Our mechanism only moves the regular files into the trash can upon its last unlink, but by default does not preserve hard links to a file.* It borrows lots of ideas from orphan and volatile files in Lustre (which stores in "{{{}ROOT/PENDING{}}}" directory on each MDT). During the format and setup, each MDT creates a "{{{}ROOT/TRASH{}}}" directory as a trash can to store "undeleted" files. The POSIX API is used to traverse the files under the trash can on a given MDT. First, a client can get the FID of trash can directory "{{{}ROOT/TRASH{}}}" on the MDT. Then the client can get the file handle via {{{}dir_fd=llapi_open_by_fid_at(){}}}; after that, the "undeleted" files within the trash can can be traversed via {{{}readdir(){}}}. It can open by {{openat(dir_fd, dent)}} and obtain the "undeleted" XATTR, which contains the necessary information to restore, via {{{}fgetxattr(fd, "trusted.unrm"){}}}. The client can even read the data or swap layout of the "undeleted" file on the trash can for restore: {{{}opendir()/readddir()/openat()/fgetxattr("trusted.unrm")/close()/closedir(){}}}. The workflow for the trash can is as follows: - An administrator can enable/disable trash can feature on a specified MDT via: {{{}mdd.*.enable_trash_can{}}}; - An administrator can enable/disable trash can feature on a specified directory or a file via the Lustre specified file flag: {{LUSTRE_TRASH_FL}} (similar to {{{}LUSTRE_ENCRYPT_FL{}}}). All sub files under a directory flagged with {{LUSTRE_TRASH_FL}} can inherit this flag. {code:java} # lctl recycle set_flag $file|$dir # lctl recycle clear_flag $file|$dir {code} - *Move a deleting file into the trash can.* When delete a regular file marked with {{LUSTRE_TRASH_FL}} upon its last unlink, first move the file into the trash can directory "{{{}ROOT/TRASH{}}}" with FID as its name. And then set a "{{{}trusted.unrm{}}}" xattr on the "undeleted" file on the trash can. The xattr contains the following information: {code:java} struct lustre_unrm_xattr { __u32 lurm_uid; /* uid of the deleting file, used for quota accounting */ __u32 lurm_gid; /* gid of the deleting file, used for quota accounting */ __u32 lurm_projid; /* projid of the deleting file, used for quota accounting */ __u32 lurm_unused; /* unused, for field alignment/future use */ __u64 lurm_dtime; /* Timestamp that the file moved into the trash can */ }; {code} Where {{lurm_uid/gid}} is the original uid/gid of the deleting file, mainly used for quota accounting for the restore operation; {{lurm_dtime}} is the time that the file was moved into the trash can. It is used to determine whether the file is expired for the specified retention period and thus should be removed from the trash can finally. . - List "undeleted" files within a trash can on a given MDT: {code:java} # lfs trash [ls|list] [-i|--id UID] [DIR] uid gid size delete time FID Fullpath 0 0 4096 Nov 14 08:11 [0x200034021:0x1:0x0] DIR/f1 0 0 32104 Nov 14 08:07 [0x200034021:0x2:0x0] DIR/f2 ... {code} Where {{DIR}} is an optional directory in a Lustre filesystem, or the current working directory if unspecified. This will list directories under {{{}MOUNT/.lustre/trash/{_}UID{_}/{_}DIRFID{_}{}}}. The pseudo code: {code:java} rbin_fid = llapi_trash_fid_get(MNTPT, mdt); dir_fd = llapi_open_by_fid(MNTPT, rbin_fid); while ((ent = readdir(dir_fd)) != NULL) { fd = openat(dir_fd, ent->d_name); fgetxattr(fd, "trusted.unrm", xattr_buf); print_one(ent->d_name, xattr_buf); close(fd); } close(dir_fd); {code} - Deleting a file in the trash can will remove the temporary file under "{{{}ROOT/TRASH{}}}" and free the data space on Lustre OSTs permanently. {code:java} # lfs trash delete [-i|--id UID] DIR/FILE {code} The pseudo code: {code:java} rbin_fid = llapi_trash_fid_get(MNTPT, mdt); dir_fd = llapi_open_by_fid(MNTPT, rbin_fid); unlinkat(dir_fd, "FID", 0); close(dir_fd); {code} - Empty a trash can (recursively delete all files/directories under {{{}_DIR_{}}}) : {code:java} # lfs trash empty [-i|--id UID] DIR {code} The pseudo code: {code:java} rbin_fid = llapi_trash_fid_get(MNTPT, mdt); dir_fd = llapi_open_by_fid(MNTPT, rbin_fid); while ((ent = readdir(dir_fd)) != NULL) { unlinkat(dir_fd, ent->d_name, 0); } close(dir_fd); {code} - Restore a file in the trash can on a given MDT. It will restore the file and its content according to the saved full path and then delete the stub on the trash can. {code:java} # lfs trash [unrm|restore] DIR/FILE {code} The pseudo code: {code:java} rbin_fid = llapi_trash_fid_get(MNTPT, mdt); dir_fd = llapi_open_by_fid(MNTPT, rbin_fid); fd = openat(dir_fd, FID, O_RDONLY); fgetxattr(fd, "trusted.unrm", xattr_buf); mkdir -p dirname(xattr_buf.path) { way 1: dst_fd = open(xattr_buf.path, O_CREAT); // copy the file data via read()/write() syscall copy_data(dst_fd, fd); close(dst_fd); unlinkat(dir_fd, "FID", 0); } { way 2: mknod(xattr_buf.path); dst_fid=path2fid(xattr_buf.path) swap_layouts(dst_fid, FID); unlinkat(dir_fd, "FID", 0) } { way 3: parent_fid=path2fid(dirname(xattr_buf.path)) ioctl(IOCTL_TRASH_RESTORE, parent_fid, FID); in the ioctl(), mv the FID into parent_fid on MDT. } close(fd); close(dir_fd); {code} - LFSCK periodically scans the files under trash can directory "{{{}ROOT/TRASH{}}}" and delete the file with grace time expired. - Provide the functionality to scan "undeleted" files on all MDTs with the grace time expired manually and delete all of them (essentially just "{{{}find{}}}" on the files in {{_DIR_}} in trash for that user). {code:java} # lfs trash check [--expire_time|-E time] [-i|--id UID] [DIR] {code} - Provide the functionality to restore/delete all files within a given directory. This can be achieved by using the command combination of "{{{}lfs trash list{}}}" and "{{{}lfs trash unrm{}}}" or "{{{}lfs trash delete{}}}" to filter the files with the full path attribute within a given directory. - Provide "{{{}.trash/MDTxxxx{}}}" (where N is the MDT index) filesystem namespace. By this way, users can list the "undeleted" files with normal userspace tools in the trash can directory on a given {{MDTxxxx}} via POSIX file system API. However, users can not read these files while they are in the trash, to prevent abuse of quota limits and prevent applications from using them. We can perform the following commands from a Lustre namespace (mount point of "{{{}/mnt/lustre{}}}") on a client: {code:java} # ls /mnt/lustre/.lustre/trash/mdt2/UID 0x200034021:0x1:0x0 0x200034021:0x2:0x0 ... # lfs trash ls /mnt/lustre/jsmith/project UID GID size delete_time FID Fullpath 0 0 4096 Nov 14 08:11 [0x200034021:0x1:0x0]->/mnt/lustre/bob/project/f1 # ls -R /mnt/lustre/.lustre/trash/MDT0002 /mnt/lustre/.lustre/trash/MDT0002/1000/[0x200034021:0x1:0x0]/f1 /mnt/lustre/.lustre/trash/MDT0002/1005/[0x200032140:0x44:0x0]/subdir/file ... {code} |
New:
h2. Introduction
If files are accidentally deleted from a file system, an application may be interrupted and the user data may be permanently lost. The trash can (also called "undelete" or "recycle bin") is a recommended feature in file systems that acts as a virtual trash can, allowing users to store deleted files temporarily before permanently deleting them. It provides a way to restore or retrieve deleted files if needed. Once the trash can feature is enabled, when a user deletes a file from a file system, it is not actually deleted but moved to the trash can, deleted files and directories are temporarily stored in the trash can. The trash can may be manually emptied or once it is full, it will remove the oldest files first. Additionally, items in the trash can may be restored or retrieved if they are still there. h2. Trash Can/Undelete Functionalities The trash can should including the following functionalities: - List "undeleted" files in the trash can; - After a file is deleted and moved into trash can, the quota for this file should be accounted and updated (reduced) accordingly; - The trash can, and all files therein are not visible in the namespace of the file system; - Restore a file in the trash can. This will restore a file to its original path. The corresponding quota account should be updated also; - Delete a file in the trash can. This will finally remove the file from the file system and free the used space. The file is now unrecoverable; - Empty the trash can. This will remove all files in the trash can; - A user can restore files from trash can within the specified retention period. By this way, a file can be kept "undeleted" under a pre-defined configurable grace period. - Enable/disable trash can feature on a entire file system; - A administrator can enable/disable trash can feature on a specified directory; Deleted files can no longer be restored from the trash can when: - A file (or directory) is deleted again from the trash can. In other words it have been deleted twice. The first deletion only moves the file to the trash can. The second deletion actually removes the file from the file system. - The trash can is emptied of all of its contents. The [Trash Can/Undelete HLD]|https://wiki.whamcloud.com/pages/viewpage.action?pageId=351437962] contains details of the design and implementation of this feature. |
"Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58997
Subject: LU-18456 tcu: update LinkEA when move file into Trash Can
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 09a8bbf75d550b5f48282a1e31c323e5b4f8a90a