[LU-14720] OSD should store filenames for system files in xattrs Created: 29/May/21  Updated: 10/May/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Rank (Obsolete): 9223372036854775807

 Description   

For recovery purposes, it would be great to store the filename for all internal system files in xattrs on each inode. This would make OI Scrub able to recover these files easily if the root directory was corrupted and some/all files end up in lost+found. While OI Scrub can repair some of the layout already (e.g. move objects from lost+found back into an object directory O/SEQ/dN/OID), it would be much better to just move the "O/" or "SEQ" or "dN" directory as a whole back to the proper location instead of moving millions of objects separately.

Similarly, recovering CONFIGS, mountdata, and last_rcvd avoid the need for support personnel to get involved to recover this manually before the filesystem can even be mounted by Lustre.

Since the mount.lustre checks that the mountdata and last_rcvd files (in ldiskfs_is_lustre()) before even trying to mount the filesystem, this would need to be relaxed, and also allow mounting the filesystem if there are files in lost+found that have trusted.lov and trusted.lma xattrs on them (and whatever else we do to identify internal system files). Then, OI Scrub should try to identify critical files/directories in lost+found first, before trying to mount the filesystem. If the recovery of files from lost+found is unsuccessful, and the mountdata file cannot be recovered, the mount should fail with a error as it does today.



 Comments   
Comment by Andreas Dilger [ 29/May/21 ]

We might consider using "trusted.link", but that may be confused with regular user files. Possibly use the same format, but with a different xattr name.

For files/directories in the filesystem root (eg. "O/", "CATALOGS", etc) can also have a parent IGIF FID of [0x00000002:0x0:0x0] for the ext2 root inode.

We have lots of space in these internal inodes, so storing a couple of xattrs is fine, and Scrub should be able to rebuild the whole structure easily.

Generated at Sat Feb 10 03:12:12 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.