[LU-11437] Recovering files in .lustre/lost+found/MDT0000/* Created: 27/Sep/18 Updated: 05/Oct/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Question/Request | Priority: | Minor |
| Reporter: | James A Simmons | Assignee: | Lai Siyao |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
ORNL's altas files system running patched lustre 2.8.2 with same version for it clients |
||
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Recently we suffered an outage for some of servers and after recover started a lfsck run for all the options, namespace, layout etc. During the run a large number of files ended up cached in /lustre/altas2/.lustre/lost+found/MDT0000/* as well as the MDT on the MDS server to the point it crashed the MDS since we don't have large directory support. To prevent this all zero size files are being deleted. At the same time user files that were accessible after the recovery are no longer accessible after the lfsck launch. Their data is ended up in lost+found. We are attempting to look at recovering this data but no documentation can be found easily on how to do that for lost+found files. I have looked at the sanity-lfsck for information on how to do this but their doesn't seem to be a clear answer on how to figure out the original location of the files in lost+found. We end attempted to use ll_decode_linkea but the parent return was "lost+found" itself instead of the original directory. As for sanity-lfsck none of the test actually determine the original location of such displaced files by using tools but use the path originally given when creating a test file. So we are looking for pointers on how to recover these files. |
| Comments |
| Comment by Peter Jones [ 28/Sep/18 ] |
|
Lai Could you please advise Thanks Peter |
| Comment by Fan Yong [ 29/Sep/18 ] |
|
There are two kinds of "lost+found" in Lustre: 1) One is the backend "lost+found", that is special for ldiskfs backend. The e2fsck tool will put the backend orphans (no name entry reference the inode) to the backend "/lost+found" directory. The backend lost+found directory and its sub-items are invisible to Lustre client. You have to mount the server as "ldiskfs" if you want to check the backend "lost+found". 2) The other is the Lustre global "lost+found" directory. That is visible to Lustre client and under the directory $mount_point/.lustre/lost+found. The LFSCK will link Lustre orphans to the Lustre "lost+found" directory. There are several kinds of Lustre orphans with the infix in its name under the Lustre "lost+found" directory: "C": Multiple OST-objects claim the same MDT-object and the same slot in the layout EA. Then the LFSCK will create new MDT-object(s) to hold the conflict OST-object(s). "N": The orphan OST-object does not know which one was the real parent MDT-object, so the LFSCK uses new FID for its parent MDT-object. "R": The orphan OST-object knows its parent MDT-object FID, but does not know the position (the file name) in the layout. "D": The MDT-object is a directory, it may knows its parent but because there is no valid linkEA, the LFSCK cannot know where to put it back to the namespace. "O": The MDT-object has no linkEA, and there is no name entry that references the MDT-object. "P": The orphan object to be created was a parent directory of some MDT-object which linkEA shows that the @orphan object is missing. So please describe what kinds of orphans you hit in detail, then we can analysis how to process next step. |
| Comment by James A Simmons [ 01/Oct/18 ] |
|
3793 Most of the N files are empty and since their are so many we are deleting them to avoid our MDS crashing. |
| Comment by James A Simmons [ 03/Oct/18 ] |
|
Any advice? |
| Comment by Andreas Dilger [ 03/Oct/18 ] |
|
James, has LFSCK ever been run on this filesystem in the past? What is the default stripe count on the filesystem? I'm wondering if the zero-length files are potentially objects that were part of files that were smaller than (stripe_count x 1MB), so they were never modified by clients, and do not have a parent (MDT) FID stored on them? Is the MDS still crashing after you have removed the zero-length files? Do you have a stack trace from the crashes? How many files are left after the zero-length ones are removed? Can you please provide a sample of the filenames? For OST objects to end up in the Lustre lost+found, it would mean that there was corruption of the MDT that resulted in inodes being erased, since they no longer have a LOV EA pointing to them. Separately, there may be inodes in the underlying ext4 lost+found directory that still point to data on the OSTs, but lost their filenames because of directory corruption on the MDT. I'm guessing you don't have any device-level backups of the MDT? I recommend taking periodic backups of the MDT via "dd" (eg. daily or as often as is practical) to allow recovery in cases like this. While doing the backup from an LVM snapshot is preferred, doing a raw-disk backup of the live MDT is still useful in cases like this. It can either be restored directly in case of serious corruption, or potentially used to recover files that were corrupted. |
| Comment by Fan Yong [ 04/Oct/18 ] |
|
Usually, the Lustre orphan under global "lost+found" with the name format "*-N-*" is for the pre-created OST-object. There are two possible cases: 1) Such pre-created OST-object had not been assigned to any MDT-object. Under such case, removing the orphan from global "lost+found" will NOT affect the system. 2) Or even thought it had been assigned to some MDT-object before the corruption, it had never been modified after the assignment. That is why it does not know who was the parent MDT-object. Under such case, related MDT-object is not handled during the (first-stage) layout LFSCK scanning on the MDT, it is quite possible that related MDT-object is lost or become invisible orphans (under backend "lost+found"). Under such case, removing the empty orphans is also safe. |
| Comment by Jesse Hanley [ 04/Oct/18 ] |
|
After removing the Andreas, the default stripe count on the FS is 4; with our large number of small files, your statement makes sense. For the remaining files, they fit into 2 categories (1539 of these):
# lfs path2fid '/lustre/atlas2/.lustre/lost+found/MDT0000/[0x20042c912:0x14cd4:0x0]-R-0' [0x20042c912:0x14cd4:0x0] # lfs fid2path /lustre/atlas2/ '[0x20042c912:0x14cd4:0x0]' /lustre/atlas2/.lustre/lost+found/MDT0000/[0x20042c912:0x14cd4:0x0]-R-0 # lfs getstripe '/lustre/atlas2/.lustre/lost+found/MDT0000/[0x20042c912:0x14cd4:0x0]-R-0' /lustre/atlas2/.lustre/lost+found/MDT0000/[0x20042c912:0x14cd4:0x0]-R-0 lmm_stripe_count: 4 lmm_stripe_size: 1048576 lmm_pattern: 40000001 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 0 0 0 0 0 0 0 0 0 0 0 483 64774796 0x3dc628c 0 # stat '/lustre/atlas2/.lustre/lost+found/MDT0000/[0x20042c912:0x14cd4:0x0]-R-0' stat: cannot stat ‘/lustre/atlas2/.lustre/lost+found/MDT0000/[0x20042c912:0x14cd4:0x0]-R-0’: No such file or directory
I can then take that object and check it with debugfs against that OST: O/0/d12/64774796: File not found by ext2_lookup
# lfs path2fid '/lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0' [0x200380a26:0x12ecf:0x0] # lfs fid2path /lustre/atlas2/ [0x200380a26:0x12ecf:0x0] /lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0 # lfs getstripe '/lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0' /lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0 lmm_stripe_count: 3 lmm_stripe_size: 1048576 lmm_pattern: 40000001 lmm_layout_gen: 1 lmm_stripe_offset: 284 obdidx objid objid group 284 49165778 0x2ee35d2 0 0 0 0 0 271 50018306 0x2fb3802 0 # ls -l '/lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0' -r-------- 1 root root 76 Sep 14 08:36 /lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0 # file '/lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0' /lustre/atlas2/.lustre/lost+found/MDT0000/[0x200380a26:0x12ecf:0x0]-R-0: ASCII text
|
| Comment by Dustin Leverman [ 05/Oct/18 ] |
|
Are there any updates on this?
We need to give an update to our uses about the status of their files...
Thanks, Dustin |
| Comment by Andreas Dilger [ 05/Oct/18 ] |
|
The files like [0x20042c912:0x14cd4:0x0]-R-0 shown above appear to have some, but not all of the OST objects. It is entirely possible that the missing objects were among the -N- objects that were removed, but were never accessed/modified by clients, in case of sparse files. It is also possible that some of the -R- files are just some abandoned objects that was left over from past issues and are only being discovered now. In terms of moving forward, there are a couple of options:
|