[LU-1569] Many Files missing and others have no info (uid/gid/permissions) Created: 26/Jun/12  Updated: 06/Nov/13  Resolved: 06/Nov/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.7
Fix Version/s: None

Type: Task Priority: Critical
Reporter: Brian Andrus (Inactive) Assignee: WC Triage
Resolution: Incomplete Votes: 0
Labels: None
Environment:

CentOS release 5.7 (Final)
Linux nas-0-1.local 2.6.18-194.17.1.el5_lustre.1.8.5 #1 SMP Tue Nov 16 17:59:07 MST 2010 x86_64 x86_64 x86_64 GNU/Linux


Attachments: File lustre-log.tgz     Text File lustre.log     File messages.tgz    
Epic: server
Rank (Obsolete): 4002

 Description   

We have had a catastrophic failure of one of our lustre filesystems. Not sure exactly cause, but in our current state running lfsck on it gives TONS of errors like:
Failed to find fid [0xc900ec:0xd7e33ef4:0x0]: DB_NOTFOUND: No matching key/data pair found

And when we run a find on various users' directories, we find many "No such file" errors:
find: ./mrtoeppe/CFD_run_archive/Turbine2DecDet/mcfd_tec.bin.830: No such file or directory which shows up like:

?--------- ? ? ? ? ? mcfd_tec.bin.660
?--------- ? ? ? ? ? mcfd_tec.bin.670
?--------- ? ? ? ? ? mcfd_tec.bin.680



 Comments   
Comment by Cliff White (Inactive) [ 26/Jun/12 ]

Have you successfully run 'fsck -fy' on all devices? Are you using the latest version of e2fsprogs, available at http://downloads.whamcloud.com/public/e2fsprogs/

Comment by Brian Andrus (Inactive) [ 26/Jun/12 ]

Initially our lustre filesystem (/work) had one of the osts disconnect (there are 10 each 7.8TB OSTs) and not reconnect. This put /work in read-only mode.
I attempted to reconnect work_ost9, but it failed with Transport Endpoint shutdown errors (odd for an OST I think).
I took the entire system down and ran fsck on each ost and the mdt. There were numerous errors on work_ost9 as well as a few errors on the mdt and 2 other OSTs.
Upon remount, we found that there were almost no files newer than December 12, 2011.
We also found many files were corrupt and did not show proper UID/GID/Permissions.
There WERE files accessed and written to by users during this time.
It seemed there may have been issues with the LAST_ID and/or CATALOG on the MDT, so those were removed and the system brought back online. The made the LAST_ID entries on the MDT match those listed on the OSTs.
Upon remounting (read only now), we found the corrupt entries were still there and there were still no files newer than December.
I took down the filesystem again and ran e2fsck to create the MDT and OST databases. I then brought the filesystem back up and ran lfsck in read only.
This produced many of the "No matching key/data pair found" errors.
I ran lfsck without "-n" with the same result.

Currently /work is mounted read only so users that do still have data intact can copy it to a clean filesystem.

Comment by Cliff White (Inactive) [ 26/Jun/12 ]

Okay, thanks

Comment by Cliff White (Inactive) [ 26/Jun/12 ]

First, as explained in the Lustre Manual, lustre-logs which are auto-dumped must be pre-processed on site to be useful, so we can't do much with what you attached. What we need in this case are the system logs (typically /var/log/messages) for all OSTs and the MDS/MGS for the period 12 hrs before you had the initial outage to the present time. Please do not filter the logs unless you need to remove IPs for security.

Comment by Cliff White (Inactive) [ 26/Jun/12 ]

Please run lfs getstripe on one of the missing files, get the list of stripe objects and check the OSTs to determine if the data actually exists on the OST disk. Debugfs will work for this.
The lfs setstripe should return a list of ostidx (OST index) and object IDs, for an objid of (example) 818855
the following debugfs command should tell you if the data is there.

$ debugfs -c -R "stat O/0/d$((818855 % 32))/818855" /dev/<your OST device>

Comment by Brian Andrus (Inactive) [ 26/Jun/12 ]

Here is a quick check on one file that is showing in an ls, but missing info:

[root@nas-0-1 hale]# ls -l|grep gempak$
?--------- ? ? ? ? ? gempak
[root@nas-0-1 hale]# lfs getstripe ./gempak
./gempak
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_stripe_offset: 0
obdidx objid objid group
0 5703559 0x570787 0

[root@nas-0-1 hale]# debugfs -c -R "stat O/0/d$((5703559 % 32))/5703559" /dev/VG_hamming/work_ost0
debugfs 1.41.12.2.ora1 (14-Aug-2010)
/dev/VG_hamming/work_ost0: catastrophic mode - not reading inode or group bitmaps
O/0/d7/5703559: File not found by ext2_lookup

Comment by Brian Andrus (Inactive) [ 26/Jun/12 ]

Tar file of /var/log/messages for MGS and OSSes

Comment by Cliff White (Inactive) [ 26/Jun/12 ]

Thanks - did you keep any logs/output from the first fsck you did after the initial failure? Please attach if so.

Comment by Brian Andrus (Inactive) [ 27/Jun/12 ]

The only log I have is the output from lfsck, but it is 7.9GB
I can gzip it and try to upload if you think it will help.

Comment by Brian Andrus (Inactive) [ 27/Jun/12 ]

Attached output from running lctl df on all the dump logs that were generated (lustre.log)

Comment by Cliff White (Inactive) [ 27/Jun/12 ]

We need the fsck data, not the lfsck.

Comment by Brian Andrus (Inactive) [ 28/Jun/12 ]

That I do not have. I do know there are many files in lost+found on the backing filesystem. I have not examined them yet since it is now mounted as lustre (albeit read-only).

Comment by Cliff White (Inactive) [ 05/Jul/12 ]

Have you run the lost+found recovery script?

Comment by Andreas Dilger [ 11/Jul/12 ]

That would be "ll_recover_lost_found_objs", which should be installed on all the OSTs. You need to mount the OST locally using "-t ldiskfs" instead of as "-t lustre" to run this tool. It will rebuild the corrupted object directories and move all the objects from lost+found back into their proper location.

Generated at Sat Feb 10 01:17:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.