Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
None
-
None
-
2
-
9223372036854775807
Description
After upgrading filesystem from 2.12 to 2.15.2 Several top level directories got corrupted.
[root@nbp11-srv1 ~]# ls -l /nobackupp11/
ls: cannot access '/nobackupp11/ylin4': No such file or directory
ls: cannot access '/nobackupp11/mbarad': No such file or directory
ls: cannot access '/nobackupp11/ldgrant': No such file or directory
ls: cannot access '/nobackupp11/kknizhni': No such file or directory
ls: cannot access '/nobackupp11/mzhao4': No such file or directory
ls: cannot access '/nobackupp11/afahad': No such file or directory
ls: cannot access '/nobackupp11/jliu7': No such file or directory
ls: cannot access '/nobackupp11/jswest': No such file or directory
ls: cannot access '/nobackupp11/hsp': No such file or directory
ls: cannot access '/nobackupp11/vjespos1': No such file or directory
ls: cannot access '/nobackupp11/ssepka': No such file or directory
ls: cannot access '/nobackupp11/cjang1': No such file or directory
debugfs: stat ylin4
Inode: 43051102 Type: directory Mode: 0000 Flags: 0x80000
Generation: 503057142 Version: 0x00000000:00000000
User: 0 Group: 0 Project: 0 Size: 4096
File ACL: 0
Links: 2 Blockcount: 8
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x63dd8e2f:22c83c08 – Fri Feb 3 14:43:59 2023
atime: 0x63dd8e2f:22c83c08 – Fri Feb 3 14:43:59 2023
mtime: 0x63dd8e2f:22c83c08 – Fri Feb 3 14:43:59 2023
crtime: 0x63dd8e2f:22c83c08 – Fri Feb 3 14:43:59 2023
Size of extra inode fields: 32
Extended attributes:
lma: fid=[0x280015902:0x2:0x0] compat=0 incompat=2
EXTENTS:
(0):671099035
Not thinking I delete these via ldiskfs. The data is still there how can we recover the director data.
- lfs quota -u ylin4 /nobackupp11
Disk quotas for usr ylin4 (uid 11560):
Filesystem kbytes quota limit grace files quota limit grace
/nobackupp11 11337707848* 1073741824 2147483648 - 208359 500000 600000 -
Attachments
Issue Links
- is related to
-
LU-16655 Files not accessible after 2.12 -> 2.14/2.15 upgrade
-
- Resolved
-
Mahmoud, do you have any logs from the mount after the upgrade that indicate OI Scrub has been run/completed on the MDTs? It would be worthwhile to check the state of the OI files on the MDTs to confirm that they are correct:
The important information here is that it reports oi_files: 64 and not some other number (which is what the
LU-16655bug broke). The OI files map the Lustre FIDs to local inode numbers, so without these most of the by-FID lookups will be broken. The OI Scrub can rebuild the OI files from the FID xattr stored in each inode.If this is showing "oi_files: 1" or 2 or similar, my recommendation would be to mount the MDTs with "-o resetoi" to force a rebuild of the OI files, or alternately mount MDTs with ldiskfs and move the "oi.16.X" files out of the filesystem and then remount as Lustre and it should rebuild them automatically at mount (this will take a few minutes). Having a small number of OI files will cause scalability/performance issues.