Hello,
I have migrate related question. I see probable LFSCK-related inconsistency possible. I catched ldiskfs inconsistency after directory migration. Started this command
SLOW=yes RUNAS_ID=1000 CLEANUP=cleanupall SETUP=setupall OSTCOUNT=4 MDSCOUNT=4 OSTSIZE=600000 MDTSIZE=300000 ONLY=230h sh /usr/lib/lustre/tests/sanity.sh
and run fsck just after test has been finished.
Fsck shows:
Inode 25043 ref count is 5, should be 4. Fix? no
Pass 5: Checking group summary information
Free blocks count wrong (32862, counted=32853).
Fix? no
Free inodes count wrong (99721, counted=99718).
Fix? no
lustre-MDT0000: ********** WARNING: Filesystem still has errors **********
Inode with broken count is "/ROOT"
# debugfs lustre-mdt1
debugfs 1.42.13.wc6 (05-Feb-2017)
debugfs: ncheck 25043
Inode Pathname
25043
Inode has 5 links
# debugfs -R "stat <25043>" lustre-mdt1
debugfs 1.42.13.wc6 (05-Feb-2017)
Inode: 25043 Type: directory Mode: 0755 Flags: 0x0
Generation: 3666376863 Version: 0x00000001:00000008
User: 0 Group: 0 Project: 0 Size: 4096
File ACL: 0 Directory ACL: 0
Links: 5 Blockcount: 8
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x5ba2dcf7:00000000 -- Wed Sep 19 19:34:15 2018
atime: 0x5ba2dcbc:00000000 -- Wed Sep 19 19:33:16 2018
mtime: 0x5ba2dcf7:00000000 -- Wed Sep 19 19:34:15 2018
crtime: 0x5ba2dcbc:b0d6b4b8 -- Wed Sep 19 19:33:16 2018
Size of extra inode fields: 32
Extended attributes stored in inode body:
lma = "00 00 00 00 00 00 00 00 07 00 00 00 02 00 00 00 01 00 00 00 00 00 00 00 " (24)
lma: fid=[0x200000007:0x1:0x0] compat=0 incompat=0
BLOCKS:
(0):12623
TOTAL: 1
But should be 4
debugfs(mdt1): ls -l /ROOT
25043 40755 (2) 0 0 4096 19-Sep-2018 19:34 .
2 40755 (2) 0 0 4096 19-Sep-2018 19:33 ..
25044 40755 (18) 0 0 4096 19-Sep-2018 19:33 .lustre
25049 40000 (18) 0 0 4096 19-Sep-2018 19:34 d230h.sanity
sanity 230h test is about migrating root's subdirectory to mdt2. The main operation is
$LFS migrate -m1 $DIR/$tdir/migrate_dir/.. ||
error "migrating $tdir fail"
After this operation $DIR/$tdi and $DIR/$tdir/migrate_dir/ are moved to mdt2
Checked in image. This directories are actually moved there:
debugfs(mdt2): ls -l /REMOTE_PARENT_DIR
25001 40755 (2) 0 0 4096 19-Sep-2018 19:34 .
2 40755 (2) 0 0 4096 19-Sep-2018 19:33 ..
25046 40755 (2) 0 0 4096 19-Sep-2018 19:34 0x240000404:0x1:0x0
# lfs fid2path /mnt/lustre 0x240000404:0x1:0x0
/mnt/lustre/d230h.sanity
Do you have idea why anode has reference 5, then it need have reference 4?
Thanks.
I've also seen a few cases where e2fsck complains about hard-linked directories in REMOTE_PARENT_DIR:
It looks like e2fsck fixes this by removing the entry from REMOTE_PARENT_DIR, but LFSCK shouldn't get into this situation in the first place.