Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.12.0
-
3
-
9223372036854775807
Description
cross-rename for symlinks creates an empty local agent inodes with i_size = 0 . e2fsck complains about them :
Symlink /REMOTE_PARENT_DIR/0x30004e816:0x17942:0x0/12 (inode #97469960) is invalid. Clear? no
The issue can be easily reproduced:
1. start DNE system:
[root@vm1 tests]# MDSCOUNT=4 REFORMAT=no sh llmount.sh Stopping clients: vm1.localdomain /mnt/lustre (opts:-f) Stopping clients: vm1.localdomain /mnt/lustre2 (opts:-f) Loading modules from /home/zam/git/lustre-wc-rel/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions ../libcfs/libcfs/libcfs options: 'cpu_npartitions=2' gss/krb5 is not supported quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format mds2: /tmp/lustre-mdt2 Format mds3: /tmp/lustre-mdt3 Format mds4: /tmp/lustre-mdt4 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 Checking servers environments Checking clients vm1.localdomain environments Loading modules from /home/zam/git/lustre-wc-rel/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions gss/krb5 is not supported Setup mgs, mdt, osts Starting mds1: -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Commit the device label on /tmp/lustre-mdt1 Started lustre-MDT0000 Starting mds2: -o loop /tmp/lustre-mdt2 /mnt/lustre-mds2 Commit the device label on /tmp/lustre-mdt2 Started lustre-MDT0001 Starting mds3: -o loop /tmp/lustre-mdt3 /mnt/lustre-mds3 Commit the device label on /tmp/lustre-mdt3 Started lustre-MDT0002 Starting mds4: -o loop /tmp/lustre-mdt4 /mnt/lustre-mds4 Commit the device label on /tmp/lustre-mdt4 Started lustre-MDT0003 Starting ost1: -o loop /tmp/lustre-ost1 /mnt/lustre-ost1 Commit the device label on /tmp/lustre-ost1 Started lustre-OST0000 Starting ost2: -o loop /tmp/lustre-ost2 /mnt/lustre-ost2 Commit the device label on /tmp/lustre-ost2 Started lustre-OST0001 Starting client: vm1.localdomain: -o user_xattr,flock vm1.localdomain@tcp:/lustre /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 125368 1956 112176 2% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 125368 1760 112372 2% /mnt/lustre[MDT:1] lustre-MDT0002_UUID 125368 1764 112368 2% /mnt/lustre[MDT:2] lustre-MDT0003_UUID 125368 1768 112364 2% /mnt/lustre[MDT:3] lustre-OST0000_UUID 325368 13924 284284 5% /mnt/lustre[OST:0] lustre-OST0001_UUID 325368 13380 284828 4% /mnt/lustre[OST:1] filesystem_summary: 650736 27304 569112 5% /mnt/lustre Using TIMEOUT=20 seting jobstats to procname_uid Setting lustre.sys.jobid_var from disable to procname_uid Waiting 90 secs for update Updated after 4s: wanted 'procname_uid' got 'procname_uid' disable quota as required
2. create directories on other MDTs:
[root@vm1 tests]# for x in 1 2 3; do lfs mkdir -i $x /mnt/lustre/mdt$x-dir; done
3. create a symlink on MDT0:
[root@vm1 tests]# ln -s "foo" /mnt/lustre/bar-symlink
4. move the symlink to mdt1:
[root@vm1 tests]# mv /mnt/lustre/foo /mnt/lustre/mdt1-dir/ mv: cannot stat ‘/mnt/lustre/foo’: No such file or directory [root@vm1 tests]# mv /mnt/lustre/bar-symlink /mnt/lustre/mdt1-dir/
5. check that the fs images are updated with MDT objects. Please note there are two mdt objects for "bar-symlink" , on on MDT0 and one on MDT1 . Both objects are of symlink type, but only one (on MDT0) has symlink body and Link EA.
[root@vm1 tests]# sync [root@vm1 tests]# debugfs /tmp/lustre-mdt2 debugfs 1.42.13.wc6 (05-Feb-2017) debugfs: ls REMOTE_PARENT_DIR 25001 (12) . 2 (12) .. 25039 (4072) 0x240000404:0x1:0x0 debugfs: ls <25039> 25039 (12) . 25001 (28) .. 149 (4056) bar-symlink debugfs: stat <149> Inode: 149 Type: symlink Mode: 0000 Flags: 0x0 Generation: 1006356438 Version: 0x00000000:00000000 User: 0 Group: 0 Project: 0 Size: 0 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5b3aad37:85bc158c -- Tue Jul 3 01:54:47 2018 atime: 0x5b3aad37:85bc158c -- Tue Jul 3 01:54:47 2018 mtime: 0x5b3aad37:85bc158c -- Tue Jul 3 01:54:47 2018 crtime: 0x5b3aad37:85bc158c -- Tue Jul 3 01:54:47 2018 Size of extra inode fields: 32 Extended attributes stored in inode body: lma = "00 00 00 00 02 00 00 00 04 04 00 00 02 00 00 00 01 00 00 00 00 00 00 00 " (24) lma: fid=[0x200000404:0x1:0x0] compat=0 incompat=2 Fast_link_dest: debugfs: [root@vm1 tests]# debugfs /tmp/lustre-mdt1 debugfs 1.42.13.wc6 (05-Feb-2017) debugfs: ls ROOT 25043 (12) . 2 (12) .. 25044 (36) .lustre 25049 (36) mdt1-dir 25050 (36) mdt2-dir 25051 (3964) mdt3-dir debugfs: ls REMOTE_PARENT_DIR 25001 (12) . 2 (12) .. 165 (4072) 0x200000404:0x1:0x0 debugfs: ls <165> <165>: Ext2 inode is not a directory debugfs: stat <165> Inode: 165 Type: symlink Mode: 0777 Flags: 0x0 Generation: 2421202347 Version: 0x00000001:00000010 User: 0 Group: 0 Project: 0 Size: 3 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5b3aad37:00000000 -- Tue Jul 3 01:54:47 2018 atime: 0x5b3aad21:4f6ef7f0 -- Tue Jul 3 01:54:25 2018 mtime: 0x5b3aad21:4f6ef7f0 -- Tue Jul 3 01:54:25 2018 crtime: 0x5b3aad21:4f6ef7f0 -- Tue Jul 3 01:54:25 2018 Size of extra inode fields: 32 Extended attributes stored in inode body: lma = "00 00 00 00 04 00 00 00 04 04 00 00 02 00 00 00 01 00 00 00 00 00 00 00 " (24) lma: fid=[0x200000404:0x1:0x0] compat=0 incompat=4 selinux = "unconfined_u:object_r:unlabeled_t:s0\000" (37) link = "df f1 ea 11 01 00 00 00 35 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1d 00 00 00 02 40 00 04 04 00 00 00 01 00 00 00 00 62 61 72 2d 73 79 6d 6c 69 6e 6b " (53) Fast_link_dest: foo debugfs:
e2fsck on the images. Fsck complains about invalid symlink object on MDT1 (which does not contain symlink body).
[root@vm1 tests]# e2fsck -fn /tmp/lustre-mdt1 e2fsck 1.42.13.wc6 (05-Feb-2017) Warning! /tmp/lustre-mdt1 is mounted. Warning: skipping journal recovery because doing a read-only filesystem check. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Free blocks count wrong (33293, counted=32838). Fix? no Free inodes count wrong (99987, counted=99718). Fix? no lustre-MDT0000: 13/100000 files (46.2% non-contiguous), 29207/62500 blocks [root@vm1 tests]# e2fsck -fn /tmp/lustre-mdt2 e2fsck 1.42.13.wc6 (05-Feb-2017) Warning! /tmp/lustre-mdt2 is mounted. Warning: skipping journal recovery because doing a read-only filesystem check. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Symlink /REMOTE_PARENT_DIR/0x240000404:0x1:0x0/bar-symlink (inode #149) is invalid. Clear? no Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Free blocks count wrong (33293, counted=32924). Fix? no Free inodes count wrong (99987, counted=99745). Fix? no lustre-MDT0001: ********** WARNING: Filesystem still has errors ********** lustre-MDT0001: 13/100000 files (7.7% non-contiguous), 29207/62500 blocks [root@vm1 tests]#