Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11130

cross-target rename creates invalid symlink inodes

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      cross-rename for symlinks creates an empty local agent inodes with i_size = 0 . e2fsck complains about them :

      Symlink /REMOTE_PARENT_DIR/0x30004e816:0x17942:0x0/12 (inode #97469960) is invalid.
      Clear? no
      

      The issue can be easily reproduced:

      1. start DNE system:

      [root@vm1 tests]# MDSCOUNT=4 REFORMAT=no sh llmount.sh
      Stopping clients: vm1.localdomain /mnt/lustre (opts:-f)
      Stopping clients: vm1.localdomain /mnt/lustre2 (opts:-f)
      Loading modules from /home/zam/git/lustre-wc-rel/lustre/tests/..
      detected 2 online CPUs by sysfs
      Force libcfs to create 2 CPU partitions
      ../libcfs/libcfs/libcfs options: 'cpu_npartitions=2'
      gss/krb5 is not supported
      quota/lquota options: 'hash_lqs_cur_bits=3'
      Formatting mgs, mds, osts
      Format mds1: /tmp/lustre-mdt1
      Format mds2: /tmp/lustre-mdt2
      Format mds3: /tmp/lustre-mdt3
      Format mds4: /tmp/lustre-mdt4
      Format ost1: /tmp/lustre-ost1
      Format ost2: /tmp/lustre-ost2
      Checking servers environments
      Checking clients vm1.localdomain environments
      Loading modules from /home/zam/git/lustre-wc-rel/lustre/tests/..
      detected 2 online CPUs by sysfs
      Force libcfs to create 2 CPU partitions
      gss/krb5 is not supported
      Setup mgs, mdt, osts
      Starting mds1:   -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1
      Commit the device label on /tmp/lustre-mdt1
      Started lustre-MDT0000
      Starting mds2:   -o loop /tmp/lustre-mdt2 /mnt/lustre-mds2
      Commit the device label on /tmp/lustre-mdt2
      Started lustre-MDT0001
      Starting mds3:   -o loop /tmp/lustre-mdt3 /mnt/lustre-mds3
      Commit the device label on /tmp/lustre-mdt3
      Started lustre-MDT0002
      Starting mds4:   -o loop /tmp/lustre-mdt4 /mnt/lustre-mds4
      Commit the device label on /tmp/lustre-mdt4
      Started lustre-MDT0003
      Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/lustre-ost1
      Commit the device label on /tmp/lustre-ost1
      Started lustre-OST0000
      Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/lustre-ost2
      Commit the device label on /tmp/lustre-ost2
      Started lustre-OST0001
      Starting client: vm1.localdomain:  -o user_xattr,flock vm1.localdomain@tcp:/lustre /mnt/lustre
      UUID                   1K-blocks        Used   Available Use% Mounted on
      lustre-MDT0000_UUID       125368        1956      112176   2% /mnt/lustre[MDT:0]
      lustre-MDT0001_UUID       125368        1760      112372   2% /mnt/lustre[MDT:1]
      lustre-MDT0002_UUID       125368        1764      112368   2% /mnt/lustre[MDT:2]
      lustre-MDT0003_UUID       125368        1768      112364   2% /mnt/lustre[MDT:3]
      lustre-OST0000_UUID       325368       13924      284284   5% /mnt/lustre[OST:0]
      lustre-OST0001_UUID       325368       13380      284828   4% /mnt/lustre[OST:1]
      
      filesystem_summary:       650736       27304      569112   5% /mnt/lustre
      
      Using TIMEOUT=20
      seting jobstats to procname_uid
      Setting lustre.sys.jobid_var from disable to procname_uid
      Waiting 90 secs for update
      Updated after 4s: wanted 'procname_uid' got 'procname_uid'
      disable quota as required
      

      2. create directories on other MDTs:

      [root@vm1 tests]# for x in 1 2 3; do lfs mkdir -i $x /mnt/lustre/mdt$x-dir; done
      

      3. create a symlink on MDT0:

      [root@vm1 tests]# ln -s "foo" /mnt/lustre/bar-symlink
      

      4. move the symlink to mdt1:

      [root@vm1 tests]# mv /mnt/lustre/foo /mnt/lustre/mdt1-dir/
      mv: cannot stat ‘/mnt/lustre/foo’: No such file or directory
      [root@vm1 tests]# mv /mnt/lustre/bar-symlink /mnt/lustre/mdt1-dir/
      

      5. check that the fs images are updated with MDT objects. Please note there are two mdt objects for "bar-symlink" , on on MDT0 and one on MDT1 . Both objects are of symlink type, but only one (on MDT0) has symlink body and Link EA.

      [root@vm1 tests]# sync
      [root@vm1 tests]# debugfs /tmp/lustre-mdt2
      debugfs 1.42.13.wc6 (05-Feb-2017)
      debugfs:  ls REMOTE_PARENT_DIR
       25001  (12) .    2  (12) ..    25039  (4072) 0x240000404:0x1:0x0
      debugfs:  ls <25039>
       25039  (12) .    25001  (28) ..    149  (4056) bar-symlink
      debugfs:  stat <149>
      Inode: 149   Type: symlink    Mode:  0000   Flags: 0x0
      Generation: 1006356438    Version: 0x00000000:00000000
      User:     0   Group:     0   Project:     0   Size: 0
      File ACL: 0    Directory ACL: 0
      Links: 1   Blockcount: 0
      Fragment:  Address: 0    Number: 0    Size: 0
       ctime: 0x5b3aad37:85bc158c -- Tue Jul  3 01:54:47 2018
       atime: 0x5b3aad37:85bc158c -- Tue Jul  3 01:54:47 2018
       mtime: 0x5b3aad37:85bc158c -- Tue Jul  3 01:54:47 2018
      crtime: 0x5b3aad37:85bc158c -- Tue Jul  3 01:54:47 2018
      Size of extra inode fields: 32
      Extended attributes stored in inode body:
        lma = "00 00 00 00 02 00 00 00 04 04 00 00 02 00 00 00 01 00 00 00 00 00 00 00 " (24)
        lma: fid=[0x200000404:0x1:0x0] compat=0 incompat=2
      Fast_link_dest:
      debugfs:  [root@vm1 tests]# debugfs /tmp/lustre-mdt1
      debugfs 1.42.13.wc6 (05-Feb-2017)
      debugfs:  ls ROOT
       25043  (12) .    2  (12) ..    25044  (36) .lustre    25049  (36) mdt1-dir
       25050  (36) mdt2-dir    25051  (3964) mdt3-dir
      debugfs:  ls REMOTE_PARENT_DIR
       25001  (12) .    2  (12) ..    165  (4072) 0x200000404:0x1:0x0
      debugfs:  ls <165>
      
      <165>: Ext2 inode is not a directory
      debugfs:  stat <165>
      Inode: 165   Type: symlink    Mode:  0777   Flags: 0x0
      Generation: 2421202347    Version: 0x00000001:00000010
      User:     0   Group:     0   Project:     0   Size: 3
      File ACL: 0    Directory ACL: 0
      Links: 1   Blockcount: 0
      Fragment:  Address: 0    Number: 0    Size: 0
       ctime: 0x5b3aad37:00000000 -- Tue Jul  3 01:54:47 2018
       atime: 0x5b3aad21:4f6ef7f0 -- Tue Jul  3 01:54:25 2018
       mtime: 0x5b3aad21:4f6ef7f0 -- Tue Jul  3 01:54:25 2018
      crtime: 0x5b3aad21:4f6ef7f0 -- Tue Jul  3 01:54:25 2018
      Size of extra inode fields: 32
      Extended attributes stored in inode body:
        lma = "00 00 00 00 04 00 00 00 04 04 00 00 02 00 00 00 01 00 00 00 00 00 00 00 " (24)
        lma: fid=[0x200000404:0x1:0x0] compat=0 incompat=4
        selinux = "unconfined_u:object_r:unlabeled_t:s0\000" (37)
        link = "df f1 ea 11 01 00 00 00 35 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1d 00 00 00 02 40 00 04 04 00 00 00 01 00 00 00 00 62 61 72 2d 73 79 6d
       6c 69 6e 6b " (53)
      Fast_link_dest: foo
      debugfs:  
      

      e2fsck on the images. Fsck complains about invalid symlink object on MDT1 (which does not contain symlink body).

      [root@vm1 tests]# e2fsck -fn /tmp/lustre-mdt1
      e2fsck 1.42.13.wc6 (05-Feb-2017)
      Warning!  /tmp/lustre-mdt1 is mounted.
      Warning: skipping journal recovery because doing a read-only filesystem check.
      Pass 1: Checking inodes, blocks, and sizes
      Pass 2: Checking directory structure
      Pass 3: Checking directory connectivity
      Pass 4: Checking reference counts
      Pass 5: Checking group summary information
      Free blocks count wrong (33293, counted=32838).
      Fix? no
      
      Free inodes count wrong (99987, counted=99718).
      Fix? no
      
      lustre-MDT0000: 13/100000 files (46.2% non-contiguous), 29207/62500 blocks
      [root@vm1 tests]# e2fsck -fn /tmp/lustre-mdt2
      e2fsck 1.42.13.wc6 (05-Feb-2017)
      Warning!  /tmp/lustre-mdt2 is mounted.
      Warning: skipping journal recovery because doing a read-only filesystem check.
      Pass 1: Checking inodes, blocks, and sizes
      Pass 2: Checking directory structure
      Symlink /REMOTE_PARENT_DIR/0x240000404:0x1:0x0/bar-symlink (inode #149) is invalid.
      Clear? no
      
      Pass 3: Checking directory connectivity
      Pass 4: Checking reference counts
      Pass 5: Checking group summary information
      Free blocks count wrong (33293, counted=32924).
      Fix? no
      
      Free inodes count wrong (99987, counted=99745).
      Fix? no
      
      
      lustre-MDT0001: ********** WARNING: Filesystem still has errors **********
      
      lustre-MDT0001: 13/100000 files (7.7% non-contiguous), 29207/62500 blocks
      [root@vm1 tests]#
      

      Attachments

        Issue Links

          Activity

            People

              zam Alexander Zarochentsev
              zam Alexander Zarochentsev
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: