Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11549

Unattached inodes after 3 min racer run.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0
    • Lustre 2.11.0, Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      An attempt to run racer.sh on a DNE system with rpms built from wc master branch:

      [root@cslmodev100 racer]# sh racer.sh  -t 180 -T 7 -f 20 -c -d /mnt/testfs/racer-dir/
      Directory:     /mnt/testfs/racer-dir/
      Time Limit:    180
      Lustre Tests:  1
      Max Files:     20
      Threads:       7
      Running Tests: lustre_file_create dir_create file_rm file_rename file_link file_symlink file_list file_concat file_exec dir_remote
      MDS Count:     3
      Running racer.sh for 180 seconds. CTRL-C to exit
      file_create: FILE=/mnt/testfs/racer-dir//10 SIZE=207136
      file_create: FILE=/mnt/testfs/racer-dir//17 SIZE=4800
      file_create: FILE=/mnt/testfs/racer-dir//5 SIZE=73416
      file_create: FILE=/mnt/testfs/racer-dir//19 SIZE=116024
      ...
      file_create: FILE=/mnt/testfs/racer-dir//0 SIZE=234400
      file_create: FILE=/mnt/testfs/racer-dir//2 SIZE=136432
      file_create: FILE=/mnt/testfs/racer-dir//8 SIZE=53296
      file_create: FILE=/mnt/testfs/racer-dir//12 SIZE=233528
      racer cleanup
      sleeping 5 sec ...
      lustre_file_create.sh: no process found
      dir_create.sh: no process found
      file_rm.sh: no process found
      file_rename.sh: no process found
      file_link.sh: no process found
      file_symlink.sh: no process found
      file_list.sh: no process found
      file_concat.sh: no process found
      file_exec.sh: no process found
      dir_remote.sh: no process found
      there should be NO racer processes:
      root     201964  0.0  0.0 112660   988 pts/24   S+   11:03   0:00 grep -E lustre_file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat|file_exec|dir_remote
      Filesystem                                   1K-blocks    Used    Available Use% Mounted on
      172.18.1.3@o2ib1,172.18.1.4@o2ib1:/testfs 240559470792 4289196 238129624412   1% /mnt/testfs
      We survived racer.sh for 180 seconds.
      

       
      e2fsck on the MDT0 device:

      [root@cslmodev103 ~]# umount /dev/md66
      [root@cslmodev103 ~]# e2fsck -fvn /dev/md66
      e2fsck 1.42.13.x6 (01-Mar-2018)
      Pass 1: Checking inodes, blocks, and sizes
      Pass 2: Checking directory structure
      Pass 3: Checking directory connectivity
      Pass 4: Checking reference counts
      Inode 2061541938 ref count is 3, should be 2.  Fix? no
      
      Unattached inode 2061541970
      Connect to /lost+found? no
      
      Inode 2061541986 ref count is 2, should be 1.  Fix? no
      
      Unattached inode 2061542181
      Connect to /lost+found? no
      
      Inode 2061542575 ref count is 10, should be 9.  Fix? no
      
      Inode 2061542583 ref count is 6, should be 5.  Fix? no
      
      Pass 5: Checking group summary information
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1248931840, 1295) != expected (1248919552, 1295)
      Update quota info for quota type 0? no
      
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1248931840, 1295) != expected (1248919552, 1295)
      Update quota info for quota type 1? no
      
      
      testfs-MDT0000: ********** WARNING: Filesystem still has errors **********
      
      
              1304 inodes used (0.00%, out of 3042005760)
                85 non-contiguous files (6.5%)
                 2 non-contiguous directories (0.2%)
                   # of inodes with ind/dind/tind blocks: 72/64/0
         381833025 blocks used (25.10%, out of 1520996090)
                 0 bad blocks
                 2 large files
      
               555 regular files
               491 directories
                 0 character device files
                 0 block device files
                 0 fifos
               107 links
               249 symbolic links (249 fast symbolic links)
                 0 sockets
      ------------
              1400 files
      [root@cslmodev103 ~]#
      
      [root@cslmodev103 ~]# dumpe2fs -h /dev/md66 | grep -i state
      dumpe2fs 1.42.13.x6 (01-Mar-2018)
      Filesystem state:         clean
      [root@cslmodev103 ~]#
      
      

      Invalid symlink inodes are due to LU-11130, wrong nlinks are due to LU-11446,
      unattached inodes are what this ticket is about.

      The racer test script doesn't use migrate or striped dirs,
      just "lustre_file_create dir_create file_rm file_rename file_link file_symlink file_list file_concat file_exec dir_remote". Also there is no failovers.

      Lustre is built from the tip of the wc master branch:

      $ git log --oneline wc/master
      fe7c13bd48 (wc/master) LU-11329 utils: create tests maintainers list
      70a01a6c9c LU-11276 ldlm: don't apply ELC to converting and DOM locks
      72372486a5 LU-11347 osd: do not use pagecache for I/O
      8b9105d828 LU-11199 mdt: Attempt lookup lock on open
      697e8fe6f3 LU-11473 doc: add lfs-getsom man page
      ed0c19d250 LU-1095 misc: quiet console messages at startup
      ....
      
      [root@cslmodev103 ~]# rpm -q lustre_ib
      lustre_ib-2.11.56_16_gfe7c13b-1.el7.centos.x86_64
      [root@cslmodev103 ~]#
      

      Attachments

        1. racer-mod.tar.gz
          4 kB
          Alexander Zarochentsev
        2. unconnected.txt.gz
          7.35 MB
          Alexander Zarochentsev

        Issue Links

          Activity

            People

              zam Alexander Zarochentsev
              zam Alexander Zarochentsev
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: