Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11481

corrupt directory

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.7
    • Lustre 2.10.4
    • 2
    • 9223372036854775807

    Description

      A directory has an entry for subdirectory "2fe", but the object ID stored for that entry does not exist:

      alias ll="ls -l"
      [root@catalyst101:~]# ll /p/lustre3/videousr/YLI/mmcommons/data/images_v1
      
      ls: cannot access /p/lustre3/videousr/YLI/mmcommons/data/images_v1/2fe: No such file or directory
      
      total 0
      
      d????????? ? ? ? ?            ? 2fe
      

      And when using zdb on the MDT to examine images_v1, one sees that 2fe refers to an object ID that is invalid:

      [root@porter81:snap]# zdb -ddddd porter81/mdt0 533741247
      Dataset porter81/mdt0 [ZPL], ID 148, cr_txg 98, 910G, 61852198 objects, rootbp DVA[0]=<4:88d9c400:200> DVA[1]=<5:25ca03c200:200> [L0 DMU objset] fletcher4 lz4 LE contiguous unique double size=800L/200P birth=1214040L/1214040P fill=61852198 cksum=139cf672b7:5dc8d6146f6:f8e6add4f57c:1e27e38477f5c0                                                                                
      
          Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
       533741247    2   128K    16K   231K     512   528K  100.00  ZFS directory
                                                     192   bonus  System attributes
              dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED SPILL_BLKPTR
              dnode maxblkid: 32                                                           
              path    ???<object#533741247>                                                
              uid     0                                                                    
              gid     2093                                                                 
              atime   Mon Oct  8 11:01:28 2018                                             
              mtime   Wed Oct  3 15:53:08 2018                                             
              ctime   Wed Oct  3 15:53:08 2018                                             
              crtime  Mon Oct  1 20:53:54 2018                                             
              gen     1090081                                                              
              mode    42700                                                                
              size    2                                                                    
              parent  533740502                                                            
              links   3                                                                    
              pflags  0                                                                    
              rdev    0x0000000000000000                                                   
              SA xattrs: 204 bytes, 3 entries                                              
      
                      trusted.lma = \000\000\000\000\000\000\000\0002@\000\000\002\000\000\000\245\037\001\000\000\000\000\000                                                                    
                      trusted.link = \337\361\352\021\001\000\000\0003\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\033\000\000\000\002\000\000@F\000\0001\213\000\000\000\000images_v1                                                                                      
                      trusted.version = \022\231\236+\011\000\000\000                               
              Fat ZAP stats:                                                                        
                      Pointer table:                                                                
                              1024 elements                                                         
                              zt_blk: 0                                                             
                              zt_numblks: 0                                                         
                              zt_shift: 10                                                          
                              zt_blks_copied: 0                                                     
                              zt_nextblk: 0                                                         
                      ZAP entries: 1                                                                
                      Leaf blocks: 32                                                               
                      Total blocks: 33                                                              
                      zap_block_type: 0x8000000000000001                                            
                      zap_magic: 0x2f52ab2ab                                                        
                      zap_salt: 0x3e3cbee7f                                                         
                      Leafs with 2^n pointers:                                                      
                                5:     32 ********************************                          
                      Blocks with n*5 entries:                                                      
                                0:     32 ********************************                          
                      Blocks n/10 full:                                                             
                                1:     32 ********************************                          
                      Entries with n chunks:                                                        
                                4:      1 *                                                         
                      Buckets with n entries:                                                       
                                0:  16383 ****************************************                  
                                1:      1 *                                                         
      
                      2fe = 533742980 (type: Directory)
      Indirect blocks:
                     0 L1  6:1a0095d000:a00 20000L/a00P F=33 B=1133009/1133009
                     0  L0 4:d99372200:200 4000L/200P F=1 B=1133009/1133009
                  4000  L0 4:2b78affa00:e00 4000L/e00P F=1 B=1132989/1132989
                  8000  L0 4:1a409fa00:e00 4000L/e00P F=1 B=1133008/1133008
                  c000  L0 4:dbecc8800:e00 4000L/e00P F=1 B=1133003/1133003
                 10000  L0 4:2d07544a00:e00 4000L/e00P F=1 B=1132997/1132997
                 14000  L0 5:11130c9600:e00 4000L/e00P F=1 B=1133005/1133005
                 18000  L0 5:1053a11c00:e00 4000L/e00P F=1 B=1132991/1132991
                 1c000  L0 4:2d07545800:e00 4000L/e00P F=1 B=1132997/1132997
                 20000  L0 6:1a41dd7c00:e00 4000L/e00P F=1 B=1133002/1133002
                 24000  L0 5:112ca4cc00:e00 4000L/e00P F=1 B=1133007/1133007
                 28000  L0 5:559e31000:e00 4000L/e00P F=1 B=1133000/1133000
                 2c000  L0 4:d91a7e000:e00 4000L/e00P F=1 B=1133004/1133004
                 30000  L0 4:d99372400:e00 4000L/e00P F=1 B=1133009/1133009
                 34000  L0 4:265bf62800:e00 4000L/e00P F=1 B=1132993/1132993
                 38000  L0 6:134c5fcc00:e00 4000L/e00P F=1 B=1132992/1132992
                 3c000  L0 5:559e31e00:e00 4000L/e00P F=1 B=1133000/1133000
                 40000  L0 5:11130ca400:e00 4000L/e00P F=1 B=1133005/1133005
                 44000  L0 4:dbeccac00:e00 4000L/e00P F=1 B=1133003/1133003
                 48000  L0 4:2b78b02200:e00 4000L/e00P F=1 B=1132989/1132989
                 4c000  L0 6:134c5ff400:e00 4000L/e00P F=1 B=1132992/1132992
                 50000  L0 4:1a40a2400:e00 4000L/e00P F=1 B=1133008/1133008
                 54000  L0 5:11130cb200:e00 4000L/e00P F=1 B=1133005/1133005
                 58000  L0 6:19f0f10c00:e00 4000L/e00P F=1 B=1132991/1132991
                 5c000  L0 4:1a40a3200:e00 4000L/e00P F=1 B=1133008/1133008
                 60000  L0 7:b97b6aa00:e00 4000L/e00P F=1 B=1133004/1133004
                 64000  L0 5:112ca4f400:e00 4000L/e00P F=1 B=1133007/1133007
                 68000  L0 4:17f825800:e00 4000L/e00P F=1 B=1132999/1132999
                 6c000  L0 6:1a2429de00:e00 4000L/e00P F=1 B=1132995/1132995
                 70000  L0 6:1a41dd9a00:e00 4000L/e00P F=1 B=1133002/1133002
                 74000  L0 7:129d29e800:e00 4000L/e00P F=1 B=1133007/1133007
                 78000  L0 4:dbeccca00:e00 4000L/e00P F=1 B=1133003/1133003
                 7c000  L0 4:17f826600:e00 4000L/e00P F=1 B=1132999/1132999
                 80000  L0 5:569fa5000:e00 4000L/e00P F=1 B=1132994/1132994
      
                      segment [0000000000000000, 0000000000084000) size  528K
      
      [root@porter81:snap]# zdb -ddddd porter81/mdt0 533742980
      Dataset porter81/mdt0 [ZPL], ID 148, cr_txg 98, 910G, 61852198 objects, rootbp DVA[0]=<4:88d9c400:200> DVA[1]=<5:25ca03c200:200> [L0 DMU objset] fletcher4 lz4 LE contiguous unique double size=800L/200P birth=1214040L/1214040P fill=61852198 cksum=139cf672b7:5dc8d6146f6:f8e6add4f57c:1e27e38477f5c0
      
          Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
      zdb: dmu_bonus_hold(533742980) failed, errno 2
      
      

      This is on a new file system that has not been used by end-users yet, but which we attempted to copy data to. More specifically:
      1, We copied about 500 million files/dirs to it
      2. We tried to use lfs migrate -M to move some large subtrees from one MDT to another, but that failed due to a Lustre 2.8 bug with lfs migrate
      3. We deleted most of the files/dirs

      • The servers did not crash, as far as I can recall, while we were performing all the copy and delete operations. But I cannot be certain of that.
      • We inspected the console logs on the servers and clients but found nothing that sounded like it indicated object creation or destruction failing.

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: