Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11549

Unattached inodes after 3 min racer run.

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0
    • Lustre 2.11.0, Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      An attempt to run racer.sh on a DNE system with rpms built from wc master branch:

      [root@cslmodev100 racer]# sh racer.sh  -t 180 -T 7 -f 20 -c -d /mnt/testfs/racer-dir/
      Directory:     /mnt/testfs/racer-dir/
      Time Limit:    180
      Lustre Tests:  1
      Max Files:     20
      Threads:       7
      Running Tests: lustre_file_create dir_create file_rm file_rename file_link file_symlink file_list file_concat file_exec dir_remote
      MDS Count:     3
      Running racer.sh for 180 seconds. CTRL-C to exit
      file_create: FILE=/mnt/testfs/racer-dir//10 SIZE=207136
      file_create: FILE=/mnt/testfs/racer-dir//17 SIZE=4800
      file_create: FILE=/mnt/testfs/racer-dir//5 SIZE=73416
      file_create: FILE=/mnt/testfs/racer-dir//19 SIZE=116024
      ...
      file_create: FILE=/mnt/testfs/racer-dir//0 SIZE=234400
      file_create: FILE=/mnt/testfs/racer-dir//2 SIZE=136432
      file_create: FILE=/mnt/testfs/racer-dir//8 SIZE=53296
      file_create: FILE=/mnt/testfs/racer-dir//12 SIZE=233528
      racer cleanup
      sleeping 5 sec ...
      lustre_file_create.sh: no process found
      dir_create.sh: no process found
      file_rm.sh: no process found
      file_rename.sh: no process found
      file_link.sh: no process found
      file_symlink.sh: no process found
      file_list.sh: no process found
      file_concat.sh: no process found
      file_exec.sh: no process found
      dir_remote.sh: no process found
      there should be NO racer processes:
      root     201964  0.0  0.0 112660   988 pts/24   S+   11:03   0:00 grep -E lustre_file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat|file_exec|dir_remote
      Filesystem                                   1K-blocks    Used    Available Use% Mounted on
      172.18.1.3@o2ib1,172.18.1.4@o2ib1:/testfs 240559470792 4289196 238129624412   1% /mnt/testfs
      We survived racer.sh for 180 seconds.
      

       
      e2fsck on the MDT0 device:

      [root@cslmodev103 ~]# umount /dev/md66
      [root@cslmodev103 ~]# e2fsck -fvn /dev/md66
      e2fsck 1.42.13.x6 (01-Mar-2018)
      Pass 1: Checking inodes, blocks, and sizes
      Pass 2: Checking directory structure
      Pass 3: Checking directory connectivity
      Pass 4: Checking reference counts
      Inode 2061541938 ref count is 3, should be 2.  Fix? no
      
      Unattached inode 2061541970
      Connect to /lost+found? no
      
      Inode 2061541986 ref count is 2, should be 1.  Fix? no
      
      Unattached inode 2061542181
      Connect to /lost+found? no
      
      Inode 2061542575 ref count is 10, should be 9.  Fix? no
      
      Inode 2061542583 ref count is 6, should be 5.  Fix? no
      
      Pass 5: Checking group summary information
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1248931840, 1295) != expected (1248919552, 1295)
      Update quota info for quota type 0? no
      
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1248931840, 1295) != expected (1248919552, 1295)
      Update quota info for quota type 1? no
      
      
      testfs-MDT0000: ********** WARNING: Filesystem still has errors **********
      
      
              1304 inodes used (0.00%, out of 3042005760)
                85 non-contiguous files (6.5%)
                 2 non-contiguous directories (0.2%)
                   # of inodes with ind/dind/tind blocks: 72/64/0
         381833025 blocks used (25.10%, out of 1520996090)
                 0 bad blocks
                 2 large files
      
               555 regular files
               491 directories
                 0 character device files
                 0 block device files
                 0 fifos
               107 links
               249 symbolic links (249 fast symbolic links)
                 0 sockets
      ------------
              1400 files
      [root@cslmodev103 ~]#
      
      [root@cslmodev103 ~]# dumpe2fs -h /dev/md66 | grep -i state
      dumpe2fs 1.42.13.x6 (01-Mar-2018)
      Filesystem state:         clean
      [root@cslmodev103 ~]#
      
      

      Invalid symlink inodes are due to LU-11130, wrong nlinks are due to LU-11446,
      unattached inodes are what this ticket is about.

      The racer test script doesn't use migrate or striped dirs,
      just "lustre_file_create dir_create file_rm file_rename file_link file_symlink file_list file_concat file_exec dir_remote". Also there is no failovers.

      Lustre is built from the tip of the wc master branch:

      $ git log --oneline wc/master
      fe7c13bd48 (wc/master) LU-11329 utils: create tests maintainers list
      70a01a6c9c LU-11276 ldlm: don't apply ELC to converting and DOM locks
      72372486a5 LU-11347 osd: do not use pagecache for I/O
      8b9105d828 LU-11199 mdt: Attempt lookup lock on open
      697e8fe6f3 LU-11473 doc: add lfs-getsom man page
      ed0c19d250 LU-1095 misc: quiet console messages at startup
      ....
      
      [root@cslmodev103 ~]# rpm -q lustre_ib
      lustre_ib-2.11.56_16_gfe7c13b-1.el7.centos.x86_64
      [root@cslmodev103 ~]#
      

      Attachments

        Issue Links

          Activity

            [LU-11549] Unattached inodes after 3 min racer run.

            Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35991
            Subject: LU-11549 tests: link succeded to an ophan remote object
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3e6574997f82407c98b90318448400bb3a1ca5e0

            gerrit Gerrit Updater added a comment - Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35991 Subject: LU-11549 tests: link succeded to an ophan remote object Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3e6574997f82407c98b90318448400bb3a1ca5e0

            The above patch https://review.whamcloud.com/35776 fixes the issue, my reproducer doesn't show any fs corruption . Unfortunately it reports an incorrect link creation even with the fix b/c the test was not ready that client repeats the link() operation after getting ESTALE error, so second attempt to create the link works.

            the fix is one-liner:

            diff --git a/lustre/mdd/mdd_dir.c b/lustre/mdd/mdd_dir.c
            index 17613d6..260dd7a 100644
            --- a/lustre/mdd/mdd_dir.c
            +++ b/lustre/mdd/mdd_dir.c
            @@ -1456,7 +1456,7 @@ static int mdd_mark_orphan_object(const struct lu_env *env,
                    struct lu_attr *attr = MDD_ENV_VAR(env, la_for_start);
                    int rc;
             
            -       if (!S_ISDIR(mdd_object_type(obj)))
            +       if (S_ISDIR(mdd_object_type(obj)))
                            return 0;
             
                    attr->la_valid = LA_FLAGS;
            
            zam Alexander Zarochentsev added a comment - The above patch https://review.whamcloud.com/35776 fixes the issue, my reproducer doesn't show any fs corruption . Unfortunately it reports an incorrect link creation even with the fix b/c the test was not ready that client repeats the link() operation after getting ESTALE error, so second attempt to create the link works. the fix is one-liner: diff --git a/lustre/mdd/mdd_dir.c b/lustre/mdd/mdd_dir.c index 17613d6..260dd7a 100644 --- a/lustre/mdd/mdd_dir.c +++ b/lustre/mdd/mdd_dir.c @@ -1456,7 +1456,7 @@ static int mdd_mark_orphan_object( const struct lu_env *env, struct lu_attr *attr = MDD_ENV_VAR(env, la_for_start); int rc; - if (!S_ISDIR(mdd_object_type(obj))) + if (S_ISDIR(mdd_object_type(obj))) return 0; attr->la_valid = LA_FLAGS;

            Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35776
            Subject: LU-11549 mdd: set LUSTRE_ORPHAN_FL for non-dirs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fd759510de938a3556eddf3b5798ce95135e1dbc

            gerrit Gerrit Updater added a comment - Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35776 Subject: LU-11549 mdd: set LUSTRE_ORPHAN_FL for non-dirs Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fd759510de938a3556eddf3b5798ce95135e1dbc
            zam Alexander Zarochentsev added a comment - - edited

            Reproducer: https://review.whamcloud.com/35683 :

            == sanityn test 104: Link to an orphan remote object should not succeed ============================== 22:38:41 (1564774721)
            fail_loc=0x8000018a
            sanityn.sh: line 4738: 23639 Terminated              $MULTIOP $DIR2/$tdir/mdt0dir/foodir/file2 Ow4096_c
             sanityn test_104: @@@@@@ FAIL: Link to an orphan object succeeded 
              Trace dump:
              = ./../tests/test-framework.sh:6114:error()
              = sanityn.sh:4739:test_104()
              = ./../tests/test-framework.sh:6416:run_one()
              = ./../tests/test-framework.sh:6455:run_one_logged()
              = ./../tests/test-framework.sh:6286:run_test()
              = sanityn.sh:4741:main()
            Dumping lctl log to /tmp/test_logs/1564774698/sanityn.test_104.*.1564774724.log
            Dumping logs only on local client.
            Resetting fail_loc on all nodes...done.
            FAIL 104 (3s)
            cleanup: ======================================================
            sh: fuser: command not found
            sh: fuser: command not found
            == sanityn test complete, duration 26 sec ============================================================ 22:38:44 (1564774724)
            sanityn: FAIL: test_104 Link to an orphan object succeeded
            rm: cannot remove '/mnt/lustre/d104.sanityn/mdt1dir': Directory not empty
            

            the wrong link causes a cross-mdt link corruption,

            [root@devvm1 tests]# stat /mnt/lustre/d104.sanityn/mdt1dir/file2x 
            stat: cannot stat ‘/mnt/lustre/d104.sanityn/mdt1dir/file2x’: No such file or directory
            [root@devvm1 tests]# 
            

            file2x is a name which cannot be deleted, it points to a remote object already removed from OI, but its inode (unconnected) still exists.

            zam Alexander Zarochentsev added a comment - - edited Reproducer: https://review.whamcloud.com/35683 : == sanityn test 104: Link to an orphan remote object should not succeed ============================== 22:38:41 (1564774721) fail_loc=0x8000018a sanityn.sh: line 4738: 23639 Terminated $MULTIOP $DIR2/$tdir/mdt0dir/foodir/file2 Ow4096_c sanityn test_104: @@@@@@ FAIL: Link to an orphan object succeeded Trace dump: = ./../tests/test-framework.sh:6114:error() = sanityn.sh:4739:test_104() = ./../tests/test-framework.sh:6416:run_one() = ./../tests/test-framework.sh:6455:run_one_logged() = ./../tests/test-framework.sh:6286:run_test() = sanityn.sh:4741:main() Dumping lctl log to /tmp/test_logs/1564774698/sanityn.test_104.*.1564774724.log Dumping logs only on local client. Resetting fail_loc on all nodes...done. FAIL 104 (3s) cleanup: ====================================================== sh: fuser: command not found sh: fuser: command not found == sanityn test complete, duration 26 sec ============================================================ 22:38:44 (1564774724) sanityn: FAIL: test_104 Link to an orphan object succeeded rm: cannot remove '/mnt/lustre/d104.sanityn/mdt1dir': Directory not empty the wrong link causes a cross-mdt link corruption, [root@devvm1 tests]# stat /mnt/lustre/d104.sanityn/mdt1dir/file2x stat: cannot stat ‘/mnt/lustre/d104.sanityn/mdt1dir/file2x’: No such file or directory [root@devvm1 tests]# file2x is a name which cannot be deleted, it points to a remote object already removed from OI, but its inode (unconnected) still exists.

            Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35683
            Subject: LU-11549 tests: link succeded to an ophan remote object
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 14ab193c6aa78ca95b85fa5c940af62fbb815abb

            gerrit Gerrit Updater added a comment - Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35683 Subject: LU-11549 tests: link succeded to an ophan remote object Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 14ab193c6aa78ca95b85fa5c940af62fbb815abb

            Root cause update:

            0. Locks are OK.

            1. the mdt_reint_link operation over a remote object cannot perform a reliable check for the remote object, We want to check the mdd object for DEAD_OBJ flag but can't do that, because DEAD_OBJ is an mdd object flag, it exists on the remote MDT and not seen through OSP/OUT interface where out_attr_get() checks only dt object state. So the mdd_link_sanity_check() is passed successfully and illegal link operation is done.

            2. A solution could be in moving DEAD_OBJ flag to lu_object level .

            zam Alexander Zarochentsev added a comment - Root cause update: 0. Locks are OK. 1. the mdt_reint_link operation over a remote object cannot perform a reliable check for the remote object, We want to check the mdd object for DEAD_OBJ flag but can't do that, because DEAD_OBJ is an mdd object flag, it exists on the remote MDT and not seen through OSP/OUT interface where out_attr_get() checks only dt object state. So the mdd_link_sanity_check() is passed successfully and illegal link operation is done. 2. A solution could be in moving DEAD_OBJ flag to lu_object level .

            Peter Jones - any chance we can get someone to review this?

            cfaber#1 Colin Faber [X] (Inactive) added a comment - Peter Jones - any chance we can get someone to review this?

            Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35021
            Subject: LU-11549 mdt: take UPDATE lock for rename target
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 4ff2700af396e63fd9ac90d5f51f1129465c8df7

            gerrit Gerrit Updater added a comment - Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/35021 Subject: LU-11549 mdt: take UPDATE lock for rename target Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 4ff2700af396e63fd9ac90d5f51f1129465c8df7
            zam Alexander Zarochentsev added a comment - - edited

            more debugging showed the following scenario of the fs corruption:

            1. client opens file [0x20001500b:0x185f4:0x0]/9 , FID=[0x200020b8a:0x1c3d:0x0] and keeps fd open:

            00000004:00000002:10.0:1544282653.825766:0:105088:0:(mdt_open.c:1319:mdt_reint_open()) I am going to open [0x20001500b:0x185f4:0x0]/(9->[0x200020b8a:0x1c3d:0x0]) cr_flag=02102 mode=0100
            666 msg_flag=0x0
            

            2. rename [0x20001500b:0x185f4:0x0]/4 to [0x20001500b:0x185f4:0x0]/9 unlinks target file:

            0000004:00000002:16.0:1544282653.939918:0:105058:0:(mdt_reint.c:1880:mdt_reint_rename_internal()) @@@ rename [0x20001500b:0x185f4:0x0]/4 to [0x20001500b:0x185f4:0x0]/9  req@ffff880cf46
            
            00000004:00000001:16.0:1544282653.942415:0:105058:0:(mdd_dir.c:3039:mdd_rename()) Process entered
            

            and puts the object into the orpan list b/c the file is still open:

            00000004:00080000:16.0:1544282653.942707:0:105058:0:(mdd_dir.c:1663:mdd_finish_unlink()) Object [0x200020b8a:0x1c3d:0x0] is inserted into orphan list, open count = 1
            

            3. somehow in parallel with (2), a cross-mdt transaction adds a reference to [0x200020b8a:0x1c3d:0x0] (increases nlink count):

            00080000:00000002:17.0:1544282653.996839:0:105655:0:(osd_handler.c:3882:osd_ref_add()) [0x200020b8a:0x1c3d:0x0] increase nlink 1
            

            it is a part of compound "out" update :

            00000020:00001000:17.0:1544282653.996777:0:105655:0:(out_handler.c:997:out_handle()) testfs-MDT0000: out update request, 1 buffers
            ...
            00000020:00001000:17.0:1544282653.996782:0:105655:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 3 [0x200020b8a:0x1c3d:0x0]
            ...
            00000020:00001000:17.0:1544282653.996789:0:105655:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 1, opc = 7 [0x200020b8a:0x1c3d:0x0]
            ...
            00000020:00001000:17.0:1544282653.996792:0:105655:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 2, opc = 12 [0x20001fbeb:0x4:0x0]
            ...
            

            I think the cross-target update is a link() operation, creating a file name on another MDT.

            4. finally, file close removes the object from orphan list and from OI (b/c it has ORPHAN_OBJ and DEAD_OBJ flags set),  creating a fs corruption reported by e2fsck as "unattached inode".

            I do not know what exactly is broken yet, suspecting cross-target link not checking properly for ORPAN_OBJ | DEAD_OBJ flags.

             

            zam Alexander Zarochentsev added a comment - - edited more debugging showed the following scenario of the fs corruption: 1. client opens file [0x20001500b:0x185f4:0x0] /9 , FID= [0x200020b8a:0x1c3d:0x0] and keeps fd open: 00000004:00000002:10.0:1544282653.825766:0:105088:0:(mdt_open.c:1319:mdt_reint_open()) I am going to open [0x20001500b:0x185f4:0x0]/(9->[0x200020b8a:0x1c3d:0x0]) cr_flag=02102 mode=0100 666 msg_flag=0x0 2. rename [0x20001500b:0x185f4:0x0] /4 to [0x20001500b:0x185f4:0x0] /9 unlinks target file: 0000004:00000002:16.0:1544282653.939918:0:105058:0:(mdt_reint.c:1880:mdt_reint_rename_internal()) @@@ rename [0x20001500b:0x185f4:0x0]/4 to [0x20001500b:0x185f4:0x0]/9 req@ffff880cf46 00000004:00000001:16.0:1544282653.942415:0:105058:0:(mdd_dir.c:3039:mdd_rename()) Process entered and puts the object into the orpan list b/c the file is still open: 00000004:00080000:16.0:1544282653.942707:0:105058:0:(mdd_dir.c:1663:mdd_finish_unlink()) Object [0x200020b8a:0x1c3d:0x0] is inserted into orphan list, open count = 1 3. somehow in parallel with (2), a cross-mdt transaction adds a reference to [0x200020b8a:0x1c3d:0x0] (increases nlink count): 00080000:00000002:17.0:1544282653.996839:0:105655:0:(osd_handler.c:3882:osd_ref_add()) [0x200020b8a:0x1c3d:0x0] increase nlink 1 it is a part of compound "out" update : 00000020:00001000:17.0:1544282653.996777:0:105655:0:(out_handler.c:997:out_handle()) testfs-MDT0000: out update request, 1 buffers ... 00000020:00001000:17.0:1544282653.996782:0:105655:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 3 [0x200020b8a:0x1c3d:0x0] ... 00000020:00001000:17.0:1544282653.996789:0:105655:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 1, opc = 7 [0x200020b8a:0x1c3d:0x0] ... 00000020:00001000:17.0:1544282653.996792:0:105655:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 2, opc = 12 [0x20001fbeb:0x4:0x0] ... I think the cross-target update is a link() operation, creating a file name on another MDT. 4. finally, file close removes the object from orphan list and from OI (b/c it has ORPHAN_OBJ and DEAD_OBJ flags set),  creating a fs corruption reported by e2fsck as "unattached inode". I do not know what exactly is broken yet, suspecting cross-target link not checking properly for ORPAN_OBJ | DEAD_OBJ flags.  

            Attaching unconnected.txt.gz – lustre log from MDT0 with dropping nlink to zero and destroying the file.

            zam Alexander Zarochentsev added a comment - Attaching  unconnected.txt.gz – lustre log from MDT0 with dropping nlink to zero and destroying the file.
            zam Alexander Zarochentsev added a comment - - edited

            More info from another run where I managed to get lustre logs with almost full debug (trace , inode, info, other) and I added debug code to OUT code :

            the unattached inode FID is 0x20001ec49:0xab67:0x0 now. Below is a grep for 0x20001ec49:0xab67:0x0 through the lustre log:

            00000004:00000002:16.0:1541058276.560428:0:61806:0:(mdt_reint.c:72:mdt_obj_version_get()) FID [0x20001ec49:0xab67:0x0] version is 0x37000634a7
            00080000:00000002:16.0:1541058276.560531:0:61806:0:(osd_handler.c:312:osd_idc_find_and_init()) testfs-MDT0000: FID [0x20001ec49:0xab67:0x0] not in the id map cache
            00080000:00000002:16.0:1541058276.560677:0:61806:0:(osd_handler.c:3971:osd_ref_del()) [0x20001ec49:0xab67:0x0] decrease nlink 1
            00080000:00000002:16.0:1541058276.560691:0:61806:0:(osd_handler.c:5864:osd_index_ea_insert()) parent 1985033217 insert 0x20001ec49:0xab67:0x0:2061542000 rc = 0
            00080000:00000002:16.0:1541058276.560694:0:61806:0:(osd_handler.c:3882:osd_ref_add()) [0x20001ec49:0xab67:0x0] increase nlink 0
            00000004:00000002:16.0:1541058276.560734:0:61806:0:(mdt_handler.c:736:mdt_pack_attr2body()) [0x20001ec49:0xab67:0x0]: nlink=1, mode=100644, valid=0x100000000002f8f
            00010000:00000040:34.0:1541058276.561254:0:26965:0:(ldlm_resource.c:1701:ldlm_resource_dump()) --- Resource: [0x20001ec49:0xab67:0x0].0x0 (ffff8810196f4f00) refcount = 1
            00000020:00001000:6.0:1541058276.561752:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 6 [0x20001ec49:0xab67:0x0]
            00010000:00000040:13.0:1541058276.570670:0:26966:0:(ldlm_resource.c:1701:ldlm_resource_dump()) --- Resource: [0x20001ec49:0xab67:0x0].0x0 (ffff8810196f4f00) refcount = 2
            00000020:00001000:6.0:1541058276.571050:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 8 [0x20001ec49:0xab67:0x0]
            00000020:00000040:6.0:1541058276.571056:0:61820:0:(out_handler.c:295:out_xattr_get()) testfs-MDT0000: [0x20001ec49:0xab67:0x0] get xattr trusted.link len 43: rc = 43
            00000020:00001000:6.0:1541058276.571722:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 3 [0x20001ec49:0xab67:0x0]
            00080000:00000002:6.0:1541058276.571726:0:61820:0:(osd_handler.c:312:osd_idc_find_and_init()) testfs-MDT0000: FID [0x20001ec49:0xab67:0x0] not in the id map cache
            00000020:00001000:6.0:1541058276.571728:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 1, opc = 7 [0x20001ec49:0xab67:0x0]
            00080000:00000002:6.0:1541058276.571774:0:61820:0:(osd_handler.c:3882:osd_ref_add()) [0x20001ec49:0xab67:0x0] increase nlink 1
            00080000:00000002:6.0:1541058276.571782:0:61820:0:(osd_handler.c:4382:osd_xattr_set()) [0x20001ec49:0xab67:0x0] set xattr 'trusted.link' with size 62
            00080000:00000002:6.0:1541058276.571791:0:61820:0:(osd_compat.c:264:osd_add_to_remote_parent()) md66: add agent entry [0x20001ec49:0xab67:0x0]
            00080000:00000002:6.0:1541058276.571799:0:61820:0:(osd_compat.c:296:osd_add_to_remote_parent()) testfs-MDT0000: create agent entry for 0x20001ec49:0xab67:0x0: rc = 0
            00000004:00000002:13.0:1541058276.615091:0:61817:0:(mdt_open.c:2214:mdt_mfd_close()) testfs-MDT0000: close file [0x20001ec49:0xab67:0x0] with intent: 0
            00000004:00000002:13.0:1541058276.615095:0:61817:0:(mdd_object.c:1155:mdd_attr_set()) testfs-MDD0000: no valid attribute on [0x20001ec49:0xab67:0x0], previousvalid is 0x1
            00080000:00000002:13.0:1541058276.615139:0:61817:0:(osd_handler.c:312:osd_idc_find_and_init()) testfs-MDT0000: FID [0x20001ec49:0xab67:0x0] not in the id map cache
            00080000:00000002:13.0:1541058276.615239:0:61817:0:(osd_handler.c:3971:osd_ref_del()) [0x20001ec49:0xab67:0x0] decrease nlink 2
            00080000:00000002:13.0:1541058276.615246:0:61817:0:(osd_compat.c:318:osd_delete_from_remote_parent()) md66: remove agent direntry [0x20001ec49:0xab67:0x0]
            00080000:00000002:13.0:1541058276.615253:0:61817:0:(osd_compat.c:340:osd_delete_from_remote_parent()) testfs-MDT0000: remove agent entry for 0x20001ec49:0xab67:0x0: rc = 0
            00080000:00000002:13.0:1541058276.615255:0:61817:0:(osd_oi.c:817:osd_oi_delete()) md66: delete [0x20001ec49:0xab67:0x0]
            00010000:00000040:13.0:1541058276.615329:0:61817:0:(ldlm_resource.c:1701:ldlm_resource_dump()) --- Resource: [0x20001ec49:0xab67:0x0].0x0 (ffff8810196f4f00) refcount = 2
            00010000:00000040:13.0:1541058276.615332:0:61817:0:(ldlm_resource.c:1707:ldlm_resource_dump()) ### ### ns: mdt-testfs-MDT0000_UUID lock: ffff88078a251680/0x94311686d2f0ef67 lrc: 2/0,0 mode: EX/EX res: [0x20001ec49:0xab67:0x0].0x0 bits 0x2/0x0 rrc: 3 type: IBT flags: 0x40000001000000 nid: 172.18.1.8@o2ib1 remote: 0x54fc01ada1ff65c1 expref: 9 pid: 61820 timeout: 0 lvb_type: 0
            00000004:00000040:13.0:1541058276.615373:0:61817:0:(mdt_handler.c:5461:mdt_object_free()) object free, fid = [0x20001ec49:0xab67:0x0]
            

            For a moment, nlink counter dropped to zero:

            00080000:00000002:16.0:1541058276.560694:0:61806:0:(osd_handler.c:3882:osd_ref_add()) [0x20001ec49:0xab67:0x0] increase nlink 0
            

            I think it is not a valid situation, VFS warns about such cases, but Lustre avoids nlinks increase from zero (LU-4416 http://review.whamcloud.com/11571 ).

            almost at the same time thread 61817 closes the file, it sees nlink = 0 (my guess) and continue with osd_delete_from_remote_parent() and osd_oi_delete() , it causes the corruption explained in the previous comment.

            zam Alexander Zarochentsev added a comment - - edited More info from another run where I managed to get lustre logs with almost full debug (trace , inode, info, other) and I added debug code to OUT code : the unattached inode FID is 0x20001ec49:0xab67:0x0 now. Below is a grep for 0x20001ec49:0xab67:0x0 through the lustre log: 00000004:00000002:16.0:1541058276.560428:0:61806:0:(mdt_reint.c:72:mdt_obj_version_get()) FID [0x20001ec49:0xab67:0x0] version is 0x37000634a7 00080000:00000002:16.0:1541058276.560531:0:61806:0:(osd_handler.c:312:osd_idc_find_and_init()) testfs-MDT0000: FID [0x20001ec49:0xab67:0x0] not in the id map cache 00080000:00000002:16.0:1541058276.560677:0:61806:0:(osd_handler.c:3971:osd_ref_del()) [0x20001ec49:0xab67:0x0] decrease nlink 1 00080000:00000002:16.0:1541058276.560691:0:61806:0:(osd_handler.c:5864:osd_index_ea_insert()) parent 1985033217 insert 0x20001ec49:0xab67:0x0:2061542000 rc = 0 00080000:00000002:16.0:1541058276.560694:0:61806:0:(osd_handler.c:3882:osd_ref_add()) [0x20001ec49:0xab67:0x0] increase nlink 0 00000004:00000002:16.0:1541058276.560734:0:61806:0:(mdt_handler.c:736:mdt_pack_attr2body()) [0x20001ec49:0xab67:0x0]: nlink=1, mode=100644, valid=0x100000000002f8f 00010000:00000040:34.0:1541058276.561254:0:26965:0:(ldlm_resource.c:1701:ldlm_resource_dump()) --- Resource: [0x20001ec49:0xab67:0x0].0x0 (ffff8810196f4f00) refcount = 1 00000020:00001000:6.0:1541058276.561752:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 6 [0x20001ec49:0xab67:0x0] 00010000:00000040:13.0:1541058276.570670:0:26966:0:(ldlm_resource.c:1701:ldlm_resource_dump()) --- Resource: [0x20001ec49:0xab67:0x0].0x0 (ffff8810196f4f00) refcount = 2 00000020:00001000:6.0:1541058276.571050:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 8 [0x20001ec49:0xab67:0x0] 00000020:00000040:6.0:1541058276.571056:0:61820:0:(out_handler.c:295:out_xattr_get()) testfs-MDT0000: [0x20001ec49:0xab67:0x0] get xattr trusted.link len 43: rc = 43 00000020:00001000:6.0:1541058276.571722:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 0, opc = 3 [0x20001ec49:0xab67:0x0] 00080000:00000002:6.0:1541058276.571726:0:61820:0:(osd_handler.c:312:osd_idc_find_and_init()) testfs-MDT0000: FID [0x20001ec49:0xab67:0x0] not in the id map cache 00000020:00001000:6.0:1541058276.571728:0:61820:0:(out_handler.c:1085:out_handle()) ... testfs-MDT0000: buf 0, update 1, opc = 7 [0x20001ec49:0xab67:0x0] 00080000:00000002:6.0:1541058276.571774:0:61820:0:(osd_handler.c:3882:osd_ref_add()) [0x20001ec49:0xab67:0x0] increase nlink 1 00080000:00000002:6.0:1541058276.571782:0:61820:0:(osd_handler.c:4382:osd_xattr_set()) [0x20001ec49:0xab67:0x0] set xattr 'trusted.link' with size 62 00080000:00000002:6.0:1541058276.571791:0:61820:0:(osd_compat.c:264:osd_add_to_remote_parent()) md66: add agent entry [0x20001ec49:0xab67:0x0] 00080000:00000002:6.0:1541058276.571799:0:61820:0:(osd_compat.c:296:osd_add_to_remote_parent()) testfs-MDT0000: create agent entry for 0x20001ec49:0xab67:0x0: rc = 0 00000004:00000002:13.0:1541058276.615091:0:61817:0:(mdt_open.c:2214:mdt_mfd_close()) testfs-MDT0000: close file [0x20001ec49:0xab67:0x0] with intent: 0 00000004:00000002:13.0:1541058276.615095:0:61817:0:(mdd_object.c:1155:mdd_attr_set()) testfs-MDD0000: no valid attribute on [0x20001ec49:0xab67:0x0], previousvalid is 0x1 00080000:00000002:13.0:1541058276.615139:0:61817:0:(osd_handler.c:312:osd_idc_find_and_init()) testfs-MDT0000: FID [0x20001ec49:0xab67:0x0] not in the id map cache 00080000:00000002:13.0:1541058276.615239:0:61817:0:(osd_handler.c:3971:osd_ref_del()) [0x20001ec49:0xab67:0x0] decrease nlink 2 00080000:00000002:13.0:1541058276.615246:0:61817:0:(osd_compat.c:318:osd_delete_from_remote_parent()) md66: remove agent direntry [0x20001ec49:0xab67:0x0] 00080000:00000002:13.0:1541058276.615253:0:61817:0:(osd_compat.c:340:osd_delete_from_remote_parent()) testfs-MDT0000: remove agent entry for 0x20001ec49:0xab67:0x0: rc = 0 00080000:00000002:13.0:1541058276.615255:0:61817:0:(osd_oi.c:817:osd_oi_delete()) md66: delete [0x20001ec49:0xab67:0x0] 00010000:00000040:13.0:1541058276.615329:0:61817:0:(ldlm_resource.c:1701:ldlm_resource_dump()) --- Resource: [0x20001ec49:0xab67:0x0].0x0 (ffff8810196f4f00) refcount = 2 00010000:00000040:13.0:1541058276.615332:0:61817:0:(ldlm_resource.c:1707:ldlm_resource_dump()) ### ### ns: mdt-testfs-MDT0000_UUID lock: ffff88078a251680/0x94311686d2f0ef67 lrc: 2/0,0 mode: EX/EX res: [0x20001ec49:0xab67:0x0].0x0 bits 0x2/0x0 rrc: 3 type: IBT flags: 0x40000001000000 nid: 172.18.1.8@o2ib1 remote: 0x54fc01ada1ff65c1 expref: 9 pid: 61820 timeout: 0 lvb_type: 0 00000004:00000040:13.0:1541058276.615373:0:61817:0:(mdt_handler.c:5461:mdt_object_free()) object free, fid = [0x20001ec49:0xab67:0x0] For a moment, nlink counter dropped to zero: 00080000:00000002:16.0:1541058276.560694:0:61806:0:(osd_handler.c:3882:osd_ref_add()) [0x20001ec49:0xab67:0x0] increase nlink 0 I think it is not a valid situation, VFS warns about such cases, but Lustre avoids nlinks increase from zero ( LU-4416 http://review.whamcloud.com/11571 ). almost at the same time thread 61817 closes the file, it sees nlink = 0 (my guess) and continue with osd_delete_from_remote_parent() and osd_oi_delete() , it causes the corruption explained in the previous comment.

            People

              zam Alexander Zarochentsev
              zam Alexander Zarochentsev
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: