Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3190

Interop 2.3.0<->2.4 Failed on lustre-rsync-test test 3b: ASSERTION( lio->lis_lsm != ((void *)0) ) failed

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.4.0
    • Lustre 2.3.0, Lustre 2.4.0
    • server: lustre-master tag-2.3.64 build #1411
      client: 2.3.0
    • 3
    • 7783

    Description

      Hit following LBUG when running lustre-rsync-test test 3b
      In tag-2.3.62, the same test passed

      Lustre: DEBUG MARKER: == lustre-rsync-test test 3b: Replicate files created by writemany == 17:57:47 (1366246667)
      LustreError: 6661:0:(lmv_obd.c:850:lmv_iocontrol()) error: iocontrol MDC lustre-MDT0000_UUID on MDTidx 0 cmd c0086696: err = -2
      LustreError: 6661:0:(lmv_obd.c:850:lmv_iocontrol()) Skipped 1415 previous similar messages
      Lustre: DEBUG MARKER: == lustre-rsync-test test 3c: Replicate files created by createmany/unlinkmany == 17:59:17 (1366246757)
      Lustre: DEBUG MARKER: == lustre-rsync-test test 4: Replicate files created by iozone == 17:59:33 (1366246773)
      LustreError: 7489:0:(lcommon_cl.c:1210:cl_file_inode_init()) Failure to initialize cl object [0x20001d0f0:0x340d:0x0]: -95
      LustreError: 7489:0:(lcommon_cl.c:1210:cl_file_inode_init()) Failure to initialize cl object [0x20001d0f0:0x340d:0x0]: -95
      LustreError: 7489:0:(lov_io.c:311:lov_io_slice_init()) ASSERTION( lio->lis_lsm != ((void *)0) ) failed: 
      LustreError: 7489:0:(lov_io.c:311:lov_io_slice_init()) LBUG
      Pid: 7489, comm: lustre_rsync
      
      Message from
      Call Trace:
       syslogd@client- [<ffffffffa0996905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      5 at Apr 17 18:0 [<ffffffffa0996f17>] lbug_with_loc+0x47/0xb0 [libcfs]
      0:04 ...
       kern [<ffffffffa06b5088>] lov_io_init_raid0+0x6d8/0x810 [lov]
      el:LustreError:  [<ffffffffa06ac037>] lov_io_init+0x97/0x160 [lov]
      7489:0:(lov_io.c [<ffffffffa0dd1578>] cl_io_init0+0x98/0x160 [obdclass]
       [<ffffffffa0dd4464>] cl_io_init+0x64/0x100 [obdclass]
       [<ffffffffa07e6fed>] cl_glimpse_size0+0x7d/0x190 [lustre]
      :311:lov_io_slic [<ffffffffa07a3f32>] ll_inode_revalidate_it+0xf2/0x1c0 [lustre]
       [<ffffffffa07a4049>] ll_getattr_it+0x49/0x170 [lustre]
       [<ffffffffa07a41a7>] ll_getattr+0x37/0x40 [lustre]
       [<ffffffff81214343>] ? security_inode_getattr+0x23/0x30
      e_init()) ASSERT [<ffffffff81180571>] vfs_getattr+0x51/0x80
       [<ffffffffa09a2088>] ? libcfs_log_return+0x28/0x40 [libcfs]
       [<ffffffff8118082f>] vfs_fstat+0x3f/0x60
       [<ffffffff81180874>] sys_newfstat+0x24/0x40
       [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
       [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
      
      ION( lio->lis_lsKernel panic - not syncing: LBUG
      m != ((void *)0)Pid: 7489, comm: lustre_rsync Not tainted 2.6.32-279.5.1.el6.x86_64 #1
       ) failed: 
      Call Trace:
      
      Message from s [<ffffffff814fd24a>] ? panic+0xa0/0x168
      yslogd@client-5  [<ffffffffa0996f6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      at Apr 17 18:00: [<ffffffffa06b5088>] ? lov_io_init_raid0+0x6d8/0x810 [lov]
      04 ...
       kernel [<ffffffffa06ac037>] ? lov_io_init+0x97/0x160 [lov]
      :LustreError: 74 [<ffffffffa0dd1578>] ? cl_io_init0+0x98/0x160 [obdclass]
      89:0:(lov_io.c:3 [<ffffffffa0dd4464>] ? cl_io_init+0x64/0x100 [obdclass]
      11:lov_io_slice_ [<ffffffffa07e6fed>] ? cl_glimpse_size0+0x7d/0x190 [lustre]
      init()) LBUG
       [<ffffffffa07a3f32>] ? ll_inode_revalidate_it+0xf2/0x1c0 [lustre]
      
      Message from  [<ffffffffa07a4049>] ? ll_getattr_it+0x49/0x170 [lustre]
      syslogd@client-5 [<ffffffffa07a41a7>] ? ll_getattr+0x37/0x40 [lustre]
       at Apr 17 18:00 [<ffffffff81214343>] ? security_inode_getattr+0x23/0x30
      :04 ...
       kerne [<ffffffff81180571>] ? vfs_getattr+0x51/0x80
      l:Kernel panic - [<ffffffffa09a2088>] ? libcfs_log_return+0x28/0x40 [libcfs]
       not syncing: LB [<ffffffff8118082f>] ? vfs_fstat+0x3f/0x60
      UG
       [<ffffffff81180874>] ? sys_newfstat+0x24/0x40
       [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
       [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
      Initializing cgroup subsys cpuset
      

      Attachments

        Issue Links

          Activity

            [LU-3190] Interop 2.3.0<->2.4 Failed on lustre-rsync-test test 3b: ASSERTION( lio->lis_lsm != ((void *)0) ) failed

            Lower priority as patch landed to master.
            But http://review.whamcloud.com/#change,6167 still needs to land.

            jlevi Jodi Levi (Inactive) added a comment - Lower priority as patch landed to master. But http://review.whamcloud.com/#change,6167 still needs to land.
            sarah Sarah Liu added a comment -

            Patch set 7 didn't hit the LBUG but still failed test_7 as LU-3279

            sarah Sarah Liu added a comment - Patch set 7 didn't hit the LBUG but still failed test_7 as LU-3279
            di.wang Di Wang added a comment -

            hmm, we probably should not put OSD_OII_NOGEN to the oi_cache, which make us unable to compare the generation at all. Just update the patch, please have a look.

            di.wang Di Wang added a comment - hmm, we probably should not put OSD_OII_NOGEN to the oi_cache, which make us unable to compare the generation at all. Just update the patch, please have a look.

            In fact, the per-thread based "FID => ino#/gen" mapping should be fixed, if someone removed the inode, and caused the OI mapping deleted from the OI file, the related "inode::n_link" should be zero, or if the inode is reused by others, then the "inode::i_generation" will be changed. So other threads cached old "FID => ion/gen" mapping will be invalid automatically, because nobody can use the old "ino/gen" to find out the inode. One special case is that: if the cached generation is "OSD_OII_NOGEN", then we need verify with the inode's LMA.

            yong.fan nasf (Inactive) added a comment - In fact, the per-thread based "FID => ino#/gen" mapping should be fixed, if someone removed the inode, and caused the OI mapping deleted from the OI file, the related "inode::n_link" should be zero, or if the inode is reused by others, then the "inode::i_generation" will be changed. So other threads cached old "FID => ion/gen" mapping will be invalid automatically, because nobody can use the old "ino/gen" to find out the inode. One special case is that: if the cached generation is "OSD_OII_NOGEN", then we need verify with the inode's LMA.
            di.wang Di Wang added a comment -

            Sigh, only remove oi_cache in current thread info seems not enough, and we should remove this oi in the cache of all thread infos.

            di.wang Di Wang added a comment - Sigh, only remove oi_cache in current thread info seems not enough, and we should remove this oi in the cache of all thread infos.
            sarah Sarah Liu added a comment -

            After running patch set 5 for 3 times, cannot reproduce this issue, usually will hit this bug before sub test_5. All runs failed on sub test_7, please refer to LU-3279

            https://maloo.whamcloud.com/test_sets/5584e96e-b5de-11e2-9d08-52540035b04c

            sarah Sarah Liu added a comment - After running patch set 5 for 3 times, cannot reproduce this issue, usually will hit this bug before sub test_5. All runs failed on sub test_7, please refer to LU-3279 https://maloo.whamcloud.com/test_sets/5584e96e-b5de-11e2-9d08-52540035b04c
            di.wang Di Wang added a comment -

            Yes, I totally agree we should not return LinkEA for the object once it is removed from the namespace.

            di.wang Di Wang added a comment - Yes, I totally agree we should not return LinkEA for the object once it is removed from the namespace.

            Jinshan,
            I don't agree with your comment that it is OK to have two inodes with the same pathname at the same time. That isn't possible in the namespace, just like you cannot have two files in the same directory with the same filename at the same time. I'm not objecting to two inodes that had the same name at different times (which seems to be the case here), but only the new one should resolve to the pathname with fid2path(). The old file can return any other pathnames that it still has, or return an ENOENT error if it is unlinked.

            Di,
            it still seems to make sense to not return any pathnames in the case of an open-unlinked inode. In this case, even if the oti_cache was pointing to the unlinked inode, there shouldn't have been any entries in the "link" xattr to return, so either the inode didn't get written to disk after it was unlinked, or the "link" entries are not being written for unlinked inodes. I still think that case needs to be handled properly, and checking the "DEAD" and "ORPHAN" flags is probably the right way to go.

            adilger Andreas Dilger added a comment - Jinshan, I don't agree with your comment that it is OK to have two inodes with the same pathname at the same time. That isn't possible in the namespace, just like you cannot have two files in the same directory with the same filename at the same time. I'm not objecting to two inodes that had the same name at different times (which seems to be the case here), but only the new one should resolve to the pathname with fid2path(). The old file can return any other pathnames that it still has, or return an ENOENT error if it is unlinked. Di, it still seems to make sense to not return any pathnames in the case of an open-unlinked inode. In this case, even if the oti_cache was pointing to the unlinked inode, there shouldn't have been any entries in the "link" xattr to return, so either the inode didn't get written to disk after it was unlinked, or the "link" entries are not being written for unlinked inodes. I still think that case needs to be handled properly, and checking the "DEAD" and "ORPHAN" flags is probably the right way to go.
            di.wang Di Wang added a comment -

            It turns out OI cache problem, as jinshan said, object FID is different with the lmm_oi, so I add this patch in osd_xattr_get to dump some information.

            -       return __osd_xattr_get(inode, dentry, name, buf->lb_buf, buf->lb_len);
            +       rc = __osd_xattr_get(inode, dentry, name, buf->lb_buf, buf->lb_len);
            +       if (strcmp(name, XATTR_NAME_LOV) == 0 && rc > 0) {
            +               struct lov_mds_md *md = (struct lov_mds_md *)buf->lb_buf;
            +               if (unlikely(fid_oid(lu_object_fid(&dt->do_lu)) != md->lmm_oi.oi.oi_id)) {
            +                       struct lu_fid fid = {0};
            +                       struct osd_object *lmm_oi_obj;
            +                       struct osd_inode_id id1 = {0}, id2 = {0}, id3 = {0};
            +                       struct osd_device *osd = osd_obj2dev(obj);
            +                       int rc1,rc2, rc0;
            +
            +                       fid.f_oid = md->lmm_oi.oi.oi_id;
            +                       fid.f_seq = md->lmm_oi.oi.oi_seq;
            +                       CERROR("Uncorrect layout for "DFID" FID is "DFID" "DFID" %u:%u\n",
            +                              PFID(lu_object_fid(&dt->do_lu)), PFID(&fid), PFID(&info->oti_cache.oic_fid), info->oti_cache.oic_lid.oii_ino, info->oti_cache.oic_lid.oii_gen);
            +
            +                       rc0 = osd_oii_lookup(osd, lu_object_fid(&dt->do_lu), &id3);
            +                       rc1 = osd_oi_lookup(info, osd, &fid, &id1, false);
            +                       rc2 = osd_oi_lookup(info, osd, lu_object_fid(&dt->do_lu), &id2, false);
            +
            +                       lmm_oi_obj = osd_object_find(env, dt, &fid);
            +                       if(IS_ERR(lmm_oi_obj)) {
            +                               CERROR("can not find obj by "DFID"\n", PFID(&fid));
            +                       } else {
            +                               CERROR("object inode %p %p %lu %u:%u rc %d dt obj %p %p %lu %u:%u Rc %d rc0 %d id3 %u:%u\n", lmm_oi_obj->oo_inode, lmm_oi_obj->oo_inode, lmm_oi_obj->oo_inode->i_ino, id1.oii_ino, id1.oii_gen, rc1,
            +                                       obj, obj->oo_inode, obj->oo_inode->i_ino, id2.oii_ino, id2.oii_gen, rc2, rc0, id3.oii_ino, id3.oii_gen); 
            +                       }
            +
            +                       LBUG();
            +               }
            +       }
            +       return rc;
             }
            

            And when the error happens, the output are

            Lustre: DEBUG MARKER: == lustre-rsync-test test 4: Replicate files created by iozone == 01:15:24 (1367655324)
            LustreError: 3606:0:(osd_handler.c:2647:osd_xattr_get()) Uncorrect layout for [0x200000400:0x586f:0x0] FID is [0x200000400:0x5871:0x0] [0x200000400:0x586f:0x0] 161:818259173 (oic_cache)
            LustreError: 3606:0:(osd_handler.c:2658:osd_xattr_get()) object inode ffff88007b5a7ba0 ffff88007b5a7ba0 161 161:818259173 rc 0 dt obj ffff880065a8e818 ffff88007b5a7ba0 161 0:0 Rc -2 rc0 -2 id3 0:0
            LustreError: 3606:0:(osd_handler.c:2661:osd_xattr_get()) LBUG
            Pid: 3606, comm: mdt00_002
            

            So we can see both [0x200000400:0x586f:0x0] (unlinked one) and [0x200000400:0x5871:0x0](corrected one) are pointing to the inode ffff88007b5a7ba0(161), but only correct one can be found in osd_oi_lookup, which means OI table is correct. But it seems the unlinked pair is left in the oi cache([0x200000400:0x586f:0x0], 161), so we can still get one inode 161 from osd_fid_lookup. But unfortunately inode 161 has been be used to by the new object([0x200000400:0x5871:0x0]), that is why we different lmm_oi here. And it seems this patch are good enough to fix the problem

            diff --git a/lustre/osd-ldiskfs/osd_oi.c b/lustre/osd-ldiskfs/osd_oi.c
            index 574b706..dc7f5e8 100644
            --- a/lustre/osd-ldiskfs/osd_oi.c
            +++ b/lustre/osd-ldiskfs/osd_oi.c
            @@ -674,6 +674,10 @@ int osd_oi_delete(struct osd_thread_info *info,
             {
                    struct lu_fid *oi_fid = &info->oti_fid2;
             
            +       /* clear idmap cache */
            +       if (lu_fid_eq(fid, &info->oti_cache.oic_fid))
            +               memset(&info->oti_cache, 0, sizeof(struct osd_idmap_cache));
            +
                    if (fid_is_last_id(fid))
                            return 0;
            
            di.wang Di Wang added a comment - It turns out OI cache problem, as jinshan said, object FID is different with the lmm_oi, so I add this patch in osd_xattr_get to dump some information. - return __osd_xattr_get(inode, dentry, name, buf->lb_buf, buf->lb_len); + rc = __osd_xattr_get(inode, dentry, name, buf->lb_buf, buf->lb_len); + if (strcmp(name, XATTR_NAME_LOV) == 0 && rc > 0) { + struct lov_mds_md *md = (struct lov_mds_md *)buf->lb_buf; + if (unlikely(fid_oid(lu_object_fid(&dt->do_lu)) != md->lmm_oi.oi.oi_id)) { + struct lu_fid fid = {0}; + struct osd_object *lmm_oi_obj; + struct osd_inode_id id1 = {0}, id2 = {0}, id3 = {0}; + struct osd_device *osd = osd_obj2dev(obj); + int rc1,rc2, rc0; + + fid.f_oid = md->lmm_oi.oi.oi_id; + fid.f_seq = md->lmm_oi.oi.oi_seq; + CERROR("Uncorrect layout for "DFID" FID is "DFID" "DFID" %u:%u\n", + PFID(lu_object_fid(&dt->do_lu)), PFID(&fid), PFID(&info->oti_cache.oic_fid), info->oti_cache.oic_lid.oii_ino, info->oti_cache.oic_lid.oii_gen); + + rc0 = osd_oii_lookup(osd, lu_object_fid(&dt->do_lu), &id3); + rc1 = osd_oi_lookup(info, osd, &fid, &id1, false); + rc2 = osd_oi_lookup(info, osd, lu_object_fid(&dt->do_lu), &id2, false); + + lmm_oi_obj = osd_object_find(env, dt, &fid); + if(IS_ERR(lmm_oi_obj)) { + CERROR("can not find obj by "DFID"\n", PFID(&fid)); + } else { + CERROR("object inode %p %p %lu %u:%u rc %d dt obj %p %p %lu %u:%u Rc %d rc0 %d id3 %u:%u\n", lmm_oi_obj->oo_inode, lmm_oi_obj->oo_inode, lmm_oi_obj->oo_inode->i_ino, id1.oii_ino, id1.oii_gen, rc1, + obj, obj->oo_inode, obj->oo_inode->i_ino, id2.oii_ino, id2.oii_gen, rc2, rc0, id3.oii_ino, id3.oii_gen); + } + + LBUG(); + } + } + return rc; } And when the error happens, the output are Lustre: DEBUG MARKER: == lustre-rsync-test test 4: Replicate files created by iozone == 01:15:24 (1367655324) LustreError: 3606:0:(osd_handler.c:2647:osd_xattr_get()) Uncorrect layout for [0x200000400:0x586f:0x0] FID is [0x200000400:0x5871:0x0] [0x200000400:0x586f:0x0] 161:818259173 (oic_cache) LustreError: 3606:0:(osd_handler.c:2658:osd_xattr_get()) object inode ffff88007b5a7ba0 ffff88007b5a7ba0 161 161:818259173 rc 0 dt obj ffff880065a8e818 ffff88007b5a7ba0 161 0:0 Rc -2 rc0 -2 id3 0:0 LustreError: 3606:0:(osd_handler.c:2661:osd_xattr_get()) LBUG Pid: 3606, comm: mdt00_002 So we can see both [0x200000400:0x586f:0x0] (unlinked one) and [0x200000400:0x5871:0x0] (corrected one) are pointing to the inode ffff88007b5a7ba0(161), but only correct one can be found in osd_oi_lookup, which means OI table is correct. But it seems the unlinked pair is left in the oi cache( [0x200000400:0x586f:0x0] , 161), so we can still get one inode 161 from osd_fid_lookup. But unfortunately inode 161 has been be used to by the new object( [0x200000400:0x5871:0x0] ), that is why we different lmm_oi here. And it seems this patch are good enough to fix the problem diff --git a/lustre/osd-ldiskfs/osd_oi.c b/lustre/osd-ldiskfs/osd_oi.c index 574b706..dc7f5e8 100644 --- a/lustre/osd-ldiskfs/osd_oi.c +++ b/lustre/osd-ldiskfs/osd_oi.c @@ -674,6 +674,10 @@ int osd_oi_delete(struct osd_thread_info *info, { struct lu_fid *oi_fid = &info->oti_fid2; + /* clear idmap cache */ + if (lu_fid_eq(fid, &info->oti_cache.oic_fid)) + memset(&info->oti_cache, 0, sizeof(struct osd_idmap_cache)); + if (fid_is_last_id(fid)) return 0;

            Suddenly I realize the current linkea is okay. There is nothing wrong for two FIDs to point to the same file path - actually they are two separate files. Sorry for misleading.

            So the problem boils down to the inconsistent layout was be returned through OBF, as what I mentioned at comment "30/Apr/13 3:39 PM". Is there any CLI tool to map a FID to the inode # in the OSD?

            jay Jinshan Xiong (Inactive) added a comment - Suddenly I realize the current linkea is okay. There is nothing wrong for two FIDs to point to the same file path - actually they are two separate files. Sorry for misleading. So the problem boils down to the inconsistent layout was be returned through OBF, as what I mentioned at comment "30/Apr/13 3:39 PM". Is there any CLI tool to map a FID to the inode # in the OSD?
            di.wang Di Wang added a comment - http://review.whamcloud.com/#change,6252

            People

              bobijam Zhenyu Xu
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: