Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6465

OSD ID mapping cache is not safe to use.

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      t seems osd_id_map is not safe to use right now. So the pair of [FID, OID] is added to cache after lookup, but if another thread delete the object, it will only invalidate the cache in its own thread info.

      int osd_oi_delete(struct osd_thread_info *info,
                        struct osd_device *osd, const struct lu_fid *fid,
                        handle_t *th, enum oi_check_flags flags)
      {
              struct lu_fid *oi_fid = &info->oti_fid2;
      
              /* clear idmap cache */
              if (lu_fid_eq(fid, &info->oti_cache.oic_fid))
                      fid_zero(&info->oti_cache.oic_fid);
            ..............
      
      

      And other threads can still get the OID from the cache, and if the inode has been reused by other object, then we will see bunch of

      Lustre: lustre-MDT0003-osd: FID [0x2c0000404:0x1f34:0x0] != self_fid [0x2c0000404:0x281c:0x0]
      

      Unfortunately, it will also trigger osd-scrub in this case, this is what I saw in the DNE2 failover test.

      Attachments

        Activity

          [LU-6465] OSD ID mapping cache is not safe to use.

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14499/
          Subject: LU-6465 osd: NO OI scrub because of cached invalid OI mapping
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: dac584c6946d15e1ca9e6feeb26b164768041c40

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14499/ Subject: LU-6465 osd: NO OI scrub because of cached invalid OI mapping Project: fs/lustre-release Branch: master Current Patch Set: Commit: dac584c6946d15e1ca9e6feeb26b164768041c40

          Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/14499
          Subject: LU-6465 osd: NO OI scrub because of cached invalid OI mapping
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 06534d926e58b3908302d389aac4a39d73e6d2b8

          gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/14499 Subject: LU-6465 osd: NO OI scrub because of cached invalid OI mapping Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 06534d926e58b3908302d389aac4a39d73e6d2b8
          di.wang Di Wang added a comment -

          And disable the oic_cache does make problem go away.

          [root@testnode lustre-release_new]# git diff
          diff --git a/lustre/osd-ldiskfs/osd_handler.c b/lustre/osd-ldiskfs/osd_handler.c
          index 7ab83f3..cce5dfa 100644
          --- a/lustre/osd-ldiskfs/osd_handler.c
          +++ b/lustre/osd-ldiskfs/osd_handler.c
          @@ -607,13 +607,15 @@ static int osd_fid_lookup(const struct lu_env *env, struct osd_object *obj,
                  if (conf != NULL && conf->loc_flags & LOC_F_NEW)
                          GOTO(out, result = 0);
           
          +#if 0
          +       /* Disable OIC cache until LU-6465 is resolved */
                  /* Search order: 1. per-thread cache. */
                  if (lu_fid_eq(fid, &oic->oic_fid) &&
                      likely(oic->oic_dev == dev)) {
                          id = &oic->oic_lid;
                          goto iget;
                  }
          -
          +#endif
                  id = &info->oti_id;
                  if (!list_empty(&scrub->os_inconsistent_items)) {
                          /* Search order: 2. OI scrub pending list. */
          
          di.wang Di Wang added a comment - And disable the oic_cache does make problem go away. [root@testnode lustre-release_new]# git diff diff --git a/lustre/osd-ldiskfs/osd_handler.c b/lustre/osd-ldiskfs/osd_handler.c index 7ab83f3..cce5dfa 100644 --- a/lustre/osd-ldiskfs/osd_handler.c +++ b/lustre/osd-ldiskfs/osd_handler.c @@ -607,13 +607,15 @@ static int osd_fid_lookup(const struct lu_env *env, struct osd_object *obj, if (conf != NULL && conf->loc_flags & LOC_F_NEW) GOTO(out, result = 0); +#if 0 + /* Disable OIC cache until LU-6465 is resolved */ /* Search order: 1. per-thread cache. */ if (lu_fid_eq(fid, &oic->oic_fid) && likely(oic->oic_dev == dev)) { id = &oic->oic_lid; goto iget; } - +#endif id = &info->oti_id; if (!list_empty(&scrub->os_inconsistent_items)) { /* Search order: 2. OI scrub pending list. */

          People

            yong.fan nasf (Inactive)
            di.wang Di Wang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: