Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • Lustre 2.5.3
    • None
    • 3
    • 9223372036854775807

    Description

      We are encountering an issue on our lustre-2.5.4+ servers where we are seeing an error destroying objects with RC115, then the OI scrubber is starting. We suspect that the OI scrubber may be causing some jobs to run over wallclock.

      I have attached some kernel logs, and below is the output from the OI scrubber:

      [root@f1-oss1d8 f1-OST00bf]# cat /proc/fs/lustre/osd-ldiskfs/f1-OST00bf/oi_scrub
      name: OI_scrub
      magic: 0x4c5fd252
      oi_files: 64
      status: scanning
      flags: inconsistent,auto
      param:
      time_since_last_completed: 316 seconds
      time_since_latest_start: 17 seconds
      time_since_last_checkpoint: 17 seconds
      latest_start_position: 12
      last_checkpoint_position: N/A
      first_failure_position: 562883
      checked: 1900449
      updated: 0
      failed: 8
      prior_updated: 0
      noscrub: 119
      igif: 1
      success_count: 13873
      run_time: 18 seconds
      average_speed: 105580 objects/sec
      real-time_speed: 107345 objects/sec
      current_position: 2082404
      lf_scanned: 0
      lf_reparied: 0
      lf_failed: 0
      [root@f1-oss1d8 f1-OST00bf]#

      Thanks,
      Dustin

      Attachments

        Issue Links

          Activity

            [LU-7378] Error destroying object with RC115
            yujian Jian Yu added a comment -

            Thank you, Dustin. Let me close this ticket.

            yujian Jian Yu added a comment - Thank you, Dustin. Let me close this ticket.

            Nasf,
            We took a downtime this morning and removed the object you identified above. We are now getting new duplicate FID errors in the logs. It looks like 15 messages were being suppressed so we have more objects to destroy. The one that you helped with above is no longer generating messages. We will disable log suppression and identify the objects that need to be destroyed using the procedure we followed above.

            Thank you for your help, I think it is okay to close this ticket.

            Thanks,
            Dustin

            dustb100 Dustin Leverman added a comment - Nasf, We took a downtime this morning and removed the object you identified above. We are now getting new duplicate FID errors in the logs. It looks like 15 messages were being suppressed so we have more objects to destroy. The one that you helped with above is no longer generating messages. We will disable log suppression and identify the objects that need to be destroyed using the procedure we followed above. Thank you for your help, I think it is okay to close this ticket. Thanks, Dustin

            Thank you for the information Nasf. This will be our plan, but we need to schedule an outage with the customer. We will let you know the result.

            Thanks,
            Dustin

            dustb100 Dustin Leverman added a comment - Thank you for the information Nasf. This will be our plan, but we need to schedule an outage with the customer. We will let you know the result. Thanks, Dustin

            [root@lfs-mgmt01.ncrc.gov ~]# lfs fid2path /lustre/f1 '[0x20002f6bc006ea8:0x0:0x0]'
            ioctl err -22: Invalid argument (22)
            fid2path: error on FID [0x20002f6bc006ea8:0x0:0x0]: Invalid argument

            If the "/lustre/f1" is the right mount point, then that means the FID "[0x20002f6bc006ea8:0x0:0x0]" is an invalid one. The first valid lu_fid::f_oid for MDT-object's FID is 1, not 0. So the PFID EA for the OST-object is wrong. So it is normal that you cannot locate related MDT-object via such invalid FID.

            On the other hand, your case happened during destroying OST-object. That means the MDT-object has been removed already. So even thought the PFID EA was right, you still could not locate the removed MDT-object.

            [root@f1-oss1d8 ~]# debugfs -c -R "stat /O/0/d4/54354404" /dev/mapper/f1-ddn1d-l53 | grep fid
            debugfs 1.42.12.wc1 (15-Sep-2014)
            /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps
            lma: fid=[0x100000000:0x2fef612:0x0] compat=8 incompat=0
            fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16)
            fid: parent=[0x20002f6bc006ea8:0x0:0x0] stripe=2
            [root@f1-oss1d8 ~]# debugfs -c -R "stat /O/0/d18/50263570" /dev/mapper/f1-ddn1d-l53 | grep fid
            debugfs 1.42.12.wc1 (15-Sep-2014)
            /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps
            lma: fid=[0x100000000:0x2fef612:0x0] compat=8 incompat=0
            fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16)
            fid: parent=[0x20002f6bc006ea8:0x0:0x0] stripe=2

            Searched the log, only the FID [0x100000000:0x2fef612:0x0] was reported as conflict FID mapping. Such FID should be mapped to /O/0/d18/50263570 (6037283/2343154970). But for some unknown reason, the /O/0/d4/54354404 (562883/2343154970) also claims to be mapped to the FID [0x100000000:0x2fef612:0x0]. The latter one is wrong.

            In fact, the OI entry /O/0/d4/54354404 is corresponding to the FID [0x100000000:0x33d61e4:0x0]. Since we are destroying the OST-object that is corresponding to the FID [0x100000000:0x33d61e4:0x0]. The most simple solution is that mount the OST as ldiskfs mode, then remove /O/0/d4/54354404 directly.

            yong.fan nasf (Inactive) added a comment - [root@lfs-mgmt01.ncrc.gov ~] # lfs fid2path /lustre/f1 ' [0x20002f6bc006ea8:0x0:0x0] ' ioctl err -22: Invalid argument (22) fid2path: error on FID [0x20002f6bc006ea8:0x0:0x0] : Invalid argument If the "/lustre/f1" is the right mount point, then that means the FID " [0x20002f6bc006ea8:0x0:0x0] " is an invalid one. The first valid lu_fid::f_oid for MDT-object's FID is 1, not 0. So the PFID EA for the OST-object is wrong. So it is normal that you cannot locate related MDT-object via such invalid FID. On the other hand, your case happened during destroying OST-object. That means the MDT-object has been removed already. So even thought the PFID EA was right, you still could not locate the removed MDT-object. [root@f1-oss1d8 ~] # debugfs -c -R "stat /O/0/d4/54354404" /dev/mapper/f1-ddn1d-l53 | grep fid debugfs 1.42.12.wc1 (15-Sep-2014) /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps lma: fid= [0x100000000:0x2fef612:0x0] compat=8 incompat=0 fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16) fid: parent= [0x20002f6bc006ea8:0x0:0x0] stripe=2 [root@f1-oss1d8 ~] # debugfs -c -R "stat /O/0/d18/50263570" /dev/mapper/f1-ddn1d-l53 | grep fid debugfs 1.42.12.wc1 (15-Sep-2014) /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps lma: fid= [0x100000000:0x2fef612:0x0] compat=8 incompat=0 fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16) fid: parent= [0x20002f6bc006ea8:0x0:0x0] stripe=2 Searched the log, only the FID [0x100000000:0x2fef612:0x0] was reported as conflict FID mapping. Such FID should be mapped to /O/0/d18/50263570 (6037283/2343154970). But for some unknown reason, the /O/0/d4/54354404 (562883/2343154970) also claims to be mapped to the FID [0x100000000:0x2fef612:0x0] . The latter one is wrong. In fact, the OI entry /O/0/d4/54354404 is corresponding to the FID [0x100000000:0x33d61e4:0x0] . Since we are destroying the OST-object that is corresponding to the FID [0x100000000:0x33d61e4:0x0] . The most simple solution is that mount the OST as ldiskfs mode, then remove /O/0/d4/54354404 directly.
            yujian Jian Yu added a comment -

            Yes, Matt. Let me ask for help.

            Hi Nasf,

            Could you please advise?

            yujian Jian Yu added a comment - Yes, Matt. Let me ask for help. Hi Nasf, Could you please advise?
            ezell Matt Ezell added a comment -

            Hi Jian-

            This file system was formatted with Lustre 2.4, so we shouldn't have any IGIF files. According to step 2, fid2path should have been able to find it, correct?

            ~Matt

            ezell Matt Ezell added a comment - Hi Jian- This file system was formatted with Lustre 2.4, so we shouldn't have any IGIF files. According to step 2, fid2path should have been able to find it, correct? ~Matt
            yujian Jian Yu added a comment - Hi Matt, Could you please refer to the steps in https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.html#dbdoclet.50438194_30872 ?
            ezell Matt Ezell added a comment -

            Hi Jian-

            I think those are OST inode numbers, so I ran the command there:

            [root@f1-oss1d8 ~]# debugfs -c -R "ncheck 6037283 562883" /dev/mapper/f1-ddn1d-l53
            debugfs 1.42.12.wc1 (15-Sep-2014)
            /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps
            Inode   Pathname
            562883  /O/0/d4/54354404
            6037283 /O/0/d18/50263570
            
            [root@f1-oss1d8 ~]# debugfs -c -R "stat /O/0/d4/54354404" /dev/mapper/f1-ddn1d-l53 | grep fid
            debugfs 1.42.12.wc1 (15-Sep-2014)
            /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps
              lma: fid=[0x100000000:0x2fef612:0x0] compat=8 incompat=0
              fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16)
              fid: parent=[0x20002f6bc006ea8:0x0:0x0] stripe=2
            [root@f1-oss1d8 ~]# debugfs -c -R "stat /O/0/d18/50263570" /dev/mapper/f1-ddn1d-l53 | grep fid
            debugfs 1.42.12.wc1 (15-Sep-2014)
            /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps
              lma: fid=[0x100000000:0x2fef612:0x0] compat=8 incompat=0
              fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16)
              fid: parent=[0x20002f6bc006ea8:0x0:0x0] stripe=2
            

            That's where I hit a dead end:

            [root@lfs-mgmt01.ncrc.gov ~]# lfs fid2path /lustre/f1 '[0x100000000:0x2fef612:0x0]'
            ioctl err -22: Invalid argument (22)
            fid2path: error on FID [0x100000000:0x2fef612:0x0]: Invalid argument
            [root@lfs-mgmt01.ncrc.gov ~]# lfs fid2path /lustre/f1 '[0x20002f6bc006ea8:0x0:0x0]'
            ioctl err -22: Invalid argument (22)
            fid2path: error on FID [0x20002f6bc006ea8:0x0:0x0]: Invalid argument
            
            ezell Matt Ezell added a comment - Hi Jian- I think those are OST inode numbers, so I ran the command there: [root@f1-oss1d8 ~]# debugfs -c -R "ncheck 6037283 562883" /dev/mapper/f1-ddn1d-l53 debugfs 1.42.12.wc1 (15-Sep-2014) /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps Inode Pathname 562883 /O/0/d4/54354404 6037283 /O/0/d18/50263570 [root@f1-oss1d8 ~]# debugfs -c -R "stat /O/0/d4/54354404" /dev/mapper/f1-ddn1d-l53 | grep fid debugfs 1.42.12.wc1 (15-Sep-2014) /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps lma: fid=[0x100000000:0x2fef612:0x0] compat=8 incompat=0 fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16) fid: parent=[0x20002f6bc006ea8:0x0:0x0] stripe=2 [root@f1-oss1d8 ~]# debugfs -c -R "stat /O/0/d18/50263570" /dev/mapper/f1-ddn1d-l53 | grep fid debugfs 1.42.12.wc1 (15-Sep-2014) /dev/mapper/f1-ddn1d-l53: catastrophic mode - not reading inode or group bitmaps lma: fid=[0x100000000:0x2fef612:0x0] compat=8 incompat=0 fid = "a8 6e 00 bc f6 02 00 02 00 00 00 00 02 00 00 00 " (16) fid: parent=[0x20002f6bc006ea8:0x0:0x0] stripe=2 That's where I hit a dead end: [root@lfs-mgmt01.ncrc.gov ~]# lfs fid2path /lustre/f1 '[0x100000000:0x2fef612:0x0]' ioctl err -22: Invalid argument (22) fid2path: error on FID [0x100000000:0x2fef612:0x0]: Invalid argument [root@lfs-mgmt01.ncrc.gov ~]# lfs fid2path /lustre/f1 '[0x20002f6bc006ea8:0x0:0x0]' ioctl err -22: Invalid argument (22) fid2path: error on FID [0x20002f6bc006ea8:0x0:0x0]: Invalid argument
            yujian Jian Yu added a comment -

            Hi Dustin,

            Could you please run the following command on MDT device to see the pathnames of the two inodes that used the same FID?

            debugfs -c -R "ncheck 6037283 562883" /dev/{mdtdev}
            
            yujian Jian Yu added a comment - Hi Dustin, Could you please run the following command on MDT device to see the pathnames of the two inodes that used the same FID? debugfs -c -R "ncheck 6037283 562883" /dev/{mdtdev}
            yujian Jian Yu added a comment - - edited

            In kernel logs, the following error messages kept occurring:

            Oct 14 16:30:12 f1-oss1d8 kernel: [2604418.085966] LustreError: 10030:0:(ost_handler.c:1777:ost_blocking_ast()) Error -2 syncing data on lock cancel
            Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.041236] Lustre: f1-OST00bf-osd: trigger OI scrub by RPC for [0x100000000:0x33d61e5:0x0], rc = 0 [1]
            Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.065422] Lustre: Skipped 3 previous similar messages
            Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.076384] LustreError: 99512:0:(ofd_obd.c:1096:ofd_destroy()) f1-OST00bf: error destroying object [0x100000000:0x33d61e5:0x0]: -115
            Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.125308] LustreError: 99512:0:(ofd_obd.c:1096:ofd_destroy()) Skipped 11 previous similar messages
            Oct 14 16:33:11 f1-oss1d8 kernel: [2604597.299596] LustreError: 38769:0:(osd_compat.c:598:osd_obj_update_entry()) f1-OST00bf-osd: the FID [0x100000000:0x2fef612:0x0] is used by two objects: 6037283/2343154970 562883/2343154970
            Oct 14 16:33:11 f1-oss1d8 kernel: [2604597.348347] LustreError: 38769:0:(osd_compat.c:598:osd_obj_update_entry()) Skipped 15 previous similar messages
            

            For "-115" (-EINPROGRESS) error, it came from ofd_destroy -> ofd_destroy_by_fid -> ofd_object_find -> … -> osd_object_init -> osd_fid_lookup():

                                    triggered = true;
                                    if (thread_is_running(&scrub->os_thread)) {
                                            result = -EINPROGRESS;
                                    } else if (!dev->od_noscrub) {
                                            result = osd_scrub_start(dev);
                                            LCONSOLE_WARN("%.16s: trigger OI scrub by RPC "
                                                          "for "DFID", rc = %d [1]\n",
                                                          osd_name(dev), PFID(fid), result);
                                            if (result == 0 || result == -EALREADY)
                                                    result = -EINPROGRESS;
                                            else
                                                    result = -EREMCHG;
                                    }
            

            For "FID used by two objects" error, it came from osd_obj_update_entry():

                    if (lu_fid_eq(fid, oi_fid)) {
                            CERROR("%s: the FID "DFID" is used by two objects: "
                                   "%u/%u %u/%u\n", osd_name(osd), PFID(fid),
                                   oi_id->oii_ino, oi_id->oii_gen,
                                   id->oii_ino, id->oii_gen);
                            GOTO(out, rc = -EEXIST);
                    }
            
            yujian Jian Yu added a comment - - edited In kernel logs, the following error messages kept occurring: Oct 14 16:30:12 f1-oss1d8 kernel: [2604418.085966] LustreError: 10030:0:(ost_handler.c:1777:ost_blocking_ast()) Error -2 syncing data on lock cancel Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.041236] Lustre: f1-OST00bf-osd: trigger OI scrub by RPC for [0x100000000:0x33d61e5:0x0], rc = 0 [1] Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.065422] Lustre: Skipped 3 previous similar messages Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.076384] LustreError: 99512:0:(ofd_obd.c:1096:ofd_destroy()) f1-OST00bf: error destroying object [0x100000000:0x33d61e5:0x0]: -115 Oct 14 16:33:07 f1-oss1d8 kernel: [2604593.125308] LustreError: 99512:0:(ofd_obd.c:1096:ofd_destroy()) Skipped 11 previous similar messages Oct 14 16:33:11 f1-oss1d8 kernel: [2604597.299596] LustreError: 38769:0:(osd_compat.c:598:osd_obj_update_entry()) f1-OST00bf-osd: the FID [0x100000000:0x2fef612:0x0] is used by two objects: 6037283/2343154970 562883/2343154970 Oct 14 16:33:11 f1-oss1d8 kernel: [2604597.348347] LustreError: 38769:0:(osd_compat.c:598:osd_obj_update_entry()) Skipped 15 previous similar messages For "-115" (-EINPROGRESS) error, it came from ofd_destroy -> ofd_destroy_by_fid -> ofd_object_find -> … -> osd_object_init -> osd_fid_lookup(): triggered = true ; if (thread_is_running(&scrub->os_thread)) { result = -EINPROGRESS; } else if (!dev->od_noscrub) { result = osd_scrub_start(dev); LCONSOLE_WARN( "%.16s: trigger OI scrub by RPC " " for " DFID ", rc = %d [1]\n" , osd_name(dev), PFID(fid), result); if (result == 0 || result == -EALREADY) result = -EINPROGRESS; else result = -EREMCHG; } For "FID used by two objects" error, it came from osd_obj_update_entry(): if (lu_fid_eq(fid, oi_fid)) { CERROR( "%s: the FID " DFID " is used by two objects: " "%u/%u %u/%u\n" , osd_name(osd), PFID(fid), oi_id->oii_ino, oi_id->oii_gen, id->oii_ino, id->oii_gen); GOTO(out, rc = -EEXIST); }

            People

              yong.fan nasf (Inactive)
              dustb100 Dustin Leverman
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: