Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3569

Use real OST index as ostid_to_fid() parameter instead of always "0"

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.6.0, Lustre 2.8.0
    • Lustre 2.5.0
    • None
    • 3
    • 8988

    Description

      Currently, the callers for ostid_to_fid() always use "0" as the "ost_idx" parameter. There are potential issues for that, and may cause FID inconsistency as to crash the system.

      Attachments

        Issue Links

          Activity

            [LU-3569] Use real OST index as ostid_to_fid() parameter instead of always "0"

            Peter, I have refreshed the patch http://review.whamcloud.com/#/c/16477/

            yong.fan nasf (Inactive) added a comment - Peter, I have refreshed the patch http://review.whamcloud.com/#/c/16477/
            pjones Peter Jones added a comment -

            Fan Yong

            We have been discussing this issue on the triage call and it looks like an fsck issue. Could you please look at the latest feedback in the patch and see what needs to be done?

            Thanks

            Peter

            pjones Peter Jones added a comment - Fan Yong We have been discussing this issue on the triage call and it looks like an fsck issue. Could you please look at the latest feedback in the patch and see what needs to be done? Thanks Peter

            Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/16477
            Subject: LU-3569 utils: remove ll_recover_lost_found_obj
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ef14f2f062ee31302d55a0a29ecee94233f47aa8

            gerrit Gerrit Updater added a comment - Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/16477 Subject: LU-3569 utils: remove ll_recover_lost_found_obj Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ef14f2f062ee31302d55a0a29ecee94233f47aa8
            pjones Peter Jones added a comment -

            As per discussion on the triage call, let's just delete the tool for 2.8

            pjones Peter Jones added a comment - As per discussion on the triage call, let's just delete the tool for 2.8

            Lai, I'd be grateful if you could revive my http://review.whamcloud.com/7880 patch to fix ll_recover_lost_found_objs, or possibly better is to make a different patch to delete ll_recover_lost_found_objs.c entirely because LFSCK should handle that now (sanity-scrub test_14 tests this).

            adilger Andreas Dilger added a comment - Lai, I'd be grateful if you could revive my http://review.whamcloud.com/7880 patch to fix ll_recover_lost_found_objs, or possibly better is to make a different patch to delete ll_recover_lost_found_objs.c entirely because LFSCK should handle that now (sanity-scrub test_14 tests this).
            laisiyao Lai Siyao added a comment -

            Andreas, is there anything needed for this ticket?

            laisiyao Lai Siyao added a comment - Andreas, is there anything needed for this ticket?

            Looks like I am mistaken. I thought that it was a bug for ll_recover_lost_found_objs checking fid_seq_is_idif() and creating the object path LPX64, since this would move objects to "O/0x0/d*" instead of "O/0/d*". However, I didn't notice until now that the old code is actually using LPX64i instead of LPX64, and LPX64i prints zero as "0" instead of "0x0".

            Even so, I cleaned up the code a bit to consolidate where the seq_name is being generated into one place, and IDIF objects now use LPU64 for the seq_name to make the code more clear.

            http://review.whamcloud.com/7880

            adilger Andreas Dilger added a comment - Looks like I am mistaken. I thought that it was a bug for ll_recover_lost_found_objs checking fid_seq_is_idif() and creating the object path LPX64, since this would move objects to "O/0x0/d*" instead of "O/0/d*". However, I didn't notice until now that the old code is actually using LPX64i instead of LPX64, and LPX64i prints zero as "0" instead of "0x0". Even so, I cleaned up the code a bit to consolidate where the seq_name is being generated into one place, and IDIF objects now use LPU64 for the seq_name to make the code more clear. http://review.whamcloud.com/7880

            Even if we don't store the OST index into the on-disk LMA FID, we need to fix a couple of bugs in the ll_lost_found_objs.

            adilger Andreas Dilger added a comment - Even if we don't store the OST index into the on-disk LMA FID, we need to fix a couple of bugs in the ll_lost_found_objs.

            I am thinking whether we will have the requirement to change OST-index in the future. If we have, then storing OST-index inside the IDIF will be trouble. Because for normal FID, OST-index is meaningless, the client can get the OST-index by checking FLD with the given FID. So as long as we can adjust the FLD correctly, then changing OST-index without breaking the system is easy.

            Currently, because all the IDIF use 0-index, the OI scrub does not care about the OST-index. But if the up layer change the IDIF logic, means both LOD and OFD use the real OST-index, then OI scrub needs to the OST-index to rebuild the LMA and the /O tree. And before such conversion is completed, the system is un-accessable, it is not like OI scrub for MDT file-level backup/restore. Under the later one case, the system is still accessible during the OI scrub rebuilding the OI files. I am not sure whether OI scrub or the OSD should know the 'index' information or not.

            yong.fan nasf (Inactive) added a comment - I am thinking whether we will have the requirement to change OST-index in the future. If we have, then storing OST-index inside the IDIF will be trouble. Because for normal FID, OST-index is meaningless, the client can get the OST-index by checking FLD with the given FID. So as long as we can adjust the FLD correctly, then changing OST-index without breaking the system is easy. Currently, because all the IDIF use 0-index, the OI scrub does not care about the OST-index. But if the up layer change the IDIF logic, means both LOD and OFD use the real OST-index, then OI scrub needs to the OST-index to rebuild the LMA and the /O tree. And before such conversion is completed, the system is un-accessable, it is not like OI scrub for MDT file-level backup/restore. Under the later one case, the system is still accessible during the OI scrub rebuilding the OI files. I am not sure whether OI scrub or the OSD should know the 'index' information or not.
            di.wang Di Wang added a comment -

            Oh, unfortunately, we always store FID_SEQ_IDIF(without ost index) for all of OST objects since 2.4.0. So LFSCK have to fix it. Nasf?

            di.wang Di Wang added a comment - Oh, unfortunately, we always store FID_SEQ_IDIF(without ost index) for all of OST objects since 2.4.0. So LFSCK have to fix it. Nasf?

            I see that we are already storing the "index == 0" FID into the OST objects with the current master:

            # debugfs /dev/vg_sookie/lvost2
            debugfs 1.42.7.wc2 (12-Apr-2013)
            debugfs:  stats
            Filesystem volume name:   testfs-OST0001
            :
            :
            debugfs:  stat O/0/d0/1856
            Inode: 172   Type: regular    Mode:  0666   Flags: 0x80000
            :
            :
            Extended attributes stored in inode body: 
              lma = "08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 40 07 00 00 00 00 00 00 " (24)
              lma: fid=[0x100000000:0x740:0x0] compat=8 incompat=0
              fid = "00 04 00 00 02 00 00 00 f8 75 00 00 00 00 00 00 " (16)
              fid: parent=[0x200000400:0x75f8:0x0] stripe=0
            

            So this is always storing FID_SEQ_IDIF = 0x100000000 into the on-disk OST object LMA, instead of (FID_SEQ_NORMAL + ost_idx << 16) = 0x100010000 in this case. If this has been around since 2.4.x days then I guess LFSCK will have to fix it up either way and there is no requirement to fix it immediately, but if it is new in 2.5 then I'd prefer that this be fixed before 2.5.0.

            adilger Andreas Dilger added a comment - I see that we are already storing the "index == 0" FID into the OST objects with the current master: # debugfs /dev/vg_sookie/lvost2 debugfs 1.42.7.wc2 (12-Apr-2013) debugfs: stats Filesystem volume name: testfs-OST0001 : : debugfs: stat O/0/d0/1856 Inode: 172 Type: regular Mode: 0666 Flags: 0x80000 : : Extended attributes stored in inode body: lma = "08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 40 07 00 00 00 00 00 00 " (24) lma: fid=[0x100000000:0x740:0x0] compat=8 incompat=0 fid = "00 04 00 00 02 00 00 00 f8 75 00 00 00 00 00 00 " (16) fid: parent=[0x200000400:0x75f8:0x0] stripe=0 So this is always storing FID_SEQ_IDIF = 0x100000000 into the on-disk OST object LMA, instead of (FID_SEQ_NORMAL + ost_idx << 16) = 0x100010000 in this case. If this has been around since 2.4.x days then I guess LFSCK will have to fix it up either way and there is no requirement to fix it immediately, but if it is new in 2.5 then I'd prefer that this be fixed before 2.5.0.

            People

              yong.fan nasf (Inactive)
              yong.fan nasf (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: