Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5626

Corruption of MDT “..” entry in non-HTree ldiskfs directories

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.7.0, Lustre 2.5.4
    • Lustre 2.7.0, Lustre 2.4.3, Lustre 2.5.3
    • Lustre 2.4 or newer, file system upgraded from 1.8
    • 3
    • 15741

    Description

      LU-2638 reported directory entry corruption related to FID-in-dirent and the “..” entry in HTree directories.
      We have since discovered an identical problem in non-HTree directories.

      This is essentially exactly the same problem, but it manifests itself slightly different in non-HTree directories. The “..” entry must remain as the second entry in the directory block (FSCK demands this), and when a directory created under 1.8 (now on a 2.4+ server with dirdata enabled) is moved to a new parent, the “..” entry is updated. Exactly as happened in LU-2638, the FID is added to the “..” entry without regards to whether or not there is sufficient space in the second position in the directory block.

      In the lucky case where space is already available in the second entry in a directory, the “..” entry is -recreated in the same place, FID attached. If not, it is created in the next available space of sufficient size. This causes complaints from FSCK, and when FSCK repairs this, it places the updated “..” immediately after “.” again, which causes it to overlap the next entry in the directory block. This entry - which is for a real user created file, not . or .. - is moved to Lost + found.

      This is because add_dirent_to_buf (used when not in a dx directory) has the same bug as “ldiskfs_update_dotdot”, which was fixed in LU-2638. Because the structure of add_dirent_to_buf is a bit different, the fix looks different as well.

      I don’t have time at the moment to commit & update the new ldiskfs patch file to Gerrit, but I will do so shortly. In the meantime, I’m attaching the new patch file & the resulting namei.c to this bug.

      The patch is a bit ugly and could probably use improvement, but in my testing, it does fix the bug.

      I'll share replication details in a comment.

      One ‘technical debt’ problem with this patch:
      This patch, and the one for LU-2638, do not simply avoid writing the FID in to the “..” entry. In fact, they avoid writing the entire data section on to the “..” entry, so if there were a pre-existing “..” entry with something else in data other than the FID, that would be lost on directory moves. Currently, it appears that FID-in-dirent is the only user of this extra section.

      Attachments

        1. ext4-data-in-dirent-dotdot-fixes.patch
          2 kB
        2. ll_fix_mdt_lost_found.sh
          1 kB
        3. namei.c
          94 kB
        4. namei.c
          92 kB

        Issue Links

          Activity

            [LU-5626] Corruption of MDT “..” entry in non-HTree ldiskfs directories

            Attached is a script that can be used to recover the files from lost+found in this case, when the output from e2fsck is available. The script itself is not ready for distribution use because it hard-codes pathnames to log files and such, but was useful in recovering problematic files.

            This could theoretically also be handled within e2fsck by moving the second entry (if otherwise valid) to another spot in the directory before clobbering it with "..", but I haven't looked at e2fsck to know of the complexity of this (it would have to know where to put the entry to avoid making it unreachable in htree directories, or if the directory doesn't have enough space).

            adilger Andreas Dilger added a comment - Attached is a script that can be used to recover the files from lost+found in this case, when the output from e2fsck is available. The script itself is not ready for distribution use because it hard-codes pathnames to log files and such, but was useful in recovering problematic files. This could theoretically also be handled within e2fsck by moving the second entry (if otherwise valid) to another spot in the directory before clobbering it with "..", but I haven't looked at e2fsck to know of the complexity of this (it would have to know where to put the entry to avoid making it unreachable in htree directories, or if the directory doesn't have enough space).

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12927/
            Subject: LU-5626 ldiskfs: update non-htree dotdot in rename
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set:
            Commit: 639aca79b2c87aae2adf16463d50b9318f7429e5

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12927/ Subject: LU-5626 ldiskfs: update non-htree dotdot in rename Project: fs/lustre-release Branch: b2_5 Current Patch Set: Commit: 639aca79b2c87aae2adf16463d50b9318f7429e5

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12926/
            Subject: LU-5626 ldiskfs: update non-htree dotdot in rename
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set:
            Commit: 76a7bae58006e4f6d1c13216df8cedda85e5e911

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12926/ Subject: LU-5626 ldiskfs: update non-htree dotdot in rename Project: fs/lustre-release Branch: b2_5 Current Patch Set: Commit: 76a7bae58006e4f6d1c13216df8cedda85e5e911

            Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/12927
            Subject: LU-5626 ldiskfs: update non-htree dotdot in rename
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set: 1
            Commit: 2b4f7f9612742ce7846646f6f50276080fdf398c

            gerrit Gerrit Updater added a comment - Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/12927 Subject: LU-5626 ldiskfs: update non-htree dotdot in rename Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: 2b4f7f9612742ce7846646f6f50276080fdf398c

            Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/12926
            Subject: LU-5626 ldiskfs: update non-htree dotdot in rename
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set: 1
            Commit: 2caf4d3a14df0974e1d96716f0f376f94be3a4ab

            gerrit Gerrit Updater added a comment - Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/12926 Subject: LU-5626 ldiskfs: update non-htree dotdot in rename Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: 2caf4d3a14df0974e1d96716f0f376f94be3a4ab

            We should wait until 2.8 to land the ldiskfs series for RHEL7.

            simmonsja James A Simmons added a comment - We should wait until 2.8 to land the ldiskfs series for RHEL7.
            pjones Peter Jones added a comment -

            ok then it sounds like the recent landing to master of the SLES fix means that this can be marked as resolved for 2.7

            pjones Peter Jones added a comment - ok then it sounds like the recent landing to master of the SLES fix means that this can be marked as resolved for 2.7

            Already talked to Yang Sheng about el7. The latest refresh of http://review.whamcloud.com/#/c/10249 includes changes that are supposed to address the issue. Hopefully the el7 ldiskfs patches will be landing soon.

            bogl Bob Glossman (Inactive) added a comment - Already talked to Yang Sheng about el7. The latest refresh of http://review.whamcloud.com/#/c/10249 includes changes that are supposed to address the issue. Hopefully the el7 ldiskfs patches will be landing soon.

            SLES 11 SP3 patch landed, need one for RHEL 7 ldiskfs series in LU-5022. I think it is fine to land the current ldiskfs series and then add this patch separately.

            adilger Andreas Dilger added a comment - SLES 11 SP3 patch landed, need one for RHEL 7 ldiskfs series in LU-5022 . I think it is fine to land the current ldiskfs series and then add this patch separately.

            patched namei.c from sles11sp3 requested by comment in http://review.whamcloud.com/#/c/12585. to be absolutely clear this is the ldiskfs/namei.c after the full, updated patch series is applied, not just the 1 refreshed data-in-dirent patch.

            bogl Bob Glossman (Inactive) added a comment - patched namei.c from sles11sp3 requested by comment in http://review.whamcloud.com/#/c/12585 . to be absolutely clear this is the ldiskfs/namei.c after the full, updated patch series is applied, not just the 1 refreshed data-in-dirent patch.

            People

              bogl Bob Glossman (Inactive)
              paf Patrick Farrell (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: