[LU-5626] Corruption of MDT “..” entry in non-HTree ldiskfs directories - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: Lustre 2.7.0, Lustre 2.5.4
Affects Version/s: Lustre 2.7.0, Lustre 2.4.3, Lustre 2.5.3
Labels:
- e2fsck
- e2fsprogs
- patch
Environment:
Lustre 2.4 or newer, file system upgraded from 1.8

Severity:
3
Rank (Obsolete):
15741

Description

~~LU-2638~~ reported directory entry corruption related to FID-in-dirent and the “..” entry in HTree directories.
We have since discovered an identical problem in non-HTree directories.

This is essentially exactly the same problem, but it manifests itself slightly different in non-HTree directories. The “..” entry must remain as the second entry in the directory block (FSCK demands this), and when a directory created under 1.8 (now on a 2.4+ server with dirdata enabled) is moved to a new parent, the “..” entry is updated. Exactly as happened in ~~LU-2638~~, the FID is added to the “..” entry without regards to whether or not there is sufficient space in the second position in the directory block.

In the lucky case where space is already available in the second entry in a directory, the “..” entry is -recreated in the same place, FID attached. If not, it is created in the next available space of sufficient size. This causes complaints from FSCK, and when FSCK repairs this, it places the updated “..” immediately after “.” again, which causes it to overlap the next entry in the directory block. This entry - which is for a real user created file, not . or .. - is moved to Lost + found.

This is because add_dirent_to_buf (used when not in a dx directory) has the same bug as “ldiskfs_update_dotdot”, which was fixed in ~~LU-2638~~. Because the structure of add_dirent_to_buf is a bit different, the fix looks different as well.

I don’t have time at the moment to commit & update the new ldiskfs patch file to Gerrit, but I will do so shortly. In the meantime, I’m attaching the new patch file & the resulting namei.c to this bug.

The patch is a bit ugly and could probably use improvement, but in my testing, it does fix the bug.

I'll share replication details in a comment.

One ‘technical debt’ problem with this patch:
This patch, and the one for ~~LU-2638~~, do not simply avoid writing the FID in to the “..” entry. In fact, they avoid writing the entire data section on to the “..” entry, so if there were a pre-existing “..” entry with something else in data other than the FID, that would be lost on directory moves. Currently, it appears that FID-in-dirent is the only user of this extra section.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

ext4-data-in-dirent-dotdot-fixes.patch
2 kB
15/Sep/14 9:27 PM
ll_fix_mdt_lost_found.sh
1 kB
04/Nov/15 2:53 AM
namei.c
94 kB
06/Nov/14 5:37 PM
namei.c
92 kB
15/Sep/14 9:27 PM

Issue Links

is related to

LU-7399 e2fsck repair of ".." entries from LU-5626 should be improved

Open

is related to

LU-5022 support for 3.10 rhel7 linux kernel

Resolved

Activity

[LU-5626] Corruption of MDT “..” entry in non-HTree ldiskfs directories

Bob Glossman (Inactive) added a comment - 07/Nov/14 7:03 PM

Already talked to Yang Sheng about el7. The latest refresh of http://review.whamcloud.com/#/c/10249 includes changes that are supposed to address the issue. Hopefully the el7 ldiskfs patches will be landing soon.

Bob Glossman (Inactive) added a comment - 07/Nov/14 7:03 PM Already talked to Yang Sheng about el7. The latest refresh of http://review.whamcloud.com/#/c/10249 includes changes that are supposed to address the issue. Hopefully the el7 ldiskfs patches will be landing soon.

Andreas Dilger added a comment - 07/Nov/14 6:59 PM

SLES 11 SP3 patch landed, need one for RHEL 7 ldiskfs series in ~~LU-5022~~. I think it is fine to land the current ldiskfs series and then add this patch separately.

Andreas Dilger added a comment - 07/Nov/14 6:59 PM SLES 11 SP3 patch landed, need one for RHEL 7 ldiskfs series in LU-5022 . I think it is fine to land the current ldiskfs series and then add this patch separately.

Bob Glossman (Inactive) added a comment - 06/Nov/14 5:37 PM

patched namei.c from sles11sp3 requested by comment in http://review.whamcloud.com/#/c/12585. to be absolutely clear this is the ldiskfs/namei.c after the full, updated patch series is applied, not just the 1 refreshed data-in-dirent patch.

Bob Glossman (Inactive) added a comment - 06/Nov/14 5:37 PM patched namei.c from sles11sp3 requested by comment in http://review.whamcloud.com/#/c/12585 . to be absolutely clear this is the ldiskfs/namei.c after the full, updated patch series is applied, not just the 1 refreshed data-in-dirent patch.

Bob Glossman (Inactive) added a comment - 05/Nov/14 6:17 PM

for sles11sp3:
http://review.whamcloud.com/12585

Bob Glossman (Inactive) added a comment - 05/Nov/14 6:17 PM for sles11sp3: http://review.whamcloud.com/12585

James A Simmons added a comment - 03/Nov/14 5:29 PM

Looks like we will need this for SLES11 SP3 as well.

James A Simmons added a comment - 03/Nov/14 5:29 PM Looks like we will need this for SLES11 SP3 as well.

James A Simmons added a comment - 31/Oct/14 5:40 PM

To clarify RHEL7 and SLES12 server side support are both 2.8 feature so this ticket is safe to close. Just need to integrate this work into those distros for 2.8.

James A Simmons added a comment - 31/Oct/14 5:40 PM To clarify RHEL7 and SLES12 server side support are both 2.8 feature so this ticket is safe to close. Just need to integrate this work into those distros for 2.8.

James A Simmons added a comment - 30/Oct/14 10:23 AM

Now these changes need to be integrated into the upcoming SLES12 and RHEL7 ldiskfs work.

James A Simmons added a comment - 30/Oct/14 10:23 AM Now these changes need to be integrated into the upcoming SLES12 and RHEL7 ldiskfs work.

nasf (Inactive) added a comment - 30/Oct/14 3:10 AM

James, you mean the ldiskfs/kernel_patches/patches/sles11sp2/ext4-data-in-dirent.patch should be updated? or anything else?

nasf (Inactive) added a comment - 30/Oct/14 3:10 AM James, you mean the ldiskfs/kernel_patches/patches/sles11sp2/ext4-data-in-dirent.patch should be updated? or anything else?

James A Simmons added a comment - 22/Oct/14 4:39 PM

Once this lands the ldiskfs patches for SLES12 and RHEL7 will need to be updated.

James A Simmons added a comment - 22/Oct/14 4:39 PM Once this lands the ldiskfs patches for SLES12 and RHEL7 will need to be updated.

Patrick Farrell (Inactive) added a comment - 17/Sep/14 10:05 PM - edited

We also saw file systems taking errors and getting remounted read-only, and we were initially unable to figure out why. It turns out that when an incorrect/damaged (IE, ".." entry in the wrong place) non-HTree directory is converted to an HTree directory, the conversion goes badly wrong, and the resulting directory is badly corrupt.

With the patch for this bug, it's no longer possible to get in the bad state. I thought I'd share the errors here so others who hit this bug have a better chance of finding this JIRA ticket.

Here's what the resulting errors look like - The key thing is "rec_len=2049", which we've always seen in this situation:
LDISKFS-fs error (device sdi): ldiskfs_dx_find_entry: bad entry in directory #32789: rec_len % 4 != 0 - block=16755offset=24(24), inode=0, rec_len=2049, name_len=0
Aborting journal on device sdi-8.
LDISKFS-fs (sdi): Remounting filesystem read-only
LDISKFS-fs error (device sdi): ldiskfs_dx_find_entry: bad entry in directory #32789: rec_len % 4 != 0 - block=16755offset=24(24), inode=0, rec_len=2049, name_len=0
Lustre: 2208:0:(mdd_dir.c:2926:mdd_rename()) cent5602-MDD0000: sp obj dotdot delete error: rc = -2
Lustre: 2208:0:(mdd_dir.c:2933:mdd_rename()) cent5602-MDD0000: sp obj dotdot insert error: rc = -30
LDISKFS-fs error (device sdi) in add_dirent_to_buf: Journal has aborted
Lustre: 2208:0:(mdd_dir.c:2942:mdd_rename()) sp obj fix error: rc = -30
LustreError: 2208:0:(osd_io.c:1595:osd_ldiskfs_write_record()) journal_get_write_access() returned error -30
LustreError: 2208:0:(osd_handler.c:1126:osd_trans_stop()) Failure in transaction hook: -30
LustreError: 2208:0:(osd_handler.c:1135:osd_trans_stop()) Failure to stop transaction: -30
LustreError: 2205:0:(osd_handler.c:910:osd_trans_commit_cb()) transaction @0xffff88022e004c00 commit error: 2

Patrick Farrell (Inactive) added a comment - 17/Sep/14 10:05 PM - edited We also saw file systems taking errors and getting remounted read-only, and we were initially unable to figure out why. It turns out that when an incorrect/damaged (IE, ".." entry in the wrong place) non-HTree directory is converted to an HTree directory, the conversion goes badly wrong, and the resulting directory is badly corrupt. With the patch for this bug, it's no longer possible to get in the bad state. I thought I'd share the errors here so others who hit this bug have a better chance of finding this JIRA ticket. Here's what the resulting errors look like - The key thing is "rec_len=2049", which we've always seen in this situation: LDISKFS-fs error (device sdi): ldiskfs_dx_find_entry: bad entry in directory #32789: rec_len % 4 != 0 - block=16755offset=24(24), inode=0, rec_len=2049, name_len=0 Aborting journal on device sdi-8. LDISKFS-fs (sdi): Remounting filesystem read-only LDISKFS-fs error (device sdi): ldiskfs_dx_find_entry: bad entry in directory #32789: rec_len % 4 != 0 - block=16755offset=24(24), inode=0, rec_len=2049, name_len=0 Lustre: 2208:0:(mdd_dir.c:2926:mdd_rename()) cent5602-MDD0000: sp obj dotdot delete error: rc = -2 Lustre: 2208:0:(mdd_dir.c:2933:mdd_rename()) cent5602-MDD0000: sp obj dotdot insert error: rc = -30 LDISKFS-fs error (device sdi) in add_dirent_to_buf: Journal has aborted Lustre: 2208:0:(mdd_dir.c:2942:mdd_rename()) sp obj fix error: rc = -30 LustreError: 2208:0:(osd_io.c:1595:osd_ldiskfs_write_record()) journal_get_write_access() returned error -30 LustreError: 2208:0:(osd_handler.c:1126:osd_trans_stop()) Failure in transaction hook: -30 LustreError: 2208:0:(osd_handler.c:1135:osd_trans_stop()) Failure to stop transaction: -30 LustreError: 2205:0:(osd_handler.c:910:osd_trans_commit_cb()) transaction @0xffff88022e004c00 commit error: 2

Patrick Farrell (Inactive) added a comment - 16/Sep/14 3:45 PM

Patch is here:
http://review.whamcloud.com/11939

Local testing suggests this resolves the issue.

Patrick Farrell (Inactive) added a comment - 16/Sep/14 3:45 PM Patch is here: http://review.whamcloud.com/11939 Local testing suggests this resolves the issue.

People

Assignee:: Bob Glossman (Inactive)

Reporter:: Patrick Farrell (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 15/Sep/14 9:20 PM

Updated:: 02/Dec/15 1:51 PM

Resolved:: 07/Nov/14 7:55 PM