[LU-7261] EA list corruption Created: 07/Oct/15  Updated: 01/Jul/16  Resolved: 11/Nov/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.6.0, Lustre 2.7.0, Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Minor
Reporter: Alexey Lyashkov Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: None
Environment:

RHEL6 + lustre/master


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

After some requirements from customers i checked an large EA patch and see we have a corrupted EA list in result. It's easy replicated with leave test file on disk and run a e2fsck / debugfs over test file.
to leave a test file may used patch

bash-3.2$ git diff
diff --git a/lustre/tests/sanity.sh b/lustre/tests/sanity.sh
index 824ba8f..620a626e 100644
--- a/lustre/tests/sanity.sh
+++ b/lustre/tests/sanity.sh
@@ -6810,7 +6810,7 @@ grow_xattr() {
        [[ "$new" != "$orig" ]] && error "$xbig different after growing $xsml"
        log "$xbig still valid after growing $xsml"
 
-       rm -f $file
+#      rm -f $file
 }
 
 test_102h() { # bug 15777
diff --git a/lustre/tests/test-framework.sh b/lustre/tests/test-framework.sh
index be6d1ec..d1e91fb 100755
--- a/lustre/tests/test-framework.sh
+++ b/lustre/tests/test-framework.sh
@@ -4325,8 +4325,8 @@ check_and_cleanup_lustre() {
     fi
 
        if is_mounted $MOUNT; then
-               [ -n "$DIR" ] && rm -rf $DIR/[Rdfs][0-9]* ||
-                       error "remove sub-test dirs failed"
+#              [ -n "$DIR" ] && rm -rf $DIR/[Rdfs][0-9]* ||
+#                      error "remove sub-test dirs failed"
                [ "$ENABLE_QUOTA" ] && restore_quota || true
        fi

debugfs / e2fsck output:

]# /Users/shadow/work/lustre/work/WorkQ/CLSTR-4851/e2fsprogs/debugfs/debugfs -R "stat ROOT/f102ha.sanity" /tmp/lustre-mdt1
debugfs 1.42.12.x1 (03-Apr-2015)
Inode: 133   Type: regular    Mode:  0644   Flags: 0x0
Generation: 1408916363    Version: 0x00000001:00000010
User:     0   Group:     0   Size: 0
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x5614a1b5:00000000 -- Wed Oct  7 07:38:13 2015
 atime: 0x5614a1a8:00000000 -- Wed Oct  7 07:38:00 2015
 mtime: 0x5614a1a8:00000000 -- Wed Oct  7 07:38:00 2015
crtime: 0x5614a1a8:def3d250 -- Wed Oct  7 07:38:00 2015
Size of extra inode fields: 28
Extended attributes stored in inode body: 
  lma = "00 00 00 00 00 00 00 00 01 04 00 00 02 00 00 00 03 00 00 00 00 00 00 00 " (24)
  lma: fid=[0x200000401:0x3:0x0] compat=0 incompat=0
  lov = "d0 0b d1 0b 01 00 00 00 03 00 00 00 00 00 00 00 01 04 00 00 02 00 00 00 00 00 10 00 02 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 " (80)
  link = "df f1 ea 11 01 00 00 00 37 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1f 00 00 00 02 00 00 00 07 00 00 00 01 00 00 00 00 66 31 30 32 68 61 2e 73 61 6e 69 74 79 " (55)
1. invalid EA entry in inode -> big
BLOCKS:
e2fsck 1.42.12.x1 (03-Apr-2015)
Pass 1: Checking inodes, blocks, and sizes
Extended attribute in inode 133 has a value size (65536) which is invalid
Clear? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 134
Connect to /lost+found? yes

Unattached inode 135
Connect to /lost+found? yes

Pass 5: Checking group summary information

one note. e2fsck 1.42.3.wc3 (15-Aug-2012) - can't find a bug in EA.

Root cause of it bug, large EA forget to skip when we start update an data offsets after EA record changed.
fix is simple

-@@ -605,13 +883,17 @@ ext4_xattr_set_entry(struct ext4_xattr_i
+@@ -606,13 +884,18 @@ ext4_xattr_set_entry(struct ext4_xattr_i
                        last = s->first;
                        while (!IS_LAST_ENTRY(last)) {
                                size_t o = le16_to_cpu(last->e_value_offs);
 -                              if (!last->e_value_block &&
 -                                  last->e_value_size && o < offs)
-+                              if (last->e_value_size > 0 && o < offs)
++                              if ((last->e_value_size > 0 && o < offs) 
++                                   && last->e_value_inum == 0)
                                        last->e_value_offs =
                                                cpu_to_le16(o + size);
                                last = EXT4_XATTR_NEXT(last);

but i don't able to send because lack of gerrit login after OAuth changes.



 Comments   
Comment by Gerrit Updater [ 09/Oct/15 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/16777
Subject: LU-7261 ldiskfs: fix large_xattr overwrite
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 892aa251a50d7777eb759f73520bc19de4ae51fb

Comment by Gerrit Updater [ 09/Oct/15 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/16778
Subject: LU-7261 ldiskfs: clean up code style for large_xattr
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e7083690d7b292de287ed579bfa9f05d18ec6bd2

Comment by Gerrit Updater [ 21/Oct/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16777/
Subject: LU-7261 ldiskfs: fix large_xattr overwrite
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 66ca2bc59135b00cd20a4e5095a23cf54cdfa2eb

Comment by Andreas Dilger [ 22/Oct/15 ]

Have reduced priority after main fix patch has landed. Remaining patch is cleanup and no longer a blocker, but bug shouldn't be closed until it is landed (hopefully still for 2.8.0 to avoid having to fork the bug just to track that patch).

Comment by Gerrit Updater [ 11/Nov/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16778/
Subject: LU-7261 ldiskfs: clean up code style for large_xattr
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ab07d78bec9bc127d1a2240b7e8c16e52f117b41

Comment by Joseph Gmitter (Inactive) [ 11/Nov/15 ]

Landed for 2.8

Generated at Sat Feb 10 02:07:23 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.