[LU-6757] conf-sanity test_61: trusted.big different after saving Created: 23/Jun/15  Updated: 25/Feb/16  Resolved: 26/Aug/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Maloo Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: None
Environment:

server: lustre-master build # 3071 EL7 ldiskfs
client: lustre-master build #3071 EL6


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/74ee0e8e-13d8-11e5-b4b0-5254006e85c2.

The sub-test test_61 failed with the following error:

trusted.big different after saving

MDS dmesg

[10669.620927] Lustre: DEBUG MARKER: /usr/sbin/lctl mark save large xattr trusted.big on \/mnt\/lustre\/f61.conf-sanity
[10669.768885] Lustre: DEBUG MARKER: save large xattr trusted.big on /mnt/lustre/f61.conf-sanity
[10669.796354] ------------[ cut here ]------------
[10669.801490] WARNING: at /var/lib/jenkins/workspace/lustre-master/arch/x86_64/build_type/server/distro/el7/ib_stack/inkernel/BUILD/BUILD/lustre-2.7.55/ldiskfs/ldiskfs.h:2631 ldiskfs_xattr_set_entry+0x7c4/0x7f0 [ldiskfs]()
[10669.809597] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) libcfs(OF) ldiskfs(OF) sha512_generic dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ppdev parport_pc virtio_balloon serio_raw pcspkr parport i2c_piix4 ib_core ib_addr ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect ata_piix sysimgblt drm_kms_helper virtio_pci virtio_ring virtio ttm 8139too drm 8139cp i2c_core mii libata floppy [last unloaded: libcfs]
[10669.827111] CPU: 0 PID: 7739 Comm: mdt00_002 Tainted: GF          O--------------   3.10.0-229.4.2.el7_lustre.x86_64 #1
[10669.829588] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[10669.831727]  0000000000000000 00000000510e8244 ffff88006cd67710 ffffffff816050da
[10669.834182]  ffff88006cd67748 ffffffff8106e34b ffff8800655e9e88 ffff88007ac63cf0
[10669.836691]  ffff88006cd67898 0000000000010000 ffff88006571c068 ffff88006cd67758
[10669.839122] Call Trace:
[10669.841102]  [<ffffffff816050da>] dump_stack+0x19/0x1b
[10669.843216]  [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0
[10669.845424]  [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20
[10669.847584]  [<ffffffffa05db334>] ldiskfs_xattr_set_entry+0x7c4/0x7f0 [ldiskfs]
[10669.849899]  [<ffffffffa05dbdd5>] ldiskfs_xattr_ibody_set+0x35/0x90 [ldiskfs]
[10669.852135]  [<ffffffffa05dc1c4>] ldiskfs_xattr_set_handle+0x1a4/0x500 [ldiskfs]
[10669.854442]  [<ffffffffa05dc666>] ldiskfs_xattr_set+0x146/0x1c0 [ldiskfs]
[10669.856627]  [<ffffffffa0607f0d>] ldiskfs_xattr_trusted_set+0x2d/0x30 [ldiskfs]
[10669.858871]  [<ffffffff811eb4b2>] generic_setxattr+0x62/0x80
[10669.860919]  [<ffffffffa0ba62cc>] osd_xattr_set+0x3ec/0x700 [osd_ldiskfs]
[10669.863083]  [<ffffffffa0e29187>] lod_sub_object_xattr_set+0x227/0x460 [lod]
[10669.865182]  [<ffffffffa0e150d9>] lod_xattr_set_internal+0xb9/0x2d0 [lod]
[10669.867329]  [<ffffffffa0e20fb6>] lod_xattr_set+0x136/0x3f0 [lod]
[10669.869422]  [<ffffffffa0e83673>] mdd_xattr_set+0x483/0x720 [mdd]
[10669.872412]  [<ffffffffa0d453c7>] ? mdt_version_save+0x67/0x120 [mdt]
[10669.875322]  [<ffffffffa0d51b57>] mdt_reint_setxattr+0x967/0xf90 [mdt]
[10669.878297]  [<ffffffffa0d4f1d0>] mdt_reint_rec+0x80/0x210 [mdt]
[10669.880537]  [<ffffffffa0d32e6c>] mdt_reint_internal+0x58c/0x780 [mdt]
[10669.882681]  [<ffffffffa0d3c167>] mdt_reint+0x67/0x140 [mdt]
[10669.884691]  [<ffffffffa0a047e5>] tgt_request_handle+0x6d5/0x1060 [ptlrpc]
[10669.886831]  [<ffffffffa09b472b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[10669.888956]  [<ffffffffa09b22a8>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[10669.891045]  [<ffffffffa0561967>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[10669.893039]  [<ffffffffa09b8918>] ptlrpc_main+0xaf8/0x1ea0 [ptlrpc]
[10669.895022]  [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40
[10669.896880]  [<ffffffffa09b7e20>] ? ptlrpc_register_service+0xf00/0xf00 [ptlrpc]
[10669.898876]  [<ffffffff8109739f>] kthread+0xcf/0xe0
[10669.900601]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
[10669.902499]  [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
[10669.904251]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
[10669.906128] ---[ end trace 28efa5c9df746bee ]---
[10669.911949] LustreError: 7739:0:(mdt_handler.c:891:mdt_getattr_internal()) lustre-MDT0000: getattr error for [0x200000401:0x1:0x0]: rc = -5
[10670.039079] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_61: @@@@@@ FAIL: trusted.big different after saving 
[10670.184684] Lustre: DEBUG MARKER: conf-sanity test_61: @@@@@@ FAIL: trusted.big different after saving


 Comments   
Comment by Peter Jones [ 03/Jul/15 ]

Jian

Could you please look into this issue?

Thanks

Peter

Comment by Jian Yu [ 12/Aug/15 ]

On client:

getfattr: /mnt/lustre/f61.conf-sanity: Input/output error
00000002:00000001:0.0:1439363436.401810:0:5372:0:(mdc_locks.c:571:mdc_finish_enqueue()) Process entered
00000002:00100000:0.0:1439363436.401812:0:5372:0:(mdc_locks.c:625:mdc_finish_enqueue()) @@@ op: 8 disposition: b, status: -5  req@ffff8800796ed800 x1509281914421388/t0(0) o101->lustre-MDT0000-mdc-ffff880078fbf000@10.100.4.86@tcp:12/10 lens 584/536 e 0 to 0 dl 1439363443 ref 1 fl Complete:R/0/0 rc 301/301
<~snip~>
00000002:00000001:0.0:1439363436.401820:0:5372:0:(mdc_locks.c:955:mdc_finish_intent_lock()) Process entered
00000002:00000001:0.0:1439363436.401821:0:5372:0:(mdc_locks.c:979:mdc_finish_intent_lock()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00000002:00000001:0.0:1439363436.401822:0:5372:0:(mdc_locks.c:1182:mdc_intent_lock()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00800000:00000001:0.0:1439363436.401823:0:5372:0:(obd_class.h:1535:md_intent_lock()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00800000:00000001:0.0:1439363436.401824:0:5372:0:(lmv_intent.c:410:lmv_intent_lookup()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00800000:00000001:0.0:1439363436.401825:0:5372:0:(lmv_intent.c:530:lmv_intent_lock()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00000080:00000001:0.0:1439363436.401826:0:5372:0:(obd_class.h:1535:md_intent_lock()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00000080:00000001:0.0:1439363436.401827:0:5372:0:(namei.c:590:ll_lookup_it()) Process leaving via out (rc=18446744073709551611 : -5 : 0xfffffffffffffffb)

On MDS:

00000004:00000001:0.0:1439363436.853113:0:26612:0:(lod_object.c:1355:lod_xattr_get()) Process entered
00000004:00000001:0.0:1439363436.853119:0:26612:0:(lod_object.c:1395:lod_xattr_get()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00000004:00000001:0.0:1439363436.853121:0:26612:0:(mdd_object.c:244:mdd_xattr_get()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00000004:00000001:0.0:1439363436.853122:0:26612:0:(mdt_handler.c:802:mdt_attr_get_complex()) Process leaving via out (rc=18446744073709551611 : -5 : 0xfffffffffffffffb)
00000004:00000002:0.0:1439363436.853123:0:26612:0:(mdt_handler.c:861:mdt_attr_get_complex()) after getattr rc = -5, ma_valid = 0x1 ma_lmm=ffff88006982a340
00000004:00000001:0.0:1439363436.853124:0:26612:0:(mdt_handler.c:862:mdt_attr_get_complex()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)
00000004:00020000:0.0:1439363436.853125:0:26612:0:(mdt_handler.c:951:mdt_getattr_internal()) lustre-MDT0000: getattr error for [0x200000400:0x1:0x0]: rc = -5
00000004:00000001:0.0:1439363436.856913:0:26612:0:(mdt_handler.c:952:mdt_getattr_internal()) Process leaving (rc=18446744073709551611 : -5 : fffffffffffffffb)

Still investigate.

Comment by Jian Yu [ 15/Aug/15 ]
# Get the maximum xattr size supported by the filesystem.
max_xattr_size() {      
    local size  
        
    if large_xattr_enabled; then
        # include/linux/limits.h: #define XATTR_SIZE_MAX 65536
        size=65536
    else
        local mds_dev=$(mdsdevname ${SINGLEMDS//mds/})
        local block_size=$(get_block_size $SINGLEMDS $mds_dev)

        # maximum xattr size = size of block - size of header -
        #                      size of 1 entry - 4 null bytes
        size=$((block_size - 32 - 32 - 4))
    fi
    
    echo $size
}

Testing showed that after large_xattr was enabled, the test would fail on RHEL 7.1 while the maximum xattr size exceeded $((block_size - 32 - 32 - 4)). So, large xattr did not work.

I'm looking into ldiskfs_xattr_set_entry().

Comment by Jian Yu [ 18/Aug/15 ]

The problematic source code is:

# gdb "$(modinfo -n ldiskfs)"
<~snip~>
Reading symbols from /usr/lib/modules/3.10.0-229.4.2.el7_lustre.x86_64/extra/kernel/fs/lustre/ldiskfs.ko...Reading symbols from /usr/lib/debug/lib/modules/3.10.0-229.4.2.el7_lustre.x86_64/extra/kernel/fs/lustre/ldiskfs.ko.debug...done.
done.
(gdb) list *(ldiskfs_xattr_set_entry+0x7c4)
0x1e244 is in ldiskfs_xattr_set_entry (/usr/src/debug/lustre-2.7.0/ldiskfs/ldiskfs.h:2637).
2632    #endif
2633
2634    /* Update i_disksize. Requires i_mutex to avoid races with truncate */
2635    static inline void ldiskfs_update_i_disksize(struct inode *inode, loff_t newsize)
2636    {
2637            WARN_ON_ONCE(S_ISREG(inode->i_mode) &&
2638                         !mutex_is_locked(&inode->i_mutex));
2639            down_write(&LDISKFS_I(inode)->i_data_sem);
2640            if (newsize > LDISKFS_I(inode)->i_disksize)
2641                    LDISKFS_I(inode)->i_disksize = newsize;
Comment by Jian Yu [ 18/Aug/15 ]

The i_mutex lock needs to be acquired before calling ext4_update_i_disksize() in xattr.c.
I'm creating the patch.

Comment by Gerrit Updater [ 18/Aug/15 ]

Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/16012
Subject: LU-6757 ldiskfs: acquire i_mutex lock for ext4_update_i_disksize
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2dd614101cc3e7c5d0b8c5cc6d88e1c07fa55141

Comment by Jian Yu [ 24/Aug/15 ]

The root cause of lod_xattr_get() returning "-5" (-EIO) error is related to the following changes in ext4_xattr_check_names() for the RHEL 7.1 kernel version 3.10.0:

author	Darrick J. Wong <darrick.wong@oracle.com>	2014-09-16 18:34:59 (GMT)
committer	Theodore Ts'o <tytso@mit.edu>	2014-09-16 18:34:59 (GMT)
commit	a0626e75954078cfacddb00a4545dde821170bc5 (patch)
tree	61aeb6747c5fc7016ac54cc704fb6d9b1833674d
parent	684de5748660e16e185754697ac0afa9e18297f6 (diff)
ext4: check EA value offset when loading
When loading extended attributes, check each entry's value offset to
make sure it doesn't collide with the entries.

Without this check it is easy to crash the kernel by mounting a
malicious FS containing a file with an EA wherein e_value_offs = 0 and
e_value_size > 0 and then deleting the EA, which corrupts the name
list.

(See the f_ea_value_crash test's FS image in e2fsprogs for an example.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index da4df70..42823ab 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -190,14 +190,28 @@ ext4_listxattr(struct dentry *dentry, char *buffer, size_t size)
 }
 
 static int
-ext4_xattr_check_names(struct ext4_xattr_entry *entry, void *end)
+ext4_xattr_check_names(struct ext4_xattr_entry *entry, void *end,
+		       void *value_start)
 {
-	while (!IS_LAST_ENTRY(entry)) {
-		struct ext4_xattr_entry *next = EXT4_XATTR_NEXT(entry);
+	struct ext4_xattr_entry *e = entry;
+
+	while (!IS_LAST_ENTRY(e)) {
+		struct ext4_xattr_entry *next = EXT4_XATTR_NEXT(e);
 		if ((void *)next >= end)
 			return -EIO;
-		entry = next;
+		e = next;
 	}
+
+	while (!IS_LAST_ENTRY(entry)) {
+		if (entry->e_value_size != 0 &&
+		    (value_start + le16_to_cpu(entry->e_value_offs) <
+		     (void *)e + sizeof(__u32) ||
+		     value_start + le16_to_cpu(entry->e_value_offs) +
+		    le32_to_cpu(entry->e_value_size) > end))
+			return -EIO;{color}                                               ------------> here
+		entry = EXT4_XATTR_NEXT(entry);
+	}
+
 	return 0;
 }

While getting xattr, in ext4_xattr_ibody_get():

        end = (void *)raw_inode + EXT4_SB(inode->i_sb)->s_inode_size;
        error = ext4_xattr_check_names(entry, end, entry);

For large EA, value_start + le16_to_cpu(entry->e_value_offs) + le32_to_cpu(entry->e_value_size) > end will fail, which causes ext4_xattr_check_names() return "-EIO".

I fixed the above failure by checking e_value_inum to see whether the large EA value is saved in an external EA inode or not. Testing showed that conf-sanity test 61 passed on RHEL 7.1 distro.

Comment by Gerrit Updater [ 26/Aug/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16012/
Subject: LU-6757 ldiskfs: acquire i_mutex lock for ext4_update_i_disksize
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 7417966b6a2ebf56a49dee327f1561f63bf9224b

Comment by Joseph Gmitter (Inactive) [ 26/Aug/15 ]

Landed for 2.8.

Comment by Gerrit Updater [ 15/Feb/16 ]

Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/18449
Subject: LU-6757 ldiskfs: large EA support
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 813580ebe4a08dddf34eaa057c16832d4be46682

Comment by Gerrit Updater [ 25/Feb/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18449/
Subject: LU-6757 ldiskfs: large EA support
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 721e5ca9b7c1e00dae1d0adc49bcbe403d913a10

Generated at Sat Feb 10 02:02:59 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.