[LU-4287] Kernel update [RHEL6.5 2.6.32-431.3.1.el6] Created: 21/Nov/13  Updated: 14/Feb/14  Resolved: 10/Feb/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.6.0, Lustre 2.5.1

Type: Improvement Priority: Minor
Reporter: Yang Sheng Assignee: Yang Sheng
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Related
Rank (Obsolete): 11766

 Description   

This update fixes the following security issues:

  • A flaw was found in the way the Linux kernel's IPv6 implementation
    handled certain UDP packets when the UDP Fragmentation Offload (UFO)
    feature was enabled. A remote attacker could use this flaw to crash the
    system or, potentially, escalate their privileges on the system.
    (CVE-2013-4387, Important)
  • A flaw was found in the way the Linux kernel handled the creation of
    temporary IPv6 addresses. If the IPv6 privacy extension was enabled
    (/proc/sys/net/ipv6/conf/eth0/use_tempaddr set to '2'), an attacker on the
    local network could disable IPv6 temporary address generation, leading to a
    potential information disclosure. (CVE-2013-0343, Moderate)
  • A flaw was found in the way the Linux kernel handled HID (Human Interface
    Device) reports with an out-of-bounds Report ID. An attacker with physical
    access to the system could use this flaw to crash the system or,
    potentially, escalate their privileges on the system. (CVE-2013-2888,
    Moderate)
  • An off-by-one flaw was found in the way the ANSI CPRNG implementation in
    the Linux kernel processed non-block size aligned requests. This could lead
    to random numbers being generated with less bits of entropy than expected
    when ANSI CPRNG was used. (CVE-2013-4345, Moderate)
  • It was found that the fix for CVE-2012-2375 released via RHSA-2012:1580
    accidentally removed a check for small-sized result buffers. A local,
    unprivileged user with access to an NFSv4 mount with ACL support could use
    this flaw to crash the system or, potentially, escalate their privileges on
    the system . (CVE-2013-4591, Moderate)
  • A flaw was found in the way IOMMU memory mappings were handled when
    moving memory slots. A malicious user on a KVM host who has the ability to
    assign a device to a guest could use this flaw to crash the host.
    (CVE-2013-4592, Moderate)
  • Heap-based buffer overflow flaws were found in the way the Zeroplus and
    Pantherlord/GreenAsia game controllers handled HID reports. An attacker
    with physical access to the system could use these flaws to crash the
    system or, potentially, escalate their privileges on the system.
    (CVE-2013-2889, CVE-2013-2892, Moderate)
  • Two information leak flaws were found in the logical link control (LLC)
    implementation in the Linux kernel. A local, unprivileged user could use
    these flaws to leak kernel stack memory to user space. (CVE-2012-6542,
    CVE-2013-3231, Low)
  • A heap-based buffer overflow in the way the tg3 Ethernet driver parsed
    the vital product data (VPD) of devices could allow an attacker with
    physical access to a system to cause a denial of service or, potentially,
    escalate their privileges. (CVE-2013-1929, Low)
  • Information leak flaws in the Linux kernel could allow a privileged,
    local user to leak kernel memory to user space. (CVE-2012-6545,
    CVE-2013-1928, CVE-2013-2164, CVE-2013-2234, Low)
  • A format string flaw was found in the Linux kernel's block layer.
    A privileged, local user could potentially use this flaw to escalate their
    privileges to kernel level (ring0). (CVE-2013-2851, Low)

Red Hat would like to thank Stephan Mueller for reporting CVE-2013-4345,
and Kees Cook for reporting CVE-2013-2851.

This update also fixes several hundred bugs and adds enhancements. Refer to
the Red Hat Enterprise Linux 6.5 Release Notes for information on the most
significant of these changes, and the Technical Notes for further
information, both linked to in the References.

All Red Hat Enterprise Linux 6 users are advised to install these updated
packages, which correct these issues, and fix the bugs and add the
enhancements noted in the Red Hat Enterprise Linux 6.5 Release Notes and
Technical Notes. The system must be rebooted for this update to take
effect.

Bugs fixed (https://bugzilla.redhat.com/):

627128 - kernel spec: devel_post macro: hardlink fc typo
734728 - cifs: asynchronous readpages support
796364 - sbc_fitpc2_wdt NULL pointer dereference
815908 - NFSv4 server support for numeric IDs
831158 - dm-crypt: Fix possible mempool deadlock
834919 - JBD: Spotted dirty metadata buffer
851269 - kernel-debug: enable CONFIG_JBD_DEBUG
856764 - RHEL 6.5 Common Network Backports Tracker
859562 - DM RAID: 'sync' table argument is ineffective.
873659 - virt: Clocksource tsc unstable (delta = 474712882 ns). Enable clocksource failover by adding clocksource_failover kernel parameter.
876528 - Set-group-ID (SGID) bit not inherited on XFS file system with ACLs on directory
889973 - "kernel: device-mapper: table: 253:3: snapshot-origin: unknown target type"
903297 - FCoE target: backport drivers/target from upstream
908093 - gfs2: withdraw does not wait for gfs_controld
913660 - nfs client crashes during open
914664 - CVE-2013-0343 kernel: handling of IPv6 temporary addresses
918239 - kernel-2.6.32-358.0.1 doesn't boot at virtual machine on Xen Cloud Platform
920752 - cannot open device nodes for writing on RO filesystems
922322 - CVE-2012-6542 Kernel: llc: information leak via getsockname
922404 - CVE-2012-6545 Kernel: Bluetooth: RFCOMM - information leak
928207 - transfer data using two port from guest to host,guest hang and call trace
949567 - CVE-2013-1928 Kernel: information leak in fs/compat_ioctl.c VIDEO_SET_SPU_PALETTE
949932 - CVE-2013-1929 Kernel: tg3: buffer overflow in VPD firmware parsing
953097 - virtio-rng, boot the guest with two rng device, cat /dev/hwrng in guest, guest will call trace
956094 - CVE-2013-3231 Kernel: llc: Fix missing msg_namelen update in llc_ui_recvmsg
969515 - CVE-2013-2851 kernel: block: passing disk names as format strings
973100 - CVE-2013-2164 Kernel: information leak in cdrom driver
980995 - CVE-2013-2234 Kernel: net: information leak in AF_KEY notify
990806 - BUG: soft lockup - CPU#0 stuck for 63s! [killall5:7385]
999890 - CVE-2013-2889 Kernel: HID: zeroplus: heap overflow flaw
1000429 - CVE-2013-2892 Kernel: HID: pantherlord: heap overflow flaw
1000451 - CVE-2013-2888 Kernel: HID: memory corruption flaw
1007690 - CVE-2013-4345 kernel: ansi_cprng: off by one error in non-block size request
1011927 - CVE-2013-4387 Kernel: net: IPv6: panic when UFO=On for an interface
1014867 - xfssyncd and flush device threads hang in xlog_grant_head_wait
1031678 - CVE-2013-4591 kernel: nfs: missing check for buffer length in __nfs4_get_acl_uncached
1031702 - CVE-2013-4592 kernel: kvm: memory leak when memory slot is moved with assigned device



 Comments   
Comment by Fredrik Nyström [ 27/Nov/13 ]

Definition of getname() and putname() in /usr/src/kernels/2.6.32-431.el6.x86_64/include/linux/fs.h has changed.

I was able to build 1.8 client by introducing local getname() and putname() in lustre/llite/dir.c same way as was done here:
Change If44cd9f9: LU-2800 llite: introduce local getname()
http://review.whamcloud.com/#/c/5781/

I suspect this will also be needed for 2.1.

Regards / Fredrik

Comment by Bob Glossman (Inactive) [ 04/Dec/13 ]

seeing build failures in 6.5. even a simple client build now fails. example:

  CC [M]  /home/bogl/lustre-release/libcfs/libcfs/linux/linux-tracefile.o
In file included from /home/bogl/lustre-release/libcfs/include/libcfs/libcfs.h:304,
                 from /home/bogl/lustre-release/libcfs/libcfs/linux/linux-tracefile.c:40:
/home/bogl/lustre-release/libcfs/include/libcfs/params_tree.h:99: error: conflicting types for ‘PDE’
/usr/src/kernels/2.6.32-431.el6.x86_64/include/linux/proc_fs.h:323: note: previous definition of ‘PDE’ was here
make[6]: *** [/home/bogl/lustre-release/libcfs/libcfs/linux/linux-tracefile.o] Error 1
make[5]: *** [/home/bogl/lustre-release/libcfs/libcfs] Error 2
make[4]: *** [/home/bogl/lustre-release/libcfs] Error 2
make[3]: *** [_module_/home/bogl/lustre-release] Error 2
make[3]: Leaving directory `/usr/src/kernels/2.6.32-431.el6.x86_64'
make[2]: *** [modules] Error 2
make[2]: Leaving directory `/home/bogl/lustre-release'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/bogl/lustre-release'
make: *** [all] Error 2

I suspect this is due to recent patch for LU-3319 landed to master. autoconf is I think misidentifying the 2.6.32 kernel in Centos/RHEL 6.5 as

#define HAVE_ONLY_PROCFS_SEQ 1

I believe this lead to build problems.

Did some client only builds against RHEL 6.5 kernel before the recent patch and didn't have this problem.

Comment by James A Simmons [ 04/Dec/13 ]

Try this patch - http://review.whamcloud.com/#/c/8482

Comment by Bob Glossman (Inactive) [ 04/Dec/13 ]

tried it in Centos 6.5. works for me.

Comment by Karsten Weiss [ 06/Dec/13 ]

To which Lustre version does this issue apply?

I was able to build Lustre client 2.5.0 on RHEL 6.5's kernel 2.6.32-431.el6 but we ran into LU-3889 during our initial tests (with Lustre 2.1.6 servers). I also tried to build Lustre client 2.4.0/2.4.1 packages for 2.6.32-431.el6 but the compilation fails in ll_dir_ioctl() because of the putname() API changes.

Is there a patch to build Lustre client 2.4.x on 2.6.32-431.el6?

I also don't see this issue on the issue list for Lustre 2.4.2.

Comment by James A Simmons [ 06/Dec/13 ]

The build issue only exist for master (2.6 branch) for the RHEL6.5 build.

Comment by Fredrik Nyström [ 06/Dec/13 ]

Is there a patch to build Lustre client 2.4.x on 2.6.32-431.el6?

I was able to build Lustre client 2.4.x on 2.6.32-431.el6 after applying following patch.

diff --git a/lustre/llite/dir.c b/lustre/llite/dir.c
index febf6ea..484d177 100644
--- a/lustre/llite/dir.c
+++ b/lustre/llite/dir.c
@@ -1228,6 +1228,30 @@ out:
         RETURN(rc);
 }
 
+static char *
+ll_getname(const char __user *filename)
+{
+	int ret = 0, len;
+	char *tmp = __getname();
+
+	if (!tmp)
+		return ERR_PTR(-ENOMEM);
+
+	len = strncpy_from_user(tmp, filename, PATH_MAX);
+	if (len == 0)
+		ret = -ENOENT;
+	else if (len > PATH_MAX)
+		ret = -ENAMETOOLONG;
+
+	if (ret) {
+		__putname(tmp);
+		tmp =  ERR_PTR(ret);
+	}
+	return tmp;
+}
+
+#define ll_putname(filename) __putname(filename)
+
 static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
         struct inode *inode = file->f_dentry->d_inode;
@@ -1430,7 +1454,7 @@ free_lmv:
 		if (!(exp_connect_flags(sbi->ll_md_exp) & OBD_CONNECT_LVB_TYPE))
 			return -ENOTSUPP;
 
-		filename = getname((const char *)arg);
+		filename = ll_getname((const char *)arg);
 		if (IS_ERR(filename))
 			RETURN(PTR_ERR(filename));
 
@@ -1441,7 +1465,7 @@ free_lmv:
 		rc = ll_rmdir_entry(inode, filename, namelen);
 out_rmdir:
                 if (filename)
-                        putname(filename);
+                        ll_putname(filename);
 		RETURN(rc);
 	}
 	case LL_IOC_LOV_SWAP_LAYOUTS:
@@ -1461,7 +1485,7 @@ out_rmdir:
 
                 if (cmd == IOC_MDC_GETFILEINFO ||
                     cmd == IOC_MDC_GETFILESTRIPE) {
-                        filename = getname((const char *)arg);
+                        filename = ll_getname((const char *)arg);
                         if (IS_ERR(filename))
                                 RETURN(PTR_ERR(filename));
 
@@ -1528,7 +1552,7 @@ out_rmdir:
         out_req:
                 ptlrpc_req_finished(request);
                 if (filename)
-                        putname(filename);
+                        ll_putname(filename);
                 return rc;
         }
         case IOC_LOV_GETINFO: {

Similar issues with all releases < 2.5

Comment by Bob Glossman (Inactive) [ 06/Dec/13 ]

I think that's http://review.whamcloud.com/5781, in master and b2_5. It is planned to be added to b2_4 before we support Centos/RHEL 6.5 there.

Comment by Karsten Weiss [ 06/Dec/13 ]

Thanks Fredrik, with your patch it finally compiles. (I already tried a similar patch yesterday but probably made a mistake...)

Comment by Yang Sheng [ 11/Dec/13 ]

Patch commit to: http://review.whamcloud.com/#/c/8549/

Comment by Bob Glossman (Inactive) [ 12/Dec/13 ]

Adding discussion about http://review.whamcloud.com/#/c/8549 here as I don't want to fill up the review in gerrit with what may become irrelevant comment.

I notice you have carefully forked the ldiskfs portions of the mod so we can still build on earlier el6 as well as 6.5. However the same wasn't done for the base kernel. For example the lustre/kernel_patches/patches/raid5-mmp-unplug-dev-rhel6.patch was altered in such a way as it will now only apply onto 6.5 instead of making a new version of this patch for 6.5. No extensions to lbuild to select between 6.4 and 6.5 were done. If this strategy is OK and we are agreed to abandon patching and building earlier kernels then this is probably right. If it's not OK, then it isn't right.

BTW, do the revisions for ldiskfs support here imply that some similar changes will be needed in http://review.whamcloud.com/7263 ?

Just as an aside however did you find the issue needing change in the sanity.sh test? Have been into the release notes for 6.5 and didn't notice anything about it.

Comment by Yang Sheng [ 12/Dec/13 ]

As i know, We just keep earlier version for ldiskfs patches, not base kernel patches.

Yes, 3.11 will also work in this way. I'll update it.

For sanity test_17g failure, Looks like RedHat bring a patch not come from upstream. I think it exist a obvious issue in function do_getname(). So we need skip it for now.

Comment by Christopher Morrone [ 12/Dec/13 ]

I notice you have carefully forked the ldiskfs portions of the mod so we can still build on earlier el6 as well as 6.5. However the same wasn't done for the base kernel.

I can explain at least the history there. The core lustre folks have historically never cared about making the transition from one kernel to the next easy. Lustre would just randomly one day stop working with your kernel and only work with some newer version, with no consideration given towards the need for a transition period where it can still compile against both kernels.

In the past year I worked (with others like James) to get ldiskfs set up to support multiple versions of kernels supported at the same time. At LLNL we have the lustre tree apply the ldiskfs patches, but we maintain our own kernel independent of Lustre, so we don't let Lustre apply the kernel patches. Therefore I was not particularly motivated to look at improving the patching of the kernel. Since I was driving the ldiskfs changes, they only happened to ldiskfs.

Further, since we hope to eliminate the necessity for patching one's kernel soon, it was seen as less important to make that process cleaner. ldiskfs patches, on the other hand, will exist for quite a long time.

Comment by Yang Sheng [ 13/Dec/13 ]
  • A flaw was found in the way the Linux kernel's TCP/IP protocol suite
    implementation handled sending of certain UDP packets over sockets that
    used the UDP_CORK option when the UDP Fragmentation Offload (UFO) feature
    was enabled on the output device. A local, unprivileged user could use this
    flaw to cause a denial of service or, potentially, escalate their
    privileges on the system. (CVE-2013-4470, Important)
  • A divide-by-zero flaw was found in the apic_get_tmcct() function in KVM's
    Local Advanced Programmable Interrupt Controller (LAPIC) implementation.
    A privileged guest user could use this flaw to crash the host.
    (CVE-2013-6367, Important)
  • A memory corruption flaw was discovered in the way KVM handled virtual
    APIC accesses that crossed a page boundary. A local, unprivileged user
    could use this flaw to crash the system or, potentially, escalate their
    privileges on the system. (CVE-2013-6368, Important)
  • An information leak flaw in the Linux kernel could allow a local,
    unprivileged user to leak kernel memory to user space. (CVE-2013-2141, Low)

Bugs fixed (https://bugzilla.redhat.com/):

970873 - CVE-2013-2141 Kernel: signal: information leak in tkill/tgkill
1023477 - CVE-2013-4470 Kernel: net: memory corruption with UDP_CORK and UFO
1032207 - CVE-2013-6367 kvm: division by zero in apic_get_tmcct()
1032210 - CVE-2013-6368 kvm: cross page vapic_addr access

Comment by Bob Glossman (Inactive) [ 19/Dec/13 ]

the following are needed for client builds on 6.5:

in b2_4: http://review.whamcloud.com/8581
in b1_8: http://review.whamcloud.com/8607

Comment by Christopher Morrone [ 02/Jan/14 ]

Yang Sheng,

In patch http://review.whamcloud.com/8549 it is still not clear to me why you think it best to copy the ext4_ext_walk_space() function into osd_io.c. That function originates from ext4, so it would see to me that ldiskfs would be the more appropriate place to reinsert that function.

If you add it to ldiskfs for just the RHEL6.5 kernel, you do not need to change all of the other kernels' patch sets. Also, I suspect that longer term the maintenance will be less difficult, because we won't need to worry about having a function in Lustre that needs to be fully compatible with multiple kernels' ext4 implementations. The function can be tweaked as needed for only the kernels that lack that function natively.

Comment by Yang Sheng [ 03/Jan/14 ]

Hi, Christopher,

I think it should be move to osd since we can use one interface for io map. Don't need consider different cases in different distro. It will reduce the maintenance effort and ldiskfs patches number. Also we can modify walk_space as needed. Anyway, I don't think this is a main issue for the patch. I would like we can make decision which interface will be used. map_blocks or walk_space. I am trying to do some test to reveal the performance different. Hope it can give some judge base.

Comment by Bob Glossman (Inactive) [ 06/Jan/14 ]

The kernel version in 6.5 has been updated to 2.6.32-431.3.1 over the weekend. Since we haven't landed 6.5 support yet I suggest we just change our target to the new version, not submit a separate bug.

Comment by John DeSantis [ 16/Jan/14 ]

Bob,

I can confirm that the patch offered via the URL http://review.whamcloud.com/#/c/8607/ has functioned without an issue on RHEL 6.x with the new kernel. Thank you for posting that link.

John DeSantis

Comment by Yang Sheng [ 28/Jan/14 ]

I can sure that Oleg mentioned racer issue just relate to rhel6.5 self. But still not very clear why it happen. What i can provide is that mnt_count isn't release so umount cannot success forever. Other thing is 'ln' is culprit. Further investigation needed.

Btw: WangDi, Your patch looks like fixes the 'mdc_read_page' error.

Comment by Oleg Drokin [ 31/Jan/14 ]

Okm I traced the issue back to rhel 6.5 patch adding estale-retry logic. in linkat() they leak nameidata in case of ESTALE return which lustre does during racer.

Patch that fixes the issue for me is:

--- fs/namei.c-orig	2014-01-30 19:53:32.885946633 -0500
+++ fs/namei.c	2014-01-30 21:10:31.880946625 -0500
@@ -2897,6 +2897,7 @@ out_release:
 	path_put(&nd.path);
 	putname(to);
 	if (retry_estale(error, how)) {
+		path_put(&old_path);
 		how |= LOOKUP_REVAL;
 		goto retry;
 	}
Comment by Oleg Drokin [ 31/Jan/14 ]

RH ticket filed for this: https://bugzilla.redhat.com/show_bug.cgi?id=1059943

Comment by Bob Glossman (Inactive) [ 31/Jan/14 ]

Seems like we're stuck until the upstream fix happens. Even if we added a kernel patch for 6.5, it would only apply in server builds. We would still hit the problem in clients that we build and run on unpatched, pristine kernels. Is there some obvious workaround I'm missing?

Comment by Yang Sheng [ 31/Jan/14 ]

Why i cannot access RH ticket? Is it need some permit?

Comment by Oleg Drokin [ 03/Feb/14 ]

the RH ticket is restricted to some intel group for some reason I am not sure of.

Anyway, the bug is also present in upstream kernels, so I also sent a fix there and it was already accepted.

See here if you are interested in all the details: http://comments.gmane.org/gmane.linux.kernel/1638580

Comment by Peter Jones [ 10/Feb/14 ]

Landed for 2.6

Generated at Sat Feb 10 01:41:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.