[LU-4563] Oopses on bad address writes to proc Created: 29/Jan/14  Updated: 06/Jan/17  Resolved: 09/Dec/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: Lustre 2.6.0

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 12454

 Description   

Doing some unrelated research I discovered that we still have plenty of files that do not verify userspace passed buffers to be correct.

e.g.

[72053.884332] BUG: unable to handle kernel paging request at 0000000004096000
[72053.884718] IP: [<ffffffffa0f35d27>] ll_rw_extents_stats_pp_seq_write+0xe7/0x130 [lustre]
[72053.885133] PGD 8c48b067 PUD 8c60c067 PMD 0 
[72053.885503] Oops: 0000 [#2] SMP DEBUG_PAGEALLOC
[72053.885852] last sysfs file: /sys/devices/system/cpu/possible
[72053.886075] CPU 4 
[72053.886138] Modules linked in: lustre ofd osp lod ost mdt mdd mgs nodemap osd_ldiskfs ldiskfs exportfs lquota lfsck jbd obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet sha512_generic sha256_generic libcfs ext4 jbd2 mbcache ppdev parport_pc parport virtio_console virtio_balloon i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: speedstep_lib]
[72053.888085] 
[72053.888085] Pid: 4120, comm: out Tainted: G      D    ---------------    2.6.32-rhe6.4-debug2 #1 Bochs Bochs
[72053.888085] RIP: 0010:[<ffffffffa0f35d27>]  [<ffffffffa0f35d27>] ll_rw_extents_stats_pp_seq_write+0xe7/0x130 [lustre]
[72053.888085] RSP: 0018:ffff8800aef71e48  EFLAGS: 00010282
[72053.888085] RAX: 00000000fffffff2 RBX: ffff88009530c000 RCX: 0000000000000009
[72053.888085] RDX: 0000000000000000 RSI: 0000000004096000 RDI: ffffffffa0f6b3a0
[72053.888085] RBP: ffff8800aef71e98 R08: ffffffff811eae00 R09: 00007f58b4e449e0
[72053.888085] R10: 00007fff1886c500 R11: 0000000000000246 R12: 0000000000000005
[72053.888085] R13: 0000000004096000 R14: 0000000004096000 R15: 0000000000000005
[72053.888085] FS:  00007f58b504c700(0000) GS:ffff880006300000(0000) knlGS:0000000000000000
[72053.888085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[72053.888085] CR2: 0000000004096000 CR3: 00000000afbdc000 CR4: 00000000000006e0
[72053.888085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[72053.888085] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[72053.888085] Process out (pid: 4120, threadinfo ffff8800aef70000, task ffff88008cb1c380)
[72053.888085] Stack:
[72053.888085]  ffff8800aef71ea8 ffffffff8116691e ffff8800b4c34000 00000001b7578ef0
[72053.888085] <d> ffff880097cf7f08 ffff8800b7578ef0 ffff880097cf7f08 ffffffffa0f35c40
[72053.888085] <d> 0000000004096000 0000000000000005 ffff8800aef71ee8 ffffffff811eaf55
[72053.888085] Call Trace:
[72053.888085]  [<ffffffff8116691e>] ? cache_free_debugcheck+0x2ae/0x360
[72053.888085]  [<ffffffffa0f35c40>] ? ll_rw_extents_stats_pp_seq_write+0x0/0x130 [lustre]
[72053.888085]  [<ffffffff811eaf55>] proc_reg_write+0x85/0xc0
[72053.888085]  [<ffffffff811818a8>] vfs_write+0xb8/0x1a0
[72053.888085]  [<ffffffff81182111>] sys_write+0x51/0x90
[72053.888085]  [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
[72053.888085] Code: e8 4f 8a 5c e0 48 83 c4 28 4c 89 e0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 0f 1f 44 00 00 b9 09 00 00 00 48 c7 c7 a0 b3 f6 a0 4c 89 ee <f3> a6 75 1d c7 45 cc 00 00 00 00 c7 83 0c 36 00 00 00 00 00 00 
[72053.888085] RIP  [<ffffffffa0f35d27>] ll_rw_extents_stats_pp_seq_write+0xe7/0x130 [lustre]

testcase:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char **argv)
{
	int fd = open(argv[1], O_WRONLY);

	if (fd == -1) {
		perror(argv[1]);
		return -1;
	}

	write(fd, (void *)0x4096000, 5);
	perror("write");

	close(fd);

	fd = open(argv[1], O_RDONLY);
	read(fd, (void *)0x4096000, 5);
	perror("read");

	close(fd);

	return 0;
}

Run it as:

find /proc/sys/lustre -type f -not -name force_lbug -exec $PATH_TO_BINARY {} \;

patch is forthcoming.



 Comments   
Comment by Oleg Drokin [ 29/Jan/14 ]

patch in http://review.whamcloud.com/#/c/9059

Comment by Oleg Drokin [ 29/Jan/14 ]

With the patch in place, only one "warning" remains that underscores unsafe buffer handling in nid hash printing, that hopefully would be addressed once we fully migrate to seq_file:

[  255.184748] ------------[ cut here ]------------
[  255.184994] WARNING: at lib/vsprintf.c:1216 vsnprintf+0x5c6/0x5e0() (Not tainted)
[  255.185380] Hardware name: Bochs
[  255.185604] Modules linked in: lustre ofd osp lod ost mdt mdd mgs nodemap osd_ldiskfs ldiskfs exportfs lquota lfsck jbd obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet sha512_generic sha256_generic libcfs ext4 jbd2 mbcache ppdev parport_pc parport virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: speedstep_lib]
[  255.190438] Pid: 3762, comm: out Not tainted 2.6.32-rhe6.4-debug2 #1
[  255.190677] Call Trace:
[  255.190864]  [<ffffffff8106bc57>] ? warn_slowpath_common+0x87/0xc0
[  255.191119]  [<ffffffffa058aef0>] ? lprocfs_exp_print_hash+0x0/0xa0 [obdclass]
[  255.191482]  [<ffffffff8106bcaa>] ? warn_slowpath_null+0x1a/0x20
[  255.191717]  [<ffffffff8127fad6>] ? vsnprintf+0x5c6/0x5e0
[  255.191952]  [<ffffffffa058aef0>] ? lprocfs_exp_print_hash+0x0/0xa0 [obdclass]
[  255.192329]  [<ffffffff8127fb94>] ? snprintf+0x34/0x40
[  255.192562]  [<ffffffffa042c640>] ? cfs_hash_debug_str+0xd0/0x400 [libcfs]
[  255.192818]  [<ffffffffa058aef0>] ? lprocfs_exp_print_hash+0x0/0xa0 [obdclass]
[  255.193190]  [<ffffffffa058af49>] ? lprocfs_exp_print_hash+0x59/0xa0 [obdclass]
[  255.193597]  [<ffffffffa042d8f2>] ? cfs_hash_for_each_key+0xa2/0xe0 [libcfs]
[  255.193842]  [<ffffffff8117ed44>] ? nameidata_to_filp+0x54/0x70
[  255.194088]  [<ffffffffa058c12f>] ? lprocfs_exp_rd_hash+0x4f/0x60 [obdclass]
[  255.194335]  [<ffffffff811f16d1>] ? proc_file_read+0x1f1/0x370
[  255.194568]  [<ffffffff811f14e0>] ? proc_file_read+0x0/0x370
[  255.194795]  [<ffffffff811eb015>] ? proc_reg_read+0x85/0xc0
[  255.195021]  [<ffffffff81181f45>] ? vfs_read+0xb5/0x1a0
[  255.195242]  [<ffffffff81182081>] ? sys_read+0x51/0x90
[  255.195463]  [<ffffffff8100b0b2>] ? system_call_fastpath+0x16/0x1b
[  255.195699] ---[ end trace 011600448007c7ae ]---
Comment by Oleg Drokin [ 29/Jan/14 ]

Also of note, places like lprocfs_wr_nosquash_nids() try to allocate userspace-passed buffer size blindly which could easily lead to OOM.
We probably need to have another testcase and patch for this.

Generated at Sat Feb 10 01:43:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.