[LU-4563] Oopses on bad address writes to proc Created: 29/Jan/14 Updated: 06/Jan/17 Resolved: 09/Dec/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Oleg Drokin | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 12454 | ||||
| Description |
|
Doing some unrelated research I discovered that we still have plenty of files that do not verify userspace passed buffers to be correct. e.g. [72053.884332] BUG: unable to handle kernel paging request at 0000000004096000 [72053.884718] IP: [<ffffffffa0f35d27>] ll_rw_extents_stats_pp_seq_write+0xe7/0x130 [lustre] [72053.885133] PGD 8c48b067 PUD 8c60c067 PMD 0 [72053.885503] Oops: 0000 [#2] SMP DEBUG_PAGEALLOC [72053.885852] last sysfs file: /sys/devices/system/cpu/possible [72053.886075] CPU 4 [72053.886138] Modules linked in: lustre ofd osp lod ost mdt mdd mgs nodemap osd_ldiskfs ldiskfs exportfs lquota lfsck jbd obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet sha512_generic sha256_generic libcfs ext4 jbd2 mbcache ppdev parport_pc parport virtio_console virtio_balloon i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: speedstep_lib] [72053.888085] [72053.888085] Pid: 4120, comm: out Tainted: G D --------------- 2.6.32-rhe6.4-debug2 #1 Bochs Bochs [72053.888085] RIP: 0010:[<ffffffffa0f35d27>] [<ffffffffa0f35d27>] ll_rw_extents_stats_pp_seq_write+0xe7/0x130 [lustre] [72053.888085] RSP: 0018:ffff8800aef71e48 EFLAGS: 00010282 [72053.888085] RAX: 00000000fffffff2 RBX: ffff88009530c000 RCX: 0000000000000009 [72053.888085] RDX: 0000000000000000 RSI: 0000000004096000 RDI: ffffffffa0f6b3a0 [72053.888085] RBP: ffff8800aef71e98 R08: ffffffff811eae00 R09: 00007f58b4e449e0 [72053.888085] R10: 00007fff1886c500 R11: 0000000000000246 R12: 0000000000000005 [72053.888085] R13: 0000000004096000 R14: 0000000004096000 R15: 0000000000000005 [72053.888085] FS: 00007f58b504c700(0000) GS:ffff880006300000(0000) knlGS:0000000000000000 [72053.888085] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [72053.888085] CR2: 0000000004096000 CR3: 00000000afbdc000 CR4: 00000000000006e0 [72053.888085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [72053.888085] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [72053.888085] Process out (pid: 4120, threadinfo ffff8800aef70000, task ffff88008cb1c380) [72053.888085] Stack: [72053.888085] ffff8800aef71ea8 ffffffff8116691e ffff8800b4c34000 00000001b7578ef0 [72053.888085] <d> ffff880097cf7f08 ffff8800b7578ef0 ffff880097cf7f08 ffffffffa0f35c40 [72053.888085] <d> 0000000004096000 0000000000000005 ffff8800aef71ee8 ffffffff811eaf55 [72053.888085] Call Trace: [72053.888085] [<ffffffff8116691e>] ? cache_free_debugcheck+0x2ae/0x360 [72053.888085] [<ffffffffa0f35c40>] ? ll_rw_extents_stats_pp_seq_write+0x0/0x130 [lustre] [72053.888085] [<ffffffff811eaf55>] proc_reg_write+0x85/0xc0 [72053.888085] [<ffffffff811818a8>] vfs_write+0xb8/0x1a0 [72053.888085] [<ffffffff81182111>] sys_write+0x51/0x90 [72053.888085] [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [72053.888085] Code: e8 4f 8a 5c e0 48 83 c4 28 4c 89 e0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 0f 1f 44 00 00 b9 09 00 00 00 48 c7 c7 a0 b3 f6 a0 4c 89 ee <f3> a6 75 1d c7 45 cc 00 00 00 00 c7 83 0c 36 00 00 00 00 00 00 [72053.888085] RIP [<ffffffffa0f35d27>] ll_rw_extents_stats_pp_seq_write+0xe7/0x130 [lustre] testcase: #include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char **argv)
{
int fd = open(argv[1], O_WRONLY);
if (fd == -1) {
perror(argv[1]);
return -1;
}
write(fd, (void *)0x4096000, 5);
perror("write");
close(fd);
fd = open(argv[1], O_RDONLY);
read(fd, (void *)0x4096000, 5);
perror("read");
close(fd);
return 0;
}
Run it as: find /proc/sys/lustre -type f -not -name force_lbug -exec $PATH_TO_BINARY {} \;
patch is forthcoming. |
| Comments |
| Comment by Oleg Drokin [ 29/Jan/14 ] |
|
patch in http://review.whamcloud.com/#/c/9059 |
| Comment by Oleg Drokin [ 29/Jan/14 ] |
|
With the patch in place, only one "warning" remains that underscores unsafe buffer handling in nid hash printing, that hopefully would be addressed once we fully migrate to seq_file: [ 255.184748] ------------[ cut here ]------------ [ 255.184994] WARNING: at lib/vsprintf.c:1216 vsnprintf+0x5c6/0x5e0() (Not tainted) [ 255.185380] Hardware name: Bochs [ 255.185604] Modules linked in: lustre ofd osp lod ost mdt mdd mgs nodemap osd_ldiskfs ldiskfs exportfs lquota lfsck jbd obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet sha512_generic sha256_generic libcfs ext4 jbd2 mbcache ppdev parport_pc parport virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: speedstep_lib] [ 255.190438] Pid: 3762, comm: out Not tainted 2.6.32-rhe6.4-debug2 #1 [ 255.190677] Call Trace: [ 255.190864] [<ffffffff8106bc57>] ? warn_slowpath_common+0x87/0xc0 [ 255.191119] [<ffffffffa058aef0>] ? lprocfs_exp_print_hash+0x0/0xa0 [obdclass] [ 255.191482] [<ffffffff8106bcaa>] ? warn_slowpath_null+0x1a/0x20 [ 255.191717] [<ffffffff8127fad6>] ? vsnprintf+0x5c6/0x5e0 [ 255.191952] [<ffffffffa058aef0>] ? lprocfs_exp_print_hash+0x0/0xa0 [obdclass] [ 255.192329] [<ffffffff8127fb94>] ? snprintf+0x34/0x40 [ 255.192562] [<ffffffffa042c640>] ? cfs_hash_debug_str+0xd0/0x400 [libcfs] [ 255.192818] [<ffffffffa058aef0>] ? lprocfs_exp_print_hash+0x0/0xa0 [obdclass] [ 255.193190] [<ffffffffa058af49>] ? lprocfs_exp_print_hash+0x59/0xa0 [obdclass] [ 255.193597] [<ffffffffa042d8f2>] ? cfs_hash_for_each_key+0xa2/0xe0 [libcfs] [ 255.193842] [<ffffffff8117ed44>] ? nameidata_to_filp+0x54/0x70 [ 255.194088] [<ffffffffa058c12f>] ? lprocfs_exp_rd_hash+0x4f/0x60 [obdclass] [ 255.194335] [<ffffffff811f16d1>] ? proc_file_read+0x1f1/0x370 [ 255.194568] [<ffffffff811f14e0>] ? proc_file_read+0x0/0x370 [ 255.194795] [<ffffffff811eb015>] ? proc_reg_read+0x85/0xc0 [ 255.195021] [<ffffffff81181f45>] ? vfs_read+0xb5/0x1a0 [ 255.195242] [<ffffffff81182081>] ? sys_read+0x51/0x90 [ 255.195463] [<ffffffff8100b0b2>] ? system_call_fastpath+0x16/0x1b [ 255.195699] ---[ end trace 011600448007c7ae ]--- |
| Comment by Oleg Drokin [ 29/Jan/14 ] |
|
Also of note, places like lprocfs_wr_nosquash_nids() try to allocate userspace-passed buffer size blindly which could easily lead to OOM. |