Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.16.0
-
None
-
3
-
9223372036854775807
Description
I am trying to add rhel 9.3 support to janitor and off the start I am hitting a KASAN highlighted problem:
[ 111.603361] BUG: KASAN: slab-out-of-bounds in unix_find_other+0x41e/0x630 [ 111.603367] Write of size 1 at addr ffff88810fc70e6e by task insmod/2783 [ 111.603369] [ 111.603371] CPU: 2 PID: 2783 Comm: insmod Kdump: loaded Tainted: G OE ------- --- 5.14.0rocky93-debug #4 [ 111.603375] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 111.603376] Call Trace: [ 111.603378] <TASK> [ 111.603380] ? unix_find_other+0x41e/0x630 [ 111.603383] dump_stack_lvl+0x57/0x7d [ 111.603388] print_address_description.constprop.0+0x1f/0x1e0 [ 111.603394] ? unix_find_other+0x41e/0x630 [ 111.603396] print_report.cold+0x55/0x240 [ 111.603401] kasan_report+0xc8/0x200 [ 111.603405] ? unix_find_other+0x41e/0x630 [ 111.603409] unix_find_other+0x41e/0x630 [ 111.603411] ? unix_create1+0x5e0/0x870 [ 111.603414] ? unix_stream_sendpage+0xac0/0xac0 [ 111.603416] ? do_raw_spin_unlock+0x149/0x1f0 [ 111.603421] ? skb_set_owner_w+0x1d2/0x300 [ 111.603427] unix_stream_connect+0x26b/0x11b0 [ 111.603434] check_gssd_socket+0x292/0x41b [ptlrpc_gss] [ 111.603455] ? ctx_init_pack_request.cold+0x22/0x22 [ptlrpc_gss] [ 111.603479] gss_init_svc_upcall+0xc6/0x129 [ptlrpc_gss] [ 111.603497] sptlrpc_gss_init+0x7d/0x1ec [ptlrpc_gss] [ 111.603515] ? 0xffffffffc0b70000 [ 111.603529] ? 0xffffffffc0b70000 [ 111.603531] do_one_initcall+0xf9/0x550 [ 111.603535] ? perf_trace_initcall_level+0x3f0/0x3f0 [ 111.603540] ? rcu_read_lock_sched_held+0x3f/0x70 [ 111.603544] ? trace_kmalloc+0x38/0x100 [ 111.603546] ? kmem_cache_alloc_trace+0x221/0x430 [ 111.603549] ? kasan_unpoison+0x23/0x50 [ 111.603553] do_init_module+0x1c8/0x7a0 [ 111.603558] load_module+0x1ac4/0x1ee0 [ 111.603563] ? post_relocation+0x390/0x390 [ 111.603565] ? __lock_release+0x4bd/0x9f0 [ 111.603570] ? kernel_read_file_from_fd+0x86/0xe0 [ 111.603575] __do_sys_finit_module+0x110/0x1a0 [ 111.603577] ? __ia32_sys_init_module+0xa0/0xa0 [ 111.603581] ? vm_mmap_pgoff+0x188/0x210 [ 111.603589] ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370 [ 111.603592] ? syscall_enter_from_user_mode+0x22/0xb0 [ 111.603601] ? lockdep_hardirqs_on+0x79/0x100 [ 111.603605] do_syscall_64+0x56/0x80 [ 111.603609] ? asm_exc_page_fault+0x22/0x30 [ 111.603613] ? lockdep_hardirqs_on+0x79/0x100 [ 111.603616] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 111.603619] RIP: 0033:0x7fc9cea3ee5d [ 111.603623] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48 [ 111.603625] RSP: 002b:00007fffa893b188 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 111.603629] RAX: ffffffffffffffda RBX: 0000560b2baab810 RCX: 00007fc9cea3ee5d [ 111.603631] RDX: 0000000000000000 RSI: 0000560b29f0a962 RDI: 0000000000000003 [ 111.603632] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 111.603633] R10: 0000000000000003 R11: 0000000000000246 R12: 0000560b29f0a962 [ 111.603635] R13: 0000560b2baab7c0 R14: 0000560b29f09550 R15: 0000560b2baab920 [ 111.603641] </TASK> [ 111.603642] [ 111.603643] Allocated by task 2783: [ 111.603644] kasan_save_stack+0x1e/0x40 [ 111.603647] __kasan_kmalloc+0x81/0xa0 [ 111.603649] check_gssd_socket+0x167/0x41b [ptlrpc_gss] [ 111.603668] gss_init_svc_upcall+0xc6/0x129 [ptlrpc_gss] [ 111.603685] sptlrpc_gss_init+0x7d/0x1ec [ptlrpc_gss] [ 111.603701] do_one_initcall+0xf9/0x550 [ 111.603703] do_init_module+0x1c8/0x7a0 [ 111.603705] load_module+0x1ac4/0x1ee0 [ 111.603706] __do_sys_finit_module+0x110/0x1a0 [ 111.603708] do_syscall_64+0x56/0x80 [ 111.603710] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 111.603712] [ 111.603713] The buggy address belongs to the object at ffff88810fc70e00 [ 111.603713] which belongs to the cache kmalloc-128 of size 128 [ 111.603715] The buggy address is located 110 bytes inside of [ 111.603715] 128-byte region [ffff88810fc70e00, ffff88810fc70e80) [ 111.603717] [ 111.603717] The buggy address belongs to the physical page: [ 111.603719] page:ffffea00043f1c00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88810fc70380 pfn:0x10fc70 [ 111.603721] head:ffffea00043f1c00 order:1 compound_mapcount:0 compound_pincount:0 [ 111.603723] flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff) [ 111.603728] raw: 0017ffffc0010200 ffffea0001fc6988 ffff888100040b50 ffff888100042c80 [ 111.603730] raw: ffff88810fc70380 000000000015000d 00000001ffffffff 0000000000000000 [ 111.603731] page dumped because: kasan: bad access detected [ 111.603732] [ 111.603732] Memory state around the buggy address: [ 111.603734] ffff88810fc70d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 111.603735] ffff88810fc70d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 111.603736] >ffff88810fc70e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 06 fc fc [ 111.603737] ^ [ 111.603738] ffff88810fc70e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 111.603740] ffff88810fc70f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
So the actual hit is in this code:
static void unix_mkname_bsd(struct sockaddr_un *sunaddr, int addr_len) { /* This may look like an off by one error but it is a bit more * subtle. 108 is the longest valid AF_UNIX path for a binding. * sun_path[108] doesn't as such exist. However in kernel space * we are guaranteed that it is a valid memory location in our * kernel address buffer because syscall functions always pass * a pointer of struct sockaddr_storage which has a bigger buffer * than 108. */ ((char *)sunaddr)[addr_len] = 0; }
The lustre part is this:
static int check_gssd_socket(void) { struct sockaddr_un *sun; ... OBD_ALLOC(sun, sizeof(*sun)); strncpy(sun->sun_path, GSS_SOCKET_PATH, sizeof(sun->sun_path)); /* Try to connect to the socket */ while (tries++ < 6) { err = kernel_connect(sock, (struct sockaddr *)sun, sizeof(*sun), 0);
So based on the commend in unix_mkname_bsd it sounds that we might be need to be allocating sockaddr_storage here?
Attachments
Issue Links
- is related to
-
LU-16807 Resolve newer debug kernel warnings
- Reopened