[LU-15899] BUG: KASAN: slab-out-of-bounds in mdt_hsm_release Created: 27/May/22  Updated: 27/May/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Seen on v2_15_50-13-gc524079f4f. There is a slab-out-of-bounds write in mdt_hsm_release():

        if (!(ma->ma_valid & MA_LOV)) {
                /* Even empty file are released */
                memset(ma->ma_lmm, 0, sizeof(*ma->ma_lmm)); /* HERE */
                ma->ma_lmm->lmm_magic = cpu_to_le32(LOV_MAGIC_V1_DEFINED);

ma_lmm is req_capsule_server_get(info->mti_pill, &RMF_MDT_MD) from mdt_close(). We should check that this is at least sizeof(ma->ma_lmm) or use an alternate buffer.

[80779.264740] Lustre: DEBUG MARKER: == sanity-hsm test 21: Simple release tests ============== 07:57:08 (1653656228)
[80782.701558] ==================================================================
[80782.702354] BUG: KASAN: slab-out-of-bounds in mdt_hsm_release+0xae7/0x3b60 [mdt]
[80782.702990] Write of size 32 at addr ffff88811fe6f310 by task mdt_rdpg00_002/821178
[80782.703578]
[80782.703705] CPU: 1 PID: 821178 Comm: mdt_rdpg00_002 Kdump: loaded Tainted: G        W  OE    --------- -  - 4.18.0-348.7.1.el\
8.x86_64+debug #1
[80782.704726] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[80782.705459] Call Trace:
[80782.705664]  dump_stack+0x8e/0xd0
[80782.705955]  ? mdt_hsm_release+0xae7/0x3b60 [mdt]
[80782.706376]  print_address_description.constprop.5+0x1e/0x230
[80782.706844]  ? kmsg_dump_rewind_nolock+0xd9/0xd9
[80782.707252]  ? mdt_hsm_release+0xae7/0x3b60 [mdt]
[80782.707637]  ? mdt_hsm_release+0xae7/0x3b60 [mdt]
[80782.708051]  ? mdt_hsm_release+0xae7/0x3b60 [mdt]
[80782.708441]  __kasan_report.cold.7+0x37/0x86
[80782.708804]  ? mdt_hsm_release+0xae7/0x3b60 [mdt]
[80782.709201]  kasan_report+0x37/0x50
[80782.709492]  check_memory_region+0x198/0x200
[80782.709839]  memset+0x1f/0x40
[80782.710119]  mdt_hsm_release+0xae7/0x3b60 [mdt]
[80782.710514]  mdt_mfd_close+0x4b5/0x2970 [mdt]
[80782.710897]  mdt_close_internal+0x29b/0x7c0 [mdt]
[80782.711314]  mdt_close+0x586/0x1510 [mdt]
[80782.711913]  tgt_request_handle+0x1c82/0x4250 [ptlrpc]
[80782.712400]  ? tgt_brw_write+0x4c80/0x4c80 [ptlrpc]
[80782.712849]  ? libcfs_id2str+0x104/0x190 [lnet]
[80782.713248]  ptlrpc_server_handle_request+0xa5e/0x1ff0 [ptlrpc]
[80782.713783]  ptlrpc_main+0x1aa6/0x2e60 [ptlrpc]
[80782.714165]  ? __kthread_parkme+0xc4/0x190
[80782.714555]  ? ptlrpc_wait_event+0x1230/0x1230 [ptlrpc]
[80782.714990]  kthread+0x344/0x410
[80782.715290]  ? kthread_insert_work_sanity_check+0xd0/0xd0
[80782.715743]  ret_from_fork+0x24/0x50
[80782.716048]
[80782.716176] Allocated by task 821178:
[80782.716475]  kasan_save_stack+0x19/0x80
[80782.716773]  __kasan_kmalloc.constprop.9+0xc1/0xd0
[80782.717170]  __kmalloc+0x143/0x260
[80782.717487]  null_alloc_rs+0x1d6/0x7c0 [ptlrpc]
[80782.717903]  sptlrpc_svc_alloc_rs+0x19c/0x850 [ptlrpc]
[80782.718429]  lustre_pack_reply_v2+0x14c/0x8b0 [ptlrpc]
[80782.718881]  lustre_pack_reply_flags+0x126/0x380 [ptlrpc]
[80782.719367]  req_capsule_server_pack+0xa7/0x1f0 [ptlrpc]
[80782.719822]  mdt_close+0x377/0x1510 [mdt]
[80782.720205]  tgt_request_handle+0x1c82/0x4250 [ptlrpc]
[80782.720675]  ptlrpc_server_handle_request+0xa5e/0x1ff0 [ptlrpc]
[80782.721198]  ptlrpc_main+0x1aa6/0x2e60 [ptlrpc]
[80782.721607]  kthread+0x344/0x410
[80782.721888]  ret_from_fork+0x24/0x50
[80782.722196]
[80782.722319] Last call_rcu():
[80782.722570]  kasan_save_stack+0x19/0x80
[80782.722896]  kasan_record_aux_stack+0x9e/0xb0
[80782.723289]  call_rcu+0x1a3/0x1020
[80782.723557]  queue_rcu_work+0x52/0x70
[80782.723859]  process_one_work+0x8f0/0x1770
[80782.724205]  worker_thread+0x87/0xb40
[80782.724507]  kthread+0x344/0x410
[80782.724774]  ret_from_fork+0x24/0x50
[80782.725069]
[80782.725205] Second to last call_rcu():
[80782.725511]  kasan_save_stack+0x19/0x80
[80782.725831]  kasan_record_aux_stack+0x9e/0xb0
[80782.726212]  call_rcu+0x1a3/0x1020
[80782.726500]  __percpu_ref_switch_mode+0x2ad/0x6c0
[80782.726890]  percpu_ref_kill_and_confirm+0x82/0x2ed
[80782.727290]  cgroup_destroy_locked+0x246/0x5e0
[80782.727633]  cgroup_rmdir+0x2f/0x2c0
[80782.727917]  kernfs_iop_rmdir+0x131/0x1b0
[80782.728260]  vfs_rmdir+0x142/0x3c0
[80782.728545]  do_rmdir+0x2b2/0x340
[80782.728822]  do_syscall_64+0xa5/0x430
[80782.729123]  entry_SYSCALL_64_after_hwframe+0x6a/0xdf
[80782.729509]
[80782.729635] The buggy address belongs to the object at ffff88811fe6f000
[80782.729635]  which belongs to the cache kmalloc-1k of size 1024
[80782.730641] The buggy address is located 784 bytes inside of
[80782.730641]  1024-byte region [ffff88811fe6f000, ffff88811fe6f400)
[80782.731622] The buggy address belongs to the page:
[80782.731971] page:ffffea00047f9a00 refcount:1 mapcount:0 mapping:00000000f4753386 index:0xffff88811fe69000 head:ffffea00047f9a\
00 order:3 compound_mapcount:0 compound_pincount:0
[80782.733075] flags: 0x17ffffc0008100(slab|head)
[80782.733396] raw: 0017ffffc0008100 ffffea0004d2a608 ffff888100001150 ffff88810000e140
[80782.733939] raw: ffff88811fe69000 00000000000a0009 00000001ffffffff 0000000000000000
[80782.734561] page dumped because: kasan: bad access detected
[80782.735053]
[80782.735186] Memory state around the buggy address:
[80782.735551]  ffff88811fe6f200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[80782.736096]  ffff88811fe6f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[80782.736653] >ffff88811fe6f300: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[80782.737258]                          ^
[80782.737635]  ffff88811fe6f380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[80782.738244]  ffff88811fe6f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[80782.738849] ==================================================================
[80783.859016] Lustre: DEBUG MARKER: == sanity-hsm test 22: Could not swap a release file ===== 07:57:13 (1653656233)

Generated at Sat Feb 10 03:22:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.