Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
OBD_ALLOC_LARGE has a switch to vmalloc, if kmalloc allocation fails. Really the vm tries to kmalloc memory so hard that the system spends significant amount of time in try_to_free_pages() allocation loops instead of failing back to vmalloc().
#define OBD_ALLOC_LARGE(ptr, size) \ do { \ /* LU-8196 - force large allocations to use vmalloc, not kmalloc */ \ if ((size) > KMALLOC_MAX_SIZE) \ ptr = NULL; \ else \ OBD_ALLOC_GFP(ptr, size, GFP_NOFS | __GFP_NOWARN); \ if (ptr == NULL) \ OBD_VMALLOC(ptr, size); \ } while (0)
in-kernel (linux-4.18) implementation of kvmalloc() is more smart:
/* * We want to attempt a large physically contiguous block first because * it is less likely to fragment multiple larger blocks and therefore * contribute to a long term fragmentation less than vmalloc fallback. * However make sure that larger requests are not too disruptive - no * OOM killer and no allocation failure warnings as we have a fallback. */ if (size > PAGE_SIZE) { kmalloc_flags |= __GFP_NOWARN; if (!(kmalloc_flags & __GFP_RETRY_MAYFAIL)) kmalloc_flags |= __GFP_NORETRY; } ret = kmalloc_node(size, kmalloc_flags, node);
__GFP_NORETRY can be used in OBD_ALLOC_LARGE() the same way for the same purposes.
Here is an example of failed mem allocations on a heavy loaded system (CAST-31591)
2022-11-17 17:32:44 [974463.901303] Pid: 31862, comm: ll_ost_io00_744 3.10.0-957.1.3957.1.3.x3.5.46.x86_64 #1 SMP Thu Jan 20 13:08:08 CST 2022 2022-11-17 17:32:44 [974463.913928] Call Trace: 2022-11-17 17:32:44 [974463.918313] [<0>] __cond_resched+0x26/0x30 2022-11-17 17:32:44 [974463.924347] [<0>] shrink_page_list+0x97/0xc30 2022-11-17 17:32:44 [974463.930623] [<0>] shrink_inactive_list+0x1c6/0x5d0 2022-11-17 17:32:44 [974463.937296] [<0>] shrink_lruvec+0x385/0x730 2022-11-17 17:32:44 [974463.943307] [<0>] shrink_zone+0x76/0x1a0 2022-11-17 17:32:44 [974463.949020] [<0>] do_try_to_free_pages+0xf0/0x4e0 2022-11-17 17:32:44 [974463.955486] [<0>] try_to_free_pages+0xfc/0x180 2022-11-17 17:32:44 [974463.961645] [<0>] __alloc_pages_slowpath+0x457/0x724 2022-11-17 17:32:44 [974463.968328] [<0>] __alloc_pages_nodemask+0x405/0x420 2022-11-17 17:32:44 [974463.974996] [<0>] alloc_pages_current+0x98/0x110 2022-11-17 17:32:44 [974463.981328] [<0>] __get_free_pages+0xe/0x40 2022-11-17 17:32:44 [974463.987227] [<0>] kmalloc_order_trace+0x2e/0xa0 2022-11-17 17:32:44 [974463.993478] [<0>] __kmalloc+0x211/0x230 2022-11-17 17:32:44 [974463.999085] [<0>] ptlrpc_new_bulk+0x13a/0x870 [ptlrpc]