Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
If setting mdt.*.hsm.max_request to be a huge number, e.g. 2^64-1(max. int), the system will crash when receiving the first hsm request. This is related to the following part code in function mdt_coordinator():
CDEBUG(D_HSM, "coordinator starts reading llog\n"); if (hsd.hsd_request_len != cdt->cdt_max_requests) { /* cdt_max_requests has changed, * we need to allocate a new buffer */ struct hsm_scan_request *tmp = NULL; int max_requests = cdt->cdt_max_requests; OBD_ALLOC_LARGE(tmp, max_requests * sizeof(struct hsm_scan_request)); if (!tmp) { CERROR("Failed to resize request buffer, " "keeping it at %d\n", hsd.hsd_request_len); } else { ....
The system logs showed:
kernel: LustreError: 6312:0:(mdt_coordinator.c:714:mdt_coordinator()) vmalloc of 'tmp' (0 bytes) failed ... kernel: LustreError: 6312:0:(mdt_coordinator.c:718:mdt_coordinator()) Failed to resize request buffer, keeping it at 1048576 ... kernel: hsm_cdtr: vmalloc: allocation failure: 17179869184 bytes, mode:0x608042(GFP_NOFS|__GFP_HIGHMEM|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0 ... kernel: CPU: 2 PID: 6312 Comm: hsm_cdtr Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.27.1.el8.aarch64 #1 kernel: Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.24006586.BA64.2406042154 06/04/2024 kernel: Call trace: kernel: dump_backtrace+0x0/0x178 kernel: show_stack+0x28/0x38 kernel: dump_stack+0x68/0x8c kernel: warn_alloc+0x10c/0x190 kernel: __vmalloc_node_range+0x218/0x2e0 kernel: __vmalloc+0x84/0xa8 kernel: mdt_coordinator+0x1010/0x1a68 [mdt] kernel: kthread+0x150/0x160 kernel: ret_from_fork+0x10/0x18 kernel: Mem-Info: ...