[LU-18924] Super big mdt.*.hsm.max_requests will cause system crash. - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

If setting mdt.*.hsm.max_request to be a huge number, e.g. 2^64-1(max. int), the system will crash when receiving the first hsm request. This is related to the following part code in function mdt_coordinator():

                CDEBUG(D_HSM, "coordinator starts reading llog\n");

                if (hsd.hsd_request_len != cdt->cdt_max_requests) {
                        /* cdt_max_requests has changed,
                         * we need to allocate a new buffer
                         */
                        struct hsm_scan_request *tmp = NULL;
                        int max_requests = cdt->cdt_max_requests;
                        OBD_ALLOC_LARGE(tmp, max_requests *
                                        sizeof(struct hsm_scan_request));
                        if (!tmp) {
                                CERROR("Failed to resize request buffer, "
                                       "keeping it at %d\n",
                                       hsd.hsd_request_len);
                        } else {
                                 ....

The system logs showed:

kernel: LustreError: 6312:0:(mdt_coordinator.c:714:mdt_coordinator()) vmalloc of 'tmp' (0 bytes) failed
...
kernel: LustreError: 6312:0:(mdt_coordinator.c:718:mdt_coordinator()) Failed to resize request buffer, keeping it at 1048576
...
kernel: hsm_cdtr: vmalloc: allocation failure: 17179869184 bytes, mode:0x608042(GFP_NOFS|__GFP_HIGHMEM|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
...
kernel: CPU: 2 PID: 6312 Comm: hsm_cdtr Kdump: loaded Tainted: G           OE     -------- -  - 4.18.0-553.27.1.el8.aarch64 #1
kernel: Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.24006586.BA64.2406042154 06/04/2024
kernel: Call trace:
kernel: dump_backtrace+0x0/0x178
kernel: show_stack+0x28/0x38
kernel: dump_stack+0x68/0x8c
kernel: warn_alloc+0x10c/0x190
kernel: __vmalloc_node_range+0x218/0x2e0
kernel: __vmalloc+0x84/0xa8
kernel: mdt_coordinator+0x1010/0x1a68 [mdt]
kernel: kthread+0x150/0x160
kernel: ret_from_fork+0x10/0x18
kernel: Mem-Info:
...

Attachments

Activity

People

Assignee:: Emoly Liu

Reporter:: Emoly Liu

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 14/Apr/25 3:24 AM

Updated:: 3 days ago 3:35 PM

Resolved:: 3 days ago 2:05 PM