Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Medium
Fix Version/s: None
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Environment:

Lustre version: v2_17_52
Kernel: Rocky 9.7: 5.14.0-611.27.1.el9_7.x86_64

During testing of adding large numbers of NIDs to a server, I encountered a crash caused by lnetctl net show when I reached a certain threshold of NIDs.

On analyzing this crash, I was able to consistently reproduce this behaviour using the attached script which simply creates a number of LNET NIDs on a server and calls 'lnetctl net show -v 4'.

[servers] [root@server1 ~]# ./reproducer.sh --nids 150 --verbosity 4
=================================================
 Reproducer                                      
 Max NIDs:  150                                
 Verbosity: 4                           
=================================================
[Pre-flight] Checking modules and limits...
[Test] Beginning incremental interface addition...
  - 1 NIDs : PASS (Binary: 9176 bytes | YAML: 14681 bytes)
  - 2 NIDs : PASS (Binary: 9648 bytes | YAML: 16173 bytes)
  - 3 NIDs : PASS (Binary: 10120 bytes | YAML: 17665 bytes)
...
  - 127 NIDs : PASS (Binary: 65524 bytes | YAML: 191374 bytes)   
--> *CRASH*

I have attached the script and crashdump to the ticket.

I also include the following analysis from Claude on what it determines the root cause to be, which appears to be triggered when the netlink payload triggered by 'lnetctl net show' exceeds the 64KiB buffer for a single buffer.

There is a flaw in lnet_net_show_dump() (lnet/lnet/api-ni.c) that reliably panics the kernel or triggers an infinite loop when the number of configured NIs causes the netlink dump payload to exceed the 64 KiB skb buffer limit. The reproducer confirms that when the binary netlink payload crosses 65,535 bytes (e.g., ~128 dummy NIDs at verbosity -v 4), the kernel attempts to fall back to the multi-skb chunking logic but fails due to missing return checks and broken state-resumption logic.

The primary issue causing the kernel panic is that the function makes approximately 18 sequential calls to nla_nest_start() without checking for a NULL return. When the skb buffer fills up, nla_nest_start() returns NULL. Because this is ignored, the subsequent nla_nest_end(msg, NULL) attempts to write the 16-bit nla_len field to address 0, instantly seizing the node with a NULL pointer dereference.Furthermore, even if the buffer overflow perfectly aligns with a network boundary (avoiding the mid-NI panic), the multi-skb resumption path is functionally broken. When genlmsg_put fails due to lack of space, the code executes GOTO(net_unlock, rc = -EMSGSIZE), completely bypassing the nlist->lngl_idx = idx state-saving assignment. Compounding this, the function initializes idx = nlist->lngl_idx instead of 0, meaning the if (idx++ < nlist->lngl_idx) continue; check never actually skips previously processed NIs. This throws the kernel into an infinite loop, returning the exact same chunk of networks repeatedly until lnetctl crashes in userspace with a realloc(): invalid next size heap corruption.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

reproducer.sh
3 kB
05/May/26 10:49 AM
vmcore
88.37 MB
05/May/26 10:59 AM
vmcore-dmesg.txt
113 kB
05/May/26 10:47 AM

is related to

LU-6130 Number of LNET NI's limited

Open

LU-14391 Large network routes

Resolved

LU-17451 `lctl dl` with Netlink/YAML fails with large numer of devices

Resolved

LU-18417 Finish IPv6 support

Open

Assignee:: Malkeet Singh

Reporter:: Matt Rásó-Barnett

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 05/May/26 11:02 AM

Updated:: 2 days ago 3:37 PM

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates