Details
-
Bug
-
Resolution: Unresolved
-
Medium
-
None
-
None
-
3
-
9223372036854775807
Description
When too many NIs get created, i.e. in excess of lnet_interfaces_max, that triggers an LBUG as follows:
[ 461.968115] LNet: Added LNI 10.1.2.51@tcp2 [8/256/0/180] [ 461.968266] LNetError: 68600:0:(api-ni.c:2171:lnet_ping_target_install_locked()) ASSERTION( !rc ) failed: Invalid ping target: -34 [ 461.968425] LNetError: 68600:0:(api-ni.c:2171:lnet_ping_target_install_locked()) LBUG [ 461.968501] CPU: 1 PID: 68600 Comm: lnetctl Kdump: loaded Tainted: G OE --------- - - 4.18.0-477.27.1.el8_lustre.ddn17.x86_64 #1 [ 461.968649] Hardware name: Red Hat KVM, BIOS 1.13.0-2.module_el8.5.0+746+bbd5d70c 04/01/2014 [ 461.968740] Call Trace: [ 461.968829] dump_stack+0x41/0x60 [ 461.968919] lbug_with_loc.cold.6+0x5/0x43 [libcfs] [ 461.969009] lnet_ping_target_update+0x87e/0x8a0 [lnet] [ 461.969167] lnet_add_net_common+0x290/0x4b0 [lnet] [ 461.969269] lnet_dyn_add_ni+0x12b/0x1d0 [lnet] [ 461.969376] lnet_genl_parse_local_ni.isra.59+0x30a/0x1a50 [lnet] [ 461.969483] ? libcfs_str2net_internal+0xee/0x150 [lnet] [ 461.969586] lnet_net_cmd+0x53d/0x950 [lnet] [ 461.969693] genl_family_rcv_msg_doit.isra.17+0x113/0x150 [ 461.969796] genl_family_rcv_msg+0xb7/0x170 [ 461.969888] ? lnet_dyn_del_net+0x220/0x220 [lnet] [ 461.969993] ? lnet_udsp_info_send+0x460/0x460 [lnet] [ 461.970098] ? lnet_parse_peer_nis.constprop.61+0x690/0x690 [lnet] [ 461.970206] ? lnet_ping_event_handler+0x130/0x130 [lnet] [ 461.970315] genl_rcv_msg+0x47/0xa0 [ 461.970410] ? genl_family_rcv_msg+0x170/0x170 [ 461.970507] netlink_rcv_skb+0x4c/0x130 [ 461.970603] genl_rcv+0x24/0x40 [ 461.970731] netlink_unicast+0x19a/0x230 [ 461.970830] netlink_sendmsg+0x204/0x3d0 [ 461.970928] sock_sendmsg+0x50/0x60 [ 461.971032] ____sys_sendmsg+0x22a/0x250 [ 461.971134] ? copy_msghdr_from_user+0x5c/0x90 [ 461.971237] ___sys_sendmsg+0x7c/0xc0 [ 461.971342] ? __raw_spin_unlock+0x5/0x10 [ 461.971448] ? handle_pte_fault+0x770/0x880 [ 461.971553] ? __handle_mm_fault+0x453/0x6c0 [ 461.971659] __sys_sendmsg+0x57/0xa0 [ 461.971769] do_syscall_64+0x5b/0x1b0 [ 461.971877] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 461.971995] RIP: 0033:0x7fc82f30fa98
Handle this gracefully and return an error instead of crashing.