socklnd needs improved interface selection and configuration (LU-14064)

[LU-13625] socklnd can select the wrong IP address to bind socket to Created: 02/Jun/20  Updated: 09/May/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Technical task Priority: Minor
Reporter: Amir Shehata (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-10391 LNET: Support IPv6 Reopened
Rank (Obsolete): 9223372036854775807

 Description   

In choose_ipv4_src() it doesn't matter which IP address you bind to as long as you bind to the interface. The reasoning is that the only important thing is you bind to the interface to make sure traffic goes out the NIC you expect.
However, I see a drawback with not binding to an IP address. I'll mention two tests I did to show the drawback. The first one is on the nose test:
TEST ONE
1. I configure 2 IP addresses on one interface.
2. I add a REJECT rule on the second IP address. such that all traffic using the second IP address is dropped
3. I configure LNet to use that interface.
4. All LNet traffic binds against the second address and all traffic is dropped
TEST TWO
1. I configure routing and rules such that anything using the first IP address on the interfaces egress the desired NIC
2. Run an lnet self test and monitor traffic via "netstat -i"
3. Since the socket is being bound to the second IP on the interfaces the rules are not being hit and traffic is not distributed as expected

From the below stats, only eth0 is being using for TXs

Kernel Interface table
Iface      MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0      1500   115139      0   1840 0       1025644      0      0      0 BMRU
eth1      1500   115627      0   3597 0            24      0      0      0 BMRU
eth2      1500   179972      0    132 0            23      0      0      0 BMRU
lo       65536      372      0      0 0           372      0      0      0 LRU

LNet is a virtual networking layer on top of the physical network. The admin should have control over not only the interface LNet uses, but also the IP address. I admit, that currently our user space tools only bind to the interface (although ip2nets allows specifying IP ranges). However, I see situations, especially as ethernet becomes more widely used on sites, that admins might want the control to configure multiple IP addresses, route packets using each IP address differently and have Lustre go over one of the specific routes, while dealing with applications using the second IP address differently.



 Comments   
Comment by Gerrit Updater [ 02/Jun/20 ]

Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38812
Subject: LU-13625 socklnd: bind socket to correct IP addr
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 54f92a4d0228d8b8eea6e30585a033872dafc575

Generated at Sat Feb 10 03:02:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.