[LU-6130] Number of LNET NI's limited Created: 15/Jan/15  Updated: 15/Oct/20

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Blake Caldwell Assignee: Amir Shehata (Inactive)
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-7562 Increase LNET_MAX_INTERFACES Closed
Related
is related to LU-6901 LNet: Support dynamic number of NIDs ... Open
Severity: 3
Rank (Obsolete): 17076

 Description   

While investigating the number of maximum possible NI's that can be added with DLC, I found that the structure lnet_ping_info_t is only expected to have a maximum of 16 entries in pi_ni.

./lnet/include/lnet/lib-types.h:460

typedef struct {
__u32 pi_magic;
__u32 pi_features;
lnet_pid_t pi_pid;
__u32 pi_nnis;
lnet_ni_status_t pi_ni[0];
} WIRE_ATTR lnet_ping_info_t;

/* router checker data, per router */
#define LNET_MAX_RTR_NIS 16
#define LNET_PINGINFO_SIZE offsetof(lnet_ping_info_t, pi_ni[LNET_MAX_RTR_NIS])

Following the call chain from adding a new NI with DLC....

./lnet/utils/lnetconfig/liblnetconfig.c:
l_ioctl(LNET_DEV_ID, IOC_LIBCFS_ADD_NET, &data)

./lnet/lnet/module.c:
l_ioctl() -> lnet_dyn_configure() -> lnet_dyn_add_ni()

lnet/lnet/api-ni.c:
lnet_dyn_add_ni() -> lnet_ping_info_setup() -> lnet_ping_info_create()

This is where the pi_ni array mentioned above is used:

lnet_ping_info_create(int num_ni)
...
infosz = offsetof(lnet_ping_info_t, pi_ni[num_ni]);

This limit would appear designed to apply to routers, but what about an OSS with an arbitrary number (N) of NI's (e.g. o2ib0…o2ibN)?



 Comments   
Comment by Jian Yu [ 15/Jan/15 ]

Hi Isaac,

I saw that "#define LNET_MAX_RTR_NIS 16" was created by you in commit 5e1e6a6756d3b4ca19a0d7e0defcf974dbfed13c. Could you please comment on this ticket?

Comment by Amir Shehata (Inactive) [ 29/Jan/15 ]

When creating/adjusting the ping info when adding an NI, there is no limit on the size of the ping info. So it can expand to include all the NIs on the node.

However, when pinging a node the current code is only interested in the first 16 NIDs of the router. This could be an issue in the following scenario:

Router A has 17 NIs, 0 to 16
Node B has a route to Router A using the NID-16.
Node B is configured to ping routers to check if they are alive
Node B pings Router A NID-16
Router A responds with NIDs 0-15
Node B iterates all Router A's NIDs but doesn't find NID-16, so it considers that Router dead.

This is a corner scenario, since it requires a router with > 16 NIDs and other nodes have routes to the router using NIDs > 16 as a route.

This also affect lctl ping, but this is not critical, since lctl ping is only used by sysadmins to ping nodes. I don't think it's a problem since ping response will still be displayed, but only the first 16 NIDs will be displayed.

The reason for the number 16, is because we have to allocate a specific size to receive ping info. When we ping a node we don't know how many NIDs it has so we need to select a size for the buffer to receive the ping info. 16 NIDs seemed like a reasonable number to go with.

I'm not sure about your concern with regards to the OSS number of NIs. Could you please explain your concern further.

Comment by Jian Yu [ 23/Apr/15 ]

Hi Blake,

Did Amir answer your question? If not, could you please explain your concern further?

Comment by James A Simmons [ 23/Apr/15 ]

Today's corner cases will be tomorrows default settings. Perhaps we should look at fixing this. Besides LNET_MAX_RTR_NIS also LNET_PINGINFO_SIZE would have to be handled dynamically.

Comment by Blake Caldwell [ 23/Apr/15 ]

Yes, that was a clear explanation. Thank you, Amir. I understand now. So the number of LNET NI's should not be limited, which strictly speaking, was the original concern of my ticket. Creating more than 16 "flat" NI's is desirable. However, as James alluded to, this limitation could be a problem as soon as a router is needed in this configuration (o2ib to tcp, or segmented IB networks).

Comment by Amir Shehata (Inactive) [ 23/Apr/15 ]

I'll create a ticket to track this enhancement. I do agree that we should be stepping away from hard coded static values.

Comment by John Hammond [ 15/Oct/20 ]

More discussion from Andreas on LU-7562.

Generated at Sat Feb 10 01:57:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.