Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7562

Increase LNET_MAX_INTERFACES

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • 9223372036854775807

    Description

      The current LNET_MAX_INTERFACES value, the maximum number of NIDs that a single client can have, is hard-coded to 16, since it is used in a few places in the LNet protocol and user interface. It would be useful in some large SSI systems to allow up to 128 NIs, so long as this didn't present a memory burden to systems not configured that large.

      Besides internal data structures and checks, things that need to be changed include:

      • struct lnet_ioctl_net_config hard codes the number of interfaces in the interface
      • ksocklnd assume that struct ksock_hello_msg_t has at most LNET_MAX_INTERFACES entries in kshm_ips[] even though it could specify more in kshm_nips

      A quick look through the code shows only struct lnet_ioctl_net_config that encodes LNET_MAX_INTERFACES directly, and most of the uses are for internal sanity checks and structures. It might be worthwhile to separate these constants so that e.g. LNET_MAX_IB_INTERFACES is separate (or at most #defined by) LNET_MAX_INTERFACES to simplify changing this in the future.

      If this is changed in the future, it would be better to define the structure to store the count of interfaces first, followed by the array of interfaces, rather than having a static array of LNET_MAX_INTERFACES that is sent regardless of how many are actually in use. That avoids any built-in limits, especially in the interfaces, and this can be at most an internal implementation detail.

      A related question is how many interfaces could be handled within the limits of the current LIBCFS_IOC_DATA_MAX?

      I do see LNET_MAX_RTR_NIS is also 16, which should almost certainly be tied to LNET_MAX_INTERFACES, but the protocol itself doesn't appear to depend on this limit (it sends LNET_PINGINFO_SIZE packets, but these contain pi_nnis that should be used to determine the number of NIs instead of the packet size).

      To avoid unnecessary protocol incompatibility, what we've done in the past at the Lustre level for issues like this (obd_connect_data) is to allocate a reply buffer somewhat larger than the currently known data structure, on the assumption that the struct size will grow slowly over releases.

      it would be possible to allocate a reply buffer size for the PING that allowed (current_max+16) NIs to fit into the reply, and once enough time/releases had passed (or you know all clients and servers are running a suitable version if there is a specific system that needs over 16 NIs) it would be possible to increase the limit on the sender without requiring a flag day. Older clients will be able to handle this higher limit, and it sets us up to increase the limit again in the future if needed.

      That kind of "be graceful in what you accept" patch is also easily backported to older releases, since it doesn't cause any compatibility problems.

      Ideally, we could also do something to handle feature negotiation better at the LNet level at connect time, but I don't know enough about the protocol details whether this can be done in a compatible way.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: