Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17515

dynamically tune 'conns_per_peer' as needed

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.14.0, Lustre 2.16.0
    • 3
    • 9223372036854775807

    Description

      If there is a mismatch between conns_per_peer on a client and server (e.g. different Ethernet network speed across Ethernet switches, or other reasons below) then each side will try to establish a different number of TCP sockets for the peer. LU-17258 is handling this by "giving up" on establishing more peer connections, as long as one could be established for each type.

      When this happens, the client should save the conn_count as the new (in memory, until next unmount/remount) conns_per_peer value the remote peer NID, so that it doesn't continue trying to establish more connections whenever there is a problem.

      Otherwise, the server will have to handle and reject these connections on a regular basis, which may seem like a DDOS if 10000 clients are all trying to (re-)establish thousands of connections at mount, recovery, or whenever there is a network hiccup. This makes the configuration more "hands off" without the need to tune conns_per_peer explicitly (and in coordination) across all nodes.

      It is likely that the servers also need to dynamically shrink conns_per_peer when they start having a lot of connected peers to avoid the need to explicitly tune this for large clusters (and make us get involved to fix the system after it breaks). This will (eventually) cause the remote peers to also shrink their connection count over time due to their backoff of failed connections. I'm thinking something simple like shrinking conns_per_peer by 1 as the number of established peer connections grows past 20000 and again at 40000 (if it hasn't already started shrinking the number of connections when passing 20000). It couldn't be set < 1.

      It could print a console message when this is done, suggesting to "set 'options socklnd conns_per_peer=N' in /etc/modprobe.d/lustre.conf to avoid this in the future", but at least the system would continue to work.

      I don't know if the server would need to actively disconnect client connections > conns_per_peer, but that might be needed if the number of connections continues to grow (e.g. > 50000).

      It would never increase conns_per_peer until the system is restarted, or maybe if explicitly set from userspace again if the admin really thinks they know better.

      I've also filed LU-17514 for tracking an "expected_clients" tunable that can be used to set a ballpark figure for the number of clients, so that various runtime parameters like conns_per_peer could be set appropriately early in the cluster mount process.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: