Details

    • 9223372036854775807

    Description

      For high-bandwidth Ethernet interfaces (e.g. 100GigE), it would be useful to create multiple TCP connections per interface for bulk transfers in order to maximize performance (i.e. conns_per_peer=4 for socklnd in addition to o2iblnd). We already have three separate TCP connections per LND - read, write, and small message.

      For large clusters this may be problematic because of the number of TCP connections to a server, but for smaller configurations this could be very useful.

      Attachments

        Issue Links

          Activity

            [LU-12815] Create multiple TCP sockets per SockLND
            pjones Peter Jones added a comment -

            Looks like everything has landed for 2.15

            pjones Peter Jones added a comment - Looks like everything has landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44417/
            Subject: LU-12815 socklnd: set conns_per_peer based on link speed
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c44afcfb72a1c2fd8392bfab3143c3835b146be6

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44417/ Subject: LU-12815 socklnd: set conns_per_peer based on link speed Project: fs/lustre-release Branch: master Current Patch Set: Commit: c44afcfb72a1c2fd8392bfab3143c3835b146be6

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/41463/
            Subject: LU-12815 socklnd: allow dynamic setting of conns_per_peer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: a5cbe7883db6d77b82fbd83ad4c662499421d229

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/41463/ Subject: LU-12815 socklnd: allow dynamic setting of conns_per_peer Project: fs/lustre-release Branch: master Current Patch Set: Commit: a5cbe7883db6d77b82fbd83ad4c662499421d229

            Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44417
            Subject: LU-12815 socklnd: set conns_per_peer based on link speed
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ca2f4fed6d85d2e4506958fcc2e1c6c98eb2d020

            gerrit Gerrit Updater added a comment - Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44417 Subject: LU-12815 socklnd: set conns_per_peer based on link speed Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ca2f4fed6d85d2e4506958fcc2e1c6c98eb2d020

            There may still be work needed to distribute RPCs from a single client to multiple CPTs on the server, in order to get the best performance for real IO workloads.

            Otherwise, a client with a single interface (NID) will have all of its RPCs handled by cores in a single CPT, which is not quite the same as having multiple real interfaces on the client. Discussion is ongoing in LU-14676.

            adilger Andreas Dilger added a comment - There may still be work needed to distribute RPCs from a single client to multiple CPTs on the server, in order to get the best performance for real IO workloads. Otherwise, a client with a single interface (NID) will have all of its RPCs handled by cores in a single CPT, which is not quite the same as having multiple real interfaces on the client. Discussion is ongoing in LU-14676 .

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41056/
            Subject: LU-12815 socklnd: add conns_per_peer parameter
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 71b2476e4ddb95aa42f4a0ea3f23b1826017bfa5

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41056/ Subject: LU-12815 socklnd: add conns_per_peer parameter Project: fs/lustre-release Branch: master Current Patch Set: Commit: 71b2476e4ddb95aa42f4a0ea3f23b1826017bfa5

            Does multiple TCP sockets featurer will only apply to Bulk I/O sockets?

            Correct, this is only for bulk sockets.

            adilger Andreas Dilger added a comment - Does multiple TCP sockets featurer will only apply to Bulk I/O sockets? Correct, this is only for bulk sockets.

            That's correct. I wanted to warn about this often unknown use case (I was really surprised when I discovered it), with any potential impact of this feature.

            Agreed that some sites could prefer risking evictions than changing firewall rules. However, this is surprising behavior and sites tends to reduce as much as possible evictions and having a "normal case" where evictions will happen but are expected, just because of unknown traffic rules is not the best option. This could cause additional JIRA tickets. But this problem already exists and is independent of this ticket. I just wanted to avoid increasing the issue. Does multiple TCP sockets featurer will only apply to Bulk I/O sockets?

             

            degremoa Aurelien Degremont (Inactive) added a comment - That's correct. I wanted to warn about this often unknown use case (I was really surprised when I discovered it), with any potential impact of this feature. Agreed that some sites could prefer risking evictions than changing firewall rules. However, this is surprising behavior and sites tends to reduce as much as possible evictions and having a "normal case" where evictions will happen but are expected, just because of unknown traffic rules is not the best option. This could cause additional JIRA tickets. But this problem already exists and is independent of this ticket. I just wanted to avoid increasing the issue. Does multiple TCP sockets featurer will only apply to Bulk I/O sockets?  

            There is still this rare behavior where a TCP socket is broken for some reasons and the first node to need it is the server, trying to send a ldlm callback. When the server detects the tcp connection for the reverse import is broken, it re-establish it itself, creating a server->client socket.

            That is true, but in this case I still believe that the server will use target port 988 on the client (not sure of source port), and the client will need to allow new connections on port 988 for most reliable behavior. In many cases, it is possible for the client to function properly without allowing any incoming connections, but as you write there may be rare cases that the server needs to initiate a connection, and without that the client may occasionally be evicted. For some sites that may be preferable to having an open port in the firewall. IIRC, there may even be a parameter to disable server->client connections, but I don't recall the details.

            adilger Andreas Dilger added a comment - There is still this rare behavior where a TCP socket is broken for some reasons and the first node to need it is the server, trying to send a ldlm callback. When the server detects the tcp connection for the reverse import is broken, it re-establish it itself, creating a server->client socket. That is true, but in this case I still believe that the server will use target port 988 on the client (not sure of source port), and the client will need to allow new connections on port 988 for most reliable behavior. In many cases, it is possible for the client to function properly without allowing any incoming connections, but as you write there may be rare cases that the server needs to initiate a connection, and without that the client may occasionally be evicted. For some sites that may be preferable to having an open port in the firewall. IIRC, there may even be a parameter to disable server->client connections, but I don't recall the details.

            Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41463
            Subject: LU-12815 socklnd: allow dynamic setting of conns_per_peer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fe3db0979f34afc5139fdc1b6b9ab6eace5cfde4

            gerrit Gerrit Updater added a comment - Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41463 Subject: LU-12815 socklnd: allow dynamic setting of conns_per_peer Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fe3db0979f34afc5139fdc1b6b9ab6eace5cfde4

            which it is for all new connections, and the actually-assigned port is mostly irrelevant. This is not any different from multiple clients connecting separately.

            If I remember correctly this is not totally true. There is still this rare behavior where a TCP socket is broken for some reasons and the first node to need it is the server, trying to send a ldlm callback. When the server detects the tcp connection for the reverse import is broken, it re-establish it itself, creating a server->client socket.

            (If the client needs this connection before, (ie: obd_ping), it will re-establish it normally and the server will use this connection as usual). This likely impact the metadata socket, not the bulk I/O sockets.

            degremoa Aurelien Degremont (Inactive) added a comment - - edited which it is for all new connections, and the actually-assigned port is mostly irrelevant. This is not any different from multiple clients connecting separately. If I remember correctly this is not totally true. There is still this rare behavior where a TCP socket is broken for some reasons and the first node to need it is the server, trying to send a ldlm callback. When the server detects the tcp connection for the reverse import is broken, it re-establish it itself, creating a server->client socket. (If the client needs this connection before, (ie: obd_ping), it will re-establish it normally and the server will use this connection as usual). This likely impact the metadata socket, not the bulk I/O sockets.

            People

              ashehata Amir Shehata (Inactive)
              adilger Andreas Dilger
              Votes:
              2 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: