socklnd needs improved interface selection and configuration
(LU-14064)
|
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0, Lustre 2.12.4, Lustre 2.15.0 |
| Fix Version/s: | Lustre 2.16.0, Lustre 2.15.0 |
| Type: | Technical task | Priority: | Minor |
| Reporter: | Amir Shehata (Inactive) | Assignee: | Serguei Smirnov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | lnet | ||
| Issue Links: |
|
||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
TCP bonding in socklnd over-complicates the code and there is no evidence it's being used anywhere. With LNet Multi-Rail, the use_tcp_bonding option has become obsolete. Add a deprecation message for earlier releases. Remove it in the 2.15 release. Multi-Rail feature doesn't need to be explicitly enabled. To use MR instead of the use_tcp_bonding configuration option, group the interfaces on the same network using the lnetctl utility:
lnetctl net add --net tcp --if eth0,eth1
or via the /etc/modprobe.d/lnet.conf or /etc/modprobe.d/lustre.conf configuration file:
options lnet networks="tcp(eth0,eth1)"
and make sure dynamic discovery is enabled: lnetctl set discovery 1 MR will aggregate the throughput of all available networks/interfaces shared between peer nodes. See LNet Software Multi-Rail Configuration in the Lustre Operations Manual for more details. |
| Comments |
| Comment by Gerrit Updater [ 23/Sep/20 ] |
|
Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40000 |
| Comment by Gerrit Updater [ 27/Nov/20 ] |
|
Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40774 |
| Comment by Gerrit Updater [ 24/Dec/20 ] |
|
Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41088 |
| Comment by Gerrit Updater [ 29/Dec/20 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41102 |
| Comment by Cory Spitz [ 04/Jan/21 ] |
|
> TCP bonding in socklnd over-complicates the code and there is no evidence it's being used anywhere It may not be widely used, but doesn't native TCP bonding outperform MR in various RAS situations? I suspect that there are real-world tests that show that TCP bonding seamlessly rides through failures whereas MR would need to re-try/re-transmit. Is this a wrong assumption? Is it proven that MR is better than bonding in any & all scenarios? If not, do you still want to deprecate bonding? |
| Comment by Serguei Smirnov [ 04/Jan/21 ] |
|
Hi Cory, The wording is a bit confusing, so I'll clarify that this ticket is dealing with just a socklnd feature, so one would still be able to use native TCP bonding in Linux with MR. It is the "socklnd bonding" that's being deprecated. Introduction of "socklnd bonding" allowed treating multiple interfaces as one in socklnd layer - introduction of MR brought the same concept to LNet layer. I don't believe there's a difference in performance. ashehata can correct me if my understanding is wrong. |
| Comment by Cory Spitz [ 04/Jan/21 ] |
|
Ah, thanks for the clarification. That makes sense. |
| Comment by Gerrit Updater [ 05/Jan/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41088/ |
| Comment by Peter Jones [ 05/Jan/21 ] |
|
The deprecation warning has landed to 2.14. The removal itself is deferred to 2.15 |
| Comment by Gerrit Updater [ 08/Jan/21 ] |
|
James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/41179 |
| Comment by Gerrit Updater [ 08/Jan/21 ] |
|
James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/41180 |
| Comment by Gerrit Updater [ 04/Mar/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41102/ |
| Comment by Gerrit Updater [ 30/Mar/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40000/ |
| Comment by Gerrit Updater [ 10/Apr/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40774/ |
| Comment by Gerrit Updater [ 16/Sep/22 ] |
|
"Neil Brown <neilb@suse.de>" uploaded a new patch: https://review.whamcloud.com/48568 |
| Comment by Gerrit Updater [ 10/Oct/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48568/ |