The design document is very useful, thanks.
I do have one concern: the code looks through lists of routes while holding a spinlock and with interrupts disabled on the CPU (spin_lock_irqsave() and friends). This will definitely be a problem if these lists become large, because a system becomes unstable if one or more of the CPU cores runs for a long time with interrupts disabled.
Trying to figure how large these lists can become, if we have a cluster with N clients, M MDS, O OSS, and ignoring routers, assuming just one interface for each system I get something like this:
- on a client: M + O
- on an MDS: N + M + O
- on an OSS: N + M
This shouldn't be much of a problem in a small cluster, but in a large cluster it would be the MDS and OSS in particular that have large lists. So my concern is that there is a scaling problem that will render MDS and OSS unstable in large clusters, but will be invisible in the small clusters typically used for testing.
Closing this ticket, as LNet multi-rail support landed in 2.10.