[LU-11978] Increase maximum number of configured NIDs per device Created: 20/Feb/19 Updated: 14/Oct/20 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.5 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major |
| Reporter: | Diego Moreno | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Multi-tenancy, several filesets per node |
||
| Issue Links: |
|
||||
| Epic/Theme: | mgs | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
We're finding some sort of limitation in our particular use case for Lustre where we export Lustre to many different VLANs with many filesets. Because the VLANs are directly connected to the servers (the servers' interfaces are connected to trunk ports on the switches) we need to specify, for each VLAN, a different set of NIDs for each device. If we have, for instance, 10 VLANs, that means 10 NIDs for each server. Besides, if we export several filesets to the same client, this means we need several LNETs on the same client (e.g. tcp10 for fileset1, tcp11 for fileset2, tcp12 for fileset3). This would mean 3 NIDs per VLAN. That means that in the hypothetic configuration above we would need to define 30 NIDs for each server and at least 60 NIDs for each device (failover NIDs). Currently Lustre supports around 20 NIDs per device, above this number the configuration will silently fail and only after trying to unsuccessfully mount the Lustre client we could realize that the config in the device is truncated after around 20 NIDs. Lustre routers can help to scale with problem 1 (multiple VLANs) but they cannot help if we have multiple filesets per client (maximum of 20 filesets per client in a routed environment). It would be a good improvement if this limit could be increased now that new Lustre use cases are in play. |
| Comments |
| Comment by Amir Shehata (Inactive) [ 20/Feb/19 ] |
|
Can you please share the configuration that you're using? |
| Comment by Diego Moreno [ 26/Feb/19 ] |
|
This is currently the example on an OST, where we have 22 NIDs per server so, with just one failover server, a total of 44 NIDs defined:
[root@le-oss01 ]# tunefs.lustre /dev/sfa0003 checking for existing Lustre data: found Reading CONFIGS/mountdata Read previous values: Target: fs1-OST0002 Index: 2 Lustre FS: fs1 Mount type: ldiskfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: ,errors=remount-ro Parameters: failover.node=10.207.208.121@tcp1001,10.207.64.121@o2ib,10.204.64.121@tcp1101,10.204.64.121@tcp1102,10.212.0.121@tcp1111,10.212.0.121@tcp1112,10.212.0.121@tcp1113,10.212.0.121@tcp1114,10.212.0.121@tcp1115,10.212.16.121@tcp1121,10.212.16.121@tcp1122,10.212.16.121@tcp1123,10.212.32.121@tcp1131,10.212.32.121@tcp1132,10.212.32.121@tcp1133,10.212.48.121@tcp1141,10.212.48.121@tcp1142,10.212.64.121@tcp1151,10.212.64.121@tcp1152,10.212.80.121@tcp1161,10.212.96.121@tcp1171,10.212.112.121@tcp1181:10.207.208.122@tcp1001,10.207.64.122@o2ib,10.204.64.122@tcp1101,10.204.64.122@tcp1102,10.212.0.122@tcp1111,10.212.0.122@tcp1112,10.212.0.122@tcp1113,10.212.0.122@tcp1114,10.212.0.122@tcp1115,10.212.16.122@tcp1121,10.212.16.122@tcp1122,10.212.16.122@tcp1123,10.212.32.122@tcp1131,10.212.32.122@tcp1132,10.212.32.122@tcp1133,10.212.48.122@tcp1141,10.212.48.122@tcp1142,10.212.64.122@tcp1151,10.212.64.122@tcp1152,10.212.80.122@tcp1161,10.212.96.122@tcp1171,10.212.112.122@tcp1181 mgsnode=10.207.208.102@tcp1001:10.207.208.101@tcp1001 When I tried to find the limiting number of NIDs it was around 45 NIDs, the 46th NID was truncated in the MGS and when the client tried to mount the filesystem it failed because the 46th NID was basically trash. This configuration for instance is used as follows:
This is just an example but you can find a more detailed explanation on this presentation from LAD'18: Multi-tenancy at ETHZ In practical terms the more complex the configuration is the more we need Lustre routers to help simplifying the configuration. That's also what we'll do in the future as soon as we have more tenants. But even with Lustre routers there's still the need of 1 LNET per fileset and per mapping client <- router -> server. That would mean that in a configuration with 4 failover OSSes then there would be a hard limitation of 11 filesets that can be mounted at the same time from a tenant (this comes from the 45 NID limitation). It seems extreme, ok, but I guess our current configuration was also considered "extreme" at the time the limitation on the number of NIDs was introduced... Another alternative could be maybe to manage the MGS configuration in a different way and not hard-coding the NIDs on the OSTs/MDTs. I don't know if there's any feature request/development on that regard. |
| Comment by Sebastien Buisson [ 11/Mar/19 ] |
|
Hi Diego, After discussing with Amir, it seems there are two options to help with your configuration:
HTH, |
| Comment by Diego Moreno [ 14/Mar/19 ] |
|
Sebastien, Thanks for the very detailed answer. I agree with you that both, the feature and the patch, help to solve this complicated use-case of multi-tenancy. Together with filesets and dynamic LNET configuration it should be possible to do all we need in that context.
Thanks for the help! |