Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.17.0
-
None
-
3
-
9223372036854775807
Description
It would be useful to have a variable that can be set early in startup (e.g. libcfs module parameter that is also a tunable parameter) that gives the system a hint on how many clients will be mounting the filesystem.
While we try to tune filesystem parameters dynamically, sometimes that is complex to get right from the start, and by the time 1000 or 10000 clients have connected, then the values used when 10 or 100 clients had mounted are no longer optimal. For example, conns_per_peer at the LNet level could be 4 for 100 clients but should be 1 for 10000 clients, otherwise the servers can run out of TCP ports and have too many open sockets. Similarly, at_min should be low (5s) for smaller clusters for faster recovery, but with 10000 clients it should be larger (15s+) (LU-12064), but since it is only set at mount time there is no easy way to update it on remote clients afterward.
Having a simple parameter set in /etc/modprobe.d/lustre.conf, like:
options libcfs expected_clients=10000
gives us a ballpark figure to work with. That doesn't obviate the need for dynamic tuning of parameters at runtime, but establishes some expectations for what is not obvious when the first 10 (of 10000) clients are mounting the filesystem.
Even better would be if the targets saved the maximum number of connected clients locally (e.g. every 128 client connections) so that it could read this from the target when it is first mounting. That avoids the requirement for the admin to specify expected_clients, though it would be useful to have both.