Details
-
Improvement
-
Resolution: Unresolved
-
Medium
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
NRS TBF: Support per-rule scheduling classes for heterogeneous rate limiting
Background and Current Behavior
The Lustre NRS TBF (Token Bucket Filter) scheduler provides rate limiting for incoming PTLRPC requests on the server side. Currently, administrators define rules (e.g., NID/JobID or nodemap) to group requests into "classes" and attach a token bucket to each class to enforce bandwidth or request-rate limits.
In the current implementation:
A TBF scheduler instance is configured with one global classification type (e.g., NID-based TBF or Nodemap-based TBF).
This classification type applies to all rules within that TBF scheduler instance.
Constraints of this approach:
- Administrators cannot mix per-NID and per-nodemap rules in the same scheduler.
- It is impossible to choose different grouping granularities for different traffic classes on the same service.
- Different traffic types often require different scopes (e.g., some need per-client limits, others need per-tenant/nodemap limits).
Design Goal
Introduce per-rule scheduling classes (classification types) for NRS TBF to achieve the following:
Allow each TBF rule to independently choose how its matching requests are grouped into token buckets (e.g., per-NID, per-nodemap, or various combination of NID/Nodemap/Opcode/JobID/UID/GID, et, al).
Enable a single TBF scheduler instance to contain a heterogeneous mix of rule types.
Increase the expressiveness of the configuration model while preserving existing global defaults.
Maintain full backwards compatibility with current NID-only or nodemap-only setups.
Requirements
Functional Requirements
Per-rule scheduling class
- Every TBF rule must have a configurable scheduling class (classification type).
- Supported types should include nid, nodemap, JobID, UID, GID, ProjID, Opcode and their combinations.
Rule–to–bucket mapping - The TBF scheduler must identify the matching rule for a given incoming request.
- The bucket key must be computed according to that specific rule’s scheduling class.
- Token accounting must be applied against the selected bucket.
Rule heterogeneity - Multiple rules with different scheduling classes (TBF types) must coexist within one TBF scheduler instance (e.g., Rule A is per-NID, Rule B is per-nodemap, Rule C is the combination of NID+Opcode).
Scheduler-level default
**The existing scheduler-level type shall remain as a default scheduling class for rules that do not explicitly specify one.
Non-Functional Requirements
Backward compatibility
- Existing configurations specifying only "NID TBF" or "Nodemap TBF" at the scheduler level must continue to behave exactly as before.
Low overhead - Request classification and bucket lookup must remain O(1) or near-O(1) per request (using rhashtable).
- Additional logic for per-rule classes must be minimal to avoid performance degradation.
Operational clarity - The configuration interface must clearly express rule match conditions, the scheduling class, and the effective rate limit scope.
High-Level Design
Rule-Level Scheduling Class
Each TBF rule is extended to carry a scheduling class type attribute. If a rule omits this field, the rule type defaults to the scheduler-level default class type.
Example syntax:
Per-NID rule ost.OSS.ost_io.nrs_tbf_rule="start rule1 nid={192.168.23.10} rate=100 type=nid" Per-Nodemap rule ost.OSS.ost_io.nrs_tbf_rule="start rule2 nodemap={storage_group} rate=50 type=nodemap" Combined NID and Opcode rule ost.OSS.ost_io.nrs_tbf_rule="start rule3 nid={192.168.23.19}&opcode={ost_read} rate=30 type=nid+opcode"
Or the class type of a rule can be detected automatically from the matching conditions if the detect one is not same with the default global one of the NRS TBF scheduler:
i.e.
nrs_tbf_rule="start ruleauto nid={10.0.0.1}&opcode={ost_write}&uid={100} rate=10"
it can automatically detect that the rule class type is the combination of nid+opcode+uid.
Backwards Compatibility
To avoid breaking existing deployments:
Scheduler-level type preservation
- Legacy settings (NID TBF / Nodemap TBF / JobID TBF) remain and serve as the default scheduling class for all rules in that scheduler.
Legacy Rule Behavior - Rules installed without specifying a scheduling class rypw will yield identical behavior to current versions (same grouping and rate semantics).
Opt-in Mixed Mode - Heterogeneous scheduling classes are only utilized when the operator explicitly specifies the type parameter in new or updated rules.