Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.5.0, Lustre 2.6.0, Lustre 2.5.1
-
OpenSFS cluster with RHEL6 with combined MGS/MDS, single OSS with two OSTs, four Lustre 2.5 (Build #2) clients; one HSM Agent + client, one with robinhood/db running and two Lustre clients.
-
11283
Description
Currently, the default HSM default parameters, when HSM is enabled, are:
# lctl get_param mdt.scratch-MDT0000.hsm_control mdt.scratch-MDT0000.hsm_control=enabled # lctl get_param mdt.scratch-MDT0000.hsm.actions # lctl get_param mdt.scratch-MDT0000.hsm.agents mdt.scratch-MDT0000.hsm.agents= uuid=181c8885-24a1-2d51-0b7e-3b986fff7a93 archive_id=1 requests=[current:0 ok:6917 errors:0] # lctl get_param mdt.scratch-MDT0000.hsm.default_archive_id mdt.scratch-MDT0000.hsm.default_archive_id=3 # lctl get_param mdt.scratch-MDT0000.hsm.grace_delay mdt.scratch-MDT0000.hsm.grace_delay=60 # lctl get_param mdt.scratch-MDT0000.hsm.loop_period mdt.scratch-MDT0000.hsm.loop_period=10 # lctl get_param mdt.scratch-MDT0000.hsm.max_requests mdt.scratch-MDT0000.hsm.max_requests=3 # lctl get_param mdt.scratch-MDT0000.hsm.policy mdt.scratch-MDT0000.hsm.policy=NonBlockingRestore [NoRetryAction] # lctl get_param mdt.scratch-MDT0000.hsm.active_requests # lctl get_param mdt.scratch-MDT0000.hsm.active_request_timeout mdt.scratch-MDT0000.hsm.active_request_timeout=3600
The hsm.max_requests value of 3 differs from what is documented in the HSM test plan; "10 x Agent count". Thus, the default value of hsm.max_requests should be changed.
From the documentation, max_requests is a per coordinator value. If there are two coordinators (MDTs), each agents will never have to handle more than 2 x max_requests requests.