[LU-14363] Prioritize HSM cancel request Created: 25/Jan/21  Updated: 26/Jan/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major
Reporter: Etienne Aujames Assignee: Etienne Aujames
Resolution: Unresolved Votes: 0
Labels: CEA, HSM

Issue Links:
Duplicate
duplicates LU-8324 HSM: prioritize HSM requests Open
Rank (Obsolete): 9223372036854775807

 Description   

This ticket is a duplicate of the LU-8324, I created it to describe the specific need to prioritize HSM CANCEL over the rest of HSM actions.

The coordinator process actions in fifo style (except for RESTORE). So canceling an action requires that it is sent before the CANCEL request to the copytool. If the action and the CANCEL are on the waiting list, we could directly discard this action before sending it.

Moreover, if the action is running on a copytool, we could send the cancel first. This quickly release resources on the copytool.

 

Example (for RESTORE action):

When a user requests to restore a lot of files by error, this fills the HSM's llog waiting list and the coordinator.

Then if we want to cancel the RESTORE requests, in most cases, the CANCEL requests will be processed after the RESTORE.
The copytool have to send the RESTORE and then the CANCEL. Moreover RESTORE actions are prioritized (see LU-8324), so CANCEL requests are less likely to be sent.

As workaround, we use the command below manually to cancel all actions in the llog list:

lctl set_param mdt.$FSNAME-MDT0000.hsm_control=purge

 

The main purposes of this "improvement":

  • if the action has not been processed yet, the coordinator should process the cancel and remove the action directly in the llog list without sending it to the copytool.
  • if the action has been already processed, the coordinator should send the cancel request in priority to the copytool.

Generated at Sat Feb 10 03:09:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.