Description
This ticket is a duplicate of the LU-8324, I created it to describe the specific need to prioritize HSM CANCEL over the rest of HSM actions.
The coordinator process actions in fifo style (except for RESTORE). So canceling an action requires that it is sent before the CANCEL request to the copytool. If the action and the CANCEL are on the waiting list, we could directly discard this action before sending it.
Moreover, if the action is running on a copytool, we could send the cancel first. This quickly release resources on the copytool.
Example (for RESTORE action):
When a user requests to restore a lot of files by error, this fills the HSM's llog waiting list and the coordinator.
Then if we want to cancel the RESTORE requests, in most cases, the CANCEL requests will be processed after the RESTORE.
The copytool have to send the RESTORE and then the CANCEL. Moreover RESTORE actions are prioritized (see LU-8324), so CANCEL requests are less likely to be sent.
As workaround, we use the command below manually to cancel all actions in the llog list:
lctl set_param mdt.$FSNAME-MDT0000.hsm_control=purge
The main purposes of this "improvement":
- if the action has not been processed yet, the coordinator should process the cancel and remove the action directly in the llog list without sending it to the copytool.
- if the action has been already processed, the coordinator should send the cancel request in priority to the copytool.