[LU-10968] add coordinator bypass upcalls for HSM archive and remove - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
None

Rank (Obsolete):
9223372036854775807

Description

HSM restore performance is often degraded by the presence of too many archive requests in the CDT llog or CT pipeline. Offer upcalls for archive and remove to be invoked on the MDT which allow bypassing of the coordinator and better scheduling of archives and removes.

From the commit message on https://review.whamcloud.com/32212:

This change provides an HSM upcall facility to optionally bypass the
HSM coordinator (CDT) for archive and remove requests. (Release
already bypasses the CDT and restore bypass is not supported by this
change.)

Requires updated MDT and a worker client. OSTs, compute nodes, and
copytool nodes need not be updated.

lctl set_param mdt.*.hsm.upcall_mask='ARCHIVE' # or 'ARCHIVE RESTORE', 'RESTORE', ''
lctl set_param mdt.*.hsm.upcall_path=/.../lhsm_mdt_upcall # Full path.

HSM requests whose action is set in the upcall_mask parameter will be
diverted from the coordinator and handled by the executable specified
by upcall_path. By default upcall_mask is empty which gives the normal
HSM coordinator handling behavior.

The upcall (to be supplied by the site) will be invoked by MDT RPC
handler (runs on MDT as a root privileged process with an empty
environment). Invocation will be of the form:

  /.../lhsm_mdt_upcall ACTION FSNAME ARCHIVE_ID FLAGS DATA FID...

with one or more FIDs each as a separate argument. The upcall_path
paramater can be set to the path of an arbitrary (site supplied)
executable as long as it DTRT. The RPC handler will block until the
upcall completes. So for safety/liveness the upcall should really not
access Lustre. Instead the upcall should put the request in an
off-Lustre persistent queue or database and then exit. The actions
could be submitted to a job scheduler but care must be taken to ensure
thatthis does not entail any Lustre operations. See comments in
mdt_hsm_upcall().

A separate process (called a "worker" and also to be supplied by the
site) should read from that persistent queue and perform the
actions. The worker process does what a copytool does but instead of
listening on a KUC pipe for actions it reads form the queue. Like
existing copytools it must interact with the Lustre and with the
archive. The main difference (on the Lustre side) is that it uses
slightly modified ioctls to handle the upcalled requests. To make it
easier I added a new command ('lfs hsm_upcall') that manages the
Lustre half of an upcalled action and a sample script
lustre/utils/lhsm_worker_posix that handles the archive side (assuming
a lhsmtool_posix archive layout). The idea is that 'lfs hsm_upcall'
knows about Lustre and lhsm_worker_posix knows about the
archive. Running

  lfs hsm_upcall lhsm_worker_posix ARCHIVE FSNAME ARCHIVE_ID FLAGS DATA FID...

will do the following for each FID:
  1. Open the Lustre file to be archived specified by FID.
  2. Send an RPC (which bypasses the CDT) to the MDT to say that ARCHIVE is starting.
  3. Invoke

       lhsm_worker_posix ACTION FSNAME ARCHIVE_ID FLAGS DATA FID

     with stdin opened to the file to be archived.
  4. Wait for lhsm_worker_posix and send a ARCHIVE completion RPC
     (with the exit status of lhsm_worker_posix to the MDT).
  5. Close the file to be archived.

Remove is handled similarly by without the open or close.

See comments in lustre/utils/lhsm_worker_posix and lfs_hsm_upcall().

This may seem like a lot of moving parts but internally HSM has a lot
of parts and this was the cleanest way to decompose it that would
offer the flexibility needed.

Attachments

Issue Links

is related to

LU-13384 HSM copytool API for external coordinator

Open

LU-8324 HSM: prioritize HSM requests

Open

LU-9680 Improve the user land to kernel space interface for lustre

In Progress

LU-7659 Replace KUC by more standard mechanisms

Reopened

is related to

LU-6081 hsm: add file migrate support

Open

mentioned in: Page Loading...

(1 mentioned in)

Activity

[LU-10968] add coordinator bypass upcalls for HSM archive and remove

Ben Evans (Inactive) added a comment - 22/Mar/22 7:49 PM

yep, given some improvements, coordinatool looks like a really good solution.

Ben Evans (Inactive) added a comment - 22/Mar/22 7:49 PM yep, given some improvements, coordinatool looks like a really good solution.

James A Simmons added a comment - 22/Mar/22 5:42 PM

Why even bother with kernel space at all then. If you want a pure user land solution then look at http://github.com/cea-hpc/coordinatool.

This work is looking to improve what we are already using without creating a new interface.

James A Simmons added a comment - 22/Mar/22 5:42 PM Why even bother with kernel space at all then. If you want a pure user land solution then look at http://github.com/cea-hpc/coordinatool. This work is looking to improve what we are already using without creating a new interface.

Ben Evans (Inactive) added a comment - 22/Mar/22 5:23 PM

I think 99% of this can be skipped by LU-13384 and using purely non-kernel calls. lfs hsm ... calls just all get routed to the external coordinator (of whatever form).

The only issue is the calls that perform imperative restore on file access. I believe those can be easily added using a smaller chunk of the infrastructure in this PR.

Ben Evans (Inactive) added a comment - 22/Mar/22 5:23 PM I think 99% of this can be skipped by LU-13384 and using purely non-kernel calls. lfs hsm ... calls just all get routed to the external coordinator (of whatever form). The only issue is the calls that perform imperative restore on file access. I believe those can be easily added using a smaller chunk of the infrastructure in this PR.

James A Simmons added a comment - 22/Mar/22 3:25 PM

Both HPE and Microsoft is interested in this work.

James A Simmons added a comment - 22/Mar/22 3:25 PM Both HPE and Microsoft is interested in this work.

Cory Spitz added a comment - 09/Jun/21 11:36 PM

There are still two patches pending in Gerrit for this ticket. It is probably best not to abandon them. Granted it isn't a true dependency, but we've all been waiting for the netlink changes to finish refreshing and landing the HSM/data movement patches that would be impacted. It looks like https://review.whamcloud.com/#/c/34230 finally has all the necessary +1s and it is even in master-next now, so perhaps we can reopen and resume this work soon.

Cory Spitz added a comment - 09/Jun/21 11:36 PM There are still two patches pending in Gerrit for this ticket. It is probably best not to abandon them. Granted it isn't a true dependency, but we've all been waiting for the netlink changes to finish refreshing and landing the HSM/data movement patches that would be impacted. It looks like https://review.whamcloud.com/#/c/34230 finally has all the necessary +1s and it is even in master-next now, so perhaps we can reopen and resume this work soon.

John Hammond added a comment - 09/Jun/21 5:30 PM

Closing since this isn't being worked on.

John Hammond added a comment - 09/Jun/21 5:30 PM Closing since this isn't being worked on.

Cory Spitz added a comment - 19/Feb/20 2:53 AM

beevans, this is assigned to your old persona.

Cory Spitz added a comment - 19/Feb/20 2:53 AM beevans , this is assigned to your old persona.

Gerrit Updater added a comment - 18/Oct/19 7:02 PM

Ben Evans (bevans@cray.com) uploaded a new patch: https://review.whamcloud.com/36492
Subject: LU-10968 hsm: encapsulate copyaction_private
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: dfe56daacdb3cbf85b45eb2eccf98038a665c63c

Gerrit Updater added a comment - 18/Oct/19 7:02 PM Ben Evans (bevans@cray.com) uploaded a new patch: https://review.whamcloud.com/36492 Subject: LU-10968 hsm: encapsulate copyaction_private Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: dfe56daacdb3cbf85b45eb2eccf98038a665c63c

Gerrit Updater added a comment - 18/Sep/19 9:27 PM

Ben Evans (bevans@cray.com) uploaded a new patch: https://review.whamcloud.com/36235
Subject: LU-10968 hsm: create external HSM queue interface
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ee8c7d8d925a3b21d29a5170524781ae9375618c

Gerrit Updater added a comment - 18/Sep/19 9:27 PM Ben Evans (bevans@cray.com) uploaded a new patch: https://review.whamcloud.com/36235 Subject: LU-10968 hsm: create external HSM queue interface Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ee8c7d8d925a3b21d29a5170524781ae9375618c

James A Simmons added a comment - 13/Feb/19 3:35 PM

To let you know I'm going to push another LU-9680 update. I have been talking to Amir about its application for LNet UDSP as well as using this for lnet selftest so I might move the netlink handling into liblnetconfig. I will rebase the LU-7659 patch on top of LU-9680 as well as push a early stats patch I developed which is not finished.

James A Simmons added a comment - 13/Feb/19 3:35 PM To let you know I'm going to push another LU-9680 update. I have been talking to Amir about its application for LNet UDSP as well as using this for lnet selftest so I might move the netlink handling into liblnetconfig. I will rebase the LU-7659 patch on top of LU-9680 as well as push a early stats patch I developed which is not finished.

People

Assignee:: Nikitas Angelinas

Reporter:: John Hammond

Votes:: 1 Vote for this issue

Watchers:: 23 Start watching this issue

Dates

Created:: 30/Apr/18 2:56 PM

Updated:: 28/Feb/25 1:03 AM