[LU-4640] Last unlink should trigger HSM remove by default Created: 17/Feb/14  Updated: 27/Jun/17  Resolved: 07/Jun/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: Lustre 2.10.0

Type: Bug Priority: Minor
Reporter: Aurelien Degremont (Inactive) Assignee: Bruno Faccini (Inactive)
Resolution: Fixed Votes: 0
Labels: HSM, patch

Issue Links:
Related
is related to LU-9654 fix problem of Remove Archive on Last... Resolved
is related to LU-7881 sanity-hsm test_26b: @@@@@@ FAIL: Co... Resolved
Severity: 3
Rank (Obsolete): 12691

 Description   

In current code, when an archived file is removed from the file system, no HSM request is triggered. We rely on Changelogs and RobinHood reading them to have it sending corresponding hsm_remove requests to clean those orphans in HSM backend. This is done in purpose, to provide a way to implement "soft unlink". RobinHood will remove the file in the backend after a grace time. During this time, Admins could restore the file from the HSM if they want to (using import feature).

Requiring a RobinHood setup to handle this cleaning is a too big limitation. We should consider modifying this behaviour.

By default, Lustre should automatically add a HSM_REMOVE request for any last unlink. This way, no file will be leaked in the archive.
A tunable should be added to disable this behaviour (should we add this to hsm_policy?) and go back to a mode where an external component is responsible for tracking UNLINK changelogs and add hsm_remove requests when needed (Robinhood) (current behaviour).



 Comments   
Comment by Andreas Dilger [ 17/Feb/14 ]

Not sure if this is possible, but what about tracking the nlink count in the unlink record? That way, on the last unlink it is clear to the ChangeLog reader, without the need to write a separate record each time.

I haven't looked at the code yet, just an idea as I read this bug.

Comment by Aurelien Degremont (Inactive) [ 18/Feb/14 ]

The issue is not about which record needs to be added. When I say "add a HSM_REMOVE request", I do mean a Changelog record, but a real HSM request to be processed by coordinator and so on.

The problem here is that we need an external tool to remove file from backend when they are unlink in Lustre. For simple setup, the policy engine should be optional.

My idea was to track nlink == 0, when this is true, call something like

 if (nlink == 0 && file has HS_EXIST flag) {
   if not (policy & soft_remove) {
     mdt_hsm_request( HUA_REMOVE );
   }
 }
Comment by Peter Jones [ 18/Feb/14 ]

Bruno

Could you please look into this one?

Thanks

Peter

Comment by Malcolm Cowe (Inactive) [ 18/Feb/14 ]

While this is just my opinion, I would prefer that the default condition is not to remove the file from the backend in order to deliver behaviour consistent with the current case.

Comment by Aurelien Degremont (Inactive) [ 19/Feb/14 ]

We can set this policy off by default, in order to be coherent with 2.5.0 behaviour. However, I'm wondering what admins are expecting when they setup a simple Lustre/HSM system. Do they expect file to be removed from backend when they unlink files in Lustre?

I raised this issue following lots of discussion with pioneer users of Lustre 2.5.0 and their surprises.

Comment by Andreas Dilger [ 19/Feb/14 ]

I agree that tracking nlink == 0 (via a flag in the unlink record?) is probably enough to signal to the policy engine when the last unlink happens.

That said, I think it should be up to the policy engine and the system policy to decide when the archive copy is deleted. In some sites files may never be deleted, or the length of time that a file is kept may depend on the owner of the file or what directory it is in. Those are all decisions perfectly suited to the policy engine, and not inside Lustre.

This means there would need to be some mechanism in the policy engine or archive to decide how long to keep files after they are deleted from Lustre, and to schedule the deletion afterward.

Comment by Aurelien Degremont (Inactive) [ 21/Feb/14 ]

Andreas, all these features are already possible using RobinHood. The idea was to offer a possibility to use Lustre/HSM simply, without a PolicyEngine, and so be able to remove file from backend when needed.
There is an important point here: files in backend are archived using Lustre FID. When the file is removed from Lustre, there is no more FID available. If admins wants to remove the file in backend afterwards, they have to backup the FID list somewhere before the file is removed and use it later to delete the file. Robinhood is doing this, but if you have a simple system without it, you got a problem

Comment by Malcolm Cowe (Inactive) [ 23/Feb/14 ]

From the original hpdd_discuss thread, I thought that there was a limitation: that the archive_id of the unlinked file is not known by RH, so that a deferred remove does not know which archive to remove the file from.

So, it's not enough to know the FID of the file that has been deleted from Lustre; one also needs the archive_id in order to locate the remote copy.

Comment by Aurelien Degremont (Inactive) [ 24/Feb/14 ]

That's true that RBH does not know this archive_id, but most of the time it will use the defaul_archive_id for that.

Anyway, this is not related to this ticket. This ticket try to manage archive removal with a setup WITHOUT Robinhood.

Comment by Andreas Dilger [ 25/Feb/14 ]

Sorry, I didn't understand the details before. In the case of HSM without RobinHood then it should delete the file immediately when the last link is dropped. It appears that this is (or should be) already handled with CLF_UNLINK_LAST in the CL_UNLINK record. If there is a simple ChangeLog processing tool (not RH) then it makes sense to have it turn CLF_UNLINK_LAST directly into HUA_REMOVE.

I agree with Malcolm that this behaviour should be optional (default to off), since the archive may have its own retention policy.

Comment by Bruno Faccini (Inactive) [ 12/Mar/14 ]

Actually, in master code there is even a new CLF_UNLINK_HSM_EXISTS flag to indicate that the unlinked files is archived.
Also, is the fact that there is a Change-Log reader enough to determine there is a policy-engine (RH?!) running ?
What about the "soft-rm" feature and setting detection ??
Last, what would be the best way to start a HSM-Remove request, from/in an external thread/context or by artificially create one ?

Comment by Aurelien Degremont (Inactive) [ 12/Mar/14 ]

It seems I'm very unclear. I do not know where I was so wrong when I explain my view.

Forget about Changelogs for this ticket! Do not consider Changelog for this ticket.

My whole point could be considered even with Changelog off.

> Last, what would be the best way to start a HSM-Remove request, from/in an external thread/context or by artificially create one ?

My idea is to call mdt_hsm_add_actions( HSM_REMOVE: FID ) when MDT UNLINK RPC handler detect the file was successfully removed and nlink == 0. This do not required any userspace program nor Changelog record. This behaviour should be controlled by a tunable.

I need to check code to be more precise where to put this, but somewhere on mdt_unlink() path code, a call to mdt_hsm_request()

Comment by Robert Read (Inactive) [ 12/Mar/14 ]

It would be convenient if the HSM_REMOVE also included the xattrs (or perhaps just the value of a specific user xattr). This would allow copytools that store their own key as an xattr to be able to cleanup after deletion as well.

Comment by Bruno Faccini (Inactive) [ 12/Mar/14 ]

Aurelien,
I should have better detailed too !! My comment about CLF_UNLINK_HSM_EXISTS did not mean that I linked what is to be done here to ChangeLogs, but only that in mdd_unlink() when nlink becomes 0, thereis already a verification that there is an archive version of file (via mdd_hsm_archive_exists()) and then the flag is set to CLF_UNLINK_HSM_EXISTS for ssociated ChangeLog record. But may be it is the wrong place to create the HSM_REMOVE request ??

Comment by Aurelien Degremont (Inactive) [ 17/Mar/14 ]

But may be it is the wrong place to create the HSM_REMOVE request ??

MDS layering was a pain when designing Coordinator. Now, most of coordinator is implemented in MDT layer, as a consequence, unfortunately, you cannot create a new request in mdd_unlink()

After looking at the code, here is quick proposal.

mdt_reint_unlink() seems to be a good place to add this call.

        if (rc == 0 && !lu_object_is_dying(&mc->mot_header))
                // Also read HSM data here: ma_need |= MA_HSM 
                rc = mdt_attr_get_complex(info, mc, ma);
        if (rc == 0)
                // In this function, call mdt_hsm_add_actions() to add a new request 
                mdt_handle_last_unlink(info, mc, ma);

Atomicity and error management could be tricky as we are in MDT layer, though. But we do not have other possibilities.

Comment by Bruno Faccini (Inactive) [ 09/Mar/15 ]

After more discussions it appears that this new behavior should not be enable by default but upon a new tunable setting.

Comment by Gerrit Updater [ 07/Apr/15 ]

Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/14384
Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5c285dee336a29c2e1a55835004446fbd89c9c63

Comment by Bruno Faccini (Inactive) [ 09/Apr/15 ]

Patch-set #1 testing has shown that requesting HSM data was unnecessary due to being already done before for a fully unlinked file via

mdt_reint_unlink()->mdo_unlink()->mdd_unlink()->mdd_hsm_archive_exists()

call sequence.
Also mdt_hsm_add_actions() can not be used because it tries to lookup (via mdt_object_find()) object again but hangs/wait then (in lu_object_find[_at](), due to

-EAGAIN

returned for a dying object) since object seems already set as dying via

mdt_reint_unlink()->mdo_unlink()->mdd_unlink()->mdd_finish_unlink()->mdo_destroy()->...

call sequence. I use mdt_agent_record_add() directly instead.

Comment by Bruno Faccini (Inactive) [ 23/Apr/15 ]

Now using mdt_agent_record_add() directly raises a new question, should we wake-up (by calling mdt_hsm_cdt_wakeup()) CDT upon each of these hsm_remove requests being created or simply "batch" them in hsm_actions LLOG waiting for CDT to wake-up upon any other event or loop_period timer end ? My feeling is that batching them is a good idea since there is no emergency to handle them ...

Comment by Aurelien Degremont (Inactive) [ 28/Apr/15 ]

I agree with you that we do not need to wake up the CDT for each of them. It will be more difficult to write tests however.

Comment by Gerrit Updater [ 14/Mar/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14384/
Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 74d92933108dc64b110a843352cf3336dca249d0

Comment by Bruno Faccini (Inactive) [ 14/Mar/16 ]

Hello Aurelien,
Now my patch to implement RAoLU has finally landed, do you agree that we can close this long standing ticket ??

Comment by Gerrit Updater [ 15/Mar/16 ]

Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/18925
Subject: Revert "LU-4640 mdt: implement Remove Archive on Last Unlink policy"
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 97b0e7059a9755e918e0760fd21b33186395cf5b

Comment by Gerrit Updater [ 15/Mar/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18925/
Subject: Revert "LU-4640 mdt: implement Remove Archive on Last Unlink policy"
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d9f95aa201341d972eeb610471e3c45f1ba12202

Comment by Oleg Drokin [ 15/Mar/16 ]

This patch was reverted due to testing problems discussed in LU-7881 causing big disruption to other patches.

Comment by Bruno Faccini (Inactive) [ 15/Mar/16 ]

Well the problem comes from the time that this patch has waited to be landed, when in the meantime patch for LU-7136 was, and thus changed sanity-hsm framework causing this issue ...

Will push a new version/patch-set of http://review.whamcloud.com/14384/ that will integrate my additional change for LU-7881 (http://review.whamcloud.com/#/c/18919/, "use new functions to kill and verify CT death" ).

Comment by Bruno Faccini (Inactive) [ 15/Mar/16 ]

Humm, but what is the best way to do so? I have tried to first rebase my patch on top of master already containing it and its revert, but nothing/no-change happen.
This could be side effect of the revert that does not really undo the first merge but only applies a patch to revert the original data changes ...
So, do I need to create a new patch by manually back-porting the changes of original patch on top of master ??

Comment by Bruno Faccini (Inactive) [ 16/Mar/16 ]

Basically and after more thinking about this, seems that the best way is to revert the revert and then to add additional changes (LU-7881). This should lead to a new commit for the almos the same changes.

Comment by Gerrit Updater [ 16/Mar/16 ]

Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/18946
Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2ed64a4994a866ce653f10af1c110abe6d506ecc

Comment by Li Xi (Inactive) [ 08/May/17 ]

Hi, Any plan to merge this patch in the near future? I am asking because I realize this is a critical problem for the policy engine that I am working on (Lustre Integrated Policy Engine). This policy engine doesn't require Lustre Changelog, thus the policy engine can not be notified by the Changelog about the unlinking of files.

Comment by Gerrit Updater [ 08/May/17 ]

Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/26980
Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4d76c26ec74c3b0748c6b0a3bf10ff1eebfab216

Comment by Peter Jones [ 08/May/17 ]

Li Xi

It is feasible that we could land a fix for this change earliy in the 2.11 dev cycle

Peter

Comment by Bruno Faccini (Inactive) [ 09/May/17 ]

Ok will try to work again on this patch if there is an urgent need now.
The reason my original patch did not work when being merged was due to some other changes (in test framework and also causing earlier object removal) that had been merged between its final+successful testing/review and the time it has landed.

LU-7881 has addressed the necessary changes required for new sanity-hsm/test_26[a,b,c,d] to comply with new test framework changes.

But I had still work to be done in order to address new condition caused by early object removal.
Just pushed a new patch-set #7 for change #18946 to try fixing this now.

By the way Li, why did you create a new #26980 Change-Id instead working on top of mine/existing ??

Comment by Li Xi (Inactive) [ 09/May/17 ]

Hi Bruno,

Thanks for helping. Yeah, I create a new patch based on your latest patch.

I first improved the test 26c and 26d. And after that, the test results immediately shows that the mdt_attr_get_complex returns -2 because the object is deleted. That means, the MA_HSM and MA_INODE needs to be read before unlink happens. That is why I split mdt_handle_last_unlink to mdt_handle_last_unlink_prepare and mdt_handle_last_unlink_commit.

Let's see what is the test result thi time.

Comment by Li Xi (Inactive) [ 09/May/17 ]

I could have pushed a new patch using the the same change ID of #18946. But I am still not sure whether my thought is correct or not

Comment by Li Xi (Inactive) [ 09/May/17 ]

The patch #26980 was on the top of version 6 of path #18946. And version 8 of patch #18946 looks better than #26980 now.

Comment by Gerrit Updater [ 07/Jun/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/18946/
Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dd4b034540d7dda499ebbb8c465d3435ad46b82a

Comment by Peter Jones [ 07/Jun/17 ]

Landed for 2.10

Generated at Sat Feb 10 01:44:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.