Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4640

Last unlink should trigger HSM remove by default

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • Lustre 2.5.0
    • 3
    • 12691

    Description

      In current code, when an archived file is removed from the file system, no HSM request is triggered. We rely on Changelogs and RobinHood reading them to have it sending corresponding hsm_remove requests to clean those orphans in HSM backend. This is done in purpose, to provide a way to implement "soft unlink". RobinHood will remove the file in the backend after a grace time. During this time, Admins could restore the file from the HSM if they want to (using import feature).

      Requiring a RobinHood setup to handle this cleaning is a too big limitation. We should consider modifying this behaviour.

      By default, Lustre should automatically add a HSM_REMOVE request for any last unlink. This way, no file will be leaked in the archive.
      A tunable should be added to disable this behaviour (should we add this to hsm_policy?) and go back to a mode where an external component is responsible for tracking UNLINK changelogs and add hsm_remove requests when needed (Robinhood) (current behaviour).

      Attachments

        Issue Links

          Activity

            [LU-4640] Last unlink should trigger HSM remove by default
            pjones Peter Jones added a comment -

            Landed for 2.10

            pjones Peter Jones added a comment - Landed for 2.10

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/18946/
            Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: dd4b034540d7dda499ebbb8c465d3435ad46b82a

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/18946/ Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy Project: fs/lustre-release Branch: master Current Patch Set: Commit: dd4b034540d7dda499ebbb8c465d3435ad46b82a

            The patch #26980 was on the top of version 6 of path #18946. And version 8 of patch #18946 looks better than #26980 now.

            lixi Li Xi (Inactive) added a comment - The patch #26980 was on the top of version 6 of path #18946. And version 8 of patch #18946 looks better than #26980 now.

            I could have pushed a new patch using the the same change ID of #18946. But I am still not sure whether my thought is correct or not

            lixi Li Xi (Inactive) added a comment - I could have pushed a new patch using the the same change ID of #18946. But I am still not sure whether my thought is correct or not

            Hi Bruno,

            Thanks for helping. Yeah, I create a new patch based on your latest patch.

            I first improved the test 26c and 26d. And after that, the test results immediately shows that the mdt_attr_get_complex returns -2 because the object is deleted. That means, the MA_HSM and MA_INODE needs to be read before unlink happens. That is why I split mdt_handle_last_unlink to mdt_handle_last_unlink_prepare and mdt_handle_last_unlink_commit.

            Let's see what is the test result thi time.

            lixi Li Xi (Inactive) added a comment - Hi Bruno, Thanks for helping. Yeah, I create a new patch based on your latest patch. I first improved the test 26c and 26d. And after that, the test results immediately shows that the mdt_attr_get_complex returns -2 because the object is deleted. That means, the MA_HSM and MA_INODE needs to be read before unlink happens. That is why I split mdt_handle_last_unlink to mdt_handle_last_unlink_prepare and mdt_handle_last_unlink_commit. Let's see what is the test result thi time.

            Ok will try to work again on this patch if there is an urgent need now.
            The reason my original patch did not work when being merged was due to some other changes (in test framework and also causing earlier object removal) that had been merged between its final+successful testing/review and the time it has landed.

            LU-7881 has addressed the necessary changes required for new sanity-hsm/test_26[a,b,c,d] to comply with new test framework changes.

            But I had still work to be done in order to address new condition caused by early object removal.
            Just pushed a new patch-set #7 for change #18946 to try fixing this now.

            By the way Li, why did you create a new #26980 Change-Id instead working on top of mine/existing ??

            bfaccini Bruno Faccini (Inactive) added a comment - Ok will try to work again on this patch if there is an urgent need now. The reason my original patch did not work when being merged was due to some other changes (in test framework and also causing earlier object removal) that had been merged between its final+successful testing/review and the time it has landed. LU-7881 has addressed the necessary changes required for new sanity-hsm/test_26 [a,b,c,d] to comply with new test framework changes. But I had still work to be done in order to address new condition caused by early object removal. Just pushed a new patch-set #7 for change #18946 to try fixing this now. By the way Li, why did you create a new #26980 Change-Id instead working on top of mine/existing ??
            pjones Peter Jones added a comment -

            Li Xi

            It is feasible that we could land a fix for this change earliy in the 2.11 dev cycle

            Peter

            pjones Peter Jones added a comment - Li Xi It is feasible that we could land a fix for this change earliy in the 2.11 dev cycle Peter

            Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/26980
            Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 4d76c26ec74c3b0748c6b0a3bf10ff1eebfab216

            gerrit Gerrit Updater added a comment - Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/26980 Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 4d76c26ec74c3b0748c6b0a3bf10ff1eebfab216

            Hi, Any plan to merge this patch in the near future? I am asking because I realize this is a critical problem for the policy engine that I am working on (Lustre Integrated Policy Engine). This policy engine doesn't require Lustre Changelog, thus the policy engine can not be notified by the Changelog about the unlinking of files.

            lixi Li Xi (Inactive) added a comment - Hi, Any plan to merge this patch in the near future? I am asking because I realize this is a critical problem for the policy engine that I am working on (Lustre Integrated Policy Engine). This policy engine doesn't require Lustre Changelog, thus the policy engine can not be notified by the Changelog about the unlinking of files.

            Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/18946
            Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 2ed64a4994a866ce653f10af1c110abe6d506ecc

            gerrit Gerrit Updater added a comment - Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/18946 Subject: LU-4640 mdt: implement Remove Archive on Last Unlink policy Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2ed64a4994a866ce653f10af1c110abe6d506ecc

            People

              bfaccini Bruno Faccini (Inactive)
              adegremont Aurelien Degremont (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: