[LU-10530] Scanning tool for HSM storage of POSIX copytool Created: 18/Jan/18  Updated: 18/Mar/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor
Reporter: Li Xi (Inactive) Assignee: Li Xi
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Blocker
Related
is related to LU-14359 support a flat HSM archive format Open
is related to LU-10602 Add file heat support for Persistent ... Open
Rank (Obsolete): 9223372036854775807

 Description   

The current HSM POSIX copytool works well as a demo of the copytool implementation. However, the copytool will create a lot of directoy entries under the HSM storage, which will never be cleaned up by the copytool. Also, if Remove Archive on Last Unlink policy (LU-4640) is not enabled and policy engine like Robinhood is not used, then a lot of orphan files will be left there in HSM storage. Thus, we want to develop a tool which will scan the directory tree of the HSM storage and cleanup the directory three.

Also, since PCC (LU-10092, LU-10499) also uses the HSM directory structure, this tool could aslo be used for PCC to:

1) scan the whole PCC storage to detach all the cached files when it is not suitable any more to cache the files on PCC, e.g. when a job stops and another job is starting to use PCC.

2) apply some kind of policy (e.g. based on UID/GID/JOBID/ProjID, etc) for PCC cache management.

These use cases needs full scanning of the HSM storage directory, and that is what we want to implement.



 Comments   
Comment by Andreas Dilger [ 18/Jan/18 ]

My preference would be to fix the POSIX copytool to avoid creating such a poor directory structure. It should be able to read the old directory structure for compatibility reasons, but it would use a new directory structure for new files.

We've also discussed moving the RBH POSIX copytool into Lustre in place of the current copytool to leverage the common functionality.

Comment by Li Xi (Inactive) [ 18/Jan/18 ]

Hi Andreas,

I agree that the current POSIX copytool need to be upated. And maybe replacing it with RBH POSIX copytool would be a good idea.

The main purpose of the scanning tool here is to apply policy to files cached on PCC. We will start with this use case first.

Comment by Gerrit Updater [ 29/Jan/18 ]

Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/31070
Subject: LU-10530 utils: Scanning tool for HSM POSIX storage
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4379f3a8bc547bf83b295c6865d66d1461a43088

Comment by Andreas Dilger [ 18/Mar/21 ]

LU-14359 is fixing lhsmtool_posix to support a flatter directory tree for the HSM objects. That avoids creating many needless subdirectories in the backing archive storage.

Generated at Sat Feb 10 02:35:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.