[LU-3833] HSM POSIX copytool sparse file handling Created: 23/Aug/13  Updated: 26/Jan/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: None

Type: Improvement Priority: Major
Reporter: John Hammond Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: HSM

Issue Links:
Related
is related to LU-13397 lfs migrate/mirror extend/resync does... Resolved
is related to LU-6848 The Incorrect block size value of the... Resolved
Severity: 3
Rank (Obsolete): 9911

 Description   

The copytool does not check for sparseness of files. Hence truncating a file to a very large size will create a potential DoS if that file is archived.



 Comments   
Comment by Andreas Dilger [ 24/Oct/13 ]

This could probably be avoided to some extent by having the policy engine make decisions based on the number of allocated blocks rather than using the size of the file. It may also make sense for Robin Hood to allow specifying in the policy to avoid (or reduce) the priority for handling sparse files (possibly as a factor of the "sparseness") if the archive does not handle sparse files well. Otherwise, some file formats like HDF5 may have very large sparse address spaces that would fill both the archive and Lustre if handled poorly.

A second-level enhancement would be to use FIEMAP to copy the file sparsely to/from the archive (assuming the archive itself has efficient sparse file handling).

Comment by Bruno Faccini (Inactive) [ 12/Mar/14 ]

John, Andreas, sorry for the late update on this but since this looks more as an improvement/feature ...
So back to your proposals and ideas, I will try to work in both directions :

_ optimize (to be checked, with help from T.Leibovici, if still to be done?) RobinHood handling of sparse files, with associated Policy configuration parameters, to be based on blocks usage vs file-size.

_ enhance copytool to use FIEMAP if archive filesystem support it. BTW, I am not sure if the actual HSM-Restore feature already use FIEMAP based on original+released file layout?

I will try to come up with some basic changes and may be more detailed thoughts soon now.

Comment by Andreas Dilger [ 26/Jan/22 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40530
Subject: LU-3833 hsm: lhsmtool to handle sparse files
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ae180a1080dc1cb3990d8f53caee95e11a160248

Generated at Sat Feb 10 01:37:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.