Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
9223372036854775807
Description
In order to speed up file purge, HSM, tiered storage management, or other tasks dependent on finding old files efficiently (old access time, possibly also modification time) it would be useful to store tables of inodes ordered by atime. As new files are created, the file's FID would be inserted into the currently-open access time table. If the file is being updated, the FID would first be removed from the table that it is in (if it is not already the current table).
Periodically (say daily, or after every 1M files created, or on server restart) a new atime table would be created, and the old one would be closed. For MDT filesystems with 1B inodes, this would result in about 1k access tables of ~64MB each (approximately the same size as the OI files), or a 5-year-old filesystem might have about 1800 tables. Tables could be named with the timestamp of when they are created in order to make it easy to sort their contents by time.
When looking for "old" files, it would be trivial to iterate over the tables by their name to find the right table(s) by age and then iterate linearly over the content to find FIDs still in the file, rather than having to scan the filesystem to find those files.
As older files are removed from the filesystem (e.g. from file purge) then the old tables would naturally become empty and be removed. It would also be possible to do periodic garbage collection on older tables to collapse them and release unused space.