Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.12.0
-
3
-
9223372036854775807
Description
There is an issue when releasing a file striped with DoM after an hsm_restore.
To reproduce:
1) create a file with a 1st component on MDT:
lfs setstripe -E 1M -L mdt -E -1 -S 4M -c -1 /mnt/lustre/domfile
2) archive and release the file (requires HSM set up)
lfs hsm_archive /mnt/lustre/domfile # (wait for archive to complete) lfs hsm_release
3) restore the file
lfs hsm_restore /mnt/lustre/domfile # or cat /mnt/lustre/domfile
4) release the file => FAILS
lfs hsm_release /mnt/lustre/domfile Cannot send HSM request (use of /mnt/lustre/domfile): Device or resource busy
It may be something wrong with the data version stored in hsm EA.
Mike, in case it is helpful to you, newer ext4 code has a "swap data" operation that is meant to allow swapping a "volatile" file into the boot loader inode. This could be used to swap data between two DoM files if needed.
That said, your recent comments indicate that it isn't the DoM data swap that is the main obstacle, but the ordering problem of the data version. IMHO, a content-based hash is probably still too expensive if the data version is used regularly. That would make inode operations that need 1KB/inode into data operations that need (possibly) 1MB/inode, or at least 64KB/inode. There was some discussion recently on whether the data version should be used for NFS file modification tracking, so doing a DoM checksum on every file access would be punishing. Storing a separate xattr would be much more efficient.
Maybe I'm missing something, but is it not possible to store the "original" object version in the swapped MDT inode? This might mess with recovery, but if the volatile file is gone it would be pretty clear that the layout swap could not be replayed in any case. We could also special-case the replay operation for layout swap to take this into consideration.