Description
Archiving, releasing then migrating leads to a "data version changed during migration":
# cd /mnt/lustre # cp /usr/bin/zip . # lfs getstripe zip zip lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 1 obdidx objid objid group 1 2 0x2 0 # lfs hsm_archive zip # lfs hsm_release zip # lfs hsm_state zip zip: (0x0000000d) released exists archived, archive_id:1 # lfs getstripe zip zip lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 80000001 lmm_layout_gen: 1 lmm_stripe_offset: 0 # lfs migrate -o 0 zip /root/lustre-cleanup/lustre/utils/lfs: zip: data version changed during migration error: migrate: migrate stripe file 'zip' failed
I think the file is restored first, then migrated, but its data version is not updated. Which lead to the following questions:
- is it correct to force a restore of an archived file when asking for a migrate operation?
- couldn't the file be restored directly to the proper OST/stripe size, ...?
- although an error is reported, the file is present and complete, so the operation actually completed properly. What if that was another kind of error? Would we get a data corruption?
Attachments
Issue Links
- is related to
-
LU-5896 lfs: can't release a migrated file
-
- Open
-
I've made lfs return an error if the file has been released. I believe it's better to fail than to mislead the user.
But this patch doesn't completely cover the bad case. For instance what happens if there is a race between a release and migrate operation. If the check for the release bit succeeds, then the migrate operation will happen, and the "data version changed during migration" message will be displayed, but the file will be restored. I don't see a solution for that, but we might be able to live with it.