[LU-5895] lfs: data version changed during migration Created: 10/Nov/14 Updated: 29/May/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Frank Zago (Inactive) | Assignee: | Ben Evans (Inactive) |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
head of tree + |
||
| Severity: | 3 |
| Rank (Obsolete): | 16474 |
| Description |
|
Archiving, releasing then migrating leads to a "data version changed during migration": # cd /mnt/lustre # cp /usr/bin/zip . # lfs getstripe zip zip lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 1 obdidx objid objid group 1 2 0x2 0 # lfs hsm_archive zip # lfs hsm_release zip # lfs hsm_state zip zip: (0x0000000d) released exists archived, archive_id:1 # lfs getstripe zip zip lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 80000001 lmm_layout_gen: 1 lmm_stripe_offset: 0 # lfs migrate -o 0 zip /root/lustre-cleanup/lustre/utils/lfs: zip: data version changed during migration error: migrate: migrate stripe file 'zip' failed I think the file is restored first, then migrated, but its data version is not updated. Which lead to the following questions:
|
| Comments |
| Comment by Frank Zago (Inactive) [ 10/Nov/14 ] |
|
After the "failed" migration, getstripes return this: # lfs getstripe zip zip lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 2 lmm_stripe_offset: 0 obdidx objid objid group 0 3 0x3 0 So the file has indeed migrated, and is not the original one simply restored. |
| Comment by Jinshan Xiong (Inactive) [ 10/Nov/14 ] |
|
HSM may not allocate the original OST to restore the file, thus your example can't verify that the migration has completed. Please try to migrate the released file to have 2 stripes and see how it goes. |
| Comment by Frank Zago (Inactive) [ 10/Nov/14 ] |
|
When I archive/restore a file, the objid stays the same. It's not the case here. I'll try with 2 stripes. |
| Comment by Frank Zago (Inactive) [ 10/Nov/14 ] |
|
When I tried migrating to 2 stripes, the file was restored to only one stripe. So that part looks ok actually. That third question is now moot. The first 2 remain. |
| Comment by Jinshan Xiong (Inactive) [ 10/Nov/14 ] |
|
I think the 3rd question is a BUG. I didn't look into the code, but I guess the root cause of this problem is that zero data version was returned for released file, but later after file was restored it saw different data version. The 2nd question is a good one. The question can be refined to support setstripe style of restore operation, in another word, the command `lfs hsm_restore' should be able to override original stripe pattern. |
| Comment by Andreas Dilger [ 02/Dec/14 ] |
|
I have to question whether it even makes sense to migrate a released file? Maybe this should just become a no-op? |
| Comment by Robert Read (Inactive) [ 02/Dec/14 ] |
|
Yes, it does sense to skip migration for a released file since there is no data to migrate as far as Lustre is concerned. |
| Comment by Andreas Dilger [ 18/Dec/14 ] |
|
Frank, would you be able to make a patch to just return 0 (do nothing) if trying to migrate a file that is already released? |
| Comment by Frank Zago (Inactive) [ 10/Jan/15 ] |
|
I missed that message. I'll send a patch. |
| Comment by Gerrit Updater [ 12/Jan/15 ] |
|
frank zago (fzago@cray.com) uploaded a new patch: http://review.whamcloud.com/13356 |
| Comment by Frank Zago (Inactive) [ 12/Jan/15 ] |
|
I've made lfs return an error if the file has been released. I believe it's better to fail than to mislead the user. # ./lfs migrate /mnt/lustre/zip /mnt/lustre/zip: can't migrate a file released by HSM error: migrate: migrate stripe file '/mnt/lustre/zip' failed But this patch doesn't completely cover the bad case. For instance what happens if there is a race between a release and migrate operation. If the check for the release bit succeeds, then the migrate operation will happen, and the "data version changed during migration" message will be displayed, but the file will be restored. I don't see a solution for that, but we might be able to live with it. |
| Comment by Frank Zago (Inactive) [ 26/Jan/15 ] |
|
I will update that patch according to Henri's suggestion: Wouldn't it make sense to migrate released files like this: - Open volatile file with O_LOV_DELAY_CREAT - Assign it the desired stripe - Swap layouts with the released file This is totally untested, but I believe it could work and prevent from having to restore released files for the sole purpose of migrating them. This is however depending on "LU-6081 user: Introducing llapi_create_volatile_param()", which itself will need " |
| Comment by Frank Zago (Inactive) [ 27/Jan/15 ] |
|
I wrote the code for it and a basic test (not pushed). 2 problems:
|