[LU-6866] MDT file migration is incompatible with HSM Created: 17/Jul/15 Updated: 17/Apr/19 Resolved: 18/Dec/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | John Hammond | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HSM, migration | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
Migrating a file between MDTs changes the FID of the file. The file FID is the used to identify the file in the HSM archive (unfortunately). So, for example, if a released file is migrated from one MDT to another then it cannot be restored. |
| Comments |
| Comment by Andreas Dilger [ 21/Aug/15 ] |
|
I think the path forward here is to generate a ChangeLog record for each migrated file (per Some archives, such as S3, cannot handle rename of files in the archive since the S3 objects are immutable. However, the use of FIDs in the immutable archive also causes problems with HSM import (unrelated to DNE migrate), which current requires the ability to change the file FIDs in the archive after import, so there have been discussions for long-term changes to Lustre HSM to store an archive-supplied UUID to persistently identify objects in Lustre rather than using the Lustre FID to track files in the archive. That would also solve this problem, but is a rather significant change, and would likely make sense to implement when HSM is integrated with PFL to store the archive UUID in a new "archive layout" type. |
| Comment by Andreas Dilger [ 22/Sep/15 ] |
|
Just watching the CINES HSM and DDN WOS HSM presentations at LAD, and it seems to me that both of these implementations could benefit from storing the UUID into an HSM file layout. Storing the WOS object UUID in the Lustre inode layout could be used to identify objects in the archive, instead of (or in addition to) using the Lustre FID to identify the object in the archive. This could also solve the problem that CINES reported with not being able to undelete files directly from the archive back into the Lustre filesystem, because the deleted files are identified by the old FID in the archive and there is no way to recreate that FID in Lustre today.. As Robert suggested previously, storing the archive UUID into the Lustre HSM layout, it will allow creating a new stub file or migrating the inode (along with the archive UUID stored in the layout xattr) across MDTs, and then nothing needs to be done in the archive since the UUID stays the same. The HSM archive probably doesn't even need to care about what the Lustre FID is, since it may change over time and that shouldn't cause the file to be re-archived. Developing the PFL composite layouts is starting, but as yet there isn't a plan to implement the HSM layout (i.e. LOV_MAGIC_HSM containing essentially struct hsm_attrs + struct uuid, assuming the standard 128-bit/16-byte UUID is enough for S3/WOS and other archive identifiers). I don't think there is anyone at Intel that will be working on the HSM-in-layout currently. |
| Comment by Robert Read (Inactive) [ 22/Sep/15 ] |
|
That's right, HSM archives should never see the FID, and should only use the UUID or what identifier makes sense for that particular archive. Currently I store the UUID as an xattr in to achieve the same result. |
| Comment by Peter Jones [ 23/Sep/15 ] |
|
Di Could you please look into this one? Thanks Peter |
| Comment by Di Wang [ 23/Sep/15 ] |
|
If I understand correctly, there are two sub-tasks here. 1. add an change log for migration, which should be easy, and can be done under |
| Comment by Robert Read (Inactive) [ 23/Sep/15 ] |
|
HSM actions contain FIDs, and the hsm tool retrieves the UUID using getxattr(). I don't see where we would need to lookup FID by UUID, so we don't need any lookup table for that. A copytool/policy engine might maintain a mapping between UUID and pathname(s), but that would be done outside of Lustre. |
| Comment by Robert Read (Inactive) [ 23/Sep/15 ] |
|
Actually, the delete action needs the UUID, since at that point the FID and its xattrs have already been deleted. Another option might be to include the UUID in the unlink changelog record, and the policy engine can pass that on the backends associated with that file. |
| Comment by Di Wang [ 23/Sep/15 ] |
|
Hmm, I thought the problem is that migration changes the FID, so the FIDs in archives is not valid anymore? so we instead use UUID to present the object, which will never be changed? I must miss sth here. |
| Comment by Robert Read (Inactive) [ 23/Sep/15 ] |
|
The actual bug here is that FIDs are used in archive. The archive needs to use UUIDs (or anything that is unique), and not FIDs. |
| Comment by Di Wang [ 23/Sep/15 ] |
|
Ah, it is copytool to take care the mapping from this unique identifier to the achieve file. I had wrong ideas before. Then we should not use FID as the identifier indeed. |
| Comment by Di Wang [ 24/Sep/15 ] |
|
Ok, I will add migration record for change log. But it seems a temporary solution, i.e. once we have UUID, this can be removed. So should I add a migration record or some one will work on this HSM layout thing (Sorry, I am not familiar with the HSM release/restore thing)? |
| Comment by Shuichi Ihara (Inactive) [ 24/Sep/15 ] |
|
If my undestanding is correct, archive UUID means we can define unique ID that is stored into EA? |
| Comment by Robert Read (Inactive) [ 24/Sep/15 ] |
|
I believe currently MDT migration triggers unlink and create records in the source MDT and target MDT changelogs respectively, so one option is to just add a MIGRATE flag to those records so we can differentiate the migrate unlink from a real unlink. Or is do yo have a better solution for this already? Would it be possible to standardize the EA field we use for the UUID instead of creating a hsm layout? As Ihara mentioned, we eed to support the ability to assign UUIDs to a file, so it is useful to use the normal EA interface for this instead of adding a new API. Also, some backends might use different kinds of identifiers, so it might not always be an actual UUID. If we have a standardize this, then we can add this extra data to the delete changelog record, and that solves the main problem we have today with using external IDs. |
| Comment by Di Wang [ 24/Sep/15 ] |
I believe currently MDT migration triggers unlink and create records in the source MDT and target MDT changelogs respectively, so one option is to just add a MIGRATE flag to those records so we can differentiate the migrate unlink from a real unlink. Or is do yo have a better solution for this already? The whole migration is implemented in MDD layer, where changelogs are generated, and there are no changelog created for migration. So I can just add one for this ticket. I will push the patch to |
| Comment by Robert Read (Inactive) [ 24/Sep/15 ] |
|
As long is there is no unlink changelog generated, then that's fine. A migration changelog is useful for the PE to know that file has changed MDTs, but this doesn't really impact the archive. I'll open a new ticket for adding UUID to delete changelog, which is much more important for HSM. |
| Comment by Robert Read (Inactive) [ 24/Sep/15 ] |
|
Please see LU-7207 for the proposed format for the external ID EA and including it in the delete changelog. |
| Comment by Di Wang [ 24/Sep/15 ] |
|
Thanks Robert. Then I will close this one, and push the migration record patch to |
| Comment by John Hammond [ 08/Dec/15 ] |
|
Reopening since lhsmtool_posix does not implement the discussed functionality. |
| Comment by Gerrit Updater [ 08/Dec/15 ] |
|
John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/17511 |
| Comment by Gerrit Updater [ 18/Dec/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17511/ |
| Comment by Joseph Gmitter (Inactive) [ 18/Dec/15 ] |
|
Landed for 2.8 |