Data-on-MDT phase II
(LU-10176)
|
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0 |
| Type: | Technical task | Priority: | Major |
| Reporter: | Andreas Dilger | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | DoM2 | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Make migration for DoM files with LFS command. This is not working out-of-box for Data-on-MDT files because it is not enough just change layout, data should be moved as well. The OST-to-MDT and MDT-MDT migrations to be supported. Note that MDT-MDT migration might just be "cp + rename", since it will be the same. |
| Comments |
| Comment by Gerrit Updater [ 28/Jun/19 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35359 |
| Comment by Mikhail Pershin [ 30/Jul/19 ] |
|
Add here some explanation about how OST-DOM migration works: OST layout + data on OSTs -> DOM layout + data on MDT
It is possible to have a DOM component with the same size in different mirrors, it is not 'mirrored' in that case but since we are considering MDT inode exists always that is not a problem I think. Technically it can be different size even - MDT inode will store the largest DoM stripe in that case but more work will be needed to return MDT stripe size to the client correctly in that case. Now it is just inode size, but should be limited by DOM component size depending on chosen mirror layout. |
| Comment by Mikhail Pershin [ 30/Jul/19 ] |
|
Another thing to think about - what sort of OST-striped files should NOT be migrated to DOM files:
|
| Comment by Andreas Dilger [ 30/Jul/19 ] |
This explanation should all be included in the patch commit message. Note that we can't exclude PFL files just on principle, because a filesystem may have a default PFL layout (maybe before the MDT has enough space for DoM, then new large MDTs are added to the filesystem) so all files are PFL. In general, while. It is good to have smart behavior by default if no other input is given, I think the kernel should try to avoid overriding decisions made by userspace. |
| Comment by Mikhail Pershin [ 31/Jul/19 ] |
|
yes, I think that PFL files should be processed as all other, if it is file size bigger than DOM component size of new layout then lfs migrate should exit with a warning about that and propose to use -f parameter if user really want to do that. |
| Comment by Andreas Dilger [ 31/Jul/19 ] |
yes, I think that PFL files should be processed as all other, if it is file size bigger than DOM component size of new layout then lfs migrate should exit with a warning about that and propose to use -f parameter if user really want to do that. Well, it isn't clear that there is a need/benefit to return an error when the user asks for this. There are reasons for having a DoM component at the start of a file, e.g. for files that have an embedded index/icon/header at the start, so my first choice would be to allow what the user asked for instead of trying to second-guess their request. That said, if the user has not requested a DoM component (e.g. generic "lfs migrate" command) then I'm perfectly happy to drop the DoM component, and PFL in general, and use a plain layout with the stripe_count, stripe_size, and pool from the last instantiated component of the file. As an aside, I generally dislike using " |
| Comment by Andreas Dilger [ 31/Jul/19 ] |
|
Is it also possible to do DoM-to-OST mirroring to drop just the mdt component from a large file? That would essentially need to write the DoM data to the first OST object (second component) in the background, and then add a new xattr operation to drop the mdt component and change the start of the second component to offset 0. |
| Comment by Mikhail Pershin [ 31/Jul/19 ] |
|
Andreas, yes, I tend to agree, though several DOM optimisations are lost for files with DOM+OST objects instantiated there are still some remains, e.g. small random access moved from OST to MDT and considering that MDT can have faster storage in general. So it is even simpler for me to don't add extra checks for lfs migrate. As for the DOM component removal, I think that has the similar benefits for any PFL file, when upon growing the first, smaller, component could be integrated into the next one and dropped. Though I am not sure if we have such feature now. Meanwhile, mirroring allows also to increase DoM component size for DoM files, which is not possible via layout swap. |
| Comment by Gerrit Updater [ 16/Sep/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35359/ |
| Comment by Peter Jones [ 16/Sep/19 ] |
|
Landed for 2.13 |
| Comment by Stephane Thiell [ 10/Mar/20 ] |
|
Is it planned to backport this patch to b2_12? I'm asking because we have MDTs that are almost full in terms of inodes (mainly due to the DoM ldiskfs space requirement, we ran out of inodes even though each MDT is 18TB). Many DoM files remain on these full MDTs, so we cannot easily migrate directories them to other MDTs ( |