[LU-13902] Merging 2 filesystems into 1. Created: 11/Aug/20 Updated: 17/Feb/21 Resolved: 13/Sep/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question/Request | Priority: | Minor |
| Reporter: | Mahmoud Hanafi | Assignee: | Peter Jones |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
We want to combine two filesystem into a single larger one. We can copy the data off and add the empty OSTs to the first. Then copy the data back. But is there a special way we can merging the OSTs and MDTs without copying the data?
|
| Comments |
| Comment by Andreas Dilger [ 12/Aug/20 ] |
|
Mahmoud, Presumably you want to minimize data movement and system downtime, so it makes sense to keep the larger filesystem (target) and migrate files from the smaller filesystem (source). Having users delete unnecessary files from the source and target filesystems in advance would minimize data copied from the source filesystem, and provide the maximum amount of free space on the target filesystem to move more data at once. Ideally, you would make a full backup of the source filesystem's data as a precaution, but that may not always be practical at large scales. Of course, if you had enough space to just copy everything at once, that would be easiest. However, it sounds like you need to migrate the data between filesystems incrementally over at least several days (depending on their size and interconnect bandwidth), in order to have enough space to store the data and to minimize system downtime. I've written a proposed high-level process to copy user data incrementally from the source filesystem to the target (at a rate of your choosing to minimize system impact), and uses the OST removal process described in http://doc.lustre.org/lustre_manual.xhtml#lustremaint.remove_ost to remove OSTs incrementally from the source filesystem before they are added to the target filesystem to add more space. This process assumes that the target filesystem has enough free MDT space to hold all of the files in the source filesystem, as incremental MDT migration would be quite complex, unless the filesystem only used DNE1 remote MDTs, and migration was ordered to empty whole MDTs of their directories first. If that is necessary, then additional steps (not listed here) would be needed. You would need to select the granularity of directory trees in the source filesystem to be moved (e.g. move one user, project, or subdirectory at a time), depending on how much free space is available in the target filesystem, and how much space is used by OSTs in the source filesystem. There would need to be at least one OST's worth of used space moved from the source to the target in each full iteration, in order to be able to remove the OST(s) from the source filesystem and add it to the target filesystem. It would be better to allow migrating multiple OSTs at one time (e.g. a whole OSS failover pair's worth) to avoid the need to make physical changes to the OSS nodes (e.g. recabling OSTs from source OSTs to target OSTs). This would also minimize the imbalance in the target filesystem, as otherwise all new data would be copied to the same new/empty OST added to the target. You would also need to ensure that enough space is available in both the source and the target filesystems so they do not get filled completely during normal usage, since the migration is incremental. The high-level (untested) process would be something like:
Repeat the following process until all OSTs in the source filesystem have been removed:
Repeat the following process until enough space has been freed on the source filesystem to remove the deactivated OSS (pair), with some margin of free space:
Once enough space is available to remove the OSTs of an OSS (pair) migrate files off of them:
Verify the source OSTs are unused and remove them from the :
MDT(s) should be empty at this point and could be reformatted and added to target filesystem |
| Comment by Mahmoud Hanafi [ 12/Aug/20 ] |
|
Thank you for the detailed explanation. |
| Comment by Gerrit Updater [ 28/Aug/20 ] |
|
Not sure why the following patch was attributed to this ticket, but it belongs in
|
| Comment by Gerrit Updater [ 12/Sep/20 ] |
|
|