Details
-
Improvement
-
Resolution: Fixed
-
Critical
-
Lustre 2.5.0
-
None
-
24,128
-
7701
Description
Hot replace:
1 - Disable your OST on MDT (lctl deactivate)
2 - Empty your OST
3 - Backup the magic files (last_rcvd, LAST_ID, CONFIG/*)
4 - Deactivate the OST on all clients also.
5 - Unmount the OST
6 - Replace, reformat using same index
7 - Put back the backup magic files.
8 - Restart the OST.
9 - Activate the OST everywhere.
It probably wouldn't be impossible to have a new OST gracefully replace an old one, if that is what the administrator wanted. Some "special" action would need to be taken on the OST and/or MDT to ensure that this is what the admin wanted, instead of e.g. accidentally inserting some other OST with the same index and corrupting the filesystem because of duplicate object IDs, or not being able to access existing objects on the "real" OST at that index.
- the new OST would be best off to start allocating objects at the LAST_ID
of the old OST, so that there is no risk of confusion between objects - the MDT contains the old LAST_ID in it's lov_objids file, and it sends this
to the OST at connection time, this is no problem - currently the new OST will refuse to allow the MDT to connect, because it
detects that the old LAST_ID value from the MDT is inconsistent with its
own value - it would be relatively straight forward to have the OST detect if the local
LAST_ID value was "new" and use the MDT value instead - the danger is if the LAST_ID file was lost for some reason (e.g. corruption
causes e2fsck to erase it). in that case, the OST startup code should be
smart enough to regenerate LAST_ID based on walking the object directories,
which would also avoid the need to do this in e2fsck/lfsck (which can only
run offline) - in cases where the on-disk LAST_ID is much lower than the MDT-supplied
value, the OST should just skip precreation of all the intermediate objects
and just start using the new MDT value - the only other thing is to avoid the case where a "new" OST is accidentally
assigned the same index, when that isn't what is wanted. There needs to be
some way to "prime" the new OST (that is NOT the default for a newly
formatted OST), or conversely tell the MDT that it should signal the new
OST to take the place of the old one, so that there are not any mistakes
Attachments
Issue Links
- is related to
-
LU-3458 OST not able to register at MGS with predefined index.
- Open
-
LU-2018 Questions about using lfsck
- Resolved
-
LU-3668 ldiskfs_check_descriptors: Block bitmap for group not in group
- Resolved
-
LU-3575 'mkfs.lustre --writeconf' not working anymore with Lustre 2.4
- Resolved
-
LU-5722 memory allocation deadlock under lu_cache_shrink()
- Resolved
-
LU-4204 typo in new conf-sanity subtest
- Resolved
-
LU-266 Need a better, automated way to recover from failures that require LAST_ID recovery
- Resolved
- is related to
-
LU-4246 Test failure on test suite conf-sanity, subtest test_72
- Closed
-
LU-1267 LFSCK II: MDT-OST consistency check/repair
- Resolved