Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14

live replacement of OST

    XMLWordPrintable

Details

    • 24,128
    • 7701

    Description

      Hot replace:
      1 - Disable your OST on MDT (lctl deactivate)
      2 - Empty your OST
      3 - Backup the magic files (last_rcvd, LAST_ID, CONFIG/*)
      4 - Deactivate the OST on all clients also.
      5 - Unmount the OST
      6 - Replace, reformat using same index
      7 - Put back the backup magic files.
      8 - Restart the OST.
      9 - Activate the OST everywhere.

      It probably wouldn't be impossible to have a new OST gracefully replace an old one, if that is what the administrator wanted. Some "special" action would need to be taken on the OST and/or MDT to ensure that this is what the admin wanted, instead of e.g. accidentally inserting some other OST with the same index and corrupting the filesystem because of duplicate object IDs, or not being able to access existing objects on the "real" OST at that index.

      • the new OST would be best off to start allocating objects at the LAST_ID
        of the old OST, so that there is no risk of confusion between objects
      • the MDT contains the old LAST_ID in it's lov_objids file, and it sends this
        to the OST at connection time, this is no problem
      • currently the new OST will refuse to allow the MDT to connect, because it
        detects that the old LAST_ID value from the MDT is inconsistent with its
        own value
      • it would be relatively straight forward to have the OST detect if the local
        LAST_ID value was "new" and use the MDT value instead
      • the danger is if the LAST_ID file was lost for some reason (e.g. corruption
        causes e2fsck to erase it). in that case, the OST startup code should be
        smart enough to regenerate LAST_ID based on walking the object directories,
        which would also avoid the need to do this in e2fsck/lfsck (which can only
        run offline)
      • in cases where the on-disk LAST_ID is much lower than the MDT-supplied
        value, the OST should just skip precreation of all the intermediate objects
        and just start using the new MDT value
      • the only other thing is to avoid the case where a "new" OST is accidentally
        assigned the same index, when that isn't what is wanted. There needs to be
        some way to "prime" the new OST (that is NOT the default for a newly
        formatted OST), or conversely tell the MDT that it should signal the new
        OST to take the place of the old one, so that there are not any mistakes

      Attachments

        Issue Links

          Activity

            People

              yong.fan nasf (Inactive)
              laisiyao Lai Siyao
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: