Details

    • Question/Request
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • 9223372036854775807

    Description

      The lustre operations manual currently says this:

      Lustre software release 2.x.y release (minor) upgrade:
      •   All servers must be upgraded at the same time, while some or all clients may be upgraded.
      •   Rolling upgrades are supported for minor releases allowing individual servers and clients to be upgraded
      without stopping the Lustre file system.
      

      The first sentence sounds like another way of saying "all the servers must be the same". The second sentence sounds like another way of saying "the servers can be different versions, but only temporarily".

      As long as:

      • "2.x" part of the version is staying the same
      • "y" part is changing to "y+1" or "y+2"
      • assuming an actual tagged releases

      Can we update update the targets within a single file system one at a time? If so, I'll propose a patch with what I think is better wording.

      Attachments

        Activity

          [LUDOC-446] Lustre server rolling upgrades

          Hi Andreas,
          Thanks in arrears. That makes sense, and I haven't gotten around to my proposed wording, but I will.

          Peter,
          I can't remove the topllnl label in this project, it appears, and can't assign this to myself. Can you make those changes, or make it so I can (either is fine with me)?

          ofaaland Olaf Faaland added a comment - Hi Andreas, Thanks in arrears. That makes sense, and I haven't gotten around to my proposed wording, but I will. Peter, I can't remove the topllnl label in this project, it appears, and can't assign this to myself. Can you make those changes, or make it so I can (either is fine with me)?

          Olaf, indeed it is possible to upgrade servers one-at-a-time, but I can't imagine why you'd want to? If one server is down temporarily for an upgrade, that means files on that server are temporarily inaccessible and clients will block until the server is restarted and recovery completes. The client recovery time will be pretty much the same whether a single server is upgraded or half/all of the servers are upgraded, but you would have many more recovery periods if you only upgrade one OSS at a time.

          The recommended process for rolling upgrades is to failover half of the MDT+OST targets to to their backup nodes, upgrade the now-idle MDS+OSS nodes to the new release, then failover all of the targets to the just-upgraded nodes, upgrade the other half of the MDS+OSS nodes, and fail back half of the targets to their original nodes. This involves 3 recovery periods, or correspondingly more if you have e.g. N-way HA failover clusters and are failing 1/N of the targets at a time.

          With FLR mirror/EC we will eventually be able to do this without any outage, but that would need all of the files (or at least all of the in-use files) to be redundant in some way. Conceivably, we could mirror all of the files using OSTs on a particular OSS before doing the upgrade, then un-mirror them afterward, but that would be relatively slow.

          In any case, I'm all for improving the readability of the manual, so feel free to suggest better wording.

          adilger Andreas Dilger added a comment - Olaf, indeed it is possible to upgrade servers one-at-a-time, but I can't imagine why you'd want to? If one server is down temporarily for an upgrade, that means files on that server are temporarily inaccessible and clients will block until the server is restarted and recovery completes. The client recovery time will be pretty much the same whether a single server is upgraded or half/all of the servers are upgraded, but you would have many more recovery periods if you only upgrade one OSS at a time. The recommended process for rolling upgrades is to failover half of the MDT+OST targets to to their backup nodes, upgrade the now-idle MDS+OSS nodes to the new release, then failover all of the targets to the just-upgraded nodes, upgrade the other half of the MDS+OSS nodes, and fail back half of the targets to their original nodes. This involves 3 recovery periods, or correspondingly more if you have e.g. N-way HA failover clusters and are failing 1/N of the targets at a time. With FLR mirror/EC we will eventually be able to do this without any outage, but that would need all of the files (or at least all of the in-use files) to be redundant in some way. Conceivably, we could mirror all of the files using OSTs on a particular OSS before doing the upgrade, then un-mirror them afterward, but that would be relatively slow. In any case, I'm all for improving the readability of the manual, so feel free to suggest better wording.

          People

            ofaaland Olaf Faaland
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: