[LUDOC-446] Lustre server rolling upgrades Created: 20/Jun/19  Updated: 26/Aug/19

Status: Open
Project: Lustre Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question/Request Priority: Minor
Reporter: Olaf Faaland Assignee: Olaf Faaland
Resolution: Unresolved Votes: 0
Labels: llnl

Rank (Obsolete): 9223372036854775807

 Description   

The lustre operations manual currently says this:

Lustre software release 2.x.y release (minor) upgrade:
•   All servers must be upgraded at the same time, while some or all clients may be upgraded.
•   Rolling upgrades are supported for minor releases allowing individual servers and clients to be upgraded
without stopping the Lustre file system.

The first sentence sounds like another way of saying "all the servers must be the same". The second sentence sounds like another way of saying "the servers can be different versions, but only temporarily".

As long as:

  • "2.x" part of the version is staying the same
  • "y" part is changing to "y+1" or "y+2"
  • assuming an actual tagged releases

Can we update update the targets within a single file system one at a time? If so, I'll propose a patch with what I think is better wording.



 Comments   
Comment by Andreas Dilger [ 25/Jun/19 ]

Olaf, indeed it is possible to upgrade servers one-at-a-time, but I can't imagine why you'd want to? If one server is down temporarily for an upgrade, that means files on that server are temporarily inaccessible and clients will block until the server is restarted and recovery completes. The client recovery time will be pretty much the same whether a single server is upgraded or half/all of the servers are upgraded, but you would have many more recovery periods if you only upgrade one OSS at a time.

The recommended process for rolling upgrades is to failover half of the MDT+OST targets to to their backup nodes, upgrade the now-idle MDS+OSS nodes to the new release, then failover all of the targets to the just-upgraded nodes, upgrade the other half of the MDS+OSS nodes, and fail back half of the targets to their original nodes. This involves 3 recovery periods, or correspondingly more if you have e.g. N-way HA failover clusters and are failing 1/N of the targets at a time.

With FLR mirror/EC we will eventually be able to do this without any outage, but that would need all of the files (or at least all of the in-use files) to be redundant in some way. Conceivably, we could mirror all of the files using OSTs on a particular OSS before doing the upgrade, then un-mirror them afterward, but that would be relatively slow.

In any case, I'm all for improving the readability of the manual, so feel free to suggest better wording.

Comment by Olaf Faaland [ 26/Aug/19 ]

Hi Andreas,
Thanks in arrears. That makes sense, and I haven't gotten around to my proposed wording, but I will.

Peter,
I can't remove the topllnl label in this project, it appears, and can't assign this to myself. Can you make those changes, or make it so I can (either is fine with me)?

Generated at Sat Feb 10 03:43:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.