[LU-9788] upgrading ldiskfs on-disk format from 2.4.3 lustre version to 2.8.0 Created: 20/Jul/17 Updated: 10/Mar/18 Resolved: 10/Mar/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question/Request | Priority: | Major |
| Reporter: | James A Simmons | Assignee: | Andreas Dilger |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
ORNL's main production file system was formatted during the 2.4.3 lustre time frame. Since then we have move to 2.5 and now to lustre 2.8.0 without updating the ldiskfs on line format. This ticket is a request into what has changed and the impact of the changes. Lastly we need to ensure the upgrade it correct when done. |
| Comments |
| Comment by Peter Jones [ 20/Jul/17 ] |
|
Andreas Could you please advise? Thanks Peter |
| Comment by Andreas Dilger [ 20/Jul/17 ] |
|
We work hard to maintain upgrade and downgrade compatibility between Lustre releases for the on-disk format. New features that affect the on-disk format in a manner that prevents a downgrade to the previous Lustre version will typically require explicit action from the administrator to enable before it is used, to allow the system to be upgraded without affecting the disk format, and only enabling the new feature once the new Lustre release is known to be stable in your environment. It would be good to get the output of "dumpe2fs -h" from the MDT and one OST (assuming they are the same) to see what ldiskfs features are currently enabled, and check if there may be performance improvements possible after the upgrade. Secondly, in addition to upgrading the servers, will you also be upgrading the clients, or will you be running with different client versions? I believe that you may already be running 2.8 clients on your system. There are two issues that I'm aware of that would affect upgrade+downgrade to a 2.4 MDS. One is that the client multiple metadata request feature ("multi-slot last_rcvd") sets a flag on the MDT for the new recovery file format that prevents mounting on an unsupported version of Lustre ( The second is related to LFSCK ( If you want to be prudent, it makes sense to create a backup of the MDT prior to the upgrade. This can be done with "dd"of the raw MDT filesystem to.a backup device before installing the new Lustre release. DDN has also been testing the use of "e2image" to make copies of the ldiskfs metadata, which have the advantage of only backing up the in-use parts of the device, and are stored more compactly. It would be possible to make an e2image backup of the OSTs as well, since the actual space used would be relatively small. |
| Comment by Brad Hoagland (Inactive) [ 22/Aug/17 ] |
|
Hi simmonsja, Does this answer your question(s)? Regards, Brad |
| Comment by James A Simmons [ 31/Aug/17 ] |
|
I need to talk to Andreas in detail about this. |
| Comment by James A Simmons [ 07/Sep/17 ] |
|
Both our clients and server back end are running lustre 2.8.1. Its just the ldiskfs format that hasn't been upgraded since our 2.5 days. Okay here is the debugfs output from our MDS server: [root@atlas1-mds1 ~]# dumpe2fs -h /dev/mapper/atlas1-mdt1 and here is the output from one of our OSS servers: [root@atlas-oss1a1 ~]# dumpe2fs -h /dev/mapper/atlas-ddn1a-l0 |
| Comment by Andreas Dilger [ 08/Sep/17 ] |
|
If you are already running Lustre 2.8.x on the MDS/OSS then there isn't a huge amount to be done. You already have the dirdata feature (which has been available since 2.1) and flex_bg, which are the major performance gains compared to upgraded 1.8-formatted MDT filesystems. The above referenced issues were only relevant in case of a downgrade, but since you are already running 2.8, presumably without problems that would make you want to downgrade, I don't think they are relevant. The only other possible issue that would come up going forward is the size of the inodes on the MDT and OST. With Lustre 2.10+ we have bumped the default MDT inode size to 1024 bytes (from 512) and the default OST inode size to 512 (from 256) to facilitate usage of PFL in the future. If you are going to make PFL layouts the default on an ldiskfs MDT then you might consider to do a backup/restore (the inode size can only be changed at format time), but this is a non-issue for ZFS (which has dynamic dnode sizing as of 0.7.x). |
| Comment by James A Simmons [ 19/Sep/17 ] |
|
Yes are production system is running lustre 2.8 clients and lustre 2.8 servers. We have no plans with the current center wide file system to move to lustre 2.10. When the file system was created it was formated at a lustre 2.5 version. So we looking to see what needs to be abled to get to the 2.8 lustre support level. The big issue we have hit is when users create 20+ millions per directory which crashes are MDS server. I believe the large directory hash work in newer lustre versions fix this. |
| Comment by Andreas Dilger [ 19/Sep/17 ] |
|
James, we've never supported more than ~10M files per directory with ldiskfs (https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#settinguplustresystem.tab2), except with DNE2 directories striped over multiple MDTs (each with < 10M entries). The per-directory limit depends on the length of the filenames being used. Hitting this limit definitely shouldn't crash the MDS, which should be filed as a separate ticket with stack traces, etc. The large_dir feature hasn't been tested in production yet, as it has only landed in the development e2fsprogs release upstream, and the patches for e2fsprogs-wc need to be updated to include fixes made to those upstream patches. This definitely isn't something included as part of 2.8. |
| Comment by James A Simmons [ 26/Sep/17 ] |
|
So even tho ldiskfs has the code to support large_dir it is off by default and has never been really tested. You can't even set large_dir with the current ldiskfs version of e2fsprogs? |
| Comment by Andreas Dilger [ 27/Sep/17 ] |
|
Correct. Until recently, there was no e2fsck support for the large_dir feature, so it has not been safe to enable. |
| Comment by Andreas Dilger [ 10/Mar/18 ] |
|
Closing this issue, I don't think there is anything here to be done. |