Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
Lustre 2.7.0, Lustre 2.8.0
-
3
-
9223372036854775807
Description
LU-5276 (commit 995921a5c1e53b4d717de21dd3210406f6833689) introduced a change in the ldiskfs series selection logic. Ostensibly the reason for this was:
LU-5276build: handle RHEL ldiskfs series more accuratedSince RHEL7 change RHEL_RELEASE macro format. So we need
a unified way to handle it. Change using RHEL_MAJOR &
RHEL_MINOR to decided which ldiskfs series to be choose.
The mentioned change in the RHEL_RELEASE macro was, to the best of my knowledge, simply that they started putting quotes around the number. So for instance it went from something like this:
#define RHEL_RELEASE 200
to something like this:
#define RHEL_RELEASE "300"
It would have been a pretty simple matter to hand either quotes or no quotes. In fact I made such a change locally. I'm not sure why this required a complete change of the logic.
But the new logic introduced a few problems, and I would argue that it should be reverted.
First of all, the commit message subject claims that the new method is more accurate than the previous. This is obviously incorrect. The new method allows significantly less accuracy than the previous method.
In the old method, we could target specific RHEL kernel versions, but in the new method we can only have a single series file per RHEL distro version. If there are kernel updates within a rhel update, there is no longer any way to specify different series files for each kernel. The previous method allowed that.
Next, the new method absolutely guarantees that Lustre can never build for any new RHEL version that comes out without developer intervention. The old method supported kernel versions and more importantly, ranges of kernel versions. The last range was always X version AND NEWER. So the most recent series file could always be tried with newer kernels. If there was not too much change in the kernel, the series file can be applied cleanly and no developer intervention is needed.
I would much prefer that we try to have the latest series applied to newer kernels, and fail during patch application time, then to fail to even try to apply any series files at all. (Currently it fails to even try).
And it is also worth noting that the patch introduced this bug:
+ 6[0-3]) LDISKFS_SERIES="2.6-rhel6.series" ;;
This range does not work. Square brackets are used as quotes in autoconf's m4 system, so these square brackets disappear. The 2.6-rhel6.series will never be used as written.
In summary, the logic change introduced these problems:
- Less kernel version specificity
- Guaranteed failure to build with every minor RHEL update
- Broken use of square brackets
I would very much like to see this fixed before 2.9 comes out.
No. I want you to put it back the way it was.