[LU-3173] Incorrect detection of ldiskfs series to use Created: 15/Apr/13  Updated: 04/Nov/13  Resolved: 11/Jul/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.5.0, Lustre 2.4.2

Type: Bug Priority: Minor
Reporter: Dmitry Eremin (Inactive) Assignee: Dmitry Eremin (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-3516 manage compiling the lustre kernel bu... Closed
Severity: 3
Rank (Obsolete): 7733

 Description   

The problem is in AS_VERSION_COMPARE([$LINUXRELEASE],[2.6.32-343] in ldiskfs-build.m4. The result of string comparison of "2.6.32.lustremaster" and "2.6.32-343" give a great and series for rhel6.4 are chosen.

The value of EXTRAVERSION in kernel Makefile should be chosen careful for subsequent string comparison or we need to change the code of series selection.



 Comments   
Comment by Dmitry Eremin (Inactive) [ 15/Apr/13 ]

The fix for selection of ldiskfs series to use is http://review.whamcloud.com/6056

Comment by Christopher Morrone [ 22/Apr/13 ]

I sounds like the Lustre build of the Linux kernel is currently converting the upstream kernel string from 2.6.32-343 to 2.6.32.lustremaster. That new version string is kind of horrible. How about you just fix that?

Comment by Dmitry Eremin (Inactive) [ 23/Apr/13 ]

According https://wiki.hpdd.intel.com/display/PUB/Walk-thru-+Build+Lustre+MASTER+on+RHEL+6.3+from+Whamcloud+git we recommend to edit kernel sources Makefile and modify line 4:

EXTRAVERSION = .lustremaster

Therefore, the kernel version produced from those sources will be:

#define UTS_RELEASE "2.6.32.lustremaster"

So, we try to use this version as upstream kernel version. I don't know how we produce pre-built RPMs packages but WiKi recommendation about building Lustre master from git sources is not correct.

Anyway we try to rely on string representation which is not a kernel version only. The kernel version is "2.6.32". The suffix "-358.2.1.el6" is a release.

$ rpm -qi kernel
Name        : kernel       Relocations: (not relocatable)
Version     : 2.6.32       Vendor: CentOS
Release     : 358.2.1.el6  Build Date: Wed 13 Mar 2013 05:35:54 AM MSK

Now we use "AS_VERSION_COMPARE([$LINUXRELEASE],[2.6.32-343]," which is not robust as I mentioned above.

In the current patch I try to use RHEL_RELEASE_CODE which is just encoding of RHEL_MAJOR and RHEL_MINOR. Probably we could use RHEL_RELEASE which is more precise. But anyway I think we should not rely on string which user can easily change with his custom version.

Comment by Christopher Morrone [ 23/Apr/13 ]

According https://wiki.hpdd.intel.com/display/PUB/Walk-thru-+Build+Lustre+MASTER+on+RHEL+6.3+from+Whamcloud+git we recommend to edit kernel sources Makefile and modify line 4

I would strongly recommend changing those instructions. Replacing the very useful release string from upstream with the very vague string ".lustremaster" is just bad advice.

Case in point: you guys aren't following that advice! Check your kernel packages.

LLNL certainly doesn't choose to rename our kernels that way, and never will.

In summary, I think that using the version+release field is plenty robust, if we simply stop giving users bad advice about trashing the release field.

The Version+Release string is simple and straight forward to understand. It is easy for a human to match up with rpm names.

The Version+Release also more uniquely identifies a kernel. Multiple differing kernels can have the same RHEL_RELEASE_CODE, and that could be a problem for properly idenifying which patches are needed.

Honestly, if a user is building their own kernel, they can quite easily change any of the numbers. Lets not worry about all of the crazy things people might do, and focus on the things we normally do.

Comment by Dmitry Eremin (Inactive) [ 23/Apr/13 ]

I don't propose to change this approach. I just would like to make this approach more stable regardless user changes.

My patch #3 (http://review.whamcloud.com/6056) just extract correct upstream kernel release for RHEL kernels and form it into $(VERSION).$(PATCHLEVEL).$(SUBLEVEL)-$(RHEL_RELEASE). It's exactly Version+Release (2.6.32-358) format like you propose to follow without any additional info and it not depends from EXTRAVERSION which user can change.

If EXTRAVERSION is not empty kernel script remove all RHEL release information from kernel version. So, user should carefully write the same RHEL release version as he see in original kernel package. It's not convenient I think. For example:

$ rpm -qi kernel
Name        : kernel       Relocations: (not relocatable)
Version     : 2.6.32       Vendor: CentOS
Release     : 358.2.1.el6  Build Date: Wed 13 Mar 2013 05:35:54 AM MSK

# Therefore write to Makefile
EXTRAVERSION = -358.2.1.el6_lustre

Moreover the directory name with produced kernel will be ~/rpmbuild/BUILD/kernel-2.6.32358.2.1.el6_lustre.x86_64. It looks confusing at least for me but user should use it in Lustre configure script.

So, what is wrong with my #3 patch? Are you insist on WiKi change only?

Comment by Christopher Morrone [ 23/Apr/13 ]

Yes, I think that a Wiki-only change is the appropriate fix for this problem. I don't see much value in the additional complexity that the patch introduces.

Comment by Yang Sheng [ 23/Apr/13 ]

Hi, Christopher, I think this change is needed. I have received many guys encountered issue when they are build kernel and lustre on local box. The version come from utsrelease.h is depend on environment. But RHEL_RELEASE is stable always. So i think we use it more reasonable.

Comment by Christopher Morrone [ 23/Apr/13 ]

Isn't that because you guys are giving them bad instructions right now? With proper instructions, doesn't that problem go away?

Comment by Yang Sheng [ 24/Apr/13 ]

It obviously not. Not everyone build kernel from spec file. It also not necessary step. We haven't this problem in past, at least we should restore the behavior before.

Comment by Dmitry Eremin (Inactive) [ 24/May/13 ]

Additional drawback of approach to play with EXTRAVERSION only is incorrect specification of the kernel version for dependencies. For example, if I set the following in Makefile:

VERSION = 2
PATCHLEVEL = 6
SUBLEVEL = 32
EXTRAVERSION = -358.6.2.lum

I will got a kernel RPM with following information:

Name        : kernel                       Relocations: (not relocatable)
Version     : 2.6.32358.6.2.lum            Vendor: The Linux Community
Release     : 1                            Build Date: Tue 21 May 2013 04:24:59 PM MSK

But a lustre-modules will required dependency from "kernel = 2.6.32-358.6.2.lum". So, additional thing we need to fix is "krequires" macro value from lustre.spec file.

Comment by Christopher Morrone [ 24/May/13 ]

You know, more and more I think this is just completely the wrong approach to the problem. We are trying to do two things that are conceptually in conflict:

  • Decide which patch series to apply based mainly on the kernel versions string (also looking at distro name).
  • Allow the user to change the version string in ways that throw out needed version resolution.

How about this instead: If a user wishes to change the kernel version number, they need to use a configure option to specify which patch set they wish to use. They chose to throw away information, so make them tell ldiskfs which patch set to use. Adding complexity to try to guess when they are making it hard for us seems counterproductive to me.

Generated at Sat Feb 10 01:31:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.