Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9731

kmods need to be limited to EL minor release kernel

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.10.1, Lustre 2.11.0
    • Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      Now that kmods are being produced, they need to be limited to the kernel of the RHEL minor release they were created for.

      This is because RHEL kernels have a kabi "whitelist".  That means, that only a subset of kernel interfaces are guaranteed to be stable by RHEL's kabi and it's only those interfaces that are put into a kernel's list of "kernel(...) = ..." Provides: and a kmod's Requires:.  This means that a kmod produced for RHEL 7.3 will look compatible with a RHEL 7.4 kernel (because the whitelisted kabi will not have changed across those releases) even though it is not because the Lustre kmods use interfaces that are not on the whitelist and can change from one minor release to another, even though the whitelisted kabi has not changed.

      While Red Hat guarantees that these non-whitelisted interfaces will not change within a minor release (i.e. 7.3 to 7.4) there is no such guarantee across minor releases and in practice they probably almost always change across minor releases so a kmod using non-whitelisted interfaces needs to limit itself to the kernel provided in a RHEL minor release.

      For a kmod produced on a RHEL 7.3 kernel that means adding a Requires: kernel >= 3.10.0-514, kernel <= 3.10.0-514 to the kmod RPM.

      If this is not done, the kmod will install on to a RHEL 7.4 machine, which has an incompatible kernel by default and a compatible kernel (kernel-3.10.0-514*) will not be installed even if it's available in a Yum repo, even though it should be.

      Attachments

        Issue Links

          Activity

            [LU-9731] kmods need to be limited to EL minor release kernel

            Interesting that the sles12sp2 test run didn't fail to install the client RPMs because of this.

            Unless called for in Test-Parameters I don't think any SLES test runs are done routinely. For sure not in review tests.

            While I think this additional mod will fix the problem I wonder if it might be more well structured to push the lines putting in extra Requires into a function in lbuild-rhel and then in lbuild call that function if it exists. That way the extra Requires would be in all RHEL builds, not just RHEL7 and would still be left out of non-RHEL builds.

            bogl Bob Glossman (Inactive) added a comment - Interesting that the sles12sp2 test run didn't fail to install the client RPMs because of this. Unless called for in Test-Parameters I don't think any SLES test runs are done routinely. For sure not in review tests. While I think this additional mod will fix the problem I wonder if it might be more well structured to push the lines putting in extra Requires into a function in lbuild-rhel and then in lbuild call that function if it exists. That way the extra Requires would be in all RHEL builds, not just RHEL7 and would still be left out of non-RHEL builds.

            Brian J. Murrell (brian.murrell@intel.com) uploaded a new patch: https://review.whamcloud.com/28202
            Subject: LU-9731 Limit work-around to EL7 only
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 83952af84f928fb0289744c61fbe215e56bf32d6

            gerrit Gerrit Updater added a comment - Brian J. Murrell (brian.murrell@intel.com) uploaded a new patch: https://review.whamcloud.com/28202 Subject: LU-9731 Limit work-around to EL7 only Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 83952af84f928fb0289744c61fbe215e56bf32d6

            Interesting that the sles12sp2 test run didn't fail to install the client RPMs because of this.

            Given that this patch is specifically to work around a bug in RHEL's kmod building macro, it should probably be limited to RHEL kmod building only.  I'll push a patch for that.

            brian Brian Murrell (Inactive) added a comment - Interesting that the sles12sp2 test run didn't fail to install the client RPMs because of this. Given that this patch is specifically to work around a bug in RHEL's kmod building macro, it should probably be limited to RHEL kmod building only.  I'll push a patch for that.
            bogl Bob Glossman (Inactive) added a comment - - edited

            I think this change mangles the Required strings on SLES. For example in build of current master for sles12sp2 where the kernel version is 4.4.59-92.17 the Requires in the built lustre-client-kmp-default package has:
            kernel >= 4.4.59-0
            kernel < 4.4.59-1

            while the kernel-default for the pristine unpatched upstream kernel has a Provides of:

            kernel = 4.4.59-92.17

            Don't see how these can properly match for the purposes of install dependencies.

            bogl Bob Glossman (Inactive) added a comment - - edited I think this change mangles the Required strings on SLES. For example in build of current master for sles12sp2 where the kernel version is 4.4.59-92.17 the Requires in the built lustre-client-kmp-default package has: kernel >= 4.4.59-0 kernel < 4.4.59-1 while the kernel-default for the pristine unpatched upstream kernel has a Provides of: kernel = 4.4.59-92.17 Don't see how these can properly match for the purposes of install dependencies.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28066/
            Subject: LU-9731 kmods need to be limited to EL minor release kernel
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 527b2cd8e593b52326519d13418daf34b6b53b0e

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28066/ Subject: LU-9731 kmods need to be limited to EL minor release kernel Project: fs/lustre-release Branch: master Current Patch Set: Commit: 527b2cd8e593b52326519d13418daf34b6b53b0e

            Brian J. Murrell (brian.murrell@intel.com) uploaded a new patch: https://review.whamcloud.com/28066
            Subject: LU-9731 kmods need to be limited to EL minor release kernel
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fdf2aca83eed194b218bdb1505f168873c88c8ad

            gerrit Gerrit Updater added a comment - Brian J. Murrell (brian.murrell@intel.com) uploaded a new patch: https://review.whamcloud.com/28066 Subject: LU-9731 kmods need to be limited to EL minor release kernel Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fdf2aca83eed194b218bdb1505f168873c88c8ad

            dmiter: You need to use an @intel.com account on RH's Bugzilla in order to see bugs filed by Intel folks.

            If you look at comment #23 in the ticket I have opened in their Bugzilla, you can see that the behaviour that I've described in this ticket is the actual expected behaviour they intend for kmods and that it's a bug in their kmod packaging scripts that not enough "Requires:" are added to a kmod so that a matching kernel can be found to be installed with the kmod.

            brian Brian Murrell (Inactive) added a comment - dmiter : You need to use an @intel.com account on RH's Bugzilla in order to see bugs filed by Intel folks. If you look at comment #23 in the ticket I have opened i n their Bugzilla, you can see that the behaviour that I've described in this ticket is the actual expected behaviour they intend for kmods and that it's a bug in their kmod packaging scripts that not enough "Requires:" are added to a kmod so that a matching kernel can be found to be installed with the kmod.

            The customer can install our kernel modules assume those are compatible with the latest kernel currently installed. We should not strict user with particular version of kernel we build for. This was main reason of introducing weak symbols support.

             

            P.S. what account I can use to see the Red Hat ticket you mentoined?

             

            dmiter Dmitry Eremin (Inactive) added a comment - The customer can install our kernel modules assume those are compatible with the latest kernel currently installed. We should not strict user with particular version of kernel we build for. This was main reason of introducing weak symbols support.   P.S. what account I can use to see the Red Hat ticket  you mentoined?  

            Why would I install a kernel modules package that I didn't want to use?

            If I install kmod-lustre-client surely I want to use the Lustre client on that node and thus I need a kernel that can use the modules.

            brian Brian Murrell (Inactive) added a comment - Why would I install a kernel modules package that I didn't want to use? If I install kmod-lustre-client surely I want to use the Lustre client on that node and thus I need a kernel that can use the modules.

            I don't see the issue with be able to install the package on any system even without appropriate kernel installed. The package will be installed but not used. Why this is an issue? As I mentoined before the script /sbin/weak-modules is responsible for propogation those modules into compatible kernels only. So, how this package will affect incompatioble kernel if those modules will not be loaded into this kernel at all?

            dmiter Dmitry Eremin (Inactive) added a comment - I don't see the issue with be able to install the package on any system even without appropriate kernel installed. The package will be installed but not used. Why this is an issue? As I mentoined before the script /sbin/weak-modules is responsible for propogation those modules into compatible kernels only. So, how this package will affect incompatioble kernel if those modules will not be loaded into this kernel at all?

            dmiterThe reason this is needed is because if you install the kmod-lustre-client RPM on a system where there is no (i.e. weak-updates) compatible kernel at all (i.e. on a RHEL 7.4 system) then you end up with a set of modules that have no matching kernel so no way to even boot to a kernel that will use them.

            So, installing the kmod should result in a compatible kernel being present.  If that's the kernel that is already installed, then all is fine, but if there is no matching kernel already installed, the Yum transaction doing the kmod installation should install an appropriate kernel also.

            The problem is that currently, the kmod-lustre-client RPM doesn't have enough information in it to make sure a compatible kernel is installed.  This may be a bug in the Red Hat kmod building tools as is being explored in a Red Hat ticket or it may not be.  That is still to be determined.  Even if it is a bug, it will likely be some time before we get a fix and we need to work-around this issue in the meanwhile.

            Of course, the other (probably not at all short-term) resolution as has also been discussed in this ticket is to get the client kABI compatible.  But that is also not likely going to happen in the time-frame that this issue needs to be either resolved or worked-around in.

            brian Brian Murrell (Inactive) added a comment - dmiter The reason this is needed is because if you install the kmod-lustre-client  RPM on a system where there is no (i.e. weak-updates) compatible kernel at all (i.e. on a RHEL 7.4 system) then you end up with a set of modules that have no matching kernel so no way to even boot to a kernel that will use them. So, installing the kmod should result in a compatible kernel being present.  If that's the kernel that is already installed, then all is fine, but if there is no matching kernel already installed, the Yum transaction doing the kmod installation should install an appropriate kernel also. The problem is that currently, the kmod-lustre-client  RPM doesn't have enough information in it to make sure a compatible kernel is installed.  This may be a bug in the Red Hat kmod building tools as is being explored in a Red Hat ticket  or it may not be.  That is still to be determined.  Even if it is a bug, it will likely be some time before we get a fix and we need to work-around this issue in the meanwhile. Of course, the other (probably not at all short-term) resolution as has also been discussed in this ticket is to get the client kABI compatible.  But that is also not likely going to happen in the time-frame that this issue needs to be either resolved or worked-around in.

            People

              brian Brian Murrell (Inactive)
              brian Brian Murrell (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: