Details

    • Technical task
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • None
    • 9223372036854775807

    Description

      Right now when RPM packages are built, we insert into Lustre's release field the version string from the kernel against which Lustre was built. For instance:

      $ rpm -qpi lustre-2.7.0-2.6.32_504.8.1.el6_lustre.x86_64.x86_64.rpm 
      Name        : lustre
      Version     : 2.7.0
      Release     : 2.6.32_504.8.1.el6_lustre.x86_64
      

      Side note: A sysadmin is going to (and have in the past) think we messed up because of the ".x86_64.x86_64" in the file name, but the reason for it is that the first one is part of the Linux kernel version string, as we can see in the Release field above. The second .x86_64 is Lustre's.

      The reason for including the kernel's version string in Lustre's Release field because Lustre has traditionally been packaged to work with one, and only one, specific version of a kernel. If you have two very slightly different kernel versions "2.6.32_504.8.1.el6" and "2.6.32_504.8.2.el6", for instance, then you currently need to compile lustre against both kernels individually. While the "rpm -requires" should also list the specific required version number, because there are so many very closely compatible kernels for which we need to juggle lustre builds, it was simpler for sysadmins and developers alike to add the kernel's version string into Lustre's release field.

      But fortunately, this need to build lustre for every specific kernel is a self-imposed restriction, and work is under way to lift that restriction in LU-5614.

      For many years, it has been possible to compile kernel modules once and then use them with any kernel that is ABI compatible. The Linux distro mechanism that allows this is often called "weak modules". LU-5614 should bring Lustre into the year 2006 and get it working with weak modules.

      Once that is done, we can finally drop the kernel version string.

      This is especially fortuitous for anyone using koji as a build system, because koji makes this sort of abuse of standard packaging practice pretty close to impossible. koji is used by fedora and its cousins, and it has also been adopted by LLNL for its RHEL-based TOSS distribution.

      Attachments

        Issue Links

          Activity

            [LU-7643] Remove kernel version string from Lustre release field

            Not sure how that is properly carried thru with the Provides exported by osd-ldiskfs or the Requires in other lustre rpms.

            I'm not entirely sure what you mean, but I'll take a stab at explaining. If the various other parts are working now, they will continue to work. The "lustre-osd" requirement is currently supplied by either the osd-zfs or osd-ldiskfs packages. One or both of them must be installed to install in order to install the main "lustre" package. With the proposed new specific kernel requirement in the osd-ldiskfs package, if the osd-zfs package is selected, everthing will install fine with any kernel that supplies the required versions of various symbols. If osd-ldiskfs is selected, it can only be installed if the correct kernel is installed.

            Of course, multiple kernels can be installed at the same time, so there is no reason that the admin needs to boot the required kernel. Only that it be installed. But that can already happen now with the packages that contain the kernel version string.

            If you think that too is too much of a problem, you are basically arguing that weak modules can't ever be used with lustre.

            morrone Christopher Morrone (Inactive) added a comment - - edited Not sure how that is properly carried thru with the Provides exported by osd-ldiskfs or the Requires in other lustre rpms. I'm not entirely sure what you mean, but I'll take a stab at explaining. If the various other parts are working now, they will continue to work. The "lustre-osd" requirement is currently supplied by either the osd-zfs or osd-ldiskfs packages. One or both of them must be installed to install in order to install the main "lustre" package. With the proposed new specific kernel requirement in the osd-ldiskfs package, if the osd-zfs package is selected, everthing will install fine with any kernel that supplies the required versions of various symbols. If osd-ldiskfs is selected, it can only be installed if the correct kernel is installed. Of course, multiple kernels can be installed at the same time, so there is no reason that the admin needs to boot the required kernel. Only that it be installed. But that can already happen now with the packages that contain the kernel version string. If you think that too is too much of a problem, you are basically arguing that weak modules can't ever be used with lustre.

            I think it would be an acceptable solution if only the osd-ldiskfs rpm has a Requires for the particular and specific kernel version it was built on. That would directly enforce and tie it to the upstream ext4 version source it was built from. This is only my opinion. I think we need buy in from all concerned. Would really like to see comment from Minh, Andreas, Dmitry, or other experts.

            Not sure how that is properly carried thru with the Provides exported by osd-ldiskfs or the Requires in other lustre rpms.

            bogl Bob Glossman (Inactive) added a comment - I think it would be an acceptable solution if only the osd-ldiskfs rpm has a Requires for the particular and specific kernel version it was built on. That would directly enforce and tie it to the upstream ext4 version source it was built from. This is only my opinion. I think we need buy in from all concerned. Would really like to see comment from Minh, Andreas, Dmitry, or other experts. Not sure how that is properly carried thru with the Provides exported by osd-ldiskfs or the Requires in other lustre rpms.

            Compatibility problems that don't change the ABI won't necessarily be addressed by freshly applying patches and recompiling either. We have had problems in the past where ext4 internal semantics changed without changing the API and without breaking ldiskfs patch application. If you care that much, and since those problems have actually hit, you should probably stop using the ldiskfs-as-patches approach altogether.

            At least with the ldiskfs module fully compiled in the past, we eliminate the problem of overlooking ext4 internal semantic changes for a single version of the packages. The ldiskfs module is going to be in a known good frozen state. It will only be when larger semantic changes happen between the larger OS and filesystems that a recompile will be needed. And hopefully those types of changes are as rare as the issues inherent to the ldiskfs-as-patches approach within a stable OS kernel release series.

            But if folks still insist on ldiskfs being tied to a single kernel, we can certainly do that and while also removing the kernel string from the lustre packages. The kernel string does not belong in the lustre package name. That is incorrect packaging and needs to stop.

            The proper way would be to add a Requires to only the osd-ldiskfs subpackage. Is that going to be required to land this patch? It will probably cause us some trouble, but I'm willing to compromise and try adding that.

            morrone Christopher Morrone (Inactive) added a comment - - edited Compatibility problems that don't change the ABI won't necessarily be addressed by freshly applying patches and recompiling either. We have had problems in the past where ext4 internal semantics changed without changing the API and without breaking ldiskfs patch application. If you care that much, and since those problems have actually hit, you should probably stop using the ldiskfs-as-patches approach altogether. At least with the ldiskfs module fully compiled in the past, we eliminate the problem of overlooking ext4 internal semantic changes for a single version of the packages. The ldiskfs module is going to be in a known good frozen state. It will only be when larger semantic changes happen between the larger OS and filesystems that a recompile will be needed. And hopefully those types of changes are as rare as the issues inherent to the ldiskfs-as-patches approach within a stable OS kernel release series. But if folks still insist on ldiskfs being tied to a single kernel, we can certainly do that and while also removing the kernel string from the lustre packages. The kernel string does not belong in the lustre package name. That is incorrect packaging and needs to stop. The proper way would be to add a Requires to only the osd-ldiskfs subpackage. Is that going to be required to land this patch? It will probably cause us some trouble, but I'm willing to compromise and try adding that.
            bogl Bob Glossman (Inactive) added a comment - - edited

            I strongly disagree. LU-684 doesn't eliminate the need for a lustre build with ldiskfs to be tightly tied to a specific linux kernel version. As long as we build ldiskfs by patching a particular upstream ext4 on the fly during a lustre server build we are subject to variations in the upstream ext4 that aren't represented by just having a compatible advertised kernel ABI that is constant. upstream ext4 changes occur unpredictably and in an unscheduled way in RHEL and SLES kernel updates.

            bogl Bob Glossman (Inactive) added a comment - - edited I strongly disagree. LU-684 doesn't eliminate the need for a lustre build with ldiskfs to be tightly tied to a specific linux kernel version. As long as we build ldiskfs by patching a particular upstream ext4 on the fly during a lustre server build we are subject to variations in the upstream ext4 that aren't represented by just having a compatible advertised kernel ABI that is constant. upstream ext4 changes occur unpredictably and in an unscheduled way in RHEL and SLES kernel updates.

            Lustre server rpms do not require a patched kernel.

            The only thing that needs a patched kernel at this point is ldiskfs testing. That should be fixed soon in LU-684. If for some reason that fails, we can take the approach suggested by James Simmons in LU-20.

            morrone Christopher Morrone (Inactive) added a comment - Lustre server rpms do not require a patched kernel. The only thing that needs a patched kernel at this point is ldiskfs testing. That should be fixed soon in LU-684 . If for some reason that fails, we can take the approach suggested by James Simmons in LU-20 .
            mdiep Minh Diep added a comment -

            Chris,

            while it's fine on client rpm where we can move to different kernel version, it's difficult for lustre server rpm to move to different kernel version since it requires patched kernel. Without kernel version string in the name, it's not easy to find out the kernel it's built on.

            Thanks
            -Minh

            mdiep Minh Diep added a comment - Chris, while it's fine on client rpm where we can move to different kernel version, it's difficult for lustre server rpm to move to different kernel version since it requires patched kernel. Without kernel version string in the name, it's not easy to find out the kernel it's built on. Thanks -Minh

            LU-5614 is done. Patch 19954 for this ticket was already based on that ticket's patch, so no rebase should be necessary. We just need to get the review process under way on this one.

            morrone Christopher Morrone (Inactive) added a comment - LU-5614 is done. Patch 19954 for this ticket was already based on that ticket's patch, so no rebase should be necessary. We just need to get the review process under way on this one.

            With LU-5614 close to landing, I rebased this issues' patch. It is ready to go through the normal review process.

            morrone Christopher Morrone (Inactive) added a comment - With LU-5614 close to landing, I rebased this issues' patch. It is ready to go through the normal review process.

            I decided to push change 19954 after all. I just won't assign reviewers until it looks like LU-5614 change 12063 is closer to landing.

            morrone Christopher Morrone (Inactive) added a comment - I decided to push change 19954 after all. I just won't assign reviewers until it looks like LU-5614 change 12063 is closer to landing.

            Christopher J. Morrone (morrone2@llnl.gov) uploaded a new patch: http://review.whamcloud.com/19954
            Subject: LU-7643 build: Remove Linux version string from RPM release field
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b14c0b91f0623f301ea00fc7b2a053912e93c42d

            gerrit Gerrit Updater added a comment - Christopher J. Morrone (morrone2@llnl.gov) uploaded a new patch: http://review.whamcloud.com/19954 Subject: LU-7643 build: Remove Linux version string from RPM release field Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b14c0b91f0623f301ea00fc7b2a053912e93c42d

            I have the patch for this waiting in the wings. I'm not going to bother pushing it to gerrit yet because it will need frequent refreshes until the LU-5614 patch lands.

            morrone Christopher Morrone (Inactive) added a comment - I have the patch for this waiting in the wings. I'm not going to bother pushing it to gerrit yet because it will need frequent refreshes until the LU-5614 patch lands.

            People

              mdiep Minh Diep
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: