Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17881

Unable to build client in SLES container on RHEL host

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • Rocky Linux 8.6, Opensuse Leap 15.5 container
    • 2
    • 9223372036854775807

    Description

      Summary

      Building the Lustre client (specifically from lustre.spec) in an SLES 15 container, running on an RHEL 8 host fails with kernel module package errors.

       

      Example environment:

      Opensuse Leap 15.5 Docker container, running on Rocky Linux 8.6 host:

      01d46cba5043:/work # cat /etc/os-release
      NAME="openSUSE Leap"
      VERSION="15.5"
      ID="opensuse-leap"
      ID_LIKE="suse opensuse"
      VERSION_ID="15.5"
      PRETTY_NAME="openSUSE Leap 15.5"
      ANSI_COLOR="0;32"
      CPE_NAME="cpe:/o:opensuse:leap:15.5"
      BUG_REPORT_URL="https://bugs.opensuse.org"
      HOME_URL="https://www.opensuse.org/"
      DOCUMENTATION_URL="https://en.opensuse.org/Portal:Leap"
      LOGO="distributor-logo-Leap"
      
      01d46cba5043:/work # uname -r
      4.18.0-372.32.1.el8_6.x86_64
      
      [root@mawenzi-04 ~]# cat /etc/os-release
      NAME="Rocky Linux"
      VERSION="8.6 (Green Obsidian)"
      ID="rocky"
      ID_LIKE="rhel centos fedora"
      VERSION_ID="8.6"
      PLATFORM_ID="platform:el8"
      PRETTY_NAME="Rocky Linux 8.6 (Green Obsidian)"
      ANSI_COLOR="0;32"
      CPE_NAME="cpe:/o:rocky:rocky:8:GA"
      HOME_URL="https://rockylinux.org/"
      BUG_REPORT_URL="https://bugs.rockylinux.org/"
      ROCKY_SUPPORT_PRODUCT="Rocky Linux"
      ROCKY_SUPPORT_PRODUCT_VERSION="8"
      REDHAT_SUPPORT_PRODUCT="Rocky Linux"
      REDHAT_SUPPORT_PRODUCT_VERSION="8"
      
      [root@mawenzi-04 ~]# uname -r
      4.18.0-372.32.1.el8_6.x86_64

      In this case mawenzi-04 is the host OS, 01d46cba5043 is the container, sharing the 4.18.0-372.32.1.el8_6.x86_64 kernel.

      Problem Description

      Building the lustre client with ./configure ... && make rpms fails with the following errors during the make portion:

      #17 3361.0 RPM build errors:
      #17 3361.0     Macro %flavors_to_build needs whitespace before body
      #17 3361.0     Macro %_suse_kernel_module_subpackage defined but not used within scope
      #17 3361.0     Installed (but unpackaged) file(s) found:
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/fid.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/fld.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/lmv.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/lov.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/lustre.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/mdc.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/mgc.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/obdclass.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/obdecho.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/osc.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/ptlrpc.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/ko2iblnd.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/ksocklnd.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/libcfs.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/lnet.ko
      #17 3361.0    /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/lnet_selftest.ko 

      I've tested this in multiple environment/container combinations:

      • OpenSUSE Leap 15.3 host OS, building in OpenSUSE Leap 15.5 container
        • There were no problems here. Everything builds fine using ./configure ... && make rpms.
      • AlmaLinux 8.5 host OS, building in OpenSUSE Leap 15.5 container
        • Kernel module package macros produced RPM build errors and failed the build.
      • Rocky Linux 8.6 host OS, building in OpenSUSE Leap 15.5 container
        • Kernel module package macros produced RPM build errors and failed the build.

      I tried replacing my ./configure ... && make rpms way of building with rpmbuild of lustre.spec directly, so that I could define variables like kver, kdir, and kobjdir:

      KERNEL_VERSION="5.14.21-150500.53"
      LINUX_DIR=$(ls -d /usr/src/linux-$KERNEL_VERSION)
      LINUX_OBJ_DIR=$(ls -d /usr/src/linux-$KERNEL_VERSION-obj/x86_64/default)
      RPMBUILD_DIR="/work/rpmbuild"
      
      rpmbuild \
        --without mpi \
        --without servers \
        --define "_topdir $RPMBUILD_DIR" \
        --define "kobjdir $LINUX_OBJ_DIR" \
        --define "kver $KERNEL_VERSION" \
        --define "kversion $KERNEL_VERSION" \
        --define "kdir $LINUX_DIR" \
        --define "_with_lnet_dlc lnet_dlc" \
        --define "configure_args $CONFIGURE_ARGS" \
        -ba lustre.spec 

      This also failed in the same way.

      It seems the issue stems from _flavor not getting set properly, then that gets passed  to the kernel_module_package macro (residing in /usr/lib/rpm/macros.d/macros.kernel-source). kernel_module_package is responsible for defining flavors_to_build and outputting the lustre-client-kmp-<flavor> %package section for each flavor you're building for. When it works, it looks something like this:

      %package -n lustre-client-kmp-default
      Version: %version_k5.14.21_150500.53
      Release: %release
      Summary: %summary
      Provides: lustre-client-kmp = %version_k5.14.21_150500.53
      Provides: lustre-client-kmp = %version
      Provides: multiversion(kernel)
      Provides: lustre-client-kmp-default-k5.14.21_150500.53
      Requires: coreutils grep
      Requires(pre):  suse-kernel-rpm-scriptlets
      Requires(post): suse-kernel-rpm-scriptlets
      Requires:       suse-kernel-rpm-scriptlets
      Requires(preun): suse-kernel-rpm-scriptlets
      Requires(postun): suse-kernel-rpm-scriptlets 

      However, when _flavor is not something that exists under /usr/src/linux-obj/<arch>, it fails. For example, if we pass in "sandwich" as a flavor to build, we'll get these errors output.

      b8f2b51e2233:~ # rpm --eval '%kernel_module_package -n lustre-client -p /work/rpmbuild/SOURCES/kmp-lustre.preamble  -f /work/rpmbuild/SOURCES/kmp-lustre.files sandwich' --define 'flavors_to_build sandwich default' 
      warning: Macro %flavors_to_build needs whitespace before body
      warning: Macro %_suse_kernel_module_subpackage defined but not used within scope
      
      	
      %internal_kmp_error
      %package -n lustre-client-kmp-_dummy_
      Version: %version
      Summary: %summary
      Group: %group
      %description -n lustre-client-kmp-_dummy_
      b8f2b51e2233:~ #  

      This brings us to how _flavor was being defined in lustre.spec:

      %if 0%{?suse_version} >= 1310 && %{defined _take_kobj}
          %global _flavor %(echo %{_kver} | sed -e 's/^.*-//')
      %else
          %global _flavor default
      %endif

      Adding debugging print statements in the %build section, we see _flavor was getting set to the kernel patch version:

      # CALEB_DEBUG
      echo "CALEB_DEBUG"
      echo "_flavor: %{_flavor}"
      echo "_kver: %{_kver}"

      ...ends up printing:

      CALEB_DEBUG
      _flavor: 150500.53
      _kver: 5.14.21-150500.53-default

      Obviously, 150500.53 is not a valid flavor.

      By its only definition in the file, we can see that it should have been:

      echo %{_kver} | sed -e 's/^.*-//'

      which produces:

      [root@mawenzi-04 ~]# echo "5.14.21-150500.53-default" | sed -e 's/^.*-//'
      default 

      This means that kver didn't have the "-default" on it at the time flavor was defined. We can see that it's not until after _flavor gets defined, that _kver receives its "-default" suffix:

      %if %{defined _take_kver}
      # as an alternative to this implementation we could simply "make -C $kdir kernelversion"
      %global kver %(files="include/generated/utsrelease.h include/linux/utsrelease.h include/linux/version.h"; for f in $files; do if test -r %{kobjdir}/$f && grep UTS_RELEASE %{kobjdir}/$f >/dev/null; then sed -ne '/^#define UTS_RELEASE/s/.*"\\(.*\\)"$/\\1/p' %{kobjdir}/$f; break; fi; done)
      %define _kver %kver
      %endif

      After which, _kver == 5.14.21-150500.53-default

      Solution

      The solution that I've tested is to simply wait to define _flavor until after _kver has been fully defined:

      %if %{defined _take_kver}
      # as an alternative to this implementation we could simply "make -C $kdir kernelversion"
      %global kver %(files="include/generated/utsrelease.h include/linux/utsrelease.h include/linux/version.h"; for f in $files; do if test -r %{kobjdir}/$f && grep UTS_RELEASE %{kobjdir}/$f >/dev/null; then sed -ne '/^#define UTS_RELEASE/s/.*"\\(.*\\)"$/\\1/p' %{kobjdir}/$f; break; fi; done)
      %define _kver %kver
      %endif
      
      %if 0%{?suse_version} >= 1310 && %{defined _take_kobj}
              %global _flavor %(echo %{_kver} | sed -e 's/^.*-//')
      %else
              %global _flavor default
      %endif 

      This has no errors, and writes the RPMs correctly.

      I'll be uploading a patch with this change shortly.

      Attachments

        Activity

          People

            carlson Caleb Carlson
            carlson Caleb Carlson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: