Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Rocky Linux 8.6, Opensuse Leap 15.5 container
-
2
-
9223372036854775807
Description
Summary
Building the Lustre client (specifically from lustre.spec) in an SLES 15 container, running on an RHEL 8 host fails with kernel module package errors.
Example environment:
Opensuse Leap 15.5 Docker container, running on Rocky Linux 8.6 host:
01d46cba5043:/work # cat /etc/os-release NAME="openSUSE Leap" VERSION="15.5" ID="opensuse-leap" ID_LIKE="suse opensuse" VERSION_ID="15.5" PRETTY_NAME="openSUSE Leap 15.5" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:opensuse:leap:15.5" BUG_REPORT_URL="https://bugs.opensuse.org" HOME_URL="https://www.opensuse.org/" DOCUMENTATION_URL="https://en.opensuse.org/Portal:Leap" LOGO="distributor-logo-Leap" 01d46cba5043:/work # uname -r 4.18.0-372.32.1.el8_6.x86_64 [root@mawenzi-04 ~]# cat /etc/os-release NAME="Rocky Linux" VERSION="8.6 (Green Obsidian)" ID="rocky" ID_LIKE="rhel centos fedora" VERSION_ID="8.6" PLATFORM_ID="platform:el8" PRETTY_NAME="Rocky Linux 8.6 (Green Obsidian)" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:rocky:rocky:8:GA" HOME_URL="https://rockylinux.org/" BUG_REPORT_URL="https://bugs.rockylinux.org/" ROCKY_SUPPORT_PRODUCT="Rocky Linux" ROCKY_SUPPORT_PRODUCT_VERSION="8" REDHAT_SUPPORT_PRODUCT="Rocky Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8" [root@mawenzi-04 ~]# uname -r 4.18.0-372.32.1.el8_6.x86_64
In this case mawenzi-04 is the host OS, 01d46cba5043 is the container, sharing the 4.18.0-372.32.1.el8_6.x86_64 kernel.
Problem Description
Building the lustre client with ./configure ... && make rpms fails with the following errors during the make portion:
#17 3361.0 RPM build errors: #17 3361.0 Macro %flavors_to_build needs whitespace before body #17 3361.0 Macro %_suse_kernel_module_subpackage defined but not used within scope #17 3361.0 Installed (but unpackaged) file(s) found: #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/fid.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/fld.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/lmv.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/lov.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/lustre.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/mdc.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/mgc.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/obdclass.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/obdecho.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/osc.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/fs/ptlrpc.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/ko2iblnd.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/ksocklnd.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/libcfs.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/lnet.ko #17 3361.0 /lib/modules/5.14.21-150500.53-default/updates/lustre-client/net/lnet_selftest.ko
I've tested this in multiple environment/container combinations:
- OpenSUSE Leap 15.3 host OS, building in OpenSUSE Leap 15.5 container
- There were no problems here. Everything builds fine using ./configure ... && make rpms.
- AlmaLinux 8.5 host OS, building in OpenSUSE Leap 15.5 container
- Kernel module package macros produced RPM build errors and failed the build.
- Rocky Linux 8.6 host OS, building in OpenSUSE Leap 15.5 container
- Kernel module package macros produced RPM build errors and failed the build.
I tried replacing my ./configure ... && make rpms way of building with rpmbuild of lustre.spec directly, so that I could define variables like kver, kdir, and kobjdir:
KERNEL_VERSION="5.14.21-150500.53" LINUX_DIR=$(ls -d /usr/src/linux-$KERNEL_VERSION) LINUX_OBJ_DIR=$(ls -d /usr/src/linux-$KERNEL_VERSION-obj/x86_64/default) RPMBUILD_DIR="/work/rpmbuild" rpmbuild \ --without mpi \ --without servers \ --define "_topdir $RPMBUILD_DIR" \ --define "kobjdir $LINUX_OBJ_DIR" \ --define "kver $KERNEL_VERSION" \ --define "kversion $KERNEL_VERSION" \ --define "kdir $LINUX_DIR" \ --define "_with_lnet_dlc lnet_dlc" \ --define "configure_args $CONFIGURE_ARGS" \ -ba lustre.spec
This also failed in the same way.
It seems the issue stems from _flavor not getting set properly, then that gets passed to the kernel_module_package macro (residing in /usr/lib/rpm/macros.d/macros.kernel-source). kernel_module_package is responsible for defining flavors_to_build and outputting the lustre-client-kmp-<flavor> %package section for each flavor you're building for. When it works, it looks something like this:
%package -n lustre-client-kmp-default Version: %version_k5.14.21_150500.53 Release: %release Summary: %summary Provides: lustre-client-kmp = %version_k5.14.21_150500.53 Provides: lustre-client-kmp = %version Provides: multiversion(kernel) Provides: lustre-client-kmp-default-k5.14.21_150500.53 Requires: coreutils grep Requires(pre): suse-kernel-rpm-scriptlets Requires(post): suse-kernel-rpm-scriptlets Requires: suse-kernel-rpm-scriptlets Requires(preun): suse-kernel-rpm-scriptlets Requires(postun): suse-kernel-rpm-scriptlets
However, when _flavor is not something that exists under /usr/src/linux-obj/<arch>, it fails. For example, if we pass in "sandwich" as a flavor to build, we'll get these errors output.
b8f2b51e2233:~ # rpm --eval '%kernel_module_package -n lustre-client -p /work/rpmbuild/SOURCES/kmp-lustre.preamble -f /work/rpmbuild/SOURCES/kmp-lustre.files sandwich' --define 'flavors_to_build sandwich default' warning: Macro %flavors_to_build needs whitespace before body warning: Macro %_suse_kernel_module_subpackage defined but not used within scope %internal_kmp_error %package -n lustre-client-kmp-_dummy_ Version: %version Summary: %summary Group: %group %description -n lustre-client-kmp-_dummy_ b8f2b51e2233:~ #
This brings us to how _flavor was being defined in lustre.spec:
%if 0%{?suse_version} >= 1310 && %{defined _take_kobj} %global _flavor %(echo %{_kver} | sed -e 's/^.*-//') %else %global _flavor default %endif
Adding debugging print statements in the %build section, we see _flavor was getting set to the kernel patch version:
# CALEB_DEBUG echo "CALEB_DEBUG" echo "_flavor: %{_flavor}" echo "_kver: %{_kver}"
...ends up printing:
CALEB_DEBUG
_flavor: 150500.53
_kver: 5.14.21-150500.53-default
Obviously, 150500.53 is not a valid flavor.
By its only definition in the file, we can see that it should have been:
echo %{_kver} | sed -e 's/^.*-//'
which produces:
[root@mawenzi-04 ~]# echo "5.14.21-150500.53-default" | sed -e 's/^.*-//' default
This means that kver didn't have the "-default" on it at the time flavor was defined. We can see that it's not until after _flavor gets defined, that _kver receives its "-default" suffix:
%if %{defined _take_kver} # as an alternative to this implementation we could simply "make -C $kdir kernelversion" %global kver %(files="include/generated/utsrelease.h include/linux/utsrelease.h include/linux/version.h"; for f in $files; do if test -r %{kobjdir}/$f && grep UTS_RELEASE %{kobjdir}/$f >/dev/null; then sed -ne '/^#define UTS_RELEASE/s/.*"\\(.*\\)"$/\\1/p' %{kobjdir}/$f; break; fi; done) %define _kver %kver %endif
After which, _kver == 5.14.21-150500.53-default.
Solution
The solution that I've tested is to simply wait to define _flavor until after _kver has been fully defined:
%if %{defined _take_kver} # as an alternative to this implementation we could simply "make -C $kdir kernelversion" %global kver %(files="include/generated/utsrelease.h include/linux/utsrelease.h include/linux/version.h"; for f in $files; do if test -r %{kobjdir}/$f && grep UTS_RELEASE %{kobjdir}/$f >/dev/null; then sed -ne '/^#define UTS_RELEASE/s/.*"\\(.*\\)"$/\\1/p' %{kobjdir}/$f; break; fi; done) %define _kver %kver %endif %if 0%{?suse_version} >= 1310 && %{defined _take_kobj} %global _flavor %(echo %{_kver} | sed -e 's/^.*-//') %else %global _flavor default %endif
This has no errors, and writes the RPMs correctly.
I'll be uploading a patch with this change shortly.