[LU-16050] ofed_info does not show mlnx-ofed-kernel-modules Created: 27/Jul/22 Updated: 26/Oct/22 Resolved: 17/Sep/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.16.0, Lustre 2.15.1 |
| Fix Version/s: | Lustre 2.16.0, Lustre 2.15.2 |
| Type: | Bug | Priority: | Major |
| Reporter: | Jian Yu | Assignee: | Jian Yu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
After installing MLNX_OFED by running mlnxofedinstall command, I found mlnx-ofed-kernel-modules package was not listed in the output of ofed_info: # tar xzf MLNX_OFED_LINUX-5.6-2.0.9.0-ubuntu22.04-x86_64.tgz
# cd MLNX_OFED_LINUX-5.6-2.0.9.0-ubuntu22.04-x86_64/
# ./mlnxofedinstall --add-kernel-support --all --force
# /etc/init.d/openibd restart
# dpkg -S /usr/src/ofa_kernel/x86_64/5.15.0-41-generic/
mlnx-ofed-kernel-modules: /usr/src/ofa_kernel/x86_64/5.15.0-41-generic
# ofed_info | awk '{print $2}' | grep mlnx-ofed
mlnx-ofed-kernel-utils
There is no mlnx-ofed-kernel-modules in the output, which caused Lustre configure hit the following error: checking whether to use Compat RDMA... /usr/bin/ofed_info dpkg-query: error: --listfiles needs at least one package name argument The relevant codes are in lnet/autoconf/lustre-lnet.m4: case $with_o2ib in yes) AS_IF([which ofed_info 2>/dev/null], [ AS_IF([test x$uses_dpkg = xyes], [ OFED_INFO="ofed_info | awk '{print \[$]2}'" LSPKG="dpkg --listfiles" ], [ OFED_INFO="ofed_info" LSPKG="rpm -ql" ]) O2IBPATHS=$(eval $OFED_INFO | egrep -w 'mlnx-ofed-kernel-dkms|mlnx-ofa_kernel-devel|compat-rdma-devel|kernel-ib-devel|ofa_kernel-devel' | xargs $LSPKG | grep -v 'ofa_kernel-' | grep rdma_cm.h | sed 's/\/include\/rdma\/rdma_cm.h//') |
| Comments |
| Comment by Gerrit Updater [ 27/Jul/22 ] |
|
"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48047 |
| Comment by Jian Yu [ 27/Jul/22 ] |
|
Hi nathand,
The fix needs to be made in debian/dkms.conf.in. I will look into the details to see why the author created that file with hard-coded params. |
| Comment by Nathan Dauchy [ 31/Jul/22 ] |
|
Jian, Patch set 2 does correctly find the IB headers without needing to specify "--with-o2ib", and it works both for the initial ./configure and for "make dkms-debs -j". This was tested with the tarball Patrick provided in NVDA-149, more or less master. ./configure --disable-dependency-tracking --with-linux=/usr/src/linux-headers-$(uname -r) --disable-snmp --enable-quota --disable-server --without-zfs --disable-ldiskfs --disable-gss --disable-crypto
checking whether to use Compat RDMA... /usr/bin/ofed_info
yes
checking whether to use any OFED backport headers... no
checking whether to enable OpenIB gen2 support... yes
configure: adding /usr/src/ofa_kernel/x86_64/5.15.0-40-generic/Module.symvers to Symbol Path
When installing the resulting packages and triggering the DKMS build, everything seemed to finish compiling fine, modules loaded, and o2ib lnet pings worked. Looks good! Thanks, |
| Comment by Jian Yu [ 01/Aug/22 ] |
|
Thank you for verifying, Nathan. |
| Comment by Gerrit Updater [ 17/Sep/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48047/ |
| Comment by Peter Jones [ 17/Sep/22 ] |
|
Landed for 2.16 |
| Comment by Gerrit Updater [ 19/Sep/22 ] |
|
"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48592 |
| Comment by Gerrit Updater [ 26/Oct/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48592/ |