Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16050

ofed_info does not show mlnx-ofed-kernel-modules

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0, Lustre 2.15.2
    • Lustre 2.16.0, Lustre 2.15.1
    • None
    • 3
    • 9223372036854775807

    Description

      After installing MLNX_OFED by running mlnxofedinstall command, I found mlnx-ofed-kernel-modules package was not listed in the output of ofed_info:

      # tar xzf MLNX_OFED_LINUX-5.6-2.0.9.0-ubuntu22.04-x86_64.tgz 
      # cd MLNX_OFED_LINUX-5.6-2.0.9.0-ubuntu22.04-x86_64/
      # ./mlnxofedinstall --add-kernel-support --all --force
      # /etc/init.d/openibd restart
      
      # dpkg -S /usr/src/ofa_kernel/x86_64/5.15.0-41-generic/
      mlnx-ofed-kernel-modules: /usr/src/ofa_kernel/x86_64/5.15.0-41-generic
      
      # ofed_info | awk '{print $2}' | grep mlnx-ofed
      mlnx-ofed-kernel-utils
      

      There is no mlnx-ofed-kernel-modules in the output, which caused Lustre configure hit the following error:

      checking whether to use Compat RDMA... /usr/bin/ofed_info
      dpkg-query: error: --listfiles needs at least one package name argument
      

      The relevant codes are in lnet/autoconf/lustre-lnet.m4:

      case $with_o2ib in
              yes)    AS_IF([which ofed_info 2>/dev/null], [
                              AS_IF([test x$uses_dpkg = xyes], [
                                      OFED_INFO="ofed_info | awk '{print \[$]2}'"
                                      LSPKG="dpkg --listfiles"
                              ], [
                                      OFED_INFO="ofed_info"
                                      LSPKG="rpm -ql"
                              ])
                              O2IBPATHS=$(eval $OFED_INFO |
                                          egrep -w 'mlnx-ofed-kernel-dkms|mlnx-ofa_kernel-devel|compat-rdma-devel|kernel-ib-devel|ofa_kernel-devel' |
                                          xargs $LSPKG | grep -v 'ofa_kernel-' | grep rdma_cm.h | sed 's/\/include\/rdma\/rdma_cm.h//')
      

      Attachments

        Activity

          [LU-16050] ofed_info does not show mlnx-ofed-kernel-modules

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48592/
          Subject: LU-16050 build: replace ofed_info with dpkg/rpm
          Project: fs/lustre-release
          Branch: b2_15
          Current Patch Set:
          Commit: 3c8812e6d364829c4faa78fe02feda755c83164a

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48592/ Subject: LU-16050 build: replace ofed_info with dpkg/rpm Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 3c8812e6d364829c4faa78fe02feda755c83164a

          "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48592
          Subject: LU-16050 build: replace ofed_info with dpkg/rpm
          Project: fs/lustre-release
          Branch: b2_15
          Current Patch Set: 1
          Commit: 1d7be9d8e70ca84b92cb59480b62eb2cc0ce0424

          gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48592 Subject: LU-16050 build: replace ofed_info with dpkg/rpm Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 1d7be9d8e70ca84b92cb59480b62eb2cc0ce0424
          pjones Peter Jones added a comment -

          Landed for 2.16

          pjones Peter Jones added a comment - Landed for 2.16

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48047/
          Subject: LU-16050 build: replace ofed_info with dpkg/rpm
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 3a7930e63c15b0fbe51ac73db81a1186939115bb

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48047/ Subject: LU-16050 build: replace ofed_info with dpkg/rpm Project: fs/lustre-release Branch: master Current Patch Set: Commit: 3a7930e63c15b0fbe51ac73db81a1186939115bb
          yujian Jian Yu added a comment -

          Thank you for verifying, Nathan.

          yujian Jian Yu added a comment - Thank you for verifying, Nathan.
          nathand Nathan Dauchy added a comment - - edited

          Jian,

          Patch set 2 does correctly find the IB headers without needing to specify "--with-o2ib", and it works both for the initial ./configure and for "make dkms-debs -j".  This was tested with the tarball Patrick provided in NVDA-149, more or less master.

          ./configure --disable-dependency-tracking --with-linux=/usr/src/linux-headers-$(uname -r) --disable-snmp --enable-quota --disable-server --without-zfs --disable-ldiskfs --disable-gss --disable-crypto
          checking whether to use Compat RDMA... /usr/bin/ofed_info
          yes
          checking whether to use any OFED backport headers... no
          checking whether to enable OpenIB gen2 support... yes
          configure: adding /usr/src/ofa_kernel/x86_64/5.15.0-40-generic/Module.symvers to Symbol Path
          

          When installing the resulting packages and triggering the DKMS build, everything seemed to finish compiling fine, modules loaded, and o2ib lnet pings worked. Looks good!

          Thanks,
          Nathan

          nathand Nathan Dauchy added a comment - - edited Jian, Patch set 2 does correctly find the IB headers without needing to specify "--with-o2ib", and it works both for the initial ./configure and for "make dkms-debs -j".  This was tested with the tarball Patrick provided in NVDA-149, more or less master. ./configure --disable-dependency-tracking --with-linux=/usr/src/linux-headers-$(uname -r) --disable-snmp --enable-quota --disable-server --without-zfs --disable-ldiskfs --disable-gss --disable-crypto checking whether to use Compat RDMA... /usr/bin/ofed_info yes checking whether to use any OFED backport headers... no checking whether to enable OpenIB gen2 support... yes configure: adding /usr/src/ofa_kernel/x86_64/5.15.0-40- generic /Module.symvers to Symbol Path When installing the resulting packages and triggering the DKMS build, everything seemed to finish compiling fine, modules loaded, and o2ib lnet pings worked. Looks good! Thanks, Nathan
          yujian Jian Yu added a comment -

          Hi nathand,
          Could you please try the latest patch set 2 of https://review.whamcloud.com/48047?
          It works on my node with mlnx-ofed-kernel-dkms installed.
          The path can be detected now and there is no need to specify it with "--with-o2ib".

          Additionally, another fix would be to have the "make dkms-debs" actually honor the original "./configure" params.

          The fix needs to be made in debian/dkms.conf.in. I will look into the details to see why the author created that file with hard-coded params.
          Before I make some changes, please try to update that file to adjust the configure params for dkms package.

          yujian Jian Yu added a comment - Hi nathand , Could you please try the latest patch set 2 of https://review.whamcloud.com/48047? It works on my node with mlnx-ofed-kernel-dkms installed. The path can be detected now and there is no need to specify it with "--with-o2ib". Additionally, another fix would be to have the "make dkms-debs" actually honor the original "./configure" params. The fix needs to be made in debian/dkms.conf.in. I will look into the details to see why the author created that file with hard-coded params. Before I make some changes, please try to update that file to adjust the configure params for dkms package.

          "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48047
          Subject: LU-16050 build: replace ofed_info with dpkg/rpm
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 171b16923c83c5c68f500788ae040d3016dd5df2

          gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48047 Subject: LU-16050 build: replace ofed_info with dpkg/rpm Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 171b16923c83c5c68f500788ae040d3016dd5df2

          People

            yujian Jian Yu
            yujian Jian Yu
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: