Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-19813

(Durham University) Improve OFA detection to handle DKMS on EL

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Medium
    • Lustre 2.18.0
    • Lustre 2.15.8
    • None
    • Server 2.12.x, mixture of clients: rhel 8.10 (ppc64le), Rocky 9.7 (aarch64+64k)
    • 3
    • 9223372036854775807

    Description

      Hello!

      NVIDIA now release their out-of-tree InfiniBand drivers in a package called DOCA instead of MLNX_OFED. On EL they've renamed some RPM packages and (more significantly) they now seem to maintain the kernel OFA source using DKMS, outside of the knowledge of the RPM database. This means that the existing detection logic cannot work, e.g. on one of our 2.15.8 clients:

      sh autogen.sh
      ./configure --with-linux=/usr/src/kernels/$(uname -r)
      
      ...
      
      checking whether to use Compat RDMA... /usr/bin/ofed_info
      rpm: no arguments given for query
      configure: error: 
      You seem to have an OFED installed but have not installed it's devel package.
      If you still want to build Lustre for your OFED I/B stack, you need to install its devel headers RPM.
      Instead, if you want to build Lustre for your kernel's built-in I/B stack rather than your installed OFED stack, either remove the OFED package(s) or use --with-o2ib=no.

      Can this be resolved, please?

      I've attached a very naive fix that works for me, but I cannot vouch if it handles enough use cases.

      I'd previously tried porting LU-18002 to 2.15.8, but it didn't resolve this.

      Thanks,

      Mark

      Attachments

        Activity

          People

            yujian Jian Yu
            bodgerer Mark Dixon
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: