Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16719

ib_xxx symbol mismatch between in-kernel and mlx OFED

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      There is a regression in commit 321a533b86 (LU-16662 autoconf: fix configure test compile for CONFIG_KEYS) that causes ib_xxx symbol mismatch and it won't able to load ko2iblnd which was built agaisnt mlx ofed below. Before 321a533b86, it worked fine.

      [root@ec01 ~]# uname -r
      4.18.0-425.13.1.el8_7.x86_64
      [root@ec01 ~]# ofed_info -n
      5.8-1.1.2.1
      
      [root@ec01 lustre-release]# git clean -d -x -f; sh ./autogen.sh; ./configure --with-o2ib=/usr/src/ofa_kernel/default; make rpms
      [root@ec01 lustre-release]# modprobe lustre
      modprobe: ERROR: could not insert 'lustre': Invalid argument
      [root@ec01 lustre-release]# lctl get_param version
      version=2.15.54_159_g321a533
      
      Apr  6 22:18:21 ec01 kernel: libcfs: HW NUMA nodes: 1, HW CPU cores: 32, npartitions: 8
      Apr  6 22:18:21 ec01 kernel: alg: No test for adler32 (adler32-zlib)
      Apr  6 22:18:21 ec01 kernel: Key type ._llcrypt registered
      Apr  6 22:18:21 ec01 kernel: Key type .llcrypt registered
      Apr  6 22:18:21 ec01 kernel: Lustre: Lustre: Build Version: 2.15.54_159_g321a533
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol __ib_alloc_pd
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol __ib_alloc_pd (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_resolve_addr
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_resolve_addr (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_dereg_mr_user
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_dereg_mr_user (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_reject
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_reject (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_disconnect
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_disconnect (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol __rdma_create_kernel_id
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol __rdma_create_kernel_id (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_register_event_handler
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_register_event_handler (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_resolve_route
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_resolve_route (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_unregister_event_handler
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_unregister_event_handler (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_bind_addr
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_bind_addr (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_create_qp
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_create_qp (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_map_mr_sg
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_map_mr_sg (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_query_port
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_query_port (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_notify
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_notify (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_listen
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_listen (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_destroy_qp
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_destroy_qp (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol __ib_create_cq
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol __ib_create_cq (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_alloc_mr
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_alloc_mr (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_connect_locked
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_connect_locked (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_set_reuseaddr
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_destroy_cq_user
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_destroy_cq_user (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_modify_qp
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_modify_qp (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_dma_virt_map_sg
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_dma_virt_map_sg (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_destroy_id
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_destroy_id (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol rdma_accept
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol rdma_accept (err -22)
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: disagrees about version of symbol ib_dealloc_pd_user
      Apr  6 22:18:22 ec01 kernel: ko2iblnd: Unknown symbol ib_dealloc_pd_user (err -22)
      Apr  6 22:18:22 ec01 kernel: LNetError: 2824746:0:(api-ni.c:2639:lnet_load_lnd()) Can't load LND o2ib, module ko2iblnd, rc=256
      Apr  6 22:18:22 ec01 kernel: LustreError: 2824746:0:(events.c:642:ptlrpc_init_portals()) network initialisation failed
      

      Attachments

        Issue Links

          Activity

            People

              xinliang Xinliang Liu
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: