Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • None
    • Ubuntu 14.04
      3.13.0-32-generic
    • 3
    • 16928

    Description

      I compiled Lustre client under Ubuntu 14.04, with network tcp/ip, it works, but
      IB didn't.

      Download IB driver for ubuntu14(kernel 3.13.0-32-generic) from:
      http://www.mellanox.com/page/mlnx_ofed_eula?mtag=linux_sw_drivers&mrequest=downloads&mtype=ofed&mver=MLNX_OFED-2.3-2.0.0&mname=MLNX_OFED_LINUX-2.3-2.0.0-ubuntu14.04-x86_64.iso

      Attachment is failed config.log, failed messages firstly come:
      "
      /usr/src/mlnx-ofed-kernel-2.3/include/linux/compat-2.6.h:17:35: fatal error: linux/compat_autoconf.h: No such file or directory
      #include <linux/compat_autoconf.h>
      "
      And if i skipped this error by remove this including in source file, i still hit following error:

      "
      configure: error: an external source tree was specified for o2iblnd however I could not find a /usr/src/mlnx-ofed-kernel-2.3/Module.symvers there
      "
      if i touched a Module.symvers(a little hack) there and compile finished, and i installed these debs, when modprobe lustre with IB, i hit following messages:

      [422278.843073] ko2iblnd: Unknown symbol rdma_create_qp (err -22)
      [422278.843080] ko2iblnd: disagrees about version of symbol ib_destroy_cq
      [422278.843081] ko2iblnd: Unknown symbol ib_destroy_cq (err -22)
      [422278.843084] ko2iblnd: disagrees about version of symbol rdma_create_id
      [422278.843085] ko2iblnd: Unknown symbol rdma_create_id (err -22)
      [422278.843101] ko2iblnd: disagrees about version of symbol rdma_listen
      [422278.843103] ko2iblnd: Unknown symbol rdma_listen (err -22)
      [422278.843105] ko2iblnd: disagrees about version of symbol rdma_destroy_qp
      [422278.843107] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22)
      [422278.843113] ko2iblnd: disagrees about version of symbol ib_query_device
      [422278.843115] ko2iblnd: Unknown symbol ib_query_device (err -22)
      [422278.843119] ko2iblnd: disagrees about version of symbol ib_get_dma_mr
      [422278.843120] ko2iblnd: Unknown symbol ib_get_dma_mr (err -22)
      [422278.843131] ko2iblnd: disagrees about version of symbol ib_alloc_pd
      [422278.843132] ko2iblnd: Unknown symbol ib_alloc_pd (err -22)
      [422278.843143] ko2iblnd: disagrees about version of symbol rdma_set_reuseaddr
      [422278.843144] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22)
      [422278.843148] ko2iblnd: disagrees about version of symbol rdma_connect
      [422278.843149] ko2iblnd: Unknown symbol rdma_connect (err -22)
      [422278.843154] ko2iblnd: disagrees about version of symbol ib_modify_qp
      [422278.843156] ko2iblnd: Unknown symbol ib_modify_qp (err -22)
      [422278.843168] ko2iblnd: disagrees about version of symbol rdma_destroy_id
      [422278.843169] ko2iblnd: Unknown symbol rdma_destroy_id (err -22)
      [422278.843174] ko2iblnd: disagrees about version of symbol rdma_accept
      [422278.843176] ko2iblnd: Unknown symbol rdma_accept (err -22)
      [422278.843189] ko2iblnd: disagrees about version of symbol ib_dealloc_pd
      [422278.843190] ko2iblnd: Unknown symbol ib_dealloc_pd (err -22)
      [422278.843195] ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys
      [422278.843196] ko2iblnd: Unknown symbol ib_fmr_pool_map_phys (err -22)
      [422278.843452] LNetError: 29038:0:(api-ni.c:1515:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256
      [422278.853606] LustreError: 29038:0:(events.c:629:ptlrpc_init_portals()) network initialisation failed

      Could you guys take a look at this issue.

      Attachments

        Issue Links

          Activity

            [LU-6083] IB with Ubuntu 14.04 client

            Patch landed to master

            utopiabound Nathaniel Clark added a comment - Patch landed to master

            This issue is handled by patch http://review.whamcloud.com/20523 linked to LU-5953

            utopiabound Nathaniel Clark added a comment - This issue is handled by patch http://review.whamcloud.com/20523 linked to LU-5953
            simmonsja James A Simmons added a comment - - edited

            What is left for this work besides adding Documentation? For me everything works well.

            simmonsja James A Simmons added a comment - - edited What is left for this work besides adding Documentation? For me everything works well.

            Modified Build Instructions without need for ahlabenadam fix:

            Install MLNX OFED as normal (must use 3.13 kernel for MLNX 2.4-1.0.4)
            As root:

            cd /usr/src/ofa_kernel
            ./ofed_scripts/gen-compat-autoconf.sh include/linux/compat-3.13.h > include/linux/compat_autoconf.h
            export MODULES_DIR=/lib/modules/$(uname -r)/updates/dkms/./
            ./ofed_scripts/create_Module.symvers.sh
            

            In lustre-release:

            ./configure --with-o2ib=/usr/src/ofa_kernel --disable-server --enable-quota
            

            NOTE: compilation problem still exists LU-5628, can be partially solved by adding --with-max-payload-mb=1 to configure line and then editing config.h to replace ((1)<<20) with 1048576

            utopiabound Nathaniel Clark added a comment - Modified Build Instructions without need for ahlabenadam fix : Install MLNX OFED as normal (must use 3.13 kernel for MLNX 2.4-1.0.4) As root : cd /usr/src/ofa_kernel ./ofed_scripts/gen-compat-autoconf.sh include/linux/compat-3.13.h > include/linux/compat_autoconf.h export MODULES_DIR=/lib/modules/$(uname -r)/updates/dkms/./ ./ofed_scripts/create_Module.symvers.sh In lustre-release : ./configure --with-o2ib=/usr/src/ofa_kernel --disable-server --enable-quota NOTE: compilation problem still exists LU-5628 , can be partially solved by adding --with-max-payload-mb=1 to configure line and then editing config.h to replace ((1)<<20) with 1048576
            utopiabound Nathaniel Clark added a comment - - edited

            FYI: Kernel Compatibility
            Ubuntu 14.04
            MLNX 2.4-1.0.4

            3.13 - Yes
            3.16 - NO
            3.19 - NO

            Ubuntu 14.04
            MLNX 3.0-2.0.1

            3.13 - ?
            3.16 - Yes
            3.19 - ?

            utopiabound Nathaniel Clark added a comment - - edited FYI: Kernel Compatibility Ubuntu 14.04 MLNX 2.4-1.0.4 3.13 - Yes 3.16 - NO 3.19 - NO Ubuntu 14.04 MLNX 3.0-2.0.1 3.13 - ? 3.16 - Yes 3.19 - ?

            This sounds very similar to LU-5597

            simmonsja James A Simmons added a comment - This sounds very similar to LU-5597

            With this solutions: https://github.com/ahlabenadam/lustre_fix.git

            Now i could load Lustre with IB successfully, Let me test it further.
            But i think in the long term considering, we'd better fix this issue for point of Lustre.

            Best Regards,
            Wang Shilong

            wangshilong Wang Shilong (Inactive) added a comment - With this solutions: https://github.com/ahlabenadam/lustre_fix.git Now i could load Lustre with IB successfully, Let me test it further. But i think in the long term considering, we'd better fix this issue for point of Lustre. Best Regards, Wang Shilong

            BTW, i see similar problem reported here.
            https://jira.hpdd.intel.com/browse/LU-5597

            We really need use Mellanox infiniband, because for Ubuntu built-in IB did
            not work for us.

            Also I tired again, modprobe will also reported following messages:
            [ 7528.946399] ko2iblnd: no symbol version for ib_create_cq
            [ 7528.946399] ko2iblnd: Unknown symbol ib_create_cq (err -22)
            [ 7528.946409] ko2iblnd: no symbol version for rdma_resolve_addr
            [ 7528.946410] ko2iblnd: Unknown symbol rdma_resolve_addr (err -22)
            [ 7528.946414] ko2iblnd: no symbol version for ib_reg_phys_mr
            [ 7528.946415] ko2iblnd: Unknown symbol ib_reg_phys_mr (err -22)
            [ 7528.946419] ko2iblnd: no symbol version for ib_create_fmr_pool
            [ 7528.946419] ko2iblnd: Unknown symbol ib_create_fmr_pool (err -22)
            [ 7528.946426] ko2iblnd: no symbol version for ib_flush_fmr_pool
            [ 7528.946427] ko2iblnd: Unknown symbol ib_flush_fmr_pool (err -22)
            [ 7528.946439] ko2iblnd: no symbol version for ib_dereg_mr
            [ 7528.946440] ko2iblnd: Unknown symbol ib_dereg_mr (err -22)
            [ 7528.946443] ko2iblnd: no symbol version for rdma_reject
            [ 7528.946444] ko2iblnd: Unknown symbol rdma_reject (err -22)
            [ 7528.946448] ko2iblnd: no symbol version for rdma_disconnect
            [ 7528.946449] ko2iblnd: Unknown symbol rdma_disconnect (err -22)
            [ 7528.946471] ko2iblnd: no symbol version for rdma_resolve_route
            [ 7528.946471] ko2iblnd: Unknown symbol rdma_resolve_route (err -22)
            [ 7528.946476] ko2iblnd: no symbol version for rdma_bind_addr
            [ 7528.946476] ko2iblnd: Unknown symbol rdma_bind_addr (err -22)
            [ 7528.946478] ko2iblnd: no symbol version for rdma_create_qp
            [ 7528.946479] ko2iblnd: Unknown symbol rdma_create_qp (err -22)
            [ 7528.946483] ko2iblnd: no symbol version for ib_destroy_cq
            [ 7528.946484] ko2iblnd: Unknown symbol ib_destroy_cq (err -22)
            [ 7528.946486] ko2iblnd: no symbol version for rdma_create_id
            [ 7528.946487] ko2iblnd: Unknown symbol rdma_create_id (err -22)
            [ 7528.946496] ko2iblnd: no symbol version for rdma_listen
            [ 7528.946497] ko2iblnd: Unknown symbol rdma_listen (err -22)
            [ 7528.946499] ko2iblnd: no symbol version for rdma_destroy_qp
            [ 7528.946500] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22)
            [ 7528.946504] ko2iblnd: no symbol version for ib_query_device
            [ 7528.946505] ko2iblnd: Unknown symbol ib_query_device (err -22)
            [ 7528.946507] ko2iblnd: no symbol version for ib_get_dma_mr
            [ 7528.946508] ko2iblnd: Unknown symbol ib_get_dma_mr (err -22)
            [ 7528.946514] ko2iblnd: no symbol version for ib_alloc_pd
            [ 7528.946515] ko2iblnd: Unknown symbol ib_alloc_pd (err -22)
            [ 7528.946522] ko2iblnd: no symbol version for rdma_set_reuseaddr
            [ 7528.946523] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22)
            [ 7528.946525] ko2iblnd: no symbol version for rdma_connect
            [ 7528.946526] ko2iblnd: Unknown symbol rdma_connect (err -22)
            [ 7528.946529] ko2iblnd: no symbol version for ib_modify_qp
            [ 7528.946530] ko2iblnd: Unknown symbol ib_modify_qp (err -22)
            [ 7528.946537] ko2iblnd: no symbol version for ib_destroy_fmr_pool
            [ 7528.946538] ko2iblnd: Unknown symbol ib_destroy_fmr_pool (err -22)
            [ 7528.946540] ko2iblnd: no symbol version for rdma_destroy_id
            [ 7528.946540] ko2iblnd: Unknown symbol rdma_destroy_id (err -22)
            [ 7528.946543] ko2iblnd: no symbol version for rdma_accept
            [ 7528.946544] ko2iblnd: Unknown symbol rdma_accept (err -22)
            [ 7528.946553] ko2iblnd: no symbol version for ib_dealloc_pd
            [ 7528.946553] ko2iblnd: Unknown symbol ib_dealloc_pd (err -22)
            [ 7528.946556] ko2iblnd: no symbol version for ib_fmr_pool_map_phys
            [ 7528.946557] ko2iblnd: Unknown symbol ib_fmr_pool_map_phys (err -22)
            [ 7528.946987] LNetError: 25709:0:(api-ni.c:1515:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256
            [ 7529.024336] LustreError: 25709:0:(events.c:629:ptlrpc_init_portals()) network initialisation failed

            wangshilong Wang Shilong (Inactive) added a comment - BTW, i see similar problem reported here. https://jira.hpdd.intel.com/browse/LU-5597 We really need use Mellanox infiniband, because for Ubuntu built-in IB did not work for us. Also I tired again, modprobe will also reported following messages: [ 7528.946399] ko2iblnd: no symbol version for ib_create_cq [ 7528.946399] ko2iblnd: Unknown symbol ib_create_cq (err -22) [ 7528.946409] ko2iblnd: no symbol version for rdma_resolve_addr [ 7528.946410] ko2iblnd: Unknown symbol rdma_resolve_addr (err -22) [ 7528.946414] ko2iblnd: no symbol version for ib_reg_phys_mr [ 7528.946415] ko2iblnd: Unknown symbol ib_reg_phys_mr (err -22) [ 7528.946419] ko2iblnd: no symbol version for ib_create_fmr_pool [ 7528.946419] ko2iblnd: Unknown symbol ib_create_fmr_pool (err -22) [ 7528.946426] ko2iblnd: no symbol version for ib_flush_fmr_pool [ 7528.946427] ko2iblnd: Unknown symbol ib_flush_fmr_pool (err -22) [ 7528.946439] ko2iblnd: no symbol version for ib_dereg_mr [ 7528.946440] ko2iblnd: Unknown symbol ib_dereg_mr (err -22) [ 7528.946443] ko2iblnd: no symbol version for rdma_reject [ 7528.946444] ko2iblnd: Unknown symbol rdma_reject (err -22) [ 7528.946448] ko2iblnd: no symbol version for rdma_disconnect [ 7528.946449] ko2iblnd: Unknown symbol rdma_disconnect (err -22) [ 7528.946471] ko2iblnd: no symbol version for rdma_resolve_route [ 7528.946471] ko2iblnd: Unknown symbol rdma_resolve_route (err -22) [ 7528.946476] ko2iblnd: no symbol version for rdma_bind_addr [ 7528.946476] ko2iblnd: Unknown symbol rdma_bind_addr (err -22) [ 7528.946478] ko2iblnd: no symbol version for rdma_create_qp [ 7528.946479] ko2iblnd: Unknown symbol rdma_create_qp (err -22) [ 7528.946483] ko2iblnd: no symbol version for ib_destroy_cq [ 7528.946484] ko2iblnd: Unknown symbol ib_destroy_cq (err -22) [ 7528.946486] ko2iblnd: no symbol version for rdma_create_id [ 7528.946487] ko2iblnd: Unknown symbol rdma_create_id (err -22) [ 7528.946496] ko2iblnd: no symbol version for rdma_listen [ 7528.946497] ko2iblnd: Unknown symbol rdma_listen (err -22) [ 7528.946499] ko2iblnd: no symbol version for rdma_destroy_qp [ 7528.946500] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22) [ 7528.946504] ko2iblnd: no symbol version for ib_query_device [ 7528.946505] ko2iblnd: Unknown symbol ib_query_device (err -22) [ 7528.946507] ko2iblnd: no symbol version for ib_get_dma_mr [ 7528.946508] ko2iblnd: Unknown symbol ib_get_dma_mr (err -22) [ 7528.946514] ko2iblnd: no symbol version for ib_alloc_pd [ 7528.946515] ko2iblnd: Unknown symbol ib_alloc_pd (err -22) [ 7528.946522] ko2iblnd: no symbol version for rdma_set_reuseaddr [ 7528.946523] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22) [ 7528.946525] ko2iblnd: no symbol version for rdma_connect [ 7528.946526] ko2iblnd: Unknown symbol rdma_connect (err -22) [ 7528.946529] ko2iblnd: no symbol version for ib_modify_qp [ 7528.946530] ko2iblnd: Unknown symbol ib_modify_qp (err -22) [ 7528.946537] ko2iblnd: no symbol version for ib_destroy_fmr_pool [ 7528.946538] ko2iblnd: Unknown symbol ib_destroy_fmr_pool (err -22) [ 7528.946540] ko2iblnd: no symbol version for rdma_destroy_id [ 7528.946540] ko2iblnd: Unknown symbol rdma_destroy_id (err -22) [ 7528.946543] ko2iblnd: no symbol version for rdma_accept [ 7528.946544] ko2iblnd: Unknown symbol rdma_accept (err -22) [ 7528.946553] ko2iblnd: no symbol version for ib_dealloc_pd [ 7528.946553] ko2iblnd: Unknown symbol ib_dealloc_pd (err -22) [ 7528.946556] ko2iblnd: no symbol version for ib_fmr_pool_map_phys [ 7528.946557] ko2iblnd: Unknown symbol ib_fmr_pool_map_phys (err -22) [ 7528.946987] LNetError: 25709:0:(api-ni.c:1515:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256 [ 7529.024336] LustreError: 25709:0:(events.c:629:ptlrpc_init_portals()) network initialisation failed
            wangshilong Wang Shilong (Inactive) added a comment - - edited

            Sorry for incomplete information, I mean for using git tree built with master branch

            wangshilong Wang Shilong (Inactive) added a comment - - edited Sorry for incomplete information, I mean for using git tree built with master branch

            People

              utopiabound Nathaniel Clark
              wangshilong Wang Shilong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: