Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • None
    • Ubuntu 14.04
      3.13.0-32-generic
    • 3
    • 16928

    Description

      I compiled Lustre client under Ubuntu 14.04, with network tcp/ip, it works, but
      IB didn't.

      Download IB driver for ubuntu14(kernel 3.13.0-32-generic) from:
      http://www.mellanox.com/page/mlnx_ofed_eula?mtag=linux_sw_drivers&mrequest=downloads&mtype=ofed&mver=MLNX_OFED-2.3-2.0.0&mname=MLNX_OFED_LINUX-2.3-2.0.0-ubuntu14.04-x86_64.iso

      Attachment is failed config.log, failed messages firstly come:
      "
      /usr/src/mlnx-ofed-kernel-2.3/include/linux/compat-2.6.h:17:35: fatal error: linux/compat_autoconf.h: No such file or directory
      #include <linux/compat_autoconf.h>
      "
      And if i skipped this error by remove this including in source file, i still hit following error:

      "
      configure: error: an external source tree was specified for o2iblnd however I could not find a /usr/src/mlnx-ofed-kernel-2.3/Module.symvers there
      "
      if i touched a Module.symvers(a little hack) there and compile finished, and i installed these debs, when modprobe lustre with IB, i hit following messages:

      [422278.843073] ko2iblnd: Unknown symbol rdma_create_qp (err -22)
      [422278.843080] ko2iblnd: disagrees about version of symbol ib_destroy_cq
      [422278.843081] ko2iblnd: Unknown symbol ib_destroy_cq (err -22)
      [422278.843084] ko2iblnd: disagrees about version of symbol rdma_create_id
      [422278.843085] ko2iblnd: Unknown symbol rdma_create_id (err -22)
      [422278.843101] ko2iblnd: disagrees about version of symbol rdma_listen
      [422278.843103] ko2iblnd: Unknown symbol rdma_listen (err -22)
      [422278.843105] ko2iblnd: disagrees about version of symbol rdma_destroy_qp
      [422278.843107] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22)
      [422278.843113] ko2iblnd: disagrees about version of symbol ib_query_device
      [422278.843115] ko2iblnd: Unknown symbol ib_query_device (err -22)
      [422278.843119] ko2iblnd: disagrees about version of symbol ib_get_dma_mr
      [422278.843120] ko2iblnd: Unknown symbol ib_get_dma_mr (err -22)
      [422278.843131] ko2iblnd: disagrees about version of symbol ib_alloc_pd
      [422278.843132] ko2iblnd: Unknown symbol ib_alloc_pd (err -22)
      [422278.843143] ko2iblnd: disagrees about version of symbol rdma_set_reuseaddr
      [422278.843144] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22)
      [422278.843148] ko2iblnd: disagrees about version of symbol rdma_connect
      [422278.843149] ko2iblnd: Unknown symbol rdma_connect (err -22)
      [422278.843154] ko2iblnd: disagrees about version of symbol ib_modify_qp
      [422278.843156] ko2iblnd: Unknown symbol ib_modify_qp (err -22)
      [422278.843168] ko2iblnd: disagrees about version of symbol rdma_destroy_id
      [422278.843169] ko2iblnd: Unknown symbol rdma_destroy_id (err -22)
      [422278.843174] ko2iblnd: disagrees about version of symbol rdma_accept
      [422278.843176] ko2iblnd: Unknown symbol rdma_accept (err -22)
      [422278.843189] ko2iblnd: disagrees about version of symbol ib_dealloc_pd
      [422278.843190] ko2iblnd: Unknown symbol ib_dealloc_pd (err -22)
      [422278.843195] ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys
      [422278.843196] ko2iblnd: Unknown symbol ib_fmr_pool_map_phys (err -22)
      [422278.843452] LNetError: 29038:0:(api-ni.c:1515:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256
      [422278.853606] LustreError: 29038:0:(events.c:629:ptlrpc_init_portals()) network initialisation failed

      Could you guys take a look at this issue.

      Attachments

        Issue Links

          Activity

            [LU-6083] IB with Ubuntu 14.04 client

            Patch landed to master

            utopiabound Nathaniel Clark added a comment - Patch landed to master

            This issue is handled by patch http://review.whamcloud.com/20523 linked to LU-5953

            utopiabound Nathaniel Clark added a comment - This issue is handled by patch http://review.whamcloud.com/20523 linked to LU-5953
            simmonsja James A Simmons added a comment - - edited

            What is left for this work besides adding Documentation? For me everything works well.

            simmonsja James A Simmons added a comment - - edited What is left for this work besides adding Documentation? For me everything works well.

            Modified Build Instructions without need for ahlabenadam fix:

            Install MLNX OFED as normal (must use 3.13 kernel for MLNX 2.4-1.0.4)
            As root:

            cd /usr/src/ofa_kernel
            ./ofed_scripts/gen-compat-autoconf.sh include/linux/compat-3.13.h > include/linux/compat_autoconf.h
            export MODULES_DIR=/lib/modules/$(uname -r)/updates/dkms/./
            ./ofed_scripts/create_Module.symvers.sh
            

            In lustre-release:

            ./configure --with-o2ib=/usr/src/ofa_kernel --disable-server --enable-quota
            

            NOTE: compilation problem still exists LU-5628, can be partially solved by adding --with-max-payload-mb=1 to configure line and then editing config.h to replace ((1)<<20) with 1048576

            utopiabound Nathaniel Clark added a comment - Modified Build Instructions without need for ahlabenadam fix : Install MLNX OFED as normal (must use 3.13 kernel for MLNX 2.4-1.0.4) As root : cd /usr/src/ofa_kernel ./ofed_scripts/gen-compat-autoconf.sh include/linux/compat-3.13.h > include/linux/compat_autoconf.h export MODULES_DIR=/lib/modules/$(uname -r)/updates/dkms/./ ./ofed_scripts/create_Module.symvers.sh In lustre-release : ./configure --with-o2ib=/usr/src/ofa_kernel --disable-server --enable-quota NOTE: compilation problem still exists LU-5628 , can be partially solved by adding --with-max-payload-mb=1 to configure line and then editing config.h to replace ((1)<<20) with 1048576
            utopiabound Nathaniel Clark added a comment - - edited

            FYI: Kernel Compatibility
            Ubuntu 14.04
            MLNX 2.4-1.0.4

            3.13 - Yes
            3.16 - NO
            3.19 - NO

            Ubuntu 14.04
            MLNX 3.0-2.0.1

            3.13 - ?
            3.16 - Yes
            3.19 - ?

            utopiabound Nathaniel Clark added a comment - - edited FYI: Kernel Compatibility Ubuntu 14.04 MLNX 2.4-1.0.4 3.13 - Yes 3.16 - NO 3.19 - NO Ubuntu 14.04 MLNX 3.0-2.0.1 3.13 - ? 3.16 - Yes 3.19 - ?

            This sounds very similar to LU-5597

            simmonsja James A Simmons added a comment - This sounds very similar to LU-5597

            With this solutions: https://github.com/ahlabenadam/lustre_fix.git

            Now i could load Lustre with IB successfully, Let me test it further.
            But i think in the long term considering, we'd better fix this issue for point of Lustre.

            Best Regards,
            Wang Shilong

            wangshilong Wang Shilong (Inactive) added a comment - With this solutions: https://github.com/ahlabenadam/lustre_fix.git Now i could load Lustre with IB successfully, Let me test it further. But i think in the long term considering, we'd better fix this issue for point of Lustre. Best Regards, Wang Shilong

            BTW, i see similar problem reported here.
            https://jira.hpdd.intel.com/browse/LU-5597

            We really need use Mellanox infiniband, because for Ubuntu built-in IB did
            not work for us.

            Also I tired again, modprobe will also reported following messages:
            [ 7528.946399] ko2iblnd: no symbol version for ib_create_cq
            [ 7528.946399] ko2iblnd: Unknown symbol ib_create_cq (err -22)
            [ 7528.946409] ko2iblnd: no symbol version for rdma_resolve_addr
            [ 7528.946410] ko2iblnd: Unknown symbol rdma_resolve_addr (err -22)
            [ 7528.946414] ko2iblnd: no symbol version for ib_reg_phys_mr
            [ 7528.946415] ko2iblnd: Unknown symbol ib_reg_phys_mr (err -22)
            [ 7528.946419] ko2iblnd: no symbol version for ib_create_fmr_pool
            [ 7528.946419] ko2iblnd: Unknown symbol ib_create_fmr_pool (err -22)
            [ 7528.946426] ko2iblnd: no symbol version for ib_flush_fmr_pool
            [ 7528.946427] ko2iblnd: Unknown symbol ib_flush_fmr_pool (err -22)
            [ 7528.946439] ko2iblnd: no symbol version for ib_dereg_mr
            [ 7528.946440] ko2iblnd: Unknown symbol ib_dereg_mr (err -22)
            [ 7528.946443] ko2iblnd: no symbol version for rdma_reject
            [ 7528.946444] ko2iblnd: Unknown symbol rdma_reject (err -22)
            [ 7528.946448] ko2iblnd: no symbol version for rdma_disconnect
            [ 7528.946449] ko2iblnd: Unknown symbol rdma_disconnect (err -22)
            [ 7528.946471] ko2iblnd: no symbol version for rdma_resolve_route
            [ 7528.946471] ko2iblnd: Unknown symbol rdma_resolve_route (err -22)
            [ 7528.946476] ko2iblnd: no symbol version for rdma_bind_addr
            [ 7528.946476] ko2iblnd: Unknown symbol rdma_bind_addr (err -22)
            [ 7528.946478] ko2iblnd: no symbol version for rdma_create_qp
            [ 7528.946479] ko2iblnd: Unknown symbol rdma_create_qp (err -22)
            [ 7528.946483] ko2iblnd: no symbol version for ib_destroy_cq
            [ 7528.946484] ko2iblnd: Unknown symbol ib_destroy_cq (err -22)
            [ 7528.946486] ko2iblnd: no symbol version for rdma_create_id
            [ 7528.946487] ko2iblnd: Unknown symbol rdma_create_id (err -22)
            [ 7528.946496] ko2iblnd: no symbol version for rdma_listen
            [ 7528.946497] ko2iblnd: Unknown symbol rdma_listen (err -22)
            [ 7528.946499] ko2iblnd: no symbol version for rdma_destroy_qp
            [ 7528.946500] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22)
            [ 7528.946504] ko2iblnd: no symbol version for ib_query_device
            [ 7528.946505] ko2iblnd: Unknown symbol ib_query_device (err -22)
            [ 7528.946507] ko2iblnd: no symbol version for ib_get_dma_mr
            [ 7528.946508] ko2iblnd: Unknown symbol ib_get_dma_mr (err -22)
            [ 7528.946514] ko2iblnd: no symbol version for ib_alloc_pd
            [ 7528.946515] ko2iblnd: Unknown symbol ib_alloc_pd (err -22)
            [ 7528.946522] ko2iblnd: no symbol version for rdma_set_reuseaddr
            [ 7528.946523] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22)
            [ 7528.946525] ko2iblnd: no symbol version for rdma_connect
            [ 7528.946526] ko2iblnd: Unknown symbol rdma_connect (err -22)
            [ 7528.946529] ko2iblnd: no symbol version for ib_modify_qp
            [ 7528.946530] ko2iblnd: Unknown symbol ib_modify_qp (err -22)
            [ 7528.946537] ko2iblnd: no symbol version for ib_destroy_fmr_pool
            [ 7528.946538] ko2iblnd: Unknown symbol ib_destroy_fmr_pool (err -22)
            [ 7528.946540] ko2iblnd: no symbol version for rdma_destroy_id
            [ 7528.946540] ko2iblnd: Unknown symbol rdma_destroy_id (err -22)
            [ 7528.946543] ko2iblnd: no symbol version for rdma_accept
            [ 7528.946544] ko2iblnd: Unknown symbol rdma_accept (err -22)
            [ 7528.946553] ko2iblnd: no symbol version for ib_dealloc_pd
            [ 7528.946553] ko2iblnd: Unknown symbol ib_dealloc_pd (err -22)
            [ 7528.946556] ko2iblnd: no symbol version for ib_fmr_pool_map_phys
            [ 7528.946557] ko2iblnd: Unknown symbol ib_fmr_pool_map_phys (err -22)
            [ 7528.946987] LNetError: 25709:0:(api-ni.c:1515:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256
            [ 7529.024336] LustreError: 25709:0:(events.c:629:ptlrpc_init_portals()) network initialisation failed

            wangshilong Wang Shilong (Inactive) added a comment - BTW, i see similar problem reported here. https://jira.hpdd.intel.com/browse/LU-5597 We really need use Mellanox infiniband, because for Ubuntu built-in IB did not work for us. Also I tired again, modprobe will also reported following messages: [ 7528.946399] ko2iblnd: no symbol version for ib_create_cq [ 7528.946399] ko2iblnd: Unknown symbol ib_create_cq (err -22) [ 7528.946409] ko2iblnd: no symbol version for rdma_resolve_addr [ 7528.946410] ko2iblnd: Unknown symbol rdma_resolve_addr (err -22) [ 7528.946414] ko2iblnd: no symbol version for ib_reg_phys_mr [ 7528.946415] ko2iblnd: Unknown symbol ib_reg_phys_mr (err -22) [ 7528.946419] ko2iblnd: no symbol version for ib_create_fmr_pool [ 7528.946419] ko2iblnd: Unknown symbol ib_create_fmr_pool (err -22) [ 7528.946426] ko2iblnd: no symbol version for ib_flush_fmr_pool [ 7528.946427] ko2iblnd: Unknown symbol ib_flush_fmr_pool (err -22) [ 7528.946439] ko2iblnd: no symbol version for ib_dereg_mr [ 7528.946440] ko2iblnd: Unknown symbol ib_dereg_mr (err -22) [ 7528.946443] ko2iblnd: no symbol version for rdma_reject [ 7528.946444] ko2iblnd: Unknown symbol rdma_reject (err -22) [ 7528.946448] ko2iblnd: no symbol version for rdma_disconnect [ 7528.946449] ko2iblnd: Unknown symbol rdma_disconnect (err -22) [ 7528.946471] ko2iblnd: no symbol version for rdma_resolve_route [ 7528.946471] ko2iblnd: Unknown symbol rdma_resolve_route (err -22) [ 7528.946476] ko2iblnd: no symbol version for rdma_bind_addr [ 7528.946476] ko2iblnd: Unknown symbol rdma_bind_addr (err -22) [ 7528.946478] ko2iblnd: no symbol version for rdma_create_qp [ 7528.946479] ko2iblnd: Unknown symbol rdma_create_qp (err -22) [ 7528.946483] ko2iblnd: no symbol version for ib_destroy_cq [ 7528.946484] ko2iblnd: Unknown symbol ib_destroy_cq (err -22) [ 7528.946486] ko2iblnd: no symbol version for rdma_create_id [ 7528.946487] ko2iblnd: Unknown symbol rdma_create_id (err -22) [ 7528.946496] ko2iblnd: no symbol version for rdma_listen [ 7528.946497] ko2iblnd: Unknown symbol rdma_listen (err -22) [ 7528.946499] ko2iblnd: no symbol version for rdma_destroy_qp [ 7528.946500] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22) [ 7528.946504] ko2iblnd: no symbol version for ib_query_device [ 7528.946505] ko2iblnd: Unknown symbol ib_query_device (err -22) [ 7528.946507] ko2iblnd: no symbol version for ib_get_dma_mr [ 7528.946508] ko2iblnd: Unknown symbol ib_get_dma_mr (err -22) [ 7528.946514] ko2iblnd: no symbol version for ib_alloc_pd [ 7528.946515] ko2iblnd: Unknown symbol ib_alloc_pd (err -22) [ 7528.946522] ko2iblnd: no symbol version for rdma_set_reuseaddr [ 7528.946523] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22) [ 7528.946525] ko2iblnd: no symbol version for rdma_connect [ 7528.946526] ko2iblnd: Unknown symbol rdma_connect (err -22) [ 7528.946529] ko2iblnd: no symbol version for ib_modify_qp [ 7528.946530] ko2iblnd: Unknown symbol ib_modify_qp (err -22) [ 7528.946537] ko2iblnd: no symbol version for ib_destroy_fmr_pool [ 7528.946538] ko2iblnd: Unknown symbol ib_destroy_fmr_pool (err -22) [ 7528.946540] ko2iblnd: no symbol version for rdma_destroy_id [ 7528.946540] ko2iblnd: Unknown symbol rdma_destroy_id (err -22) [ 7528.946543] ko2iblnd: no symbol version for rdma_accept [ 7528.946544] ko2iblnd: Unknown symbol rdma_accept (err -22) [ 7528.946553] ko2iblnd: no symbol version for ib_dealloc_pd [ 7528.946553] ko2iblnd: Unknown symbol ib_dealloc_pd (err -22) [ 7528.946556] ko2iblnd: no symbol version for ib_fmr_pool_map_phys [ 7528.946557] ko2iblnd: Unknown symbol ib_fmr_pool_map_phys (err -22) [ 7528.946987] LNetError: 25709:0:(api-ni.c:1515:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256 [ 7529.024336] LustreError: 25709:0:(events.c:629:ptlrpc_init_portals()) network initialisation failed
            wangshilong Wang Shilong (Inactive) added a comment - - edited

            Sorry for incomplete information, I mean for using git tree built with master branch

            wangshilong Wang Shilong (Inactive) added a comment - - edited Sorry for incomplete information, I mean for using git tree built with master branch

            when you say lustre client are you referring to the upstream lustre client that is part of the Ubuntu kernel tree in drivers/staging/lustre, or the lustre client built from the community lustre git tree built on and for Ubuntu?

            bogl Bob Glossman (Inactive) added a comment - when you say lustre client are you referring to the upstream lustre client that is part of the Ubuntu kernel tree in drivers/staging/lustre, or the lustre client built from the community lustre git tree built on and for Ubuntu?

            People

              utopiabound Nathaniel Clark
              wangshilong Wang Shilong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: