Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7019

Lustre client build fails when ./configure called with --with-o2ib=no

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 2.8.0
    • None
    • Ubuntu 12.04 LTS with Linux 3.4.61 (+MOSIX); gcc 4.8.1-2ubuntu1~12.04
    • 3
    • 9223372036854775807

    Description

      Until Lustre 2.8 stable is ready, I've been trying to get the current Git pull to build the client on my Ubuntu machines currently running 3.4.61 every few weeks and there has been a persistent error that causes the build to fail. I've used the same Git pull, or earlier versions of same, to successfully build the server on CentOS 7, but I'm having trouble just getting the client portion built on Ubuntu.

      For the client, I'm attempting to build Lustre on the same machine on which I built the Linux 3.4.61 kernel all the Ubuntu machines here are running, so all kernel source, intermediate object, headers, etc. should be available.

      The process I'm following is very simple, following something outlined in the old bug report LU-1706 where someone else was trying to make Debian packages of the client:

      git clone git://git.hpdd.intel.com/fs/lustre-release.git
      cd lustre-release
      sh autogen.sh
      ./configure --disable-server --with-o2ib=no --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux
      make debs

      We don't use Infiniband here; it's all 10 Gigabit Ethernet, so we don't have an Infiniband stack installed; we don't need one and certainly don't want one!

      On CentOS 7 where the server built successfully, we never (knowingly) installed any Infiniband-related packages. I suspect this issue only arises when attempting to build the client portion, or it's in "make debs".

      The build process will attempt to roll along and seems to successfully build userland and some other *.deb packages but I never get a kernel module out for the client. The build fails with the error:

      configure: error: bad --with-o2ib path

      This seems to correspond to, a little earlier up in the build process:

      1. Doesn't seem possible to only build modules...
        ./configure --with-linux=/usr/src/linux-3.4.61 \
        --with-linux-obj=/usr/src/linux-3.4.61 \
        --disable-server \
        --disable-quilt \
        --disable-dependency-tracking \
        --disable-doc \
        --disable-utils \
        --disable-iokit \
        --disable-snmp \
        --disable-tests \
        --enable-quota \
        --with-o2ib=

      I've actually gone and edited that Makefile and even if I completely remove the "--with-o2ib" flag, it still fails to compile; I believe it hangs up on a similar error elsewhere.

      I've tried the hpdd-discuss list and it doesn't seem like the problem has been noticed or fixed by happenstance (I've been trying a different Git pull every few weeks for the last month or two) so I'm hoping a bug report might help.

      I've attached the full buildlog from the attempt for review. Just let me know if there's any further information I can provide, or anything else I can try; I'd be happy to give it a try and report back. Thanks.

      Attachments

        Issue Links

          Activity

            [LU-7019] Lustre client build fails when ./configure called with --with-o2ib=no

            Patch for LU-7090 resolved this

            simmonsja James A Simmons added a comment - Patch for LU-7090 resolved this
            dmiter Dmitry Eremin (Inactive) added a comment - Patch http://review.whamcloud.com/16183 will fix this.
            scaron Sean Caron added a comment -

            It looks like it's definitely an issue with "make debs"; if I run the process with just a make:

            git clone git://git.hpdd.intel.com/fs/lustre-release.git
            cd lustre-release
            sh autogen.sh
            ./configure --disable-server --with-o2ib=no --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux
            make

            The build seems to complete successfully; exit code is 0; I seem to get a number of modules successfully built:

            root@buildhost:/usr/src/lustre-release# find ./ -name "*.ko"
            ./lnet/klnds/socklnd/ksocklnd.ko
            ./lnet/lnet/lnet.ko
            ./lnet/selftest/lnet_selftest.ko
            ./lustre/obdclass/llog_test.ko
            ./lustre/obdclass/obdclass.ko
            ./lustre/lov/lov.ko
            ./lustre/mdc/mdc.ko
            ./lustre/lmv/lmv.ko
            ./lustre/ptlrpc/gss/ptlrpc_gss.ko
            ./lustre/ptlrpc/ptlrpc.ko
            ./lustre/osc/osc.ko
            ./lustre/llite/llite_lloop.ko
            ./lustre/llite/lustre.ko
            ./lustre/mgc/mgc.ko
            ./lustre/fid/fid.ko
            ./lustre/fld/fld.ko
            ./lustre/obdecho/obdecho.ko
            ./libcfs/libcfs/libcfs.ko
            root@buildhost:/usr/src/lustre-release#

            But if I go then and do "make debs", it fails out with the same error ... seems to be attempting to be doing a lot of building just to stuff some Debian packages. I noticed this early in the "make debs" process:

            dpkg-source: warning: source directory 'lustre-release' is not <sourcepackage>-<upstreamversion> 'lustre-2.7.58'
            dpkg-source: info: building lustre in lustre_2.7.58-1.tar.gz
            dpkg-source: info: building lustre in lustre_2.7.58-1.dsc

            Is this Git pull really version 2.7.58-1? Or is the "make debs" process going out and pulling a completely different copy of the Lustre source to attempt to build ...?

            Thanks for picking up the ticket and your help so far.

            scaron Sean Caron added a comment - It looks like it's definitely an issue with "make debs"; if I run the process with just a make: git clone git://git.hpdd.intel.com/fs/lustre-release.git cd lustre-release sh autogen.sh ./configure --disable-server --with-o2ib=no --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux make The build seems to complete successfully; exit code is 0; I seem to get a number of modules successfully built: root@buildhost:/usr/src/lustre-release# find ./ -name "*.ko" ./lnet/klnds/socklnd/ksocklnd.ko ./lnet/lnet/lnet.ko ./lnet/selftest/lnet_selftest.ko ./lustre/obdclass/llog_test.ko ./lustre/obdclass/obdclass.ko ./lustre/lov/lov.ko ./lustre/mdc/mdc.ko ./lustre/lmv/lmv.ko ./lustre/ptlrpc/gss/ptlrpc_gss.ko ./lustre/ptlrpc/ptlrpc.ko ./lustre/osc/osc.ko ./lustre/llite/llite_lloop.ko ./lustre/llite/lustre.ko ./lustre/mgc/mgc.ko ./lustre/fid/fid.ko ./lustre/fld/fld.ko ./lustre/obdecho/obdecho.ko ./libcfs/libcfs/libcfs.ko root@buildhost:/usr/src/lustre-release# But if I go then and do "make debs", it fails out with the same error ... seems to be attempting to be doing a lot of building just to stuff some Debian packages. I noticed this early in the "make debs" process: dpkg-source: warning: source directory 'lustre-release' is not <sourcepackage>-<upstreamversion> 'lustre-2.7.58' dpkg-source: info: building lustre in lustre_2.7.58-1.tar.gz dpkg-source: info: building lustre in lustre_2.7.58-1.dsc Is this Git pull really version 2.7.58-1? Or is the "make debs" process going out and pulling a completely different copy of the Lustre source to attempt to build ...? Thanks for picking up the ticket and your help so far.
            dmiter Dmitry Eremin (Inactive) added a comment - - edited

            James, this is an issue in Lustre build scripts for Debian like distributive. We need to check for "no" instead of "".

            [ "x@ENABLEO2IB@" != "x" ] && \
                    export IB_OPTIONS="--with-o2ib=@O2IBPATHS@"
            

            Recently we change this to:

            case $with_o2ib in
                    yes)    AS_IF([which ofed_info 2>/dev/null], [
            [...skip...]
                            ENABLEO2IB="yes"
                            ;;
                    no)     ENABLEO2IB="no"
                            ;;
                    *)      O2IBPATHS=$with_o2ib
                            ENABLEO2IB="withpath"
                            OFED="yes"
                            ;;
            esac
            
            dmiter Dmitry Eremin (Inactive) added a comment - - edited James, this is an issue in Lustre build scripts for Debian like distributive. We need to check for "no" instead of "". [ "x@ENABLEO2IB@" != "x" ] && \ export IB_OPTIONS= "--with-o2ib=@O2IBPATHS@" Recently we change this to: case $with_o2ib in yes) AS_IF([which ofed_info 2>/dev/ null ], [ [...skip...] ENABLEO2IB= "yes" ;; no) ENABLEO2IB= "no" ;; *) O2IBPATHS=$with_o2ib ENABLEO2IB= "withpath" OFED= "yes" ;; esac

            Can you build lustre itself? I mean just sh ./autogen.sh;./configure --......;make. I like to see if it is a packaging issue or a actually autoconf script issue.

            simmonsja James A Simmons added a comment - Can you build lustre itself? I mean just sh ./autogen.sh;./configure --......;make. I like to see if it is a packaging issue or a actually autoconf script issue.

            People

              dmiter Dmitry Eremin (Inactive)
              scaron Sean Caron
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: