Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5140

Mellanox backport header conflicts with newer kernels

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.6.0
    • Lustre 2.6.0
    • Any linux environment that supports process namespace and the external Mellanox OFED stack. In my case it was SLES11 SP3 with a external OFED stack.
    • 3
    • 14179

    Description

      The Mellanox OFED stack has a compatibility layer to allow it be build across many kernel versions and many distributions. The linux process namespace is one of the things Mellanox creates wrappers to handle various levels of support of this feature.

      For lustre the libcfs layer also does the same exact thing to handle different levels of support of process namespace. In order to do that libcfs has to figure out which abstract to wrap around, Mellanox or the native system. Currently libcfs doesn't not handle this case properly.

      Attachments

        Issue Links

          Activity

            [LU-5140] Mellanox backport header conflicts with newer kernels

            Patch has landed. This ticket can be closed.

            simmonsja James A Simmons added a comment - Patch has landed. This ticket can be closed.

            When we tried to build master with OFED 3.12, Cray encountered the following build failure, which is also fixed by Jame's patch:

            [ 140s] CC [M] /usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.o
            [ 141s] In file included from /usr/src/packages/BUILD/cray-lustre/libcfs/include/libcfs/linux/linux-prim.h:66,
            [ 141s] from /usr/src/packages/BUILD/cray-lustre/libcfs/include/libcfs/linux/libcfs.h:53,
            [ 141s] from /usr/src/packages/BUILD/cray-lustre/libcfs/include/libcfs/libcfs.h:47,
            [ 141s] from /usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.h:71,
            [ 141s] from /usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.c:41:
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:137: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] cc1: warnings being treated as errors
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:137: error: parameter names (without types) in function declaration
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:139: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:139: error: parameter names (without types) in function declaration
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:142: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:142: error: parameter names (without types) in function declaration
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:144: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:144: error: parameter names (without types) in function declaration
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:146: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:146: error: parameter names (without types) in function declaration
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:148: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:148: error: parameter names (without types) in function declaration
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:152: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:152: error: function declaration isn't a prototype
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:151: error: static declaration of 'LINUX_BACKPORT' follows non-static declaration
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:148: error: previous declaration of 'LINUX_BACKPORT' was here
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h: In function 'LINUX_BACKPORT':
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: 'from_kuid' undeclared (first use in this function)
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: (Each undeclared identifier is reported only once
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: for each function it appears in.)
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: 'ns' undeclared (first use in this function)
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: 'uid' undeclared (first use in this function)
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: called object 'LINUX_BACKPORT(<erroneous-expression>)' is not a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h: At top level:
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:158: error: 'LINUX_BACKPORT' declared as function returning a function
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:158: error: function declaration isn't a prototype
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:157: error: redefinition of 'LINUX_BACKPORT'
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:152: error: previous definition of 'LINUX_BACKPORT' was here
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h: In function 'LINUX_BACKPORT':
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: 'from_kgid' undeclared (first use in this function)
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: 'ns' undeclared (first use in this function)
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: 'gid' undeclared (first use in this function)
            [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: called object 'LINUX_BACKPORT(<erroneous-expression>)' is not a function
            [ 141s] make[9]: *** [/usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.o] Error 1
            [ 141s] make[8]: *** [/usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd] Error 2
            [ 141s] make[7]: *** [/usr/src/packages/BUILD/cray-lustre/lnet/klnds] Error 2
            [ 141s] make[6]: *** [/usr/src/packages/BUILD/cray-lustre/lnet] Error 2
            [ 141s] make[5]: *** [_module_/usr/src/packages/BUILD/cray-lustre] Error 2
            [ 141s] make[4]: *** [sub-make] Error 2
            [ 141s] make[3]: *** [all] Error 2
            [ 141s] make[3]: Leaving directory `/usr/src/linux-3.0.101-0.21.1_1.0000.8135-obj/x86_64/cray_ari_s'
            [ 141s] make[2]: *** [modules] Error 2
            [ 141s] make[2]: Leaving directory `/usr/src/packages/BUILD/cray-lustre'
            [ 141s] make[1]: *** [all-recursive] Error 1
            [ 141s] make[1]: Leaving directory `/usr/src/packages/BUILD/cray-lustre'
            [ 141s] make: *** [all] Error 2
            [ 141s] error: Bad exit status from /var/tmp/rpm-tmp.71744 (%build)
            [ 141s]
            [ 141s]
            [ 141s] RPM build errors:
            [ 141s] Bad exit status from /var/tmp/rpm-tmp.71744 (%build)

            paf Patrick Farrell (Inactive) added a comment - When we tried to build master with OFED 3.12, Cray encountered the following build failure, which is also fixed by Jame's patch: [ 140s] CC [M] /usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.o [ 141s] In file included from /usr/src/packages/BUILD/cray-lustre/libcfs/include/libcfs/linux/linux-prim.h:66, [ 141s] from /usr/src/packages/BUILD/cray-lustre/libcfs/include/libcfs/linux/libcfs.h:53, [ 141s] from /usr/src/packages/BUILD/cray-lustre/libcfs/include/libcfs/libcfs.h:47, [ 141s] from /usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.h:71, [ 141s] from /usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.c:41: [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:137: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] cc1: warnings being treated as errors [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:137: error: parameter names (without types) in function declaration [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:139: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:139: error: parameter names (without types) in function declaration [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:142: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:142: error: parameter names (without types) in function declaration [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:144: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:144: error: parameter names (without types) in function declaration [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:146: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:146: error: parameter names (without types) in function declaration [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:148: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:148: error: parameter names (without types) in function declaration [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:152: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:152: error: function declaration isn't a prototype [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:151: error: static declaration of 'LINUX_BACKPORT' follows non-static declaration [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:148: error: previous declaration of 'LINUX_BACKPORT' was here [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h: In function 'LINUX_BACKPORT': [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: 'from_kuid' undeclared (first use in this function) [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: (Each undeclared identifier is reported only once [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: for each function it appears in.) [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: 'ns' undeclared (first use in this function) [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: 'uid' undeclared (first use in this function) [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:153: error: called object 'LINUX_BACKPORT(<erroneous-expression>)' is not a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h: At top level: [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:158: error: 'LINUX_BACKPORT' declared as function returning a function [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:158: error: function declaration isn't a prototype [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:157: error: redefinition of 'LINUX_BACKPORT' [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:152: error: previous definition of 'LINUX_BACKPORT' was here [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h: In function 'LINUX_BACKPORT': [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: 'from_kgid' undeclared (first use in this function) [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: 'ns' undeclared (first use in this function) [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: 'gid' undeclared (first use in this function) [ 141s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s/include/linux/uidgid.h:159: error: called object 'LINUX_BACKPORT(<erroneous-expression>)' is not a function [ 141s] make [9] : *** [/usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.o] Error 1 [ 141s] make [8] : *** [/usr/src/packages/BUILD/cray-lustre/lnet/klnds/o2iblnd] Error 2 [ 141s] make [7] : *** [/usr/src/packages/BUILD/cray-lustre/lnet/klnds] Error 2 [ 141s] make [6] : *** [/usr/src/packages/BUILD/cray-lustre/lnet] Error 2 [ 141s] make [5] : *** [_module_/usr/src/packages/BUILD/cray-lustre] Error 2 [ 141s] make [4] : *** [sub-make] Error 2 [ 141s] make [3] : *** [all] Error 2 [ 141s] make [3] : Leaving directory `/usr/src/linux-3.0.101-0.21.1_1.0000.8135-obj/x86_64/cray_ari_s' [ 141s] make [2] : *** [modules] Error 2 [ 141s] make [2] : Leaving directory `/usr/src/packages/BUILD/cray-lustre' [ 141s] make [1] : *** [all-recursive] Error 1 [ 141s] make [1] : Leaving directory `/usr/src/packages/BUILD/cray-lustre' [ 141s] make: *** [all] Error 2 [ 141s] error: Bad exit status from /var/tmp/rpm-tmp.71744 (%build) [ 141s] [ 141s] [ 141s] RPM build errors: [ 141s] Bad exit status from /var/tmp/rpm-tmp.71744 (%build)

            James, I've called for a retest. It's back in the queue.

            bogl Bob Glossman (Inactive) added a comment - James, I've called for a retest. It's back in the queue.

            Maloo failed to run for patch http://review.whamcloud.com/#/c/10571. Could some one please start the test for this patch. Thank you.

            simmonsja James A Simmons added a comment - Maloo failed to run for patch http://review.whamcloud.com/#/c/10571 . Could some one please start the test for this patch. Thank you.
            simmonsja James A Simmons added a comment - - edited

            The question becomes the order of importance for the uidgid defines. We have the possible combos of distro, ofed, and libcfs. The order for all code outside of the o2ib LND driver is a no brainier. We use the distro if present and the libcfs if not present. The definitions for OFED don't show up outside the o2ib LND driver. Now in the o2ib LND driver do we want in order of most to least importance:

            compact-rdma.h -> uidgid.h -> libcfs

            uidgid.h -> compact-rdma.h -> libcfs

            As for defining _LINUX_UIDGID_H I really can't see a way around this unless we involve the OFED testing in libcfs autoconf and that would be to messy and ugly.

            simmonsja James A Simmons added a comment - - edited The question becomes the order of importance for the uidgid defines. We have the possible combos of distro, ofed, and libcfs. The order for all code outside of the o2ib LND driver is a no brainier. We use the distro if present and the libcfs if not present. The definitions for OFED don't show up outside the o2ib LND driver. Now in the o2ib LND driver do we want in order of most to least importance: compact-rdma.h -> uidgid.h -> libcfs uidgid.h -> compact-rdma.h -> libcfs As for defining _LINUX_UIDGID_H I really can't see a way around this unless we involve the OFED testing in libcfs autoconf and that would be to messy and ugly.

            That additional change to curproc.h does repair the build with recent OFED on Centos 6.5, but can't speak to all other variations. Even in the Centos build I'm worried that it may have the effect of using defns from OFED uidgid.h in some places and local defns from curproc.h in others, depending on who includes what exactly.

            bogl Bob Glossman (Inactive) added a comment - That additional change to curproc.h does repair the build with recent OFED on Centos 6.5, but can't speak to all other variations. Even in the Centos build I'm worried that it may have the effect of using defns from OFED uidgid.h in some places and local defns from curproc.h in others, depending on who includes what exactly.

            People

              bogl Bob Glossman (Inactive)
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: