Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9564

Support for Lustre Servers on Ubuntu 14.04/16.04 Kernel 4.4.0

Details

    • New Feature
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0
    • Lustre 2.9.0
    • Ubuntu 14.04.5 with Backport Kernel 4.4.0 from Ubuntu 16.04
    • 9223372036854775807

    Description

      Currently, only SuSE or RedHat machines can be used as Lustre servers. Ubuntu can only be used as a Client.

      Since Lustre has recently started supporting SLES12 with Kernel 4.4.0 – the very same version used by Ubuntu 16.04 (and 14.04. via HWE Kernels), we had the idea of porting Lustre over (since our future usage of Lustre would greatly benefit from that).

      We made good progress in that respect and have adjusted the Kernel-Patches for "ldiskfs" to also work for Ubuntu's flavour of Kernel 4.4.0. Additionally, we have extended the Debian build system provided by Lustre, to be able to create both client and server tools and modules.

      You can find the patches against Lustre 2.9.57 attached to this ticket. The kernel patches target version 4.4.0-45.66.

      The compilation and creation of the packages works well and produces the needed modules and tools. Both debian packages install cleanly and the modules (lnet, ldiskfs, lustre) load correctly.

      Unfortunately, once we come to the creation of the Lustre file system, we run into a curious error:

      root@musxbeo050:~#  mkfs.lustre --mgs --mdt --fsname=lustre --backfstype=ldiskfs --index=0 /dev/sda7
      mkfs.lustre FATAL: unhandled/unloaded fs type 1 'ldiskfs'
      mkfs.lustre FATAL: unable to prepare backend (22)
      mkfs.lustre: exiting with 22 (Invalid argument)
      

      This is despite the fact, that "ldiskfs" is properly registered with the kernel:

      root@musxbeo050:~# uname -a
      Linux musxbeo050 4.4.0-45-generic #66~14.04.1lustre SMP Mon May 8 18:23:05 CEST 2017 x86_64 x86_64 x86_64 GNU/Linux
      
      root@musxbeo050:~# grep "ldiskfs\|lustre" /proc/filesystems
              ldiskfs
              lustre
      

      The only unusual message in the kernel log is a complaint by the "ldiskfs" module, that it cant' register itself under the "ext3" alias.

      May 26 14:16:11 musxbeo050 kernel: [159891.600363] LDISKFS-fs: Unable to register as ext3 (-16)
      

      So far, we have not yet tested the ZFS backend, but since no kernel changes are needed for that one at all, we don't think that it will be a great issue, once this one is solved.

      Thanks!

      Attachments

        Issue Links

          Activity

            [LU-9564] Support for Lustre Servers on Ubuntu 14.04/16.04 Kernel 4.4.0

            Add reference to blocking subtask for o2iblnd.c changes.

            mhschroe Martin Schröder (Inactive) added a comment - Add reference to blocking subtask for o2iblnd.c changes.

            Okay, I will extract the o2iblnd.c changes into a separate changeset and then adjust the dependencies of the current one, so that it depends on the other.

            mhschroe Martin Schröder (Inactive) added a comment - Okay, I will extract the o2iblnd.c changes into a separate changeset and then adjust the dependencies of the current one, so that it depends on the other.

            I would prefer to see those o2iblnd.c changes as a separate patch. That fix isn't specific to server builds. It is needed for any build on current Ubuntu kernel versions, including client builds.

            bogl Bob Glossman (Inactive) added a comment - I would prefer to see those o2iblnd.c changes as a separate patch. That fix isn't specific to server builds. It is needed for any build on current Ubuntu kernel versions, including client builds.

            Hi Chris. I've seen that commit (and an earlier one that was already merged) when I encountered the compile-time issue.

            The problem here is, that neither "IB_DEVICE_SG_GAPS_REG" nor "IB_MR_TYPE_SG_GAPS" are available on the 4.4.0-series Kernels as used by Ubuntu at all. So any code that mentions them outside of a suitable preprocessor guard will fail to compile. In this case, the guard matched, even when the symbols were not present.

             

            The reason for their absence appears to be, that there is a general warning against using these symbols:

            So either they removed them, or they never used them on the 4.4.0 Kernel.

             

            I changed the code so that Lustre will use them, if the symbols are present. If missing, it will revert to the old pre-patch behaviour.
            At least that's what I hope my code does. Checking for symbols via preprocessor macros can be iffy, if what you check turns out not to be a macro in the first place.

             

            mhschroe Martin Schröder (Inactive) added a comment - - edited Hi Chris. I've seen that commit (and an earlier one that was already merged) when I encountered the compile-time issue. The problem here is, that neither "IB_DEVICE_SG_GAPS_REG" nor "IB_MR_TYPE_SG_GAPS" are available on the 4.4.0-series Kernels as used by Ubuntu at all. So any code that mentions them outside of a suitable preprocessor guard will fail to compile. In this case, the guard matched, even when the symbols were not present.   The reason for their absence appears to be, that there is a general warning against using these symbols: https://patchwork.kernel.org/patch/9573483/ https://lkml.org/lkml/2017/3/13/206 So either they removed them, or they never used them on the 4.4.0 Kernel.   I changed the code so that Lustre will use them, if the symbols are present. If missing, it will revert to the old pre-patch behaviour. At least that's what I hope my code does. Checking for symbols via preprocessor macros can be iffy, if what you check turns out not to be a macro in the first place.  

            For "IB_MR_TYPE_SG_GAPS" you might want to look at LU-10089

            chunteraa Chris Hunter (Inactive) added a comment - For "IB_MR_TYPE_SG_GAPS" you might want to look at LU-10089

            Hi everyone.

            I have rebased the code, fixed the build-time issues and changed the DEB file creation to allow more than one "lustre - * - modules - <KVERS>" package to be installed simultaneously.

            I have checked the compilation on both Ubuntu 14.04 and 16.04. and against the most recent Linux Kernel version available for both.

            If you have the time, please review the Gerrit change under: https://review.whamcloud.com/#/c/29215

            Do note especially the small workaround for "IB_DEVICE_SG_GAPS_REG" and "IB_MR_TYPE_SG_GAPS", that is needed to compile against the Ubuntu Kernels, which do not have those macros. The change to "lnet/klnds/o2iblnd/o2iblnd.c" should be broadly compatible with both older and newer kernels.

             

            Thanks!

            mhschroe Martin Schröder (Inactive) added a comment - - edited Hi everyone. I have rebased the code, fixed the build-time issues and changed the DEB file creation to allow more than one "lustre - * - modules - <KVERS>" package to be installed simultaneously. I have checked the compilation on both Ubuntu 14.04 and 16.04. and against the most recent Linux Kernel version available for both. If you have the time, please review the Gerrit change under: https://review.whamcloud.com/#/c/29215 Do note especially the small workaround for "IB_DEVICE_SG_GAPS_REG" and "IB_MR_TYPE_SG_GAPS", that is needed to compile against the Ubuntu Kernels, which do not have those macros. The change to "lnet/klnds/o2iblnd/o2iblnd.c" should be broadly compatible with both older and newer kernels.   Thanks!

            Its a libtool thing. Looking on my ubuntu system I don't see any *.la files installed. I noticed we are installing them.

            simmonsja James A Simmons added a comment - Its a libtool thing. Looking on my ubuntu system I don't see any *.la files installed. I noticed we are installing them.

            Okay, I'll rebase and see where the problem might be.

             

            I do remember the "missing files" issue, which happened when certain configurations would not build shared libraries, only static ones. But I believed to have fixed all of them. Let me check.

            mhschroe Martin Schröder (Inactive) added a comment - Okay, I'll rebase and see where the problem might be.   I do remember the "missing files" issue, which happened when certain configurations would not build shared libraries, only static ones. But I believed to have fixed all of them. Let me check.

            that fail was a rebase on latest master

            bogl Bob Glossman (Inactive) added a comment - that fail was a rebase on latest master

            It needs a rebase as well.

            simmonsja James A Simmons added a comment - It needs a rebase as well.

            Latest patch isn't working for me at all.
            'build debs' fails with errors like the following:

            # Create the module-source tarball.
            cd debian/lustre-source/usr/src && tar jcf lustre.tar.bz2 modules 
            rm -rf debian/lustre-source/usr/src/modules
            dh_install -plustre-source
            dh_installchangelogs -p lustre-source lustre/ChangeLog
            dh_installdocs -p lustre-source 
            dh_link -p lustre-source /usr/share/modass/packages/default.sh
            /usr/share/modass/overrides/lustre-source
            dh_compress -p lustre-source
            dh_installdeb -p lustre-source
            dh_fixperms -p lustre-source
            dh_gencontrol -p lustre-source
            dh_md5sums -p lustre-source
            dh_builddeb -p lustre-source
            dpkg-deb: building package 'lustre-source' in
            '../lustre-source_2.10.54-14-g65983ee-1_all.deb'.
            dh_testdir
            dh_testroot
            dh_installdirs -p lustre-server-utils
            dh_installdocs -p  lustre-server-utils
            dh_installman -p lustre-server-utils
            dh_install -p lustre-server-utils
            dh_install: lustre-server-utils missing files: debian/tmp/usr/lib/lustre/*.la
            dh_install: missing files, aborting
            debian/rules:239: recipe for target 'binary-lustre-server-utils' failed
            make[1]: *** [binary-lustre-server-utils] Error 255
            make[1]: Leaving directory '/home/bogl/lustre-release'
            dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit status 2
            autoMakefile:1142: recipe for target 'debs' failed
            make: *** [debs] Error 2
            
            bogl Bob Glossman (Inactive) added a comment - Latest patch isn't working for me at all. 'build debs' fails with errors like the following: # Create the module-source tarball. cd debian/lustre-source/usr/src && tar jcf lustre.tar.bz2 modules rm -rf debian/lustre-source/usr/src/modules dh_install -plustre-source dh_installchangelogs -p lustre-source lustre/ChangeLog dh_installdocs -p lustre-source dh_link -p lustre-source /usr/share/modass/packages/default.sh /usr/share/modass/overrides/lustre-source dh_compress -p lustre-source dh_installdeb -p lustre-source dh_fixperms -p lustre-source dh_gencontrol -p lustre-source dh_md5sums -p lustre-source dh_builddeb -p lustre-source dpkg-deb: building package 'lustre-source' in '../lustre-source_2.10.54-14-g65983ee-1_all.deb'. dh_testdir dh_testroot dh_installdirs -p lustre-server-utils dh_installdocs -p lustre-server-utils dh_installman -p lustre-server-utils dh_install -p lustre-server-utils dh_install: lustre-server-utils missing files: debian/tmp/usr/lib/lustre/*.la dh_install: missing files, aborting debian/rules:239: recipe for target 'binary-lustre-server-utils' failed make[1]: *** [binary-lustre-server-utils] Error 255 make[1]: Leaving directory '/home/bogl/lustre-release' dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit status 2 autoMakefile:1142: recipe for target 'debs' failed make: *** [debs] Error 2

            People

              bogl Bob Glossman (Inactive)
              mhschroe Martin Schröder (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: