Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9564

Support for Lustre Servers on Ubuntu 14.04/16.04 Kernel 4.4.0

Details

    • New Feature
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0
    • Lustre 2.9.0
    • Ubuntu 14.04.5 with Backport Kernel 4.4.0 from Ubuntu 16.04
    • 9223372036854775807

    Description

      Currently, only SuSE or RedHat machines can be used as Lustre servers. Ubuntu can only be used as a Client.

      Since Lustre has recently started supporting SLES12 with Kernel 4.4.0 – the very same version used by Ubuntu 16.04 (and 14.04. via HWE Kernels), we had the idea of porting Lustre over (since our future usage of Lustre would greatly benefit from that).

      We made good progress in that respect and have adjusted the Kernel-Patches for "ldiskfs" to also work for Ubuntu's flavour of Kernel 4.4.0. Additionally, we have extended the Debian build system provided by Lustre, to be able to create both client and server tools and modules.

      You can find the patches against Lustre 2.9.57 attached to this ticket. The kernel patches target version 4.4.0-45.66.

      The compilation and creation of the packages works well and produces the needed modules and tools. Both debian packages install cleanly and the modules (lnet, ldiskfs, lustre) load correctly.

      Unfortunately, once we come to the creation of the Lustre file system, we run into a curious error:

      root@musxbeo050:~#  mkfs.lustre --mgs --mdt --fsname=lustre --backfstype=ldiskfs --index=0 /dev/sda7
      mkfs.lustre FATAL: unhandled/unloaded fs type 1 'ldiskfs'
      mkfs.lustre FATAL: unable to prepare backend (22)
      mkfs.lustre: exiting with 22 (Invalid argument)
      

      This is despite the fact, that "ldiskfs" is properly registered with the kernel:

      root@musxbeo050:~# uname -a
      Linux musxbeo050 4.4.0-45-generic #66~14.04.1lustre SMP Mon May 8 18:23:05 CEST 2017 x86_64 x86_64 x86_64 GNU/Linux
      
      root@musxbeo050:~# grep "ldiskfs\|lustre" /proc/filesystems
              ldiskfs
              lustre
      

      The only unusual message in the kernel log is a complaint by the "ldiskfs" module, that it cant' register itself under the "ext3" alias.

      May 26 14:16:11 musxbeo050 kernel: [159891.600363] LDISKFS-fs: Unable to register as ext3 (-16)
      

      So far, we have not yet tested the ZFS backend, but since no kernel changes are needed for that one at all, we don't think that it will be a great issue, once this one is solved.

      Thanks!

      Attachments

        Issue Links

          Activity

            [LU-9564] Support for Lustre Servers on Ubuntu 14.04/16.04 Kernel 4.4.0
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29215/
            Subject: LU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 027a7237b560489099ba490db925db17a554f37d

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29215/ Subject: LU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0 Project: fs/lustre-release Branch: master Current Patch Set: Commit: 027a7237b560489099ba490db925db17a554f37d
            pjones Peter Jones added a comment -

            Martin

            The commits will be landed by the gatekeeper when he has completed necessary reviews and tests. One of the patches is in the batch being processed ATM - https://git.hpdd.intel.com/?p=fs/lustre-release.git;a=shortlog;h=refs/heads/master-next - and the other will likely be in the next batch

            Peter

            pjones Peter Jones added a comment - Martin The commits will be landed by the gatekeeper when he has completed necessary reviews and tests. One of the patches is in the batch being processed ATM - https://git.hpdd.intel.com/?p=fs/lustre-release.git;a=shortlog;h=refs/heads/master-next  - and the other will likely be in the next batch Peter

            Okay everyone.

            Both changes have CR+1/V+1. Now, I guess all that is needed is to find someone with the needed permissions to push the code into the mainline.

            So... any volunteers on the front of merging the two commits?

             

            Once the code is merged and shown to not cause any problems (like last time), it would also be a good idea to add a Ubuntu Server build to the Jenkins config. The Client build is already there, after all. 

             

            Thanks,
                Martin.

            mhschroe Martin Schröder (Inactive) added a comment - Okay everyone. Both changes have CR+1/V+1. Now, I guess all that is needed is to find someone with the needed permissions to push the code into the mainline. So... any volunteers on the front of merging the two commits?   Once the code is merged and shown to not cause any problems (like last time), it would also be a good idea to add a Ubuntu Server build to the Jenkins config. The Client build is already there, after all.    Thanks,     Martin.
            mhschroe Martin Schröder (Inactive) added a comment - @Bob Glossmann: Please review: https://review.whamcloud.com/#/c/30893/

            Add reference to blocking subtask for o2iblnd.c changes.

            mhschroe Martin Schröder (Inactive) added a comment - Add reference to blocking subtask for o2iblnd.c changes.

            Okay, I will extract the o2iblnd.c changes into a separate changeset and then adjust the dependencies of the current one, so that it depends on the other.

            mhschroe Martin Schröder (Inactive) added a comment - Okay, I will extract the o2iblnd.c changes into a separate changeset and then adjust the dependencies of the current one, so that it depends on the other.

            I would prefer to see those o2iblnd.c changes as a separate patch. That fix isn't specific to server builds. It is needed for any build on current Ubuntu kernel versions, including client builds.

            bogl Bob Glossman (Inactive) added a comment - I would prefer to see those o2iblnd.c changes as a separate patch. That fix isn't specific to server builds. It is needed for any build on current Ubuntu kernel versions, including client builds.

            Hi Chris. I've seen that commit (and an earlier one that was already merged) when I encountered the compile-time issue.

            The problem here is, that neither "IB_DEVICE_SG_GAPS_REG" nor "IB_MR_TYPE_SG_GAPS" are available on the 4.4.0-series Kernels as used by Ubuntu at all. So any code that mentions them outside of a suitable preprocessor guard will fail to compile. In this case, the guard matched, even when the symbols were not present.

             

            The reason for their absence appears to be, that there is a general warning against using these symbols:

            So either they removed them, or they never used them on the 4.4.0 Kernel.

             

            I changed the code so that Lustre will use them, if the symbols are present. If missing, it will revert to the old pre-patch behaviour.
            At least that's what I hope my code does. Checking for symbols via preprocessor macros can be iffy, if what you check turns out not to be a macro in the first place.

             

            mhschroe Martin Schröder (Inactive) added a comment - - edited Hi Chris. I've seen that commit (and an earlier one that was already merged) when I encountered the compile-time issue. The problem here is, that neither "IB_DEVICE_SG_GAPS_REG" nor "IB_MR_TYPE_SG_GAPS" are available on the 4.4.0-series Kernels as used by Ubuntu at all. So any code that mentions them outside of a suitable preprocessor guard will fail to compile. In this case, the guard matched, even when the symbols were not present.   The reason for their absence appears to be, that there is a general warning against using these symbols: https://patchwork.kernel.org/patch/9573483/ https://lkml.org/lkml/2017/3/13/206 So either they removed them, or they never used them on the 4.4.0 Kernel.   I changed the code so that Lustre will use them, if the symbols are present. If missing, it will revert to the old pre-patch behaviour. At least that's what I hope my code does. Checking for symbols via preprocessor macros can be iffy, if what you check turns out not to be a macro in the first place.  

            For "IB_MR_TYPE_SG_GAPS" you might want to look at LU-10089

            chunteraa Chris Hunter (Inactive) added a comment - For "IB_MR_TYPE_SG_GAPS" you might want to look at LU-10089

            People

              bogl Bob Glossman (Inactive)
              mhschroe Martin Schröder (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: