Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4800

no automatic module load in newer kernels

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0, Lustre 2.5.3
    • Lustre 2.6.0, Lustre 2.5.1, Lustre 2.4.3
    • None
    • 3
    • 13208

    Description

      This problem has been seen in several test environments with newer linux kernel versions than the 2.6.x or 3.0.x we currently support.
      For lustre clients built against the unpatched, pristine kernel sources with default .config files I don't get lustre modules autoloading at mount time. example:

      # mount -t lustre -o flock,user_xattr centos2:/lustre /mnt/lustre
      mount.lustre: mount centos2:/lustre at /mnt/lustre failed: No such device
      Are the lustre modules loaded?
      Check /etc/modprobe.conf and /proc/filesystems
      

      If I explicitly preload lustre modules with a modprobe command like "modprobe lustre" then the mount works fine.

      I got the following commentary in email from Andreas:

      This is a problem known to me. If the obdclass module is loaded, then it
      will register the "lustre" filesystem type, so it will appear in
      /proc/filesystems and mount will not modprobe the "lustre" filesystem
      module.

      This has been true since Lustre 1.6 or so (when "mountconf" was first
      added).

      Two options exist:

      • modify the mount.lustre binary to always modprobe the "lustre" module if
        it isn't already loaded
      • change the "lustre" filesystem type registration to be in the "lustre"
        module (as most other filesystems do). That has the problem that the
        servers do not need the "lustre.ko" module loaded, since that is really
        the client VFS interface. It would help if there was a second filesystem
        type "lustre_srv" or similar, that could be used to register server
        mountpoints, and possibly simplify the mount internals (which are
        convoluted because they have to do completely different things for client
        and server mounts).

      Obviously, #1 is easier, but #2 would simplify the coe in the long term.

      Attachments

        Issue Links

          Activity

            [LU-4800] no automatic module load in newer kernels
            pjones Peter Jones added a comment -

            Landed for 2.6

            pjones Peter Jones added a comment - Landed for 2.6

            Patch has landed to master. This ticket can be closed.

            simmonsja James A Simmons added a comment - Patch has landed to master. This ticket can be closed.

            I really like to see the clean Andreas and I did eventually gone in another time.

            It would probably be best to start a new ticket to track that so it does not get lost.

            morrone Christopher Morrone (Inactive) added a comment - I really like to see the clean Andreas and I did eventually gone in another time. It would probably be best to start a new ticket to track that so it does not get lost.

            I agree about following through on those cleanups. I plan to enter a new ticket for that. James, I will be sure to make you a Watcher there.

            bogl Bob Glossman (Inactive) added a comment - I agree about following through on those cleanups. I plan to enter a new ticket for that. James, I will be sure to make you a Watcher there.
            simmonsja James A Simmons added a comment - - edited

            With 2.6 so close to being released I'm happy if that lands. I really like to see the clean Andreas and I did eventually gone in another time. Sorry I got carried away with the idea of cleanup

            simmonsja James A Simmons added a comment - - edited With 2.6 so close to being released I'm happy if that lands. I really like to see the clean Andreas and I did eventually gone in another time. Sorry I got carried away with the idea of cleanup

            Preliminary testing seems to confirm my theory. All that fancy handwaving both James and I did look unneeded. Please see my latest iteration at http://review.whamcloud.com/#/c/10587/6

            bogl Bob Glossman (Inactive) added a comment - Preliminary testing seems to confirm my theory. All that fancy handwaving both James and I did look unneeded. Please see my latest iteration at http://review.whamcloud.com/#/c/10587/6

            Looking at both mine & James' mods I'm coming to the conclusion that there's a much simpler solution possible. Now investigating the possibility that the only essential feature is the added runtime request_module("lustre") in obd_mount.c. The improved error exit cleanups in init_lustre_lite are valuable and in fact I think init_obdclass needs similar fixes, but that isn't a necessary part of this mod. Should be followed up in another mod entirely (IMHO). James' symbol_get() calls may also be useful, but on the theory that simplest is best I'm seeing if they are really necessary.

            If my latest theory is correct I will push a much simpler mod that does no relocation of the fs regiister/unregister calls at all.

            bogl Bob Glossman (Inactive) added a comment - Looking at both mine & James' mods I'm coming to the conclusion that there's a much simpler solution possible. Now investigating the possibility that the only essential feature is the added runtime request_module("lustre") in obd_mount.c. The improved error exit cleanups in init_lustre_lite are valuable and in fact I think init_obdclass needs similar fixes, but that isn't a necessary part of this mod. Should be followed up in another mod entirely (IMHO). James' symbol_get() calls may also be useful, but on the theory that simplest is best I'm seeing if they are really necessary. If my latest theory is correct I will push a much simpler mod that does no relocation of the fs regiister/unregister calls at all.

            Okay I cleaned up the patch. I'm going to push it as a separate patch in case no one likes it. I merged Bob's changes into my changes which allow use to go back to using just one struct file_system_type. The patch is at

            http://review.whamcloud.com/#/c/10699

            simmonsja James A Simmons added a comment - Okay I cleaned up the patch. I'm going to push it as a separate patch in case no one likes it. I merged Bob's changes into my changes which allow use to go back to using just one struct file_system_type. The patch is at http://review.whamcloud.com/#/c/10699

            I don't mind to put the registration of the"lustre_osd" fstype in a separate patch, but I think it makes sense to land that into 2.6 if at all possible.

            adilger Andreas Dilger added a comment - I don't mind to put the registration of the"lustre_osd" fstype in a separate patch, but I think it makes sense to land that into 2.6 if at all possible.
            bogl Bob Glossman (Inactive) added a comment - - edited

            Does it make sense to add additional registrations now in this mod? I don't object to the idea in general, but don't want to complicate things right now. Need a solution soon for client support. This became more urgent suddenly since rhel7 officially released yesterday and something is needed just for that alone.

            Could additional registrations maybe be done in a follow on patch?

            bogl Bob Glossman (Inactive) added a comment - - edited Does it make sense to add additional registrations now in this mod? I don't object to the idea in general, but don't want to complicate things right now. Need a solution soon for client support. This became more urgent suddenly since rhel7 officially released yesterday and something is needed just for that alone. Could additional registrations maybe be done in a follow on patch?

            People

              bogl Bob Glossman (Inactive)
              bogl Bob Glossman (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: