Details

    • Improvement
    • Resolution: Fixed
    • Critical
    • None
    • Lustre 2.4.0
    • 21,524
    • 4869

    Description

      Remove Lustre kernel patches to allow Lustre servers to be more easily ported to new kernels, and to be built against vendor kernels without changing the vendor kernel RPMs. There are a number of different patches, each one needs to use equivalent functionality which already exists in the kernel, or work to get the patch accepted upstream.

      Corresponding to bugzilla link:
      https://bugzilla.lustre.org/show_bug.cgi?id=21524

      Attachments

        1. fio_sdck_block_size_read.png
          fio_sdck_block_size_read.png
          41 kB
        2. fio_sdck_block_size_write.png
          fio_sdck_block_size_write.png
          41 kB
        3. fio_sdck_io_depth_read.png
          fio_sdck_io_depth_read.png
          36 kB
        4. fio_sdck_io_depth_write.png
          fio_sdck_io_depth_write.png
          39 kB
        5. mdtest_create_8thr.png
          mdtest_create_8thr.png
          62 kB
        6. mdtest_remove_8thr.png
          mdtest_remove_8thr.png
          72 kB
        7. mdtest_stat_8thr.png
          mdtest_stat_8thr.png
          77 kB
        8. sgpdd_16devs_rsz_read.png
          sgpdd_16devs_rsz_read.png
          47 kB
        9. sgpdd_16devs_rsz_write.png
          sgpdd_16devs_rsz_write.png
          46 kB

        Issue Links

          Activity

            [LU-20] patchless server kernel
            bogl Bob Glossman (Inactive) added a comment - - edited

            The recent landing of 'LU-20 osd-ldiskfs: Make readonly patches optional', https://review.whamcloud.com/27549 has broken lustre on el6. This mod added calls to kallsyms_lookup_name(), a kernel API not previously used. On el6 this API isn't globally visible to kernel modules, it has no EXPORT() statement. This leads to install time errors like:

            WARNING: /lib/modules/2.6.32-696.3.2.el6_lustre.x86_64/extra/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name
            WARNING: /lib/modules/2.6.32-696.3.1.el6_lustre.x86_64/weak-updates/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name
            WARNING: /lib/modules/2.6.32-696.3.2.el6_lustre.x86_64/extra/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name
            WARNING: /lib/modules/2.6.32-696.3.2.el6.x86_64/weak-updates/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name

            and runtime errors like:

            osd_ldiskfs: Unknown symbol kallsyms_lookup_name (err 0)
            LustreError: 158-c: Can't load module 'osd-ldiskfs'

            This flaw blocks any use of ldiskfs on el6
            It's a pretty serious regression.

            bogl Bob Glossman (Inactive) added a comment - - edited The recent landing of ' LU-20 osd-ldiskfs: Make readonly patches optional', https://review.whamcloud.com/27549 has broken lustre on el6. This mod added calls to kallsyms_lookup_name(), a kernel API not previously used. On el6 this API isn't globally visible to kernel modules, it has no EXPORT() statement. This leads to install time errors like: WARNING: /lib/modules/2.6.32-696.3.2.el6_lustre.x86_64/extra/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name WARNING: /lib/modules/2.6.32-696.3.1.el6_lustre.x86_64/weak-updates/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name WARNING: /lib/modules/2.6.32-696.3.2.el6_lustre.x86_64/extra/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name WARNING: /lib/modules/2.6.32-696.3.2.el6.x86_64/weak-updates/lustre-osd-ldiskfs/fs/osd_ldiskfs.ko needs unknown symbol kallsyms_lookup_name and runtime errors like: osd_ldiskfs: Unknown symbol kallsyms_lookup_name (err 0) LustreError: 158-c: Can't load module 'osd-ldiskfs' This flaw blocks any use of ldiskfs on el6 It's a pretty serious regression.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27549/
            Subject: LU-20 osd-ldiskfs: Make readonly patches optional
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0f0a43b4ba6660a88f7922aadaba1a69c297142c

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27549/ Subject: LU-20 osd-ldiskfs: Make readonly patches optional Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0f0a43b4ba6660a88f7922aadaba1a69c297142c

            Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: https://review.whamcloud.com/27549
            Subject: LU-20 osd-ldiskfs: Make readonly patches optional
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: aface21135cc936be2cf72fc2e092a4784fbecc0

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: https://review.whamcloud.com/27549 Subject: LU-20 osd-ldiskfs: Make readonly patches optional Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: aface21135cc936be2cf72fc2e092a4784fbecc0

            All of the recovery-* tests depend on this functionality to some extent, to allow clients to submit writes to the server that are dropped deterministically before the server restarts.

            Note there is no reason that the presence of the dev-rdonly patch in our kernel prevents people from building patchless kernels. We need that for the testing ldiskfs, but it is not needed for production with either ldiskfs or ZFS. Note that if you want project quota support for ldiskfs then kernel patches are needed regardless (project quota for ZFS will similarly need ZFS to be patched).

            adilger Andreas Dilger added a comment - All of the recovery-* tests depend on this functionality to some extent, to allow clients to submit writes to the server that are dropped deterministically before the server restarts. Note there is no reason that the presence of the dev-rdonly patch in our kernel prevents people from building patchless kernels. We need that for the testing ldiskfs, but it is not needed for production with either ldiskfs or ZFS. Note that if you want project quota support for ldiskfs then kernel patches are needed regardless (project quota for ZFS will similarly need ZFS to be patched).

            How many tests actually use the dev_rdonly/dm-flakey functionality?

            If it is small, perhaps the best path forward is to simply disable those tests until LU-684 is complete. That would allow us to unblock this ticket, LU-20.

            morrone Christopher Morrone (Inactive) added a comment - How many tests actually use the dev_rdonly/dm-flakey functionality? If it is small, perhaps the best path forward is to simply disable those tests until LU-684 is complete. That would allow us to unblock this ticket, LU-20 .
            mdiep Minh Diep added a comment -

            yup, just found that out too. Thanks. this is great news!

            mdiep Minh Diep added a comment - yup, just found that out too. Thanks. this is great news!

            kmod-lustre-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            lustre-iokit-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            kmod-lustre-osd-ldiskfs-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            lustre-osd-ldiskfs-mount-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            kmod-lustre-tests-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            lustre-resource-agents-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            lustre-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            lustre-tests-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm
            lustre-debuginfo-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm

            The ext4 source can be founded at

            /usr/src/debug/kernel-3.10.0-514.21.1.el7/linux-3.10.0-514.21.1.el7.x86_64/fs/ext4/*

            which is provided kernel-debug-debuginfo-*. If you look at the LB_EXT4_SRC_DIR macro in lustre-build-ldiskfs.m4 you will see it does the right thing by default.

            simmonsja James A Simmons added a comment - kmod-lustre-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm lustre-iokit-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm kmod-lustre-osd-ldiskfs-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm lustre-osd-ldiskfs-mount-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm kmod-lustre-tests-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm lustre-resource-agents-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm lustre-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm lustre-tests-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm lustre-debuginfo-2.9.58_57_gc252b3b_dirty-1.el7.x86_64.rpm The ext4 source can be founded at /usr/src/debug/kernel-3.10.0-514.21.1.el7/linux-3.10.0-514.21.1.el7.x86_64/fs/ext4/* which is provided kernel-debug-debuginfo-*. If you look at the LB_EXT4_SRC_DIR macro in lustre-build-ldiskfs.m4 you will see it does the right thing by default.
            mdiep Minh Diep added a comment -

            James, what are result rpms? did you have kmod-lustre-osd-ldiskfs? I don't see where in your steps include the ext4 sources

            mdiep Minh Diep added a comment - James, what are result rpms? did you have kmod-lustre-osd-ldiskfs? I don't see where in your steps include the ext4 sources

            Oh, I was confused for a moment and thought that LU-8685 was the LU-684 blocker...but no, it is LU-8729 that blocks LU-684. Now I really don't understand why people though LU-8685 was a blocker for this ticket.

            As far as I can tell we're still in the same spot as always: we need LU-684 finished. The 26220 patch is just a temporary hack to ease a packaging/distribution issue because LU-684 has not yet been completed. But 26220 will not let us close this ticket.

            morrone Christopher Morrone (Inactive) added a comment - Oh, I was confused for a moment and thought that LU-8685 was the LU-684 blocker...but no, it is LU-8729 that blocks LU-684 . Now I really don't understand why people though LU-8685 was a blocker for this ticket. As far as I can tell we're still in the same spot as always: we need LU-684 finished. The 26220 patch is just a temporary hack to ease a packaging/distribution issue because LU-684 has not yet been completed. But 26220 will not let us close this ticket.

            Actually its the fix for LU-8685 that has now landed for RHEL7. We no longer need the patch jbd2-fix-j_list_lock-unlock-3.10-rhel7.patch !!! We still need 26220 so osd-ldiskfs will work properly with patch-less kernels. At LUG it was brought up if the lustre tree needs to be patched to build ldiskfs with just standard RHEL kernel rpms. I tried it out and building just works out of the box with patch-less kernels. All I did was

            rpm -ivh kernel-devel-3.10.0-514.21.1.el7.x86_64.rpm

            If you want to build ldiskfs just do:
            rpm -ivh kernel-debuginfo-common-x86_64-3.10.0-514.21.1.el7.x86_64.rpm

            cd ~/lustre-release
            sh ./autogen.sh
            ./configure --with-linux=/usr/src/kernels/3.10.0-514.21.1.el7.x86_64
            make rpms

            install lustre rpms and reboot

            I gave the above example for those people like me that have multiple entries in /usr/src/kernel.
            That is all that is needed now. We have arrived

            simmonsja James A Simmons added a comment - Actually its the fix for LU-8685 that has now landed for RHEL7. We no longer need the patch jbd2-fix-j_list_lock-unlock-3.10-rhel7.patch !!! We still need 26220 so osd-ldiskfs will work properly with patch-less kernels. At LUG it was brought up if the lustre tree needs to be patched to build ldiskfs with just standard RHEL kernel rpms. I tried it out and building just works out of the box with patch-less kernels. All I did was rpm -ivh kernel-devel-3.10.0-514.21.1.el7.x86_64.rpm If you want to build ldiskfs just do: rpm -ivh kernel-debuginfo-common-x86_64-3.10.0-514.21.1.el7.x86_64.rpm cd ~/lustre-release sh ./autogen.sh ./configure --with-linux=/usr/src/kernels/3.10.0-514.21.1.el7.x86_64 make rpms install lustre rpms and reboot I gave the above example for those people like me that have multiple entries in /usr/src/kernel. That is all that is needed now. We have arrived

            In any event, hopefully the James' revelation that the RHEL kernel is now shipping with the fix means that we can drop change 26220 and just finish this ticket?

            morrone Christopher Morrone (Inactive) added a comment - In any event, hopefully the James' revelation that the RHEL kernel is now shipping with the fix means that we can drop change 26220 and just finish this ticket?

            People

              green Oleg Drokin
              yong.fan nasf (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              35 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: