Details

    • Technical task
    • Resolution: Won't Fix
    • Major
    • None
    • Lustre 2.5.0
    • None
    • 17,895
    • 8423

    Description

      In order to submit the raid5-mmp-unplug-dev patch upstream, this needs to be updated for the latest kernel. Unfortunately, the affected code seems to have changed since the patch was written, so I'm not sure whether a simple "best guess update" of the patch will be correct.

      I'll attach my "best guess" patch, but it needs to be verified by someone who actually understands the MD RAID code better.

      Attachments

        Issue Links

          Activity

            [LU-3406] Submit raid5-mmp-unplug-dev patch upstream

            I think that makes sense. Alternately, the patch could just be removed from the patch series files and left in kernel_patches in case anyone wants to use it. That would be easier to find than in the contrib directory.

            adilger Andreas Dilger added a comment - I think that makes sense. Alternately, the patch could just be removed from the patch series files and left in kernel_patches in case anyone wants to use it. That would be easier to find than in the contrib directory.

            Since this will not be fixed I suggest we delete the current patches we carry. Another option is to cache the patches in contrib until some one wants to work on a version to get accepted upstream. Especially since upstream show potential corruption with our current patch.

            simmonsja James A Simmons added a comment - Since this will not be fixed I suggest we delete the current patches we carry. Another option is to cache the patches in contrib until some one wants to work on a version to get accepted upstream. Especially since upstream show potential corruption with our current patch.

            I submitted a similar patch to dm-revel@redhat.com. You can see the message at https://www.redhat.com/archives/dm-devel/2014-November/msg00004.html. I like to see what the feedback is so we can develop a approach acceptable upstream and then back port it to supported distros.

            simmonsja James A Simmons added a comment - I submitted a similar patch to dm-revel@redhat.com. You can see the message at https://www.redhat.com/archives/dm-devel/2014-November/msg00004.html . I like to see what the feedback is so we can develop a approach acceptable upstream and then back port it to supported distros.
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            James,
            I need to apologize to have not updated this ticket, and associated change #6652 too, since months now, even if I have been assigned to higher priority tasks since... In fact what I have really forgotten is to already give a detailled update on where I was on this, so will try to do it now !

            After I had pushed patch-set #1 of LU-6652, I ran local tests on an ad-hoc HA platform to verify patch's functionality and correct behavior, but then I discovered that there was no debug trace generated out from raid456 module upon MMP block reads !!!! And this when there was during MMP block writes, which seems just impossible when reading both ext4/ldiskfs and md/raid5 source code! But also confirmed by iostat/blktrace monitoring.
            This is the reason why next patch-sets 2-5 (don't remember why I removed the "fortestonly" param ...) are only adding more debug stuff/traces to help understand what's going-on ...

            BTW, at this time and before to give-up due to higher priorities..., I tried to verify the current/original patch behavior, and it exhibited tha same unexpected results.

            So here I was and still I am on this, so if you pursue in re-basing my patch What I strongly suggest is to again verify patch functionality/behavior at the lowest level.

            bfaccini Bruno Faccini (Inactive) added a comment - - edited James, I need to apologize to have not updated this ticket, and associated change #6652 too, since months now, even if I have been assigned to higher priority tasks since... In fact what I have really forgotten is to already give a detailled update on where I was on this, so will try to do it now ! After I had pushed patch-set #1 of LU-6652 , I ran local tests on an ad-hoc HA platform to verify patch's functionality and correct behavior, but then I discovered that there was no debug trace generated out from raid456 module upon MMP block reads !!!! And this when there was during MMP block writes, which seems just impossible when reading both ext4/ldiskfs and md/raid5 source code! But also confirmed by iostat/blktrace monitoring. This is the reason why next patch-sets 2-5 (don't remember why I removed the "fortestonly" param ...) are only adding more debug stuff/traces to help understand what's going-on ... BTW, at this time and before to give-up due to higher priorities..., I tried to verify the current/original patch behavior, and it exhibited tha same unexpected results. So here I was and still I am on this, so if you pursue in re-basing my patch What I strongly suggest is to again verify patch functionality/behavior at the lowest level.

            Patch http://review.whamcloud.com/6652 has been updated to latest master. If it proves stable we should look to pushing it upstream to get feedback to see what the final result is.

            simmonsja James A Simmons added a comment - Patch http://review.whamcloud.com/6652 has been updated to latest master. If it proves stable we should look to pushing it upstream to get feedback to see what the final result is.

            Oops, forgot to indicate patch is at http://review.whamcloud.com/6652.
            Also auto-tests never started for patch-set #1 for unexplained reasons, I submitted patch-set #2 with less restrictive test-parameters just in case.
            On the other hand I am working on a local test platform + use-cases to ensure MMP work fine over SW-Raid.

            bfaccini Bruno Faccini (Inactive) added a comment - Oops, forgot to indicate patch is at http://review.whamcloud.com/6652 . Also auto-tests never started for patch-set #1 for unexplained reasons, I submitted patch-set #2 with less restrictive test-parameters just in case. On the other hand I am working on a local test platform + use-cases to ensure MMP work fine over SW-Raid.

            I spent some time digging in latest/3.9.4 kernel sources and I can confirm there are still no way to bypass the MD/Raid5 stripe-cache upon a read request.
            I am 1st testing a patch (which introduces a new flag) against current Lustre-Server supported Kernel version and to be exposed under HA/mmp tests.

            bfaccini Bruno Faccini (Inactive) added a comment - I spent some time digging in latest/3.9.4 kernel sources and I can confirm there are still no way to bypass the MD/Raid5 stripe-cache upon a read request. I am 1st testing a patch (which introduces a new flag) against current Lustre-Server supported Kernel version and to be exposed under HA/mmp tests.

            BTW, within current supported Kernels and current patch version, my earlier comment applies to md_raid5_unplug_device() instead of md_wakeup_thread(). And it was already in original patch submitted for BZ #17895 by Jinshan, may be he can help me to confirm if it is necessary or not.

            bfaccini Bruno Faccini (Inactive) added a comment - BTW, within current supported Kernels and current patch version, my earlier comment applies to md_raid5_unplug_device() instead of md_wakeup_thread(). And it was already in original patch submitted for BZ #17895 by Jinshan, may be he can help me to confirm if it is necessary or not.

            Ok, so let's go for a new flag.
            But having a look to the MD/Raid5 source code I am now concerned about the real need for md_wakeup_thread() call at the end of make_request() if flag is set, seems to me that it should have been already called within release_stripe[_plug]() call and underlying routines. Thus, even if ineffective it could be considered as useless and costly.

            bfaccini Bruno Faccini (Inactive) added a comment - Ok, so let's go for a new flag. But having a look to the MD/Raid5 source code I am now concerned about the real need for md_wakeup_thread() call at the end of make_request() if flag is set, seems to me that it should have been already called within release_stripe [_plug] () call and underlying routines. Thus, even if ineffective it could be considered as useless and costly.

            I don't think _META is less used than _SYNC. It is often used for metadata IO requests to increase the priority in the scheduler. I think it is probably best to try with a new REQ flag and see what the upstream MD maintainer thinks. This is a generic bug with ext4 MMP, and not Lustre specific, so there should be some kind of solution possible.

            adilger Andreas Dilger added a comment - I don't think _META is less used than _SYNC. It is often used for metadata IO requests to increase the priority in the scheduler. I think it is probably best to try with a new REQ flag and see what the upstream MD maintainer thinks. This is a generic bug with ext4 MMP, and not Lustre specific, so there should be some kind of solution possible.

            People

              bfaccini Bruno Faccini (Inactive)
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: