Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5525

ASSERTION( new_lock->l_readers + new_lock->l_writers == 0 ) failed

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • None
    • MDS node, Lustre 2.4.2-14chaos, ZFS OBD
    • 3
    • 15383

    Description

      After upgrading to lustre 2.4.2-14chaos (see github.com/chaos/lustre), we soon hit the following assertion on one of our MDS nodes:

      mdt_handler.c:3652:mdt_intent_lock_replace()) ASSERTION( new_lock->l_readers + new_lock->l_writers == 0 ) failed

      Perhaps most significantly, this tag of our lustre tree includes the patch entitled:

      LU-4584 mdt: ensure orig lock is found in hash upon resend

      James Simmons reported this assertion when he tested the LU-4584 patch, but the Bruno made the evaluation that the assertion was unrelated to the patch.

      Whether it is related or not, we need to fix the problem.

      Attachments

        Issue Links

          Activity

            [LU-5525] ASSERTION( new_lock->l_readers + new_lock->l_writers == 0 ) failed
            green Oleg Drokin added a comment -

            I now refreshed http://review.whamcloud.com/#/c/9488/ to a more correct version.
            Also updated open_by_fid part to be aware of resends.

            green Oleg Drokin added a comment - I now refreshed http://review.whamcloud.com/#/c/9488/ to a more correct version. Also updated open_by_fid part to be aware of resends.
            green Oleg Drokin added a comment -

            Ok. I think I got to the root of this (using my patch http://review.whamcloud.com/11842 to make every request to trigger a resend which is great for teting this code that is rarely hit in our testing, but gets hit quite a bit on larger systems now that LLNL added the patch to shrink client supplied buffers).

            First of all the code as is should have failed with this assert on the very first resent, but did not due to a bug. This was fixed in our tree, but not in 2.4 ( LU-3428) http://review.whamcloud.com/6511 - so you need this patch first.

            The assertion itself is due to a logic flaw. Then it becomes clear that patch LU-4584 you are carrying is wrong, in particular this part of it:

            --- a/lustre/mdt/mdt_open.c
            +++ b/lustre/mdt/mdt_open.c
            @@ -1710,6 +1710,10 @@ int mdt_reint_open(struct mdt_thread_info *info, struct m
                            /* the open lock might already be gotten in
                             * mdt_intent_fixup_resent */
                            LASSERT(lustre_msg_get_flags(req->rq_reqmsg) & MSG_RESENT);
            +
            +               /* take again a ref on old lock found to be resent */
            +               ldlm_lock_addref(&lhc->mlh_reg_lh, lhc->mlh_reg_mode);
            +
                            if (create_flags & MDS_OPEN_LOCK)
                                    mdt_set_disposition(info, ldlm_rep, DISP_OPEN_LOCK);
                    } else {
            

            The reason for that is because the mdt_intent_lock_replace assumes the lock has already been "replaced" into the client export, so it does not need any of those references - it cannot go away because it's already "owned" by the (not yet aware) client.

            With this part removed (and that other uninitialized var fix from above) I am no longer hitting the assertion or having terrible lock deadlocks on resend from the start.
            even despite that my racer testing on your tree cannot complete as other long fixed (in master tree) issues are getting in the way like LU-4725 and LU-5144

            At least this class of problems (LU-2827 related) should be extinguished for you now and hold you until you are ready to move to a newer release that has the more comprehensive lu-2827 patch with all of its afterfixes.

            green Oleg Drokin added a comment - Ok. I think I got to the root of this (using my patch http://review.whamcloud.com/11842 to make every request to trigger a resend which is great for teting this code that is rarely hit in our testing, but gets hit quite a bit on larger systems now that LLNL added the patch to shrink client supplied buffers). First of all the code as is should have failed with this assert on the very first resent, but did not due to a bug. This was fixed in our tree, but not in 2.4 ( LU-3428 ) http://review.whamcloud.com/6511 - so you need this patch first. The assertion itself is due to a logic flaw. Then it becomes clear that patch LU-4584 you are carrying is wrong, in particular this part of it: --- a/lustre/mdt/mdt_open.c +++ b/lustre/mdt/mdt_open.c @@ -1710,6 +1710,10 @@ int mdt_reint_open(struct mdt_thread_info *info, struct m /* the open lock might already be gotten in * mdt_intent_fixup_resent */ LASSERT(lustre_msg_get_flags(req->rq_reqmsg) & MSG_RESENT); + + /* take again a ref on old lock found to be resent */ + ldlm_lock_addref(&lhc->mlh_reg_lh, lhc->mlh_reg_mode); + if (create_flags & MDS_OPEN_LOCK) mdt_set_disposition(info, ldlm_rep, DISP_OPEN_LOCK); } else { The reason for that is because the mdt_intent_lock_replace assumes the lock has already been "replaced" into the client export, so it does not need any of those references - it cannot go away because it's already "owned" by the (not yet aware) client. With this part removed (and that other uninitialized var fix from above) I am no longer hitting the assertion or having terrible lock deadlocks on resend from the start. even despite that my racer testing on your tree cannot complete as other long fixed (in master tree) issues are getting in the way like LU-4725 and LU-5144 At least this class of problems ( LU-2827 related) should be extinguished for you now and hold you until you are ready to move to a newer release that has the more comprehensive lu-2827 patch with all of its afterfixes.

            Looks like a bug. You can work around it by using --with-downstream-release:

            $ sh autogen.sh
            $ ./configure --with-downstream-release=14chaos --disable-liblustre
            $ make rpms
            

            If you are trying to reproduce what we see, we use zfs not ldiskfs.

            Also, is reproduction likely to be an effective approach to this problem? What do we know about the code, and how do we intend to reproduce the issue?

            morrone Christopher Morrone (Inactive) added a comment - Looks like a bug. You can work around it by using --with-downstream-release: $ sh autogen.sh $ ./configure --with-downstream-release=14chaos --disable-liblustre $ make rpms If you are trying to reproduce what we see, we use zfs not ldiskfs. Also, is reproduction likely to be an effective approach to this problem? What do we know about the code, and how do we intend to reproduce the issue?

            I 1st cloned the chaos git, then selected the 2.4.2-14chaos tag.
            And then I applied the/our usual procedure described at "https://wiki.hpdd.intel.com/pages/viewpage.action?pageId=8126821", to generate a ldiskfs-based (sorry I forgot to mention this ...) Server build, and get these error during the "make rpms" step :

            ...............
            
            Wrote: /home/bruno.faccini/kernel/rpmbuild/SRPMS/lustre-iokit-1.4.0-1.src.rpm
            make[1]: Leaving directory `/home/bruno.faccini/lustre-chaos/lustre-iokit'
            rpmbuilddir=`mktemp -t -d lustre-build-$USER-XXXXXXXX`; \
                    make  \
                            rpmbuilddir="$rpmbuilddir" \
                            rpm-local || exit 1; \
                    rpmbuild \
                            --define "_tmppath $rpmbuilddir/TMP" \
                            --define "_topdir $rpmbuilddir" \
                            --define "build_src_rpm 1" \
                            --define "dist %{nil}" \
                    -ts lustre-2.4.2.tar.gz || exit 1; \
                    cp $rpmbuilddir/SRPMS/*.src.rpm . || exit 1; \
                    rm -f -R $rpmbuilddir
            make[1]: Entering directory `/home/bruno.faccini/lustre-chaos'
            make[1]: Leaving directory `/home/bruno.faccini/lustre-chaos'
            error: Macro %rel_down has empty body
            error: Macro %rel_build has empty body
            error: line 96: Empty tag: Release:
            make: *** [srpm] Error 1
            

            So I suspect I don't use the right procedure for your tree since we don't have such "rpmbuilddir" variable/macro in our build infrastructure, which plays with/changes "_topdir" setting ...

            bfaccini Bruno Faccini (Inactive) added a comment - I 1st cloned the chaos git, then selected the 2.4.2-14chaos tag. And then I applied the/our usual procedure described at "https://wiki.hpdd.intel.com/pages/viewpage.action?pageId=8126821", to generate a ldiskfs-based (sorry I forgot to mention this ...) Server build, and get these error during the "make rpms" step : ............... Wrote: /home/bruno.faccini/kernel/rpmbuild/SRPMS/lustre-iokit-1.4.0-1.src.rpm make[1]: Leaving directory `/home/bruno.faccini/lustre-chaos/lustre-iokit' rpmbuilddir=`mktemp -t -d lustre-build-$USER-XXXXXXXX`; \ make \ rpmbuilddir="$rpmbuilddir" \ rpm-local || exit 1; \ rpmbuild \ --define "_tmppath $rpmbuilddir/TMP" \ --define "_topdir $rpmbuilddir" \ --define "build_src_rpm 1" \ --define "dist %{nil}" \ -ts lustre-2.4.2.tar.gz || exit 1; \ cp $rpmbuilddir/SRPMS/*.src.rpm . || exit 1; \ rm -f -R $rpmbuilddir make[1]: Entering directory `/home/bruno.faccini/lustre-chaos' make[1]: Leaving directory `/home/bruno.faccini/lustre-chaos' error: Macro %rel_down has empty body error: Macro %rel_build has empty body error: line 96: Empty tag: Release: make: *** [srpm] Error 1 So I suspect I don't use the right procedure for your tree since we don't have such "rpmbuilddir" variable/macro in our build infrastructure, which plays with/changes "_topdir" setting ...

            But now I encounter issues during the make step, Chris, do you use special tricks to build from your source tree ??

            Maybe? Could you be more specific about the commands you issued and the problem that you saw? If you had trouble in the liblustre part, try adding --disable-liblustre to the configure command line.

            morrone Christopher Morrone (Inactive) added a comment - But now I encounter issues during the make step, Chris, do you use special tricks to build from your source tree ?? Maybe? Could you be more specific about the commands you issued and the problem that you saw? If you had trouble in the liblustre part, try adding --disable-liblustre to the configure command line.

            Chris,
            Please, can you help+answer me about my previous request and detail any specific/procedure to build from LLNL source tree/git ?

            bfaccini Bruno Faccini (Inactive) added a comment - Chris, Please, can you help+answer me about my previous request and detail any specific/procedure to build from LLNL source tree/git ?

            James,
            It's too bad that at the time of this crash you ran with a Lustre debug mask made of only "D_IOCTL+D_NETERROR+D_WANING+D_ERROR+D_EMERG+D_HA+D_CONFIG+D_CONSOLE" and which did not contain at least "D_RPCTRACE+D_DLMTRACE" in addition ... That would have greatly helped to navigate in the crash-dump.

            bfaccini Bruno Faccini (Inactive) added a comment - James, It's too bad that at the time of this crash you ran with a Lustre debug mask made of only "D_IOCTL+D_NETERROR+D_WANING+D_ERROR+D_EMERG+D_HA+D_CONFIG+D_CONSOLE" and which did not contain at least "D_RPCTRACE+D_DLMTRACE" in addition ... That would have greatly helped to navigate in the crash-dump.

            But now I encounter issues during the make step, Chris, do you use special tricks to build from your source tree ??

            bfaccini Bruno Faccini (Inactive) added a comment - But now I encounter issues during the make step, Chris, do you use special tricks to build from your source tree ??

            Ok, found that 2.6.32-431.3.1 kernel is compatible with 2.4.2-14chaos kernel patches ...

            bfaccini Bruno Faccini (Inactive) added a comment - Ok, found that 2.6.32-431.3.1 kernel is compatible with 2.4.2-14chaos kernel patches ...

            I wanted to ask you if you tried to run your LU-4584 reproducer against it already ?

            Yes. The LU-4584 patch did appear to help with the evictions on unlink.

            morrone Christopher Morrone (Inactive) added a comment - I wanted to ask you if you tried to run your LU-4584 reproducer against it already ? Yes. The LU-4584 patch did appear to help with the evictions on unlink.

            People

              bfaccini Bruno Faccini (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: