Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2943

LBUG mdt_reconstruct_open()) ASSERTION( (!(rc < 0) || (lustre_msg_get_transno(req->rq_repmsg) == 0)) )

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.1.4
    • 3
    • 7064

    Description

      This issue has already been hit on lustre 2.2 (see LU-1702). Traces are exactly the same as for LU-1702.

      It's been hit four consecutive times so it seems quite easy to reproduce.

      2013-03-06 16:05:01 LustreError: 31751:0:(mdt_open.c:1023:mdt_reconstruct_open()) ASSERTION( (!(rc < 0)

      (lustre_msg_get_transno(req->rq_repmsg) == 0)) ) failed:
      2013-03-06 16:05:01 LustreError: 31751:0:(mdt_open.c:1023:mdt_reconstruct_open()) LBUG
      2013-03-06 16:05:01 Pid: 31751, comm: mdt_145
      2013-03-06 16:05:01
      2013-03-06 16:05:01 Call Trace:
      2013-03-06 16:05:01 [<ffffffffa04a27f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      2013-03-06 16:05:01 [<ffffffffa04a2e07>] lbug_with_loc+0x47/0xb0 [libcfs]
      2013-03-06 16:05:01 [<ffffffffa0d9ed87>] mdt_reconstruct_open+0x7c7/0xa80 [mdt]
      2013-03-06 16:05:01 [<ffffffffa0d908c5>] mdt_reconstruct+0x45/0x120 [mdt]
      2013-03-06 16:05:01 [<ffffffffa0d7d099>] mdt_reint_internal+0x709/0x8e0 [mdt]
      2013-03-06 16:05:01 [<ffffffffa0d7d53d>] mdt_intent_reint+0x1ed/0x500 [mdt]
      2013-03-06 16:05:01 [<ffffffffa0d7bc09>] mdt_intent_policy+0x379/0x690 [mdt]
      2013-03-06 16:05:01 [<ffffffffa06ca3c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc]
      2013-03-06 16:05:01 [<ffffffffa06f03dd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc]
      2013-03-06 16:05:01 [<ffffffffa0d7c586>] mdt_enqueue+0x46/0x130 [mdt]
      2013-03-06 16:05:01 [<ffffffffa0d71762>] mdt_handle_common+0x932/0x1750 [mdt]
      2013-03-06 16:05:01 [<ffffffffa0d72655>] mdt_regular_handle+0x15/0x20 [mdt]
      2013-03-06 16:05:01 [<ffffffffa071f4f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc]
      2013-03-06 16:05:01 [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320
      2013-03-06 16:05:01 [<ffffffffa071e7e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
      2013-03-06 16:05:01 [<ffffffff8100412a>] child_rip+0xa/0x20
      2013-03-06 16:05:01 [<ffffffffa071e7e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
      2013-03-06 16:05:01 [<ffffffffa071e7e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
      2013-03-06 16:05:01 [<ffffffff81004120>] ? child_rip+0x0/0x20

      On the crash, the file who make the LBUG is a file created by mpio.

      Onsite support team made the following analysis

      The return status (rc) is -EREMOTE (-66) and it seems the
      disposition mask was DISP_IT_EXECD / DISP_LOOKUP_EXECD / DISP_LOOKUP_POS
      / DISP_OPEN_OPEN / DISP_OPEN_LOCK. According to these information, it could be possible that, prior to the LBUG, MDS has run mdt_reint_open() having in return -EREMOTE just before the LBUG.

      So mdt_reint_open() would return -EREMOTE and then
      mdt_reconstruct_open() does not make attention that in case of -EREMOTE
      return there is no msg transno setting ...

      On the attachment file you can find the struct mdt_thread_info info data
      who made the LBUG and also the req data (struct ptlrpc_request°
      and lcd data (struct lsd_client_data).

      Attachments

        Issue Links

          Activity

            [LU-2943] LBUG mdt_reconstruct_open()) ASSERTION( (!(rc < 0) || (lustre_msg_get_transno(req->rq_repmsg) == 0)) )

            What the current status of the latest patch ?

            louveta Alexandre Louvet (Inactive) added a comment - What the current status of the latest patch ?

            New version/patch-set #3 of b2_1 port/patch http://review.whamcloud.com/5954 submitted and successfully passed auto-tests.

            bfaccini Bruno Faccini (Inactive) added a comment - New version/patch-set #3 of b2_1 port/patch http://review.whamcloud.com/5954 submitted and successfully passed auto-tests.

            Humm sorry about that, and due to your report I am currently looking again to original/master patch from LU-2927. Seems that I missed some case in my 1st back-port and I am testing a new version. Will keep you updated soon.

            bfaccini Bruno Faccini (Inactive) added a comment - Humm sorry about that, and due to your report I am currently looking again to original/master patch from LU-2927 . Seems that I missed some case in my 1st back-port and I am testing a new version. Will keep you updated soon.

            Bruno, sorry to says that just after installing the patch, we got a lot of crashes on 3 large clusters.

            The lbug message is :
            (mdt_handler.c:3411:mdt_intent_reint()) ASSERTION( lustre_handle_is_used(&lhc->mlh_reg_lh) ) failed:
            (mdt_handler.c:3411:mdt_intent_reint()) LBUG

            followed by this stack :
            Call Trace:
            [<ffffffffa041a7f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            [<ffffffffa041ae07>] lbug_with_loc+0x47/0xb0 [libcfs]
            [<ffffffffa0c0a841>] mdt_intent_reint+0x4f1/0x530 [mdt]
            [<ffffffffa0c08c09>] mdt_intent_policy+0x379/0x690 [mdt]
            [<ffffffffa06653c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc]
            [<ffffffffa068b3dd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc]
            [<ffffffffa0c09586>] mdt_enqueue+0x46/0x130 [mdt]
            [<ffffffffa0bfe762>] mdt_handle_common+0x932/0x1750 [mdt]
            [<ffffffffa0bff655>] mdt_regular_handle+0x15/0x20 [mdt]
            [<ffffffffa06ba4f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc]
            [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320
            [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
            [<ffffffff8100412a>] child_rip+0xa/0x20
            [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
            [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
            [<ffffffff81004120>] ? child_rip+0x0/0x20

            Kernel panic - not syncing: LBUG
            Pid: 60412, comm: mdt_440 Not tainted 2.6.32-220.23.1.bl6.Bull.28.8.x86_64 0000001
            Call Trace:
            [<ffffffff81484650>] ? panic+0x78/0x143
            [<ffffffffa041ae5b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
            [<ffffffffa0c0a841>] ? mdt_intent_reint+0x4f1/0x530 [mdt]
            [<ffffffffa0c08c09>] ? mdt_intent_policy+0x379/0x690 [mdt]
            [<ffffffffa06653c1>] ? ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc]
            [<ffffffffa068b3dd>] ? ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc]
            [<ffffffffa0c09586>] ? mdt_enqueue+0x46/0x130 [mdt]
            [<ffffffffa0bfe762>] ? mdt_handle_common+0x932/0x1750 [mdt]
            [<ffffffffa0bff655>] ? mdt_regular_handle+0x15/0x20 [mdt]
            [<ffffffffa06ba4f6>] ? ptlrpc_main+0xd16/0x1a80 [ptlrpc]
            [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320
            [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
            [<ffffffff8100412a>] ? child_rip+0xa/0x20
            [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
            [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
            [<ffffffff81004120>] ? child_rip+0x0/0x20

            That was observed on system running lustre 2.1.5 + patches

            • ORNL-22 general ptlrpcd threads pool support
            • LU-1144 implement a NUMA aware ptlrpcd binding policy
            • LU-1110 MDS Oops in osd_xattr_get() during file open by FID
            • LU-2613 to much unreclaimable slab space
            • LU-2624 ptlrpc fix thread stop
            • LU-2683 client deadlock in cl_lock_mutex_get
            • LU-2943 LBUG in mdt_reconstruct_open()

            I agree this is not the same context than previously, but it located exactly were the patch is modifying the source code.
            The patch was removed on the 3 affected cluster and since, stability is back.

            Alex.

            louveta Alexandre Louvet (Inactive) added a comment - Bruno, sorry to says that just after installing the patch, we got a lot of crashes on 3 large clusters. The lbug message is : (mdt_handler.c:3411:mdt_intent_reint()) ASSERTION( lustre_handle_is_used(&lhc->mlh_reg_lh) ) failed: (mdt_handler.c:3411:mdt_intent_reint()) LBUG followed by this stack : Call Trace: [<ffffffffa041a7f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa041ae07>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0c0a841>] mdt_intent_reint+0x4f1/0x530 [mdt] [<ffffffffa0c08c09>] mdt_intent_policy+0x379/0x690 [mdt] [<ffffffffa06653c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [<ffffffffa068b3dd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [<ffffffffa0c09586>] mdt_enqueue+0x46/0x130 [mdt] [<ffffffffa0bfe762>] mdt_handle_common+0x932/0x1750 [mdt] [<ffffffffa0bff655>] mdt_regular_handle+0x15/0x20 [mdt] [<ffffffffa06ba4f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [<ffffffff8100412a>] child_rip+0xa/0x20 [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [<ffffffff81004120>] ? child_rip+0x0/0x20 Kernel panic - not syncing: LBUG Pid: 60412, comm: mdt_440 Not tainted 2.6.32-220.23.1.bl6.Bull.28.8.x86_64 0000001 Call Trace: [<ffffffff81484650>] ? panic+0x78/0x143 [<ffffffffa041ae5b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [<ffffffffa0c0a841>] ? mdt_intent_reint+0x4f1/0x530 [mdt] [<ffffffffa0c08c09>] ? mdt_intent_policy+0x379/0x690 [mdt] [<ffffffffa06653c1>] ? ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [<ffffffffa068b3dd>] ? ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [<ffffffffa0c09586>] ? mdt_enqueue+0x46/0x130 [mdt] [<ffffffffa0bfe762>] ? mdt_handle_common+0x932/0x1750 [mdt] [<ffffffffa0bff655>] ? mdt_regular_handle+0x15/0x20 [mdt] [<ffffffffa06ba4f6>] ? ptlrpc_main+0xd16/0x1a80 [ptlrpc] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [<ffffffff8100412a>] ? child_rip+0xa/0x20 [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [<ffffffffa06b97e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [<ffffffff81004120>] ? child_rip+0x0/0x20 That was observed on system running lustre 2.1.5 + patches ORNL-22 general ptlrpcd threads pool support LU-1144 implement a NUMA aware ptlrpcd binding policy LU-1110 MDS Oops in osd_xattr_get() during file open by FID LU-2613 to much unreclaimable slab space LU-2624 ptlrpc fix thread stop LU-2683 client deadlock in cl_lock_mutex_get LU-2943 LBUG in mdt_reconstruct_open() I agree this is not the same context than previously, but it located exactly were the patch is modifying the source code. The patch was removed on the 3 affected cluster and since, stability is back. Alex.

            Bruno,
            I agree with your b2_1 patch. I tried to backport the b2_4 fix from LU-2927, but there is quite a lot of changes around mdt_open_by_fid_lock(), and I was afraid to miss something.

            patrick.valentin Patrick Valentin (Inactive) added a comment - Bruno, I agree with your b2_1 patch. I tried to backport the b2_4 fix from LU-2927 , but there is quite a lot of changes around mdt_open_by_fid_lock(), and I was afraid to miss something.

            B2_1 port/patch http://review.whamcloud.com/5954 submitted and successfully passed auto-tests.

            bfaccini Bruno Faccini (Inactive) added a comment - B2_1 port/patch http://review.whamcloud.com/5954 submitted and successfully passed auto-tests.

            Seb,
            Just to share and get this done as quick as possible, did you experience regressions in your porting attempts ? And if yes of what kind ?
            I have one build in progress, will let you know on testing progress soon.

            bfaccini Bruno Faccini (Inactive) added a comment - Seb, Just to share and get this done as quick as possible, did you experience regressions in your porting attempts ? And if yes of what kind ? I have one build in progress, will let you know on testing progress soon.

            Yes, I am working on it and as you pointed it is not an easy one, will keep you updated.

            bfaccini Bruno Faccini (Inactive) added a comment - Yes, I am working on it and as you pointed it is not an easy one, will keep you updated.

            Hi Bruno,

            We tried to backport the fix from LU-2927 to b2_1, but we failed. Have you been able to make progress on this?
            for the record, this issue is the most disruptive one at CEA ATM.

            TIA,
            Sebastien.

            sebastien.buisson Sebastien Buisson (Inactive) added a comment - Hi Bruno, We tried to backport the fix from LU-2927 to b2_1, but we failed. Have you been able to make progress on this? for the record, this issue is the most disruptive one at CEA ATM. TIA, Sebastien.

            I have found the client node who are certainly the client who send the failing request
            and on its log we can see the stack of the request :

            2013-03-06 16:03:04 INFO: task %%A197:17742 blocked for more than 120 seconds.
            2013-03-06 16:03:04 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            2013-03-06 16:03:04 %%A197 D 0000000000000001 0 17742 17718 0x00000000
            2013-03-06 16:03:04 ffff8806f08eb658 0000000000000086 0000000000000000 ffffffffa0686d24
            2013-03-06 16:03:04 0000000000000008 ffff88080f3e3c00 ffff8806f08eb628 ffffffffa06872e7
            2013-03-06 16:03:04 ffff88080775da70 ffff8806f08ebfd8 000000000000db00 ffff88080775da70
            2013-03-06 16:03:04 Call Trace:
            2013-03-06 16:03:04 [<ffffffff81485ab5>] schedule_timeout+0x205/0x2d0
            2013-03-06 16:03:04 [<ffffffff81486a25>] __down+0x75/0xc0
            2013-03-06 16:03:04 [<ffffffff81081611>] down+0x41/0x50
            2013-03-06 16:03:04 [<ffffffffa08e4ae4>] mdc_enqueue+0x4e4/0x13a0 [mdc]
            2013-03-06 16:03:04 [<ffffffffa08e5bc0>] mdc_intent_lock+0x220/0x630 [mdc]
            2013-03-06 16:03:04 [<ffffffffa0b61920>] lmv_intent_open+0x2d0/0x10a0 [lmv]
            2013-03-06 16:03:04 [<ffffffffa0b62980>] lmv_intent_lock+0x290/0x360 [lmv]
            2013-03-06 16:03:04 [<ffffffffa0a2cbd8>] ll_revalidate_it+0x2b8/0x15b0 [lustre]
            2013-03-06 16:03:04 [<ffffffffa0a2e003>] ll_revalidate_nd+0x133/0x3a0 [lustre]
            2013-03-06 16:03:04 [<ffffffff81171612>] do_lookup+0x62/0x1e0
            2013-03-06 16:03:04 [<ffffffff81172b4a>] path_walk+0x6a/0xe0
            2013-03-06 16:03:04 [<ffffffff81172d1b>] do_path_lookup+0x5b/0xa0
            2013-03-06 16:03:04 [<ffffffff81173c4b>] do_filp_open+0xfb/0xd00
            2013-03-06 16:03:04 [<ffffffff811607b9>] do_sys_open+0x69/0x140
            2013-03-06 16:03:04 [<ffffffff811608d0>] sys_open+0x20/0x30
            2013-03-06 16:03:04 [<ffffffff810030f2>] system_call_fastpath+0x16/0x1b

            I have also attach a file with complete client log

            apercher Antoine Percher added a comment - I have found the client node who are certainly the client who send the failing request and on its log we can see the stack of the request : 2013-03-06 16:03:04 INFO: task %%A197:17742 blocked for more than 120 seconds. 2013-03-06 16:03:04 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-03-06 16:03:04 %%A197 D 0000000000000001 0 17742 17718 0x00000000 2013-03-06 16:03:04 ffff8806f08eb658 0000000000000086 0000000000000000 ffffffffa0686d24 2013-03-06 16:03:04 0000000000000008 ffff88080f3e3c00 ffff8806f08eb628 ffffffffa06872e7 2013-03-06 16:03:04 ffff88080775da70 ffff8806f08ebfd8 000000000000db00 ffff88080775da70 2013-03-06 16:03:04 Call Trace: 2013-03-06 16:03:04 [<ffffffff81485ab5>] schedule_timeout+0x205/0x2d0 2013-03-06 16:03:04 [<ffffffff81486a25>] __down+0x75/0xc0 2013-03-06 16:03:04 [<ffffffff81081611>] down+0x41/0x50 2013-03-06 16:03:04 [<ffffffffa08e4ae4>] mdc_enqueue+0x4e4/0x13a0 [mdc] 2013-03-06 16:03:04 [<ffffffffa08e5bc0>] mdc_intent_lock+0x220/0x630 [mdc] 2013-03-06 16:03:04 [<ffffffffa0b61920>] lmv_intent_open+0x2d0/0x10a0 [lmv] 2013-03-06 16:03:04 [<ffffffffa0b62980>] lmv_intent_lock+0x290/0x360 [lmv] 2013-03-06 16:03:04 [<ffffffffa0a2cbd8>] ll_revalidate_it+0x2b8/0x15b0 [lustre] 2013-03-06 16:03:04 [<ffffffffa0a2e003>] ll_revalidate_nd+0x133/0x3a0 [lustre] 2013-03-06 16:03:04 [<ffffffff81171612>] do_lookup+0x62/0x1e0 2013-03-06 16:03:04 [<ffffffff81172b4a>] path_walk+0x6a/0xe0 2013-03-06 16:03:04 [<ffffffff81172d1b>] do_path_lookup+0x5b/0xa0 2013-03-06 16:03:04 [<ffffffff81173c4b>] do_filp_open+0xfb/0xd00 2013-03-06 16:03:04 [<ffffffff811607b9>] do_sys_open+0x69/0x140 2013-03-06 16:03:04 [<ffffffff811608d0>] sys_open+0x20/0x30 2013-03-06 16:03:04 [<ffffffff810030f2>] system_call_fastpath+0x16/0x1b I have also attach a file with complete client log

            Master/2.4 fix landed from LU-2927 is at http://review.whamcloud.com/#change,5694 , I will evaluate the b2_1 back-port.

            bfaccini Bruno Faccini (Inactive) added a comment - Master/2.4 fix landed from LU-2927 is at http://review.whamcloud.com/#change,5694 , I will evaluate the b2_1 back-port.

            People

              bfaccini Bruno Faccini (Inactive)
              dmoreno Diego Moreno (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: