Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • Lustre 2.9.0
    • None
    • 3.10.0-514.10.2.el7_lustre.x86_64, lustre-2.9.0_srcc6-1.el7.centos.x86_64
    • 3
    • 9223372036854775807

    Description

      Our MDT was stuck or barely usable twice in a row lately, and the second time we took a crash dump, which shows that several threads were blocked in lod_qos_prep_create...

      PID: 291558  TASK: ffff88203c7b2f10  CPU: 9   COMMAND: "mdt01_030"
       #0 [ffff881a157f7588] __schedule at ffffffff8168b6a5
       #1 [ffff881a157f75f0] schedule at ffffffff8168bcf9
       #2 [ffff881a157f7600] rwsem_down_write_failed at ffffffff8168d4a5
       #3 [ffff881a157f7688] call_rwsem_down_write_failed at ffffffff81327067
       #4 [ffff881a157f76d0] down_write at ffffffff8168aebd
       #5 [ffff881a157f76e8] lod_qos_prep_create at ffffffffa124031c [lod]
       #6 [ffff881a157f77a8] lod_declare_striped_object at ffffffffa1239a8c [lod]
       #7 [ffff881a157f77f0] lod_declare_object_create at ffffffffa123b0f1 [lod]
       #8 [ffff881a157f7838] mdd_declare_object_create_internal at ffffffffa129d21f [mdd]
       #9 [ffff881a157f7880] mdd_declare_create at ffffffffa1294133 [mdd]
      #10 [ffff881a157f78f0] mdd_create at ffffffffa1295689 [mdd]
      #11 [ffff881a157f79e8] mdt_reint_open at ffffffffa1176f05 [mdt]
      #12 [ffff881a157f7ad8] mdt_reint_rec at ffffffffa116c4a0 [mdt]
      #13 [ffff881a157f7b00] mdt_reint_internal at ffffffffa114edc2 [mdt]
      #14 [ffff881a157f7b38] mdt_intent_reint at ffffffffa114f322 [mdt]
      #15 [ffff881a157f7b78] mdt_intent_policy at ffffffffa1159b9c [mdt]
      #16 [ffff881a157f7bd0] ldlm_lock_enqueue at ffffffffa0b461e7 [ptlrpc]
      #17 [ffff881a157f7c28] ldlm_handle_enqueue0 at ffffffffa0b6f3a3 [ptlrpc]
      #18 [ffff881a157f7cb8] tgt_enqueue at ffffffffa0befe12 [ptlrpc]
      #19 [ffff881a157f7cd8] tgt_request_handle at ffffffffa0bf4275 [ptlrpc]
      #20 [ffff881a157f7d20] ptlrpc_server_handle_request at ffffffffa0ba01fb [ptlrpc]
      #21 [ffff881a157f7de8] ptlrpc_main at ffffffffa0ba42b0 [ptlrpc]
      #22 [ffff881a157f7ec8] kthread at ffffffff810b06ff
      #23 [ffff881a157f7f50] ret_from_fork at ffffffff81696b98
      
      
      
      

      The disk array (from Dell) that we use for the MDT doesn't report any issue. The load was not particularly high. kmem -i does report 76 GB of free memory (60% of TOTAL MEM).

      Attaching the output of `foreach bt`, maybe somebody will have a clue.

       

      Each time, failing over the MDT resumed operations, but the recovery was a bit long and with a few evictions.

      Lustre: oak-MDT0000: Recovery over after 13:39, of 1144 clients 1134 recovered and 10 were evicted.
      
      

      Thanks!
      Stephane

      Attachments

        1. oak-io1-s1.lustre.log
          1.33 MB
          Stephane Thiell
        2. oak-io1-s2.lustre.log
          44 kB
          Stephane Thiell
        3. oak-md1-s1.foreach_bt.txt
          429 kB
          Stephane Thiell
        4. oak-md1-s1.lustre.log
          274 kB
          Stephane Thiell

        Activity

          [LU-9688] Stuck MDT in lod_qos_prep_create
          niu Niu Yawei (Inactive) made changes -
          Resolution New: Not a Bug [ 6 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

          Bad disk, not Lustre issue.

          niu Niu Yawei (Inactive) added a comment - Bad disk, not Lustre issue.

          Hi, Stephane

          That's good news, if OST fail to create objects due to backend storage problem, the creation on MDT will be blocked, we can't do much about in such situation but waiting for the storage recovered. Can we close this ticket now? Thanks.

          niu Niu Yawei (Inactive) added a comment - Hi, Stephane That's good news, if OST fail to create objects due to backend storage problem, the creation on MDT will be blocked, we can't do much about in such situation but waiting for the storage recovered. Can we close this ticket now? Thanks.

          Hi Niu,

          Thanks for looking at this. After making sure that the OSTs were OK and also failing over the MDT, the problem did not appear again.  I'm just a bit concerned that the MDT couldn't recover by itself in that specific case.

          Thanks,

          Stephane

          sthiell Stephane Thiell added a comment - Hi Niu, Thanks for looking at this. After making sure that the OSTs were OK and also failing over the MDT, the problem did not appear again.  I'm just a bit concerned that the MDT couldn't recover by itself in that specific case. Thanks, Stephane

          Yes, the error message you mentioned is related to this issue, because precreate failed instantly, all create threads are blocked on waiting objects being created.
          I checked OST log and found some md raid threads are hung in md_update_sb() at that time, I think that could be the root cause. Is this problem disappeared?

          niu Niu Yawei (Inactive) added a comment - Yes, the error message you mentioned is related to this issue, because precreate failed instantly, all create threads are blocked on waiting objects being created. I checked OST log and found some md raid threads are hung in md_update_sb() at that time, I think that could be the root cause. Is this problem disappeared?
          pjones Peter Jones made changes -
          Assignee Original: WC Triage [ wc-triage ] New: Niu Yawei [ niu ]
          pjones Peter Jones added a comment -

          Niu

          Can you please advise on this one?

          Thanks

          Peter

          pjones Peter Jones added a comment - Niu Can you please advise on this one? Thanks Peter

          Hi Alex,

          Thanks for the quick reply. That makes sense because we had some issues with the OSS oak-io1-s1 as it became unresponsive, we rebooted it on Jun 19 11:49:32, you can see that in the logs, and the OSTs were re-mounted at ~ Jun 19 12:00). Sorry I didn't mention that in the original ticket. So, I am attaching logs of the OSTs (OSS oak-io1-s1 and oak-io1-s2) and MDT (was mounted on MDS oak-md1-s1). While I was preparing the logs, I noticed that on the MDT (file oak-md1-s1.lustre.log), there are errors about objects precreation on one OST, could that be the issue?

          Jun 19 11:47:23 oak-md1-s1 kernel: LustreError: 191781:0:(osp_precreate.c:615:osp_precreate_send()) oak-OST0016-osc-MDT0000: can't precreate: rc = -11
          Jun 19 11:47:23 oak-md1-s1 kernel: LustreError: 191781:0:(osp_precreate.c:1243:osp_precreate_thread()) oak-OST0016-osc-MDT0000: cannot precreate objects: rc = -11
          
          
          

          notes:
          o2ib5 is the lnet network of the servers and a few clients
          o2ib, o2ib3, o2ib4 are client only networks

          Thanks,

          Stephane

          sthiell Stephane Thiell added a comment - Hi Alex, Thanks for the quick reply. That makes sense because we had some issues with the OSS oak-io1-s1 as it became unresponsive, we rebooted it on Jun 19 11:49:32, you can see that in the logs, and the OSTs were re-mounted at ~ Jun 19 12:00). Sorry I didn't mention that in the original ticket. So, I am attaching logs of the OSTs (OSS oak-io1-s1 and oak-io1-s2) and MDT (was mounted on MDS oak-md1-s1). While I was preparing the logs, I noticed that on the MDT (file oak-md1-s1.lustre.log), there are errors about objects precreation on one OST, could that be the issue? Jun 19 11:47:23 oak-md1-s1 kernel: LustreError: 191781:0:(osp_precreate.c:615:osp_precreate_send()) oak-OST0016-osc-MDT0000: can't precreate: rc = -11 Jun 19 11:47:23 oak-md1-s1 kernel: LustreError: 191781:0:(osp_precreate.c:1243:osp_precreate_thread()) oak-OST0016-osc-MDT0000: cannot precreate objects: rc = -11 notes: o2ib5 is the lnet network of the servers and a few clients o2ib, o2ib3, o2ib4 are client only networks Thanks, Stephane
          sthiell Stephane Thiell made changes -
          Attachment New: oak-io1-s2.lustre.log [ 27069 ]
          sthiell Stephane Thiell made changes -
          Attachment New: oak-io1-s1.lustre.log [ 27068 ]

          People

            niu Niu Yawei (Inactive)
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: