Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11091

MDS threads stuck in lod_qos_prep_create after OSS crash

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.7.0
    • None
    • lustre2.7.3 fe
    • 3
    • 9223372036854775807

    Description

      OST disk issue required  reboot of OSS. This caused MDT threads to hang in lod_qos_prep_create. The MDT required a reboot about 6 hours after the OST recovered.

      OST Disk ERRORS

       Jun 18 09:56:37 nbp2-oss5 kernel: sd 16:0:0:7: [sdcu]  Sense Key : Recovered Error [current] 
      Jun 18 09:56:37 nbp2-oss5 kernel: sd 16:0:0:7: [sdcu]  <<vendor>> ASC=0x95 ASCQ=0x1
      

      OSS Rebooted at Jun 18 14:30:00

      MDT Errors at OSS reboot time

      
      Jun 18 12:31:12 nbp2-mds kernel: Call Trace:
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffff811cb40c>] ? __getblk+0x2c/0x2a0
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffff81584435>] rwsem_down_failed_common+0x95/0x1d0
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffff81584593>] rwsem_down_write_failed+0x23/0x30
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffff812c7fe3>] call_rwsem_down_write_failed+0x13/0x20
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa11f07c0>] ? lod_declare_object_create+0x0/0x450 [lod]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffff81583a92>] ? down_write+0x32/0x40
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa11f7065>] lod_qos_prep_create+0xc25/0x1aa0 [lod]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa0f41459>] ? osd_declare_qid+0x289/0x480 [osd_ldiskfs]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa11e8c02>] lod_declare_striped_object+0x162/0x980 [lod]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa0f1b735>] ? osd_declare_object_create+0x1c5/0x340 [osd_ldiskfs]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa11f0a7f>] lod_declare_object_create+0x2bf/0x450 [lod]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa125ad76>] mdd_declare_object_create_internal+0x116/0x340 [mdd]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa125670e>] mdd_create+0x69e/0x1740 [mdd]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa1118348>] mdo_create+0x18/0x50 [mdt]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa11224ff>] mdt_reint_open+0x1f8f/0x2c70 [mdt]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa05d491c>] ? upcall_cache_get_entry+0x29c/0x880 [libcfs]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa110928d>] mdt_reint_rec+0x5d/0x200 [mdt]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa10ece7b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa10ed346>] mdt_intent_reint+0x1f6/0x440 [mdt]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa10eb92e>] mdt_intent_policy+0x4be/0xd10 [mdt]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa09047a7>] ldlm_lock_enqueue+0x127/0xa50 [ptlrpc]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa093055b>] ldlm_handle_enqueue0+0x51b/0x14d0 [ptlrpc]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa09b9eb1>] tgt_enqueue+0x61/0x230 [ptlrpc]
      Jun 18 12:31:12 nbp2-mds kernel: [<ffffffffa09baece>] tgt_request_handle+0x8be/0x1020 [ptlrpc]
      Jun 18 12:31:13 nbp2-mds kernel: [<ffffffffa0964ca1>] ptlrpc_main+0xf41/0x1a80 [ptlrpc]
      Jun 18 12:31:13 nbp2-mds kernel: [<ffffffffa0963d60>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc]
      Jun 18 12:31:13 nbp2-mds kernel: [<ffffffff810a379e>] kthread+0x9e/0xc0
      Jun 18 12:31:13 nbp2-mds kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
      Jun 18 12:31:13 nbp2-mds kernel: [<ffffffff810a3700>] ? kthread+0x0/0xc0
      Jun 18 12:31:13 nbp2-mds kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20
       

       

      MDS rebooted at Jun 18 17:58:59

       

      Backtrace at time of MDS crash is attached.

      Attachments

        Activity

          People

            hongchao.zhang Hongchao Zhang
            mhanafi Mahmoud Hanafi
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: