Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11967

MDS LBUG ASSERTION( o->opo_reserved == 0 ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.13.0, Lustre 2.12.4
    • Lustre 2.12.0
    • None
    • CentOS 7.6, Kernel 3.10.0-957.1.3.el7_lustre.x86_64, all clients are 2.12.0
    • 3
    • 9223372036854775807

    Description

      We just hit the following MDS crash on Fir (2.12), server fir-md1-s1:

      [497493.075367] Lustre: fir-MDT0000: Client 691c85d2-0e39-9e6d-1bfd-ecbaccae5366 (at 10.8.2.27@o2ib6) reconnecting
      [497594.956880] LustreError: 12324:0:(osp_object.c:1458:osp_declare_create()) ASSERTION( o->opo_reserved == 0 ) failed: 
      [497594.967490] LustreError: 12324:0:(osp_object.c:1458:osp_declare_create()) LBUG
      [497594.974807] Pid: 12324, comm: mdt01_074 3.10.0-957.1.3.el7_lustre.x86_64 #1 SMP Fri Dec 7 14:50:35 PST 2018
      [497594.984636] Call Trace:
      [497594.987187]  [<ffffffffc0c5e7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [497594.993859]  [<ffffffffc0c5e87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [497595.000177]  [<ffffffffc17cdcc5>] osp_declare_create+0x5a5/0x5b0 [osp]
      [497595.006833]  [<ffffffffc171539f>] lod_sub_declare_create+0xdf/0x210 [lod]
      [497595.013748]  [<ffffffffc1714904>] lod_qos_prep_create+0x15d4/0x1890 [lod]
      [497595.020662]  [<ffffffffc16f5bba>] lod_declare_instantiate_components+0x9a/0x1d0 [lod]
      [497595.028614]  [<ffffffffc17084d5>] lod_declare_layout_change+0xb65/0x10f0 [lod]
      [497595.035988]  [<ffffffffc177a102>] mdd_declare_layout_change+0x62/0x120 [mdd]
      [497595.043172]  [<ffffffffc1782e52>] mdd_layout_change+0x882/0x1000 [mdd]
      [497595.049830]  [<ffffffffc15ea317>] mdt_layout_change+0x337/0x430 [mdt]
      [497595.056398]  [<ffffffffc15f242e>] mdt_intent_layout+0x7ee/0xcc0 [mdt]
      [497595.062968]  [<ffffffffc15efa18>] mdt_intent_policy+0x2e8/0xd00 [mdt]
      [497595.069549]  [<ffffffffc0f41ec6>] ldlm_lock_enqueue+0x366/0xa60 [ptlrpc]
      [497595.076400]  [<ffffffffc0f6a8a7>] ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc]
      [497595.083597]  [<ffffffffc0ff1302>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [497595.089851]  [<ffffffffc0ff835a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
      [497595.096881]  [<ffffffffc0f9c92b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      [497595.104679]  [<ffffffffc0fa025c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc]
      [497595.111095]  [<ffffffff9d6c1c31>] kthread+0xd1/0xe0
      [497595.116096]  [<ffffffff9dd74c24>] ret_from_fork_nospec_begin+0xe/0x21
      [497595.122657]  [<ffffffffffffffff>] 0xffffffffffffffff
      [497595.127761] Kernel panic - not syncing: LBUG
      [497595.132122] CPU: 41 PID: 12324 Comm: mdt01_074 Kdump: loaded Tainted: G           OEL ------------   3.10.0-957.1.3.el7_lustre.x86_64 #1
      [497595.144451] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018
      [497595.152106] Call Trace:
      [497595.154649]  [<ffffffff9dd61e41>] dump_stack+0x19/0x1b
      [497595.159882]  [<ffffffff9dd5b550>] panic+0xe8/0x21f
      [497595.164763]  [<ffffffffc0c5e8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [497595.171040]  [<ffffffffc17cdcc5>] osp_declare_create+0x5a5/0x5b0 [osp]
      [497595.177668]  [<ffffffffc171539f>] lod_sub_declare_create+0xdf/0x210 [lod]
      [497595.184541]  [<ffffffff9d994d0d>] ? list_del+0xd/0x30
      [497595.189693]  [<ffffffffc1714904>] lod_qos_prep_create+0x15d4/0x1890 [lod]
      [497595.196569]  [<ffffffff9d81a849>] ? ___slab_alloc+0x209/0x4f0
      [497595.202421]  [<ffffffffc0d87f7b>] ? class_handle_hash+0xab/0x2f0 [obdclass]
      [497595.209474]  [<ffffffff9d6d67b0>] ? wake_up_state+0x20/0x20
      [497595.215152]  [<ffffffffc0da7138>] ? lu_buf_alloc+0x48/0x320 [obdclass]
      [497595.221803]  [<ffffffffc0f5be0d>] ? ldlm_cli_enqueue_local+0x27d/0x870 [ptlrpc]
      [497595.229208]  [<ffffffffc16f5bba>] lod_declare_instantiate_components+0x9a/0x1d0 [lod]
      [497595.237131]  [<ffffffffc17084d5>] lod_declare_layout_change+0xb65/0x10f0 [lod]
      [497595.244442]  [<ffffffffc177a102>] mdd_declare_layout_change+0x62/0x120 [mdd]
      [497595.251584]  [<ffffffffc1782e52>] mdd_layout_change+0x882/0x1000 [mdd]
      [497595.258213]  [<ffffffffc15e9b30>] ? mdt_object_lock_internal+0x70/0x3e0 [mdt]
      [497595.265444]  [<ffffffffc15ea317>] mdt_layout_change+0x337/0x430 [mdt]
      [497595.271978]  [<ffffffffc15f242e>] mdt_intent_layout+0x7ee/0xcc0 [mdt]
      [497595.278543]  [<ffffffffc0f8e2f7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
      [497595.285083]  [<ffffffffc15efa18>] mdt_intent_policy+0x2e8/0xd00 [mdt]
      [497595.291637]  [<ffffffffc0f40524>] ? ldlm_lock_create+0xa4/0xa40 [ptlrpc]
      [497595.298442]  [<ffffffffc15f1c40>] ? mdt_intent_open+0x350/0x350 [mdt]
      [497595.304999]  [<ffffffffc0f41ec6>] ldlm_lock_enqueue+0x366/0xa60 [ptlrpc]
      [497595.311794]  [<ffffffffc0c69fa3>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs]
      [497595.319018]  [<ffffffffc0c6d72e>] ? cfs_hash_add+0xbe/0x1a0 [libcfs]
      [497595.325490]  [<ffffffffc0f6a8a7>] ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc]
      [497595.332662]  [<ffffffffc0f927f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc]
      [497595.340268]  [<ffffffffc0ff1302>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [497595.346488]  [<ffffffffc0ff835a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
      [497595.353481]  [<ffffffffc0fd1a51>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
      [497595.361139]  [<ffffffffc0c5ebde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
      [497595.368309]  [<ffffffffc0f9c92b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      [497595.376083]  [<ffffffffc0f997b5>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
      [497595.382959]  [<ffffffff9d6d67c2>] ? default_wake_function+0x12/0x20
      [497595.389311]  [<ffffffff9d6cba9b>] ? __wake_up_common+0x5b/0x90
      [497595.395263]  [<ffffffffc0fa025c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc]
      [497595.401650]  [<ffffffffc0f9f760>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
      [497595.409132]  [<ffffffff9d6c1c31>] kthread+0xd1/0xe0
      [497595.414097]  [<ffffffff9d6c1b60>] ? insert_kthread_work+0x40/0x40
      [497595.420279]  [<ffffffff9dd74c24>] ret_from_fork_nospec_begin+0xe/0x21
      [497595.426803]  [<ffffffff9d6c1b60>] ? insert_kthread_work+0x40/0x40
      

      I do have a vmcore.

      Fir has 2 MDS, fir-md1-s1 with MDT0 and MDT2 and fir-md1-s2 with MDT1 and MDT3.

       

      DOM, PFL are enabled and used.

       

      Please let me know if you have any idea how to avoid this.

      Thanks
      Stephane

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: