Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17344

ldlm_resource_get() ASSERTION(name->name[0] != 0) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • Lustre 2.14.0, Lustre 2.16.0, Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      A system running LFSCK was crashing in a loop, apparently trying to destroy a bad object FID:

       LustreError: 16300:0:(ldlm_resource.c:1488:ldlm_resource_get()) ASSERTION(name->name[0] != 0) failed:
       kernel:Kernel panic - not syncing: LBUG 
       Call Trace:
       libcfs_call_trace+0x90/0xf0 [libcfs]
       lbug_with_loc+0x4c/0xa0 [libcfs]
       ldlm_resource_get+0x7e9/0x950 [ptlrpc]
       ldlm_lock_create+0x55/0xa60 [ptlrpc]
       ldlm_cli_enqueue_local+0xcc/0x850 [ptlrpc]
       lfsck_layout_slave_conditional_destroy [lfsck]
       lfsck_layout_slave_in_notify+0xa19/0xed0 [lfsck]
       lfsck_in_notify+0x23c/0x320 [lfsck]
       tgt_handle_lfsck_notify+0x5c/0x140 [ptlrpc]
       tgt_request_handle+0x8bf/0x18c0 [ptlrpc]
       ptlrpc_server_handle_request+0x253/0xc40 [ptlrpc]
       ptlrpc_main+0xc4a/0x1cb0 [ptlrpc]
       kthread+0xd1/0xe0
      

      It probably makes sense to have lfsck_layout_slave_conditional_destroy() or a higher level check that the FID is valid before calling all the way down to ldlm_cli_enqueue_local().

      Attachments

        Activity

          People

            hongchao.zhang Hongchao Zhang
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: