Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1111

"rm" by FID causes MDS LBUG ("(md_object.h:740:mo_version_get()) ASSERTION(m->mo_ops->moo_version_get) failed")

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.0.0
    • None
    • 3
    • 6454

    Description

      The MDS panic stack/log looks like the following :
      ==================================================
      LustreError: 64654:0:(md_object.h:740:mo_version_get()) ASSERTION(m->mo_ops->moo_version_get) failed
      LustreError: 64654:0:(md_object.h:740:mo_version_get()) LBUG
      Pid: 64654, comm: mdt_19

      Call Trace:
      [<ffffffffa04b8857>] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
      [<ffffffffa04b8e95>] lbug_with_loc+0x75/0xe0 [libcfs]
      [<ffffffffa04c48b6>] libcfs_assertion_failed+0x66/0x70 [libcfs]
      [<ffffffffa07235e9>] cml_version_get+0xa9/0xc0 [cmm]
      [<ffffffffa0a62bcb>] mdt_obj_version_get+0x14b/0x220 [mdt]
      [<ffffffffa0a63089>] mdt_version_get_check_save+0x39/0xe0 [mdt]
      [<ffffffffa0a63204>] mdt_reint_unlink+0xd4/0x940 [mdt]
      [<ffffffffa0a6084f>] ? mdt_unlink_unpack+0x41f/0x5a0 [mdt]
      [<ffffffffa0a1a256>] ? md_ucred+0x26/0x60 [mdd]
      [<ffffffffa0a1a256>] ? md_ucred+0x26/0x60 [mdd]
      [<ffffffffa0a6167f>] mdt_reint_rec+0x3f/0x100 [mdt]
      [<ffffffffa0668004>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
      [<ffffffffa0a58a34>] mdt_reint_internal+0x6d4/0x9f0 [mdt]
      [<ffffffffa0a4f756>] ? mdt_reint_opcode+0x96/0x160 [mdt]
      [<ffffffffa0a58d9c>] mdt_reint+0x4c/0x120 [mdt]
      [<ffffffffa0667ad8>] ? lustre_msg_check_version+0xc8/0xe0 [ptlrpc]
      [<ffffffffa0a4d9f5>] mdt_handle_common+0x8d5/0x1810 [mdt]
      [<ffffffffa0665734>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
      [<ffffffffa0a4ea05>] mdt_regular_handle+0x15/0x20 [mdt]
      [<ffffffffa0674641>] ptlrpc_server_handle_request+0x421/0xef0 [ptlrpc]
      [<ffffffff810408fe>] ? activate_task+0x2e/0x40
      [<ffffffff8104d906>] ? try_to_wake_up+0x276/0x380
      [<ffffffff8104da22>] ? default_wake_function+0x12/0x20
      [<ffffffff810411a9>] ? __wake_up_common+0x59/0x90
      [<ffffffffa04b963e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      [<ffffffffa06759e2>] ptlrpc_main+0x8d2/0x1550 [ptlrpc]
      [<ffffffff8104da10>] ? default_wake_function+0x0/0x20
      [<ffffffff8100d1aa>] child_rip+0xa/0x20
      [<ffffffffa0675110>] ? ptlrpc_main+0x0/0x1550 [ptlrpc]
      [<ffffffff8100d1a0>] ? child_rip+0x0/0x20

      Kernel panic - not syncing: LBUG
      Pid: 64654, comm: mdt_19 Not tainted 2.6.32-71.24.1.el6.Bull.23.x86_64 #1
      Call Trace:
      [<ffffffff81466b14>] panic+0x78/0x137
      [<ffffffffa04b8eeb>] lbug_with_loc+0xcb/0xe0 [libcfs]
      [<ffffffffa04c48b6>] libcfs_assertion_failed+0x66/0x70 [libcfs]
      [<ffffffffa07235e9>] cml_version_get+0xa9/0xc0 [cmm]
      [<ffffffffa0a62bcb>] mdt_obj_version_get+0x14b/0x220 [mdt]
      [<ffffffffa0a63089>] mdt_version_get_check_save+0x39/0xe0 [mdt]
      [<ffffffffa0a63204>] mdt_reint_unlink+0xd4/0x940 [mdt]
      [<ffffffffa0a6084f>] ? mdt_unlink_unpack+0x41f/0x5a0 [mdt]
      [<ffffffffa0a1a256>] ? md_ucred+0x26/0x60 [mdd]
      [<ffffffffa0a1a256>] ? md_ucred+0x26/0x60 [mdd]
      [<ffffffffa0a6167f>] mdt_reint_rec+0x3f/0x100 [mdt]
      [<ffffffffa0668004>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
      [<ffffffffa0a58a34>] mdt_reint_internal+0x6d4/0x9f0 [mdt]
      [<ffffffffa0a4f756>] ? mdt_reint_opcode+0x96/0x160 [mdt]
      [<ffffffffa0a58d9c>] mdt_reint+0x4c/0x120 [mdt]
      [<ffffffffa0667ad8>] ? lustre_msg_check_version+0xc8/0xe0 [ptlrpc]
      [<ffffffffa0a4d9f5>] mdt_handle_common+0x8d5/0x1810 [mdt]
      [<ffffffffa0665734>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
      [<ffffffffa0a4ea05>] mdt_regular_handle+0x15/0x20 [mdt]
      [<ffffffffa0674641>] ptlrpc_server_handle_request+0x421/0xef0 [ptlrpc]
      [<ffffffff810408fe>] ? activate_task+0x2e/0x40
      [<ffffffff8104d906>] ? try_to_wake_up+0x276/0x380
      [<ffffffff8104da22>] ? default_wake_function+0x12/0x20
      [<ffffffff810411a9>] ? __wake_up_common+0x59/0x90
      [<ffffffffa04b963e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      [<ffffffffa06759e2>] ptlrpc_main+0x8d2/0x1550 [ptlrpc]
      [<ffffffff8104da10>] ? default_wake_function+0x0/0x20
      [<ffffffff8100d1aa>] child_rip+0xa/0x20
      [<ffffffffa0675110>] ? ptlrpc_main+0x0/0x1550 [ptlrpc]
      [<ffffffff8100d1a0>] ? child_rip+0x0/0x20
      ==================================================

      The solid reproducer, to be run from a Lustre-Client, is very simple :
      ======================================================================

      1. touch /ccc/work/.../foo
      2. lfs path2fid /ccc/work/.../foo
        [0x200002cb4:0x3cbb:0x0]
      3. rm /ccc/work/.lustre/fid/[0x200002cb4:0x3cbb:0x0]
        <<< MDS crash like described just before >>>
        ======================================================================

      indeep crash-dump and concerned source-code analysis clearly indicate that "rm/unlink by-FID" is not fully implemented but also not correctly checked+prohibited !!...

      My assumption is that this should be fixed/checked+prohibited with the following :
      ==================================================================================
      [root@gaia1 lustre-2.0.0.1] # diff -urN lustre/mdd/mdd_dir.c.orig lustre/mdd/mdd_dir.c.bfi
      — lustre/mdd/mdd_dir.c.orig 2011-11-04 18:46:46.000000000 +0100
      +++ lustre/mdd/mdd_dir.c.bfi 2012-02-01 17:18:37.074912630 +0100
      @@ -415,7 +415,7 @@
      RETURN(rc);
      }

      • if (mdd_is_append(pobj))
        + if (mdd_is_immutable(pobj) || mdd_is_append(pobj))
        RETURN(-EPERM);
        }

      [root@gaia1 lustre-2.0.0.1] #
      ==================================================================================

      What do you think ??

      Also, our 1st tests with Lustre 2.1.0 show the same problem/behavior ...

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            louveta Alexandre Louvet (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: