Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7779

osd_object_destroy()) ASSERTION( osd_inode_unlinked(inode) || inode->i_nlink == 1 || inode->i_nlink == 2 ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Blocker
    • None
    • None
    • lola
      build: 2.8.50-6-gf9ca359 ; commit f9ca359284357d145819beb08b316e932f7a3060
    • 3
    • 9223372036854775807

    Description

      Error happened during soak testing of build '20160215' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20150215). DNE is enabled.
      MDT had been formatted using ldiskfs, OSTs using zfs. MDS nodes are configured in active-active HA failover configuration (see also configuration record at https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration)

      Please note that build 20150215 is a vanilla build of the master brunch.
      This issue might be addressed by the changes included in build '20160210' as we didn't observe this issue in a two day test session.

      • 2016-02-15-14:08:32 MDS (lola-9) crashed with LBUG:
        <0>LustreError: 4622:0:(osd_handler.c:2790:osd_object_destroy()) ASSERTION( osd_inode_unlinked(inode) || inode->i_nlink == 1 || inode->i_nlink == 2 ) failed: 
        <0>LustreError: 4622:0:(osd_handler.c:2790:osd_object_destroy()) LBUG
        <4>Pid: 4622, comm: orph_cleanup_so
        <4>
        <4>Call Trace:
        <4> [<ffffffffa0737875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
        <4> [<ffffffffa0737e77>] lbug_with_loc+0x47/0xb0 [libcfs]
        <4> [<ffffffffa0ffc2f1>] osd_object_destroy+0x5a1/0x5b0 [osd_ldiskfs]
        <4> [<ffffffffa12655ad>] lod_sub_object_destroy+0x1fd/0x440 [lod]
        <4> [<ffffffffa1265e2d>] ? lod_sub_object_ref_del+0x1fd/0x440 [lod]
        <4> [<ffffffffa1259220>] lod_object_destroy+0x130/0x770 [lod]
        <4> [<ffffffffa12db6fb>] __mdd_orphan_cleanup+0xd6b/0x12b0 [mdd]
        <4> [<ffffffffa12da990>] ? __mdd_orphan_cleanup+0x0/0x12b0 [mdd]
        <4> [<ffffffff8109e78e>] kthread+0x9e/0xc0
        <4> [<ffffffff8100c28a>] child_rip+0xa/0x20
        <4> [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
        <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
        <4>
        <0>Kernel panic - not syncing: LBUG
        <4>Pid: 4622, comm: orph_cleanup_so Tainted: P           ---------------    2.6.32-504.30.3.el6_lustre.gf9ca359.x86_64 #1
        <4>Call Trace:
        <4> [<ffffffff81529c9c>] ? panic+0xa7/0x16f
        <4> [<ffffffffa0737ecb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
        <4> [<ffffffffa0ffc2f1>] ? osd_object_destroy+0x5a1/0x5b0 [osd_ldiskfs]
        <4> [<ffffffffa12655ad>] ? lod_sub_object_destroy+0x1fd/0x440 [lod]
        <4> [<ffffffffa1265e2d>] ? lod_sub_object_ref_del+0x1fd/0x440 [lod]
        <4> [<ffffffffa1259220>] ? lod_object_destroy+0x130/0x770 [lod]
        <4> [<ffffffffa12db6fb>] ? __mdd_orphan_cleanup+0xd6b/0x12b0 [mdd]
        <4> [<ffffffffa12da990>] ? __mdd_orphan_cleanup+0x0/0x12b0 [mdd]
        <4> [<ffffffff8109e78e>] ? kthread+0x9e/0xc0
        <4> [<ffffffff8100c28a>] ? child_rip+0xa/0x20
        <4> [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
        <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
        
      • An other MDS node (lola-10 finished restart and remount of the MDTs successful at 2016-02-15 14:07:46,269 shortly before the LBUG happened.

      Attached files messages, console and vmcore-dmesg.txt of lola-9.

      The ticket might be a duplicate or revenant of LU-7579.

      Attachments

        Activity

          People

            wc-triage WC Triage
            heckes Frank Heckes (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: