Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8320

:(llog_osd.c:338:llog_osd_write_rec()) ASSERTION( llh ) failed:

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • None
    • None
    • 1
    • 9223372036854775807

    Description

      MDS crash with LBUG.

      0>LustreError: 39313:0:(llog_osd.c:338:llog_osd_write_rec()) ASSERTION( llh ) failed: ^M
      <0>LustreError: 39313:0:(llog_osd.c:338:llog_osd_write_rec()) LBUG^M
      <4>Pid: 39313, comm: mdt02_049^M
      <4>^M
      <4>Call Trace:^M
      <4> [<ffffffffa048b895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]^M
      <4> [<ffffffffa048be97>] lbug_with_loc+0x47/0xb0 [libcfs]^M
      <4> [<ffffffffa05bed55>] llog_osd_write_rec+0xfb5/0x1370 [obdclass]^M
      <4> [<ffffffffa0d46ecb>] ? dynlock_unlock+0x16b/0x1d0 [osd_ldiskfs]^M
      <4> [<ffffffffa0d2e5d2>] ? iam_path_release+0x42/0x70 [osd_ldiskfs]^M
      <4> [<ffffffffa0590438>] llog_write_rec+0xc8/0x290 [obdclass]^M
      <4> [<ffffffffa059910d>] llog_cat_add_rec+0xad/0x480 [obdclass]^M
      <4> [<ffffffffa0590231>] llog_add+0x91/0x1d0 [obdclass]^M
      <4> [<ffffffffa0fd04f7>] osp_sync_add_rec+0x247/0xad0 [osp]^M
      <4> [<ffffffffa0fd0e2b>] osp_sync_add+0x7b/0x80 [osp]^M
      <4> [<ffffffffa0fc27d6>] osp_object_destroy+0x106/0x150 [osp]^M
      <4> [<ffffffffa0f068e7>] lod_object_destroy+0x1a7/0x350 [lod]^M
      <4> [<ffffffffa0f74880>] mdd_finish_unlink+0x210/0x3d0 [mdd]^M
      <4> [<ffffffffa0f65d35>] ? mdd_attr_check_set_internal+0x275/0x2c0 [mdd]^M
      <4> [<ffffffffa0f75306>] mdd_unlink+0x8c6/0xca0 [mdd]^M
      <4> [<ffffffffa0e37788>] mdo_unlink+0x18/0x50 [mdt]^M
      <4> [<ffffffffa0e3b005>] mdt_reint_unlink+0x835/0x1030 [mdt]^M
      <4> [<ffffffffa0e37571>] mdt_reint_rec+0x41/0xe0 [mdt]^M
      <4> [<ffffffffa0e1ced3>] mdt_reint_internal+0x4c3/0x780 [mdt]^M
      <4> [<ffffffffa0e1d1d4>] mdt_reint+0x44/0xe0 [mdt]^M
      <4> [<ffffffffa0e1fada>] mdt_handle_common+0x52a/0x1470 [mdt]^M
      <4> [<ffffffffa0e5c5f5>] mds_regular_handle+0x15/0x20 [mdt]^M
      <4> [<ffffffffa07750c5>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]^M
      <4> [<ffffffffa048c5ae>] ? cfs_timer_arm+0xe/0x10 [libcfs]^M
      <4> [<ffffffffa049d8d5>] ? lc_watchdog_touch+0x65/0x170 [libcfs]^M
      <4> [<ffffffffa076da69>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]^M
      <4> [<ffffffff81057779>] ? __wake_up_common+0x59/0x90^M
      <4> [<ffffffffa077789d>] ptlrpc_main+0xafd/0x1780 [ptlrpc]^M
      <4> [<ffffffff8100c28a>] child_rip+0xa/0x20^M
      <4> [<ffffffffa0776da0>] ? ptlrpc_main+0x0/0x1780 [ptlrpc]^M
      <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20^M
      <4>^M
      <0>Kernel panic - not syncing: LBUG^M
      <4>Pid: 39313, comm: mdt02_049 Tainted: G           ---------------  T 2.6.32-504.30.3.el6.20151008.x86_64.lustre253 #1^M
      <4>Call Trace:^M
      <4> [<ffffffff81564fb9>] ? panic+0xa7/0x190^M
      <4> [<ffffffffa048beeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]^M
      <4> [<ffffffffa05bed55>] ? llog_osd_write_rec+0xfb5/0x1370 [obdclass]^M
      <4> [<ffffffffa0d46ecb>] ? dynlock_unlock+0x16b/0x1d0 [osd_ldiskfs]^M
      <4> [<ffffffffa0d2e5d2>] ? iam_path_release+0x42/0x70 [osd_ldiskfs]^M
      <4> [<ffffffffa0590438>] ? llog_write_rec+0xc8/0x290 [obdclass]^M
      <4> [<ffffffffa059910d>] ? llog_cat_add_rec+0xad/0x480 [obdclass]^M
      <4> [<ffffffffa0590231>] ? llog_add+0x91/0x1d0 [obdclass]^M
      <4> [<ffffffffa0fd04f7>] ? osp_sync_add_rec+0x247/0xad0 [osp]^M
      <4> [<ffffffffa0fd0e2b>] ? osp_sync_add+0x7b/0x80 [osp]^M
      <4> [<ffffffffa0fc27d6>] ? osp_object_destroy+0x106/0x150 [osp]^M
      <4> [<ffffffffa0f068e7>] ? lod_object_destroy+0x1a7/0x350 [lod]^M
      <4> [<ffffffffa0f74880>] ? mdd_finish_unlink+0x210/0x3d0 [mdd]^M
      <4> [<ffffffffa0f65d35>] ? mdd_attr_check_set_internal+0x275/0x2c0 [mdd]^M
      <4> [<ffffffffa0f75306>] ? mdd_unlink+0x8c6/0xca0 [mdd]^M
      <4> [<ffffffffa0e37788>] ? mdo_unlink+0x18/0x50 [mdt]^M
      <4> [<ffffffffa0e3b005>] ? mdt_reint_unlink+0x835/0x1030 [mdt]^M
      <4> [<ffffffffa0e37571>] ? mdt_reint_rec+0x41/0xe0 [mdt]^M
      <4> [<ffffffffa0e1ced3>] ? mdt_reint_internal+0x4c3/0x780 [mdt]^M
      <4> [<ffffffffa0e1d1d4>] ? mdt_reint+0x44/0xe0 [mdt]^M
      <4> [<ffffffffa0e1fada>] ? mdt_handle_common+0x52a/0x1470 [mdt]^M
      <4> [<ffffffffa0e5c5f5>] ? mds_regular_handle+0x15/0x20 [mdt]^M
      <4> [<ffffffffa07750c5>] ? ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]^M
      <4> [<ffffffffa048c5ae>] ? cfs_timer_arm+0xe/0x10 [libcfs]^M
      <4> [<ffffffffa049d8d5>] ? lc_watchdog_touch+0x65/0x170 [libcfs]^M
      <4> [<ffffffffa076da69>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]^M
      <4> [<ffffffff81057779>] ? __wake_up_common+0x59/0x90^M
      <4> [<ffffffffa077789d>] ? ptlrpc_main+0xafd/0x1780 [ptlrpc]^M
      <4> [<ffffffff8100c28a>] ? child_rip+0xa/0x20^M
      <4> [<ffffffffa0776da0>] ? ptlrpc_main+0x0/0x1780 [ptlrpc]^M
      <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20^M
      

      Attachments

        Activity

          [LU-8320] :(llog_osd.c:338:llog_osd_write_rec()) ASSERTION( llh ) failed:

          Please re-open until the backport patch lands to 2.7 FE.

          ndauchy Nathan Dauchy (Inactive) added a comment - Please re-open until the backport patch lands to 2.7 FE.
          pjones Peter Jones added a comment -

          Landed for 2.9

          pjones Peter Jones added a comment - Landed for 2.9

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21144/
          Subject: LU-8320 llog: prevent llog ID re-use.
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: a93ede18ababa3fe1ae8f4a5f92e868589a58cb6

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21144/ Subject: LU-8320 llog: prevent llog ID re-use. Project: fs/lustre-release Branch: master Current Patch Set: Commit: a93ede18ababa3fe1ae8f4a5f92e868589a58cb6

          Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/21144
          Subject: LU-8320 llog: prevent llog ID re-use.
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 722f308635f118d00a5c4a44fa72d18986ccdac9

          gerrit Gerrit Updater added a comment - Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/21144 Subject: LU-8320 llog: prevent llog ID re-use. Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 722f308635f118d00a5c4a44fa72d18986ccdac9

          LU-5297 patch has not landed to b2_7_fe yet. I cherry-picked it to our nas-2.7.1 and nas-2.7.2 anyway (locally, not pushed to github yet.) OTOH, LU-5297 patch caused conflicts in nas-2.5.3. Since there is a workaround (ie, find the offending file and remove it), I think we do not need LU-5297 on nas-2.5.3.

          So, we have LU-4528, LU-7079, LU-6696, and LU-5297 on nas-2.7.1 and nas-2.7.2. I tried LU-8320 and it was applied cleanly on nas-2.7.x, but I will wait until it gets reviews.

          jaylan Jay Lan (Inactive) added a comment - LU-5297 patch has not landed to b2_7_fe yet. I cherry-picked it to our nas-2.7.1 and nas-2.7.2 anyway (locally, not pushed to github yet.) OTOH, LU-5297 patch caused conflicts in nas-2.5.3. Since there is a workaround (ie, find the offending file and remove it), I think we do not need LU-5297 on nas-2.5.3. So, we have LU-4528 , LU-7079 , LU-6696 , and LU-5297 on nas-2.7.1 and nas-2.7.2. I tried LU-8320 and it was applied cleanly on nas-2.7.x, but I will wait until it gets reviews.

          The 2.5.3 lustre server still running 2.5.3-6nasS, and LU-7079 patch was included in 2.5.3-6.1nasS.

          jaylan Jay Lan (Inactive) added a comment - The 2.5.3 lustre server still running 2.5.3-6nasS, and LU-7079 patch was included in 2.5.3-6.1nasS.

          People

            tappro Mikhail Pershin
            mhanafi Mahmoud Hanafi
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: