Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7453

osp_sync_interpret assertion

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.5.4
    • None
    • 2.5.4-2.6.32_504.30.3.el6.atlas.x86_64.x86_64
    • 3
    • 9223372036854775807

    Description

      Wednesday morning one of our production MDS nodes hit an assertion:

      {{
      2015-11-18 10:58:35 [1912759.384335] LustreError: 14428:0:(osp_sync.c:352:osp_sync_interpret()) ASSERTION( rc || req->rq_transno ) failed:
      2015-11-18 10:58:35 [1912759.396346] LustreError: 14428:0:(osp_sync.c:352:osp_sync_interpret()) LBUG
      2015-11-18 10:58:35 [1912759.404445] Pid: 14428, comm: ptlrpcd_2
      2015-11-18 10:58:35 [1912759.409032]
      2015-11-18 10:58:35 [1912759.409033] Call Trace:
      2015-11-18 10:58:35 [1912759.414039] [<ffffffffa0430895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      2015-11-18 10:58:35 [1912759.422141] [<ffffffffa0430e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      2015-11-18 10:58:35 [1912759.429363] [<ffffffffa0f6e3db>] osp_sync_interpret+0x50b/0x510 [osp]
      2015-11-18 10:58:35 [1912759.437003] [<ffffffffa075aacd>] ptlrpc_check_set+0x31d/0x1c20 [ptlrpc]
      2015-11-18 10:58:35 [1912759.444806] [<ffffffff8108802b>] ? try_to_del_timer_sync+0x7b/0xe0
      2015-11-18 10:58:35 [1912759.452147] [<ffffffffa0788b13>] ptlrpcd_check+0x3d3/0x610 [ptlrpc]
      2015-11-18 10:58:35 [1912759.459582] [<ffffffffa078924b>] ptlrpcd+0x20b/0x370 [ptlrpc]
      2015-11-18 10:58:35 [1912759.466413] [<ffffffff81064c00>] ? default_wake_function+0x0/0x20
      2015-11-18 10:58:35 [1912759.473654] [<ffffffffa0789040>] ? ptlrpcd+0x0/0x370 [ptlrpc]
      2015-11-18 10:58:35 [1912759.480485] [<ffffffff8109e78e>] kthread+0x9e/0xc0
      2015-11-18 10:58:35 [1912759.486243] [<ffffffff8100c28a>] child_rip+0xa/0x20
      2015-11-18 10:58:35 [1912759.492100] [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
      2015-11-18 10:58:35 [1912759.497954] [<ffffffff8100c280>] ? child_rip+0x0/0x20
      2015-11-18 10:58:35 [1912759.503992]
      2015-11-18 10:58:35 [1912759.506435] Kernel panic - not syncing: LBUG
      2015-11-18 10:58:35 [1912759.511515] Pid: 14428, comm: ptlrpcd_2 Not tainted 2.6.32-504.30.3.el6.atlas.x86_64 #1
      2015-11-18 10:58:35 [1912759.520881] Call Trace:
      2015-11-18 10:58:35 [1912759.523916] [<ffffffff81529cbc>] ? panic+0xa7/0x16f
      2015-11-18 10:58:35 [1912759.529778] [<ffffffffa0430eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      2015-11-18 10:58:35 [1912759.537188] [<ffffffffa0f6e3db>] ? osp_sync_interpret+0x50b/0x510 [osp]
      2015-11-18 10:58:35 [1912759.545012] [<ffffffffa075aacd>] ? ptlrpc_check_set+0x31d/0x1c20 [ptlrpc]
      2015-11-18 10:58:35 [1912759.553002] [<ffffffff8108802b>] ? try_to_del_timer_sync+0x7b/0xe0
      2015-11-18 10:58:35 [1912759.560340] [<ffffffffa0788b13>] ? ptlrpcd_check+0x3d3/0x610 [ptlrpc]
      2015-11-18 10:58:35 [1912759.567961] [<ffffffffa078924b>] ? ptlrpcd+0x20b/0x370 [ptlrpc]
      2015-11-18 10:58:35 [1912759.574979] [<ffffffff81064c00>] ? default_wake_function+0x0/0x20
      2015-11-18 10:58:35 [1912759.582209] [<ffffffffa0789040>] ? ptlrpcd+0x0/0x370 [ptlrpc]
      2015-11-18 10:58:35 [1912759.589034] [<ffffffff8109e78e>] ? kthread+0x9e/0xc0
      2015-11-18 10:58:35 [1912759.594981] [<ffffffff8100c28a>] ? child_rip+0xa/0x20
      2015-11-18 10:58:35 [1912759.601024] [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
      2015-11-18 10:58:35 [1912759.606874] [<ffffffff8100c280>] ? child_rip+0x0/0x20
      }}

      Is this related to https://jira.hpdd.intel.com/browse/LU-5629 ?

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              hanleyja Jesse Hanley
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: