Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4776

suite sanity-scrub: ASSERTION( info->oti_r_locks == 0 )

Details

    • Bug
    • Resolution: Unresolved
    • Blocker
    • None
    • Lustre 2.7.0
    • None
    • 3
    • 13141

    Description

      This issue was created by maloo for Dmitry Eremin <dmitry.eremin@intel.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/6cce6894-ae66-11e3-9c2b-52540035b04c.

      17:25:05:LustreError: 18368:0:(osd_handler.c:5496:osd_key_exit()) ASSERTION( info->oti_r_locks == 0 ) failed:
      17:25:05:LustreError: 18368:0:(osd_handler.c:5496:osd_key_exit()) LBUG
      17:25:05:Pid: 18368, comm: mdt00_000
      17:25:05:
      17:25:05:Call Trace:
      17:25:05: [<ffffffffa048e895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      17:25:05: [<ffffffffa048ee97>] lbug_with_loc+0x47/0xb0 [libcfs]
      17:25:05: [<ffffffffa0d276cb>] osd_key_exit+0x5b/0xc0 [osd_ldiskfs]
      17:25:05: [<ffffffffa05f7798>] lu_context_exit+0x58/0xa0 [obdclass]
      17:25:05: [<ffffffffa0842584>] ptlrpc_main+0x904/0x1980 [ptlrpc]
      17:25:05: [<ffffffffa0841c80>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
      17:25:05: [<ffffffff8109aee6>] kthread+0x96/0xa0
      17:25:05: [<ffffffff8100c20a>] child_rip+0xa/0x20
      17:25:05: [<ffffffff8109ae50>] ? kthread+0x0/0xa0
      17:25:05: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      17:25:05:
      17:25:05:Kernel panic - not syncing: LBUG
      17:25:05:Pid: 18368, comm: mdt00_000 Not tainted 2.6.32-431.5.1.el6_lustre.g1131719.x86_64 #1
      17:25:05:Call Trace:
      17:25:05: [<ffffffff81527983>] ? panic+0xa7/0x16f
      17:25:05: [<ffffffffa048eeeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      17:25:05: [<ffffffffa0d276cb>] ? osd_key_exit+0x5b/0xc0 [osd_ldiskfs]
      17:25:05: [<ffffffffa05f7798>] ? lu_context_exit+0x58/0xa0 [obdclass]
      17:25:05: [<ffffffffa0842584>] ? ptlrpc_main+0x904/0x1980 [ptlrpc]
      17:25:05: [<ffffffffa0841c80>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
      17:25:05: [<ffffffff8109aee6>] ? kthread+0x96/0xa0
      17:25:05: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      17:25:05: [<ffffffff8109ae50>] ? kthread+0x0/0xa0
      17:25:05: [<ffffffff8100c200>] ? child_rip+0x0/0x20

      Attachments

        Issue Links

          Activity

            [LU-4776] suite sanity-scrub: ASSERTION( info->oti_r_locks == 0 )
            ys Yang Sheng added a comment - Another instance: https://testing.whamcloud.com/test_sessions/a775bc96-19e5-4cfa-a841-4acd8d9d6b6d
            yujian Jian Yu added a comment -

            The failure occurred while testing patch http://review.whamcloud.com/11213 on master branch with DNE configuration:
            https://testing.hpdd.intel.com/test_sets/fded45b8-5882-11e4-b081-5254006e85c2

            yujian Jian Yu added a comment - The failure occurred while testing patch http://review.whamcloud.com/11213 on master branch with DNE configuration: https://testing.hpdd.intel.com/test_sets/fded45b8-5882-11e4-b081-5254006e85c2

            Another instance review-dne-part-1 conf-sanity/22 on master:
            https://maloo.whamcloud.com/test_sets/96808c42-e838-11e3-9bed-52540035b04c

            utopiabound Nathaniel Clark added a comment - Another instance review-dne-part-1 conf-sanity/22 on master: https://maloo.whamcloud.com/test_sets/96808c42-e838-11e3-9bed-52540035b04c

            Firstly, the ASSERT() indicates that someone called dt_read_lock() but missed to call dt_read_unlock().

            Secondly, the ASSERT() happened inside ptlrpcd thread stack, generally, the ptlrpcd thread should not call dt_

            {read,write}

            _lock() to avoid blocked.

            Only with this log, it is not easy to locate where the issue is. Either more logs or read related code and check each dt_read_lock() one by one.

            yong.fan nasf (Inactive) added a comment - Firstly, the ASSERT() indicates that someone called dt_read_lock() but missed to call dt_read_unlock(). Secondly, the ASSERT() happened inside ptlrpcd thread stack, generally, the ptlrpcd thread should not call dt_ {read,write} _lock() to avoid blocked. Only with this log, it is not easy to locate where the issue is. Either more logs or read related code and check each dt_read_lock() one by one.

            Fan Yong and Di,
            Could you have a look and comment on this one?
            Thank you!

            jlevi Jodi Levi (Inactive) added a comment - Fan Yong and Di, Could you have a look and comment on this one? Thank you!

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: