Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7739

replay-single test 70b hangs with LBUG '(mgc_request.c:995:mgc_blocking_ast()) ASSERTION( atomic_read(&cld->cld_refcount) > 0 )'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.8.0, Lustre 2.9.0
    • None
    • autotest review-dne-part-2
    • 3
    • 9223372036854775807

    Description

      replay-single test_70b times out. In the MDS 2, MDS 3, MDS 4 console log, we see:

      13:15:37:LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts: 
      13:15:37:LustreError: 25429:0:(mgc_request.c:995:mgc_blocking_ast()) ASSERTION( atomic_read(&cld->cld_refcount) > 0 ) failed: 
      13:15:37:LustreError: 25429:0:(mgc_request.c:995:mgc_blocking_ast()) LBUG
      13:15:37:Pid: 25429, comm: ldlm_bl_01
      13:15:37:
      13:15:37:Call Trace:
      13:15:37: [<ffffffffa0467875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      13:15:37: [<ffffffffa0467e77>] lbug_with_loc+0x47/0xb0 [libcfs]
      13:15:37: [<ffffffffa0cff9d9>] mgc_blocking_ast+0x6e9/0x810 [mgc]
      13:15:37: [<ffffffffa0758b57>] ldlm_cancel_callback+0x87/0x280 [ptlrpc]
      13:15:37: [<ffffffffa07779ba>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
      13:15:37: [<ffffffffa077c55c>] ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc]
      13:15:37: [<ffffffffa0cff3db>] mgc_blocking_ast+0xeb/0x810 [mgc]
      13:15:37: [<ffffffffa0cff2f0>] ? mgc_blocking_ast+0x0/0x810 [mgc]
      13:15:37: [<ffffffffa0780c90>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
      13:15:37: [<ffffffffa0781ba1>] ldlm_bl_thread_main+0x481/0x710 [ptlrpc]
      13:15:37: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
      13:15:37: [<ffffffffa0781720>] ? ldlm_bl_thread_main+0x0/0x710 [ptlrpc]
      13:15:37: [<ffffffff810a0fce>] kthread+0x9e/0xc0
      13:15:37: [<ffffffff8100c28a>] child_rip+0xa/0x20
      13:15:37: [<ffffffff810a0f30>] ? kthread+0x0/0xc0
      13:15:37: [<ffffffff8100c280>] ? child_rip+0x0/0x20
      13:15:37:
      13:15:37:Kernel panic - not syncing: LBUG
      

      In the past month, I can only find two occurrences of this error for test_70b. Logs at
      2016-01-28 15:21:30 - https://testing.hpdd.intel.com/test_sets/c296d92c-c620-11e5-b4e1-5254006e85c2
      2016-02-03 19:34:24 - https://testing.hpdd.intel.com/test_sets/e4674cb8-caf7-11e5-be8d-5254006e85c2

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: