Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11483

replay-dual test_25: ofd_lvbo_init()) ASSERTION( env ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.12.0
    • Lustre 2.12.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Oleg Drokin <green@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/c9780dee-cb39-11e8-ad90-52540065bddc

      test_25 failed with the following error:

      trevis-13vm6 crashed during replay-dual test_25
      [ 1322.092883] LustreError: Skipped 1 previous similar message
      [ 1322.119196] Lustre: lustre-OST0000: Connection restored to  (at 10.9.4.146@tcp)
      [ 1327.099384] LustreError: 137-5: lustre-OST0001_UUID: not available for connect from 10.9.4.152@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      [ 1327.101153] LustreError: Skipped 5 previous similar messages
      [ 1327.124964] Lustre: lustre-OST0000: Connection restored to  (at 10.9.4.146@tcp)
      [ 1327.185451] Lustre: lustre-OST0000: Recovery over after 0:19, of 3 clients 3 recovered and 0 were evicted.
      [ 1327.188206] LustreError: 2588:0:(ofd_lvb.c:95:ofd_lvbo_init()) ASSERTION( env ) failed: 
      [ 1327.189159] LustreError: 2588:0:(ofd_lvb.c:95:ofd_lvbo_init()) LBUG
      [ 1327.189473] Lustre: lustre-OST0000: deleting orphan objects from 0x0:9699 to 0x0:9729
      [ 1327.190483] Pid: 2588, comm: tgt_recover_0 3.10.0-862.9.1.el7_lustre.x86_64 #1 SMP Thu Sep 13 05:07:47 UTC 2018
      [ 1327.191428] Call Trace:
      [ 1327.191691]  [<ffffffffc09937cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [ 1327.192374]  [<ffffffffc099387c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 1327.192994]  [<ffffffffc1196a3b>] ofd_lvbo_init+0x70b/0x800 [ofd]
      [ 1327.193598]  [<ffffffffc0d27b70>] ldlm_server_completion_ast+0x600/0x9b0 [ptlrpc]
      [ 1327.194512]  [<ffffffffc0cf9748>] ldlm_work_cp_ast_lock+0xa8/0x1d0 [ptlrpc]
      [ 1327.195228]  [<ffffffffc0d414ea>] ptlrpc_set_wait+0x7a/0x8d0 [ptlrpc]
      [ 1327.195898]  [<ffffffffc0cff175>] ldlm_run_ast_work+0xd5/0x3a0 [ptlrpc]
      [ 1327.196572]  [<ffffffffc0d006d9>] __ldlm_reprocess_all+0x129/0x380 [ptlrpc]
      [ 1327.197347]  [<ffffffffc0d00c96>] ldlm_reprocess_res+0x26/0x30 [ptlrpc]
      [ 1327.198080]  [<ffffffffc099ff30>] cfs_hash_for_each_relax+0x250/0x450 [libcfs]
      [ 1327.198793]  [<ffffffffc09a32c5>] cfs_hash_for_each_nolock+0x75/0x1c0 [libcfs]
      [ 1327.199533]  [<ffffffffc0d00cdc>] ldlm_reprocess_recovery_done+0x3c/0x110 [ptlrpc]
      [ 1327.200367]  [<ffffffffc0d1362a>] target_recovery_thread+0xa7a/0x1370 [ptlrpc]
      [ 1327.201102]  [<ffffffff8e8bb621>] kthread+0xd1/0xe0
      [ 1327.201600]  [<ffffffff8ef205f7>] ret_from_fork_nospec_end+0x0/0x39
      [ 1327.202240]  [<ffffffffffffffff>] 0xffffffffffffffff
      [ 1327.202817] Kernel panic - not syncing: LBUG
      

      Seems to be introduced by the newly landed https://review.whamcloud.com/#/c/32832/

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      replay-dual test_25 - trevis-13vm6 crashed during replay-dual test_25

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: