Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5377

target_destroy_export()) ASSERTION( atomic_read(&exp->exp_cb_count) == 0 ) failed: value: 1

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.7.0
    • None
    • 3
    • 14984

    Description

      In the last couple of weeks I started to get a lot of this sort of failures in master running replay-* tests (actual subtests vary):

      <4>[17185.946188] Lustre: DEBUG MARKER: == replay-dual test 24: reconstruct on non-existing object == 20:09:49 (1405901389)
      <6>[17186.151929] Lustre: *** cfs_fail_loc=119, val=2147483648***
      <3>[17186.152579] LustreError: 15607:0:(ldlm_lib.c:2399:target_send_reply_msg()) @@@ dropping reply  req@ffff880077707be8 x1474193179823612/t90194313221(0) o36->04f0aee6-2b55-a39b-00e5-049443c0f852@0@lo:0/0 lens 488/456 e 0 to 0 dl 1405901396 ref 1 fl Interpret:/0/0 rc 0/0
      <4>[17203.151745] Lustre: lustre-MDT0000: Client 04f0aee6-2b55-a39b-00e5-049443c0f852 (at 0@lo) reconnecting
      <4>[17203.153386] Lustre: cannot lookup [0x2000013a0:0x73:0x0]: rc = -2; evicting client 04f0aee6-2b55-a39b-00e5-049443c0f852 with export 0@lo
      <3>[17203.156210] LustreError: 167-0: lustre-MDT0000-mdc-ffff8800843257f0: This client was evicted by lustre-MDT0000; in progress operations using this service will fail.
      <3>[17203.158004] LustreError: Skipped 1 previous similar message
      <0>[17203.158200] LustreError: 1764:0:(ldlm_lib.c:1312:target_destroy_export()) ASSERTION( atomic_read(&exp->exp_cb_count) == 0 ) failed: value: 1
      <0>[17203.158203] LustreError: 1764:0:(ldlm_lib.c:1312:target_destroy_export()) LBUG
      <4>[17203.158204] Pid: 1764, comm: obd_zombid
      <4>[17203.158204] 
      <4>[17203.158205] Call Trace:
      <4>[17203.158218]  [<ffffffffa0d3a8a5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4>[17203.158228]  [<ffffffffa0d3aea7>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4>[17203.158256]  [<ffffffffa13a45a1>] target_destroy_export+0x131/0x170 [ptlrpc]
      <4>[17203.158267]  [<ffffffffa087a635>] mdt_destroy_export+0x45/0x220 [mdt]
      <4>[17203.158292]  [<ffffffffa0dd1d5b>] obd_zombie_impexp_cull+0x2db/0x5f0 [obdclass]
      <4>[17203.158308]  [<ffffffffa0dd20d5>] obd_zombie_impexp_thread+0x65/0x190 [obdclass]
      <4>[17203.158311]  [<ffffffff8105de00>] ? default_wake_function+0x0/0x20
      <4>[17203.158327]  [<ffffffffa0dd2070>] ? obd_zombie_impexp_thread+0x0/0x190 [obdclass]
      <4>[17203.158329]  [<ffffffff81098c06>] kthread+0x96/0xa0
      <4>[17203.158332]  [<ffffffff8100c24a>] child_rip+0xa/0x20
      <4>[17203.158333]  [<ffffffff81098b70>] ? kthread+0x0/0xa0
      <4>[17203.158335]  [<ffffffff8100c240>] ? child_rip+0x0/0x20
      <4>[17203.158336] 
      <3>[17203.189643] LustreError: 16573:0:(vvp_io.c:1203:vvp_io_init()) lustre: refresh file layout [0x2000013a0:0x73:0x0] error -5.
      <0>[17203.192716] Kernel panic - not syncing: LBUG
      

      Example crash with modules: /exports/crashdumps/192.168.10.221-2014-07-20-20\:10\:08/
      tag in my tree: master-20140720

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: