[LU-5377] target_destroy_export()) ASSERTION( atomic_read(&exp->exp_cb_count) == 0 ) failed: value: 1 Created: 21/Jul/14  Updated: 18/Feb/15  Resolved: 31/Jul/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-6255 EL7 client replay-dual test_24: MDS r... Closed
Related
is related to LU-4688 target_destroy_export() LBUG Resolved
Severity: 3
Rank (Obsolete): 14984

 Description   

In the last couple of weeks I started to get a lot of this sort of failures in master running replay-* tests (actual subtests vary):

<4>[17185.946188] Lustre: DEBUG MARKER: == replay-dual test 24: reconstruct on non-existing object == 20:09:49 (1405901389)
<6>[17186.151929] Lustre: *** cfs_fail_loc=119, val=2147483648***
<3>[17186.152579] LustreError: 15607:0:(ldlm_lib.c:2399:target_send_reply_msg()) @@@ dropping reply  req@ffff880077707be8 x1474193179823612/t90194313221(0) o36->04f0aee6-2b55-a39b-00e5-049443c0f852@0@lo:0/0 lens 488/456 e 0 to 0 dl 1405901396 ref 1 fl Interpret:/0/0 rc 0/0
<4>[17203.151745] Lustre: lustre-MDT0000: Client 04f0aee6-2b55-a39b-00e5-049443c0f852 (at 0@lo) reconnecting
<4>[17203.153386] Lustre: cannot lookup [0x2000013a0:0x73:0x0]: rc = -2; evicting client 04f0aee6-2b55-a39b-00e5-049443c0f852 with export 0@lo
<3>[17203.156210] LustreError: 167-0: lustre-MDT0000-mdc-ffff8800843257f0: This client was evicted by lustre-MDT0000; in progress operations using this service will fail.
<3>[17203.158004] LustreError: Skipped 1 previous similar message
<0>[17203.158200] LustreError: 1764:0:(ldlm_lib.c:1312:target_destroy_export()) ASSERTION( atomic_read(&exp->exp_cb_count) == 0 ) failed: value: 1
<0>[17203.158203] LustreError: 1764:0:(ldlm_lib.c:1312:target_destroy_export()) LBUG
<4>[17203.158204] Pid: 1764, comm: obd_zombid
<4>[17203.158204] 
<4>[17203.158205] Call Trace:
<4>[17203.158218]  [<ffffffffa0d3a8a5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4>[17203.158228]  [<ffffffffa0d3aea7>] lbug_with_loc+0x47/0xb0 [libcfs]
<4>[17203.158256]  [<ffffffffa13a45a1>] target_destroy_export+0x131/0x170 [ptlrpc]
<4>[17203.158267]  [<ffffffffa087a635>] mdt_destroy_export+0x45/0x220 [mdt]
<4>[17203.158292]  [<ffffffffa0dd1d5b>] obd_zombie_impexp_cull+0x2db/0x5f0 [obdclass]
<4>[17203.158308]  [<ffffffffa0dd20d5>] obd_zombie_impexp_thread+0x65/0x190 [obdclass]
<4>[17203.158311]  [<ffffffff8105de00>] ? default_wake_function+0x0/0x20
<4>[17203.158327]  [<ffffffffa0dd2070>] ? obd_zombie_impexp_thread+0x0/0x190 [obdclass]
<4>[17203.158329]  [<ffffffff81098c06>] kthread+0x96/0xa0
<4>[17203.158332]  [<ffffffff8100c24a>] child_rip+0xa/0x20
<4>[17203.158333]  [<ffffffff81098b70>] ? kthread+0x0/0xa0
<4>[17203.158335]  [<ffffffff8100c240>] ? child_rip+0x0/0x20
<4>[17203.158336] 
<3>[17203.189643] LustreError: 16573:0:(vvp_io.c:1203:vvp_io_init()) lustre: refresh file layout [0x2000013a0:0x73:0x0] error -5.
<0>[17203.192716] Kernel panic - not syncing: LBUG

Example crash with modules: /exports/crashdumps/192.168.10.221-2014-07-20-20\:10\:08/
tag in my tree: master-20140720



 Comments   
Comment by Oleg Drokin [ 21/Jul/14 ]

In fact after a few more crashes it appears that all of them happened in replay-dual test 24: reconstruct on non-existing object

Comment by Zhenyu Xu [ 31/Jul/14 ]

dup of LU-4688

Generated at Sat Feb 10 01:50:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.