Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Lustre 2.7.0
-
None
-
3
-
14984
Description
In the last couple of weeks I started to get a lot of this sort of failures in master running replay-* tests (actual subtests vary):
<4>[17185.946188] Lustre: DEBUG MARKER: == replay-dual test 24: reconstruct on non-existing object == 20:09:49 (1405901389) <6>[17186.151929] Lustre: *** cfs_fail_loc=119, val=2147483648*** <3>[17186.152579] LustreError: 15607:0:(ldlm_lib.c:2399:target_send_reply_msg()) @@@ dropping reply req@ffff880077707be8 x1474193179823612/t90194313221(0) o36->04f0aee6-2b55-a39b-00e5-049443c0f852@0@lo:0/0 lens 488/456 e 0 to 0 dl 1405901396 ref 1 fl Interpret:/0/0 rc 0/0 <4>[17203.151745] Lustre: lustre-MDT0000: Client 04f0aee6-2b55-a39b-00e5-049443c0f852 (at 0@lo) reconnecting <4>[17203.153386] Lustre: cannot lookup [0x2000013a0:0x73:0x0]: rc = -2; evicting client 04f0aee6-2b55-a39b-00e5-049443c0f852 with export 0@lo <3>[17203.156210] LustreError: 167-0: lustre-MDT0000-mdc-ffff8800843257f0: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. <3>[17203.158004] LustreError: Skipped 1 previous similar message <0>[17203.158200] LustreError: 1764:0:(ldlm_lib.c:1312:target_destroy_export()) ASSERTION( atomic_read(&exp->exp_cb_count) == 0 ) failed: value: 1 <0>[17203.158203] LustreError: 1764:0:(ldlm_lib.c:1312:target_destroy_export()) LBUG <4>[17203.158204] Pid: 1764, comm: obd_zombid <4>[17203.158204] <4>[17203.158205] Call Trace: <4>[17203.158218] [<ffffffffa0d3a8a5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4>[17203.158228] [<ffffffffa0d3aea7>] lbug_with_loc+0x47/0xb0 [libcfs] <4>[17203.158256] [<ffffffffa13a45a1>] target_destroy_export+0x131/0x170 [ptlrpc] <4>[17203.158267] [<ffffffffa087a635>] mdt_destroy_export+0x45/0x220 [mdt] <4>[17203.158292] [<ffffffffa0dd1d5b>] obd_zombie_impexp_cull+0x2db/0x5f0 [obdclass] <4>[17203.158308] [<ffffffffa0dd20d5>] obd_zombie_impexp_thread+0x65/0x190 [obdclass] <4>[17203.158311] [<ffffffff8105de00>] ? default_wake_function+0x0/0x20 <4>[17203.158327] [<ffffffffa0dd2070>] ? obd_zombie_impexp_thread+0x0/0x190 [obdclass] <4>[17203.158329] [<ffffffff81098c06>] kthread+0x96/0xa0 <4>[17203.158332] [<ffffffff8100c24a>] child_rip+0xa/0x20 <4>[17203.158333] [<ffffffff81098b70>] ? kthread+0x0/0xa0 <4>[17203.158335] [<ffffffff8100c240>] ? child_rip+0x0/0x20 <4>[17203.158336] <3>[17203.189643] LustreError: 16573:0:(vvp_io.c:1203:vvp_io_init()) lustre: refresh file layout [0x2000013a0:0x73:0x0] error -5. <0>[17203.192716] Kernel panic - not syncing: LBUG
Example crash with modules: /exports/crashdumps/192.168.10.221-2014-07-20-20\:10\:08/
tag in my tree: master-20140720