Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
Just had this crash in sanity test 134a:
Oct 11 14:20:08 centos6-14 kernel: [139674.309218] LNetError: 29399:0:(lib-move.c:661:lnet_ni_eager_recv()) ASSERTION( msg->msg_receiving ) failed: Oct 11 14:20:08 centos6-14 kernel: [139674.310338] LNetError: 29399:0:(lib-move.c:661:lnet_ni_eager_recv()) LBUG Oct 11 14:20:08 centos6-14 kernel: [139674.310909] Pid: 29399, comm: mdt00_003 Oct 11 14:20:08 centos6-14 kernel: [139674.311699] Oct 11 14:20:08 centos6-14 kernel: [139674.311700] Call Trace: Oct 11 14:20:08 centos6-14 kernel: [139674.312345] [<ffffffffa0ad6885>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] Oct 11 14:20:08 centos6-14 kernel: [139674.312897] [<ffffffffa0ad6e87>] lbug_with_loc+0x47/0xb0 [libcfs] Oct 11 14:20:08 centos6-14 kernel: [139674.313281] [<ffffffffa0cebdd0>] lnet_ni_eager_recv+0x1e0/0x220 [lnet] Oct 11 14:20:08 centos6-14 kernel: [139674.314228] [<ffffffffa0cee5ad>] lnet_parse_local+0x54d/0xc50 [lnet] Oct 11 14:20:08 centos6-14 kernel: [139674.314859] [<ffffffff8117757a>] ? cache_alloc_debugcheck_after+0x14a/0x210 Oct 11 14:20:08 centos6-14 kernel: [139674.315554] [<ffffffffa0cef37a>] lnet_parse+0x6ca/0xd20 [lnet] Oct 11 14:20:08 centos6-14 kernel: [139674.316120] [<ffffffffa0cf014b>] lolnd_send+0x2b/0xa0 [lnet] Oct 11 14:20:08 centos6-14 kernel: [139674.326047] [<ffffffffa0ce86eb>] lnet_ni_send+0x4b/0xf0 [lnet] Oct 11 14:20:08 centos6-14 kernel: [139674.326683] [<ffffffffa0cecd63>] lnet_send+0x883/0xba0 [lnet] Oct 11 14:20:08 centos6-14 kernel: [139674.327206] [<ffffffffa0cedb4c>] LNetPut+0x2fc/0x810 [lnet] Oct 11 14:20:08 centos6-14 kernel: [139674.327759] [<ffffffffa1375410>] ptl_send_buf+0x1e0/0x540 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.336282] [<ffffffff81042f1c>] ? kvm_clock_read+0x1c/0x20 Oct 11 14:20:08 centos6-14 kernel: [139674.336945] [<ffffffffa1378af5>] ptl_send_rpc+0x665/0xdf0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.337816] [<ffffffffa136e536>] ptlrpc_send_new_req+0x526/0x980 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.351995] [<ffffffffa136e9fd>] ptlrpc_set_add_req+0x6d/0xb0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.352573] [<ffffffffa135affe>] ldlm_server_blocking_ast+0x64e/0x8c0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.353593] [<ffffffffa13ddf49>] tgt_blocking_ast+0x1b9/0x8c0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.354180] [<ffffffffa0ad634f>] ? cfs_trace_unlock_tcd+0x3f/0xa0 [libcfs] Oct 11 14:20:08 centos6-14 kernel: [139674.354650] [<ffffffffa0ae2563>] ? libcfs_debug_vmsg2+0x5d3/0xbd0 [libcfs] Oct 11 14:20:08 centos6-14 kernel: [139674.355232] [<ffffffffa132d094>] ldlm_work_revoke_ast_lock+0xa4/0x1a0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.366338] [<ffffffffa1372007>] ptlrpc_set_wait+0x77/0x9d0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.366889] [<ffffffff8117a334>] ? kmem_cache_alloc_node_trace+0x144/0x210 Oct 11 14:20:08 centos6-14 kernel: [139674.371428] [<ffffffffa136919f>] ? ptlrpc_prep_set+0x5f/0x290 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.371957] [<ffffffff810a00e4>] ? __init_waitqueue_head+0x24/0x40 Oct 11 14:20:08 centos6-14 kernel: [139674.372609] [<ffffffffa1369223>] ? ptlrpc_prep_set+0xe3/0x290 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.373181] [<ffffffffa132cff0>] ? ldlm_work_revoke_ast_lock+0x0/0x1a0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.374623] [<ffffffffa132a0cf>] ldlm_run_ast_work+0xcf/0x440 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.375192] [<ffffffffa1366a46>] ldlm_reclaim_full+0x536/0x8d0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.375751] [<ffffffffa135bb4c>] ldlm_handle_enqueue0+0x14c/0x1580 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.376318] [<ffffffffa13d0d91>] ? tgt_lookup_reply+0x31/0x190 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.376950] [<ffffffffa13e2f71>] tgt_enqueue+0x61/0x230 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.377476] [<ffffffffa13e3bbc>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.377925] [<ffffffffa138ecd4>] ptlrpc_main+0xd74/0x1850 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.378344] [<ffffffffa138df60>] ? ptlrpc_main+0x0/0x1850 [ptlrpc] Oct 11 14:20:08 centos6-14 kernel: [139674.378827] [<ffffffff8109f82e>] kthread+0x9e/0xc0 Oct 11 14:20:08 centos6-14 kernel: [139674.379313] [<ffffffff8100c2ca>] child_rip+0xa/0x20 Oct 11 14:20:08 centos6-14 kernel: [139674.379826] [<ffffffff8109f790>] ? kthread+0x0/0xc0 Oct 11 14:20:08 centos6-14 kernel: [139674.380396] [<ffffffff8100c2c0>] ? child_rip+0x0/0x20
this is current master + few patches, two of them lnet: LU-7245 and LU-5733, but I think the crash is unrelated.
Crashdump failed.