Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
None
-
lola
build: tip of master (commit ae3a2891f10a19acf855a90337316dda704da5d) + patches
-
3
-
9223372036854775807
Description
The error occurred during soak testing of build '20151214' (see https://wiki.hpdd.intel.com/pages/viewpage.action?title=Soak+Testing+on+Lola&spaceKey=Releases#SoakTestingonLola-20151214)
DNE is enabeled. MDS nodes are configured in active-active HA configuration. MDTs have been formatted with ldiskfs, OSTs with zfs as storage backend.
The error happened during normal operations (no fault injected) on MDS node:
Dec 17 20:31:54 lola-11 kernel: LustreError: 5456:0:(llog_cat.c:465:llog_cat_add_rec()) llog_write_rec -5: lh=ffff8803e78994c0 Dec 17 20:31:54 lola-11 kernel: LustreError: 5324:0:(osp_trans.c:1574:osp_trans_stop()) ASSERTION( !list_empty(&oth->ot_our->our_list) ) failed: Dec 17 20:31:54 lola-11 kernel: LustreError: 5324:0:(osp_trans.c:1574:osp_trans_stop()) LBUG Dec 17 20:31:54 lola-11 kernel: Pid: 5324, comm: mdt02_001 Dec 17 20:31:54 lola-11 kernel: Dec 17 20:31:54 lola-11 kernel: Call Trace: Dec 17 20:31:54 lola-11 kernel: [<ffffffffa07c3875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa07c3e77>] lbug_with_loc+0x47/0xb0 [libcfs] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa141989d>] osp_trans_stop+0x40d/0x440 [osp] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa0b836b8>] dt_trans_stop+0x18/0x50 [ptlrpc] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa0b88fc5>] top_trans_stop+0x5d5/0xe60 [ptlrpc] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa1321ffe>] ? lod_attr_set+0x12e/0xaa0 [lod] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa08fd660>] ? lu_ucred+0x20/0x30 [obdclass] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa130491c>] lod_trans_stop+0x2bc/0x330 [lod] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa13b001a>] mdd_trans_stop+0x1a/0xac [mdd] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa139f3ca>] mdd_create+0x12ea/0x1600 [mdd] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa124b834>] ? mdt_version_save+0x84/0x1a0 [mdt] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa1250366>] mdt_reint_create+0xbb6/0xcc0 [mdt] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa0b0912b>] ? lustre_pack_reply_v2+0x1eb/0x280 [ptlrpc] Dec 17 20:31:54 lola-11 kernel: [<ffffffff81294a3a>] ? strlcpy+0x4a/0x60 Dec 17 20:31:54 lola-11 kernel: [<ffffffffa124aa2d>] mdt_reint_rec+0x5d/0x200 [mdt] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa123681b>] mdt_reint_internal+0x62b/0xb80 [mdt] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa123720b>] mdt_reint+0x6b/0x120 [mdt] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa0b7350c>] tgt_request_handle+0x8ec/0x1470 [ptlrpc] Dec 17 20:31:54 lola-11 kernel: [<ffffffffa0b1acc1>] ptlrpc_main+0xe41/0x1910 [ptlrpc] Dec 17 20:31:54 lola-11 kernel: [<ffffffff8152a39e>] ? thread_return+0x4e/0x7d0 Dec 17 20:31:54 lola-11 kernel: [<ffffffffa0b19e80>] ? ptlrpc_main+0x0/0x1910 [ptlrpc] Dec 17 20:31:54 lola-11 kernel: [<ffffffff8109e78e>] kthread+0x9e/0xc0 Dec 17 20:31:54 lola-11 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Dec 17 20:31:54 lola-11 kernel: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0 Dec 17 20:31:54 lola-11 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 Dec 17 20:31:54 lola-11 kernel: Dec 17 20:31:54 lola-11 kernel: LustreError: dumping log to /tmp/lustre-log.1450413114.5324
No other event on other soak nodes coincide with this error.
Attached syslog, messages and kernel debug log mentioned in the error message.