Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
Lustre 2.8.0
-
lola
build: 2.7.62-28-g0754bc8, 0754bc8f2623bea184111af216f7567608db35b6; soakbuild '20151104.1'
-
3
-
9223372036854775807
Description
Error occurred during soak testing of build '20151104.1' on cluster lola (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20151104.1). MDTs are fromated with ldiskfs and OSTs with zfs as storage backend. DNE is enabled. MDSes are configured in HA failover configuration:
- lola-8,9
- mdt0/mgs, mdt1 primary node lola-8
- mdt2,mdt3 primary node lola-9
- lola-10,11
- mdt4/mdt5 primary node lola-10
- mdt6,mdt7 primary node lola-11
Event sequence:
- 2015-11-04 17:15:28 restart of lola-11 finished
- The following error occcured on lola-11:
lola-11.log:Nov 4 17:16:17 lola-11 kernel: LustreError: 5104:0:(llog.c:581:llog_process_thread()) soaked-MDT0007-osp-MDT0006 retry remote llog process
INTL-156
lola-11.log:Nov 4 17:16:18 lola-11 kernel: LustreError: 4984:0:(ldlm_lib.c:1883:check_for_next_transno()) soaked-MDT0007: waking for gap in transno, VBR is OFF (skip: 558345901253, ql: 5, comp: 11, conn: 16, next: 558345901263, next_update 0 last_committed: 558345900230)
- lola-) hit LBUG
Error message reads as:Nov 4 17:16:21 lola-9 kernel: LustreError: 6493:0:(osp_md_object.c:1155:osp_md_write()) ASSERTION( ob j->opo_ooa->ooa_attr.la_valid & LA_SIZE ) failed: Nov 4 17:16:21 lola-9 kernel: LustreError: 6493:0:(osp_md_object.c:1155:osp_md_write()) LBUG Nov 4 17:16:21 lola-9 kernel: Pid: 6493, comm: mdt00_005 Nov 4 17:16:21 lola-9 kernel: Nov 4 17:16:21 lola-9 kernel: Call Trace: Nov 4 17:16:21 lola-9 kernel: [<ffffffffa07d2875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa07d2e77>] lbug_with_loc+0x47/0xb0 [libcfs] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa14518cd>] osp_md_write+0x42d/0x4e0 [osp] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa092509d>] dt_record_write+0x3d/0x130 [obdclass] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa08e3398>] llog_osd_write_rec+0x768/0x1c50 [obdclass] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa08d1416>] llog_write_rec+0xb6/0x270 [obdclass] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa08da7e3>] llog_cat_add_rec+0x1c3/0x7b0 [obdclass] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa08d1229>] llog_add+0x89/0x1c0 [obdclass] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0bca124>] sub_updates_write+0x1b4/0x14a0 [ptlrpc] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0bcbe24>] top_trans_stop+0xa14/0xe30 [ptlrpc] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa136042e>] ? lod_attr_set+0x12e/0xaa0 [lod] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0940800>] ? lu_ucred+0x20/0x30 [obdclass] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa134391c>] lod_trans_stop+0x2bc/0x330 [lod] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa13edb8a>] mdd_trans_stop+0x1a/0xac [mdd] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa13dcf3a>] mdd_create+0x12ea/0x1600 [mdd] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa128c7a4>] ? mdt_version_save+0x84/0x1a0 [mdt] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa128ef66>] mdt_reint_create+0xbb6/0xcc0 [mdt] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0940800>] ? lu_ucred+0x20/0x30 [obdclass] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa126e675>] ? mdt_ucred+0x15/0x20 [mdt] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa128785c>] ? mdt_root_squash+0x2c/0x3f0 [mdt] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0b73cf2>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa128b99d>] mdt_reint_rec+0x5d/0x200 [mdt] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa127777b>] mdt_reint_internal+0x62b/0xb80 [mdt] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa127816b>] mdt_reint+0x6b/0x120 [mdt] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0bb60ec>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc] Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0b5d9e1>] ptlrpc_main+0xe41/0x1910 [ptlrpc] Nov 4 17:16:21 lola-9 kernel: [<ffffffff8152a39e>] ? thread_return+0x4e/0x7d0 Nov 4 17:16:21 lola-9 kernel: [<ffffffffa0b5cba0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc] Nov 4 17:16:21 lola-9 kernel: [<ffffffff8109e78e>] kthread+0x9e/0xc0 Nov 4 17:16:21 lola-9 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Nov 4 17:16:21 lola-9 kernel: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0 Nov 4 17:16:21 lola-9 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 Nov 4 17:16:21 lola-9 kernel: Nov 4 17:16:21 lola-9 kernel: LustreError: dumping log to /tmp/lustre-log.1446686181.6493
Attached soak.log, debug log file, messages and console log of node lola-9.
Attachments
Issue Links
- duplicates
-
LU-7039 llog_osd.c:778:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed:
- Resolved