[LU-7298] replay-single test_70b: ASSERTION( dt->do_body_ops->dbo_write ) failed Created: 14/Oct/15  Updated: 20/Aug/18  Resolved: 20/Aug/18

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Related
is related to LU-10143 LBUG dt_object.h:2166:dt_declare_reco... Resolved
is related to LU-6844 replay-single test 70b failure: 'rund... Resolved
is related to LU-7739 replay-single test 70b hangs with LBU... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for John Hammond <john.hammond@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/1e89fd4a-6f78-11e5-a88f-5254006e85c2.

The sub-test test_70b failed with the following error:

test failed to respond and timed out

In this 70b failure MDT0 panicked:

09:32:39:Lustre: DEBUG MARKER: test_70b fail mds4 4 times
09:32:39:LustreError: 11-0: lustre-MDT0003-osp-MDT0000: operation obd_ping to node 10.1.4.179@tcp failed: rc = -107
09:32:39:Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0003-mdc-*.mds_server_uuid in FULL state after 4 sec
09:32:39:Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0003-mdc-*.mds_server_uuid in FULL state after 4 sec
09:32:39:Lustre: DEBUG MARKER: mdc.lustre-MDT0003-mdc-*.mds_server_uuid in FULL state after 4 sec
09:32:39:LustreError: 27041:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed: 
09:32:39:LustreError: 27041:0:(dt_object.c:512:dt_record_write()) LBUG
09:32:39:Pid: 27041, comm: mdt_out00_002
09:32:39:
09:32:39:Call Trace:
09:32:39: [<ffffffffa049b875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
09:32:39: [<ffffffffa049be77>] lbug_with_loc+0x47/0xb0 [libcfs]
09:32:39: [<ffffffffa05f579f>] dt_record_write+0xbf/0x130 [obdclass]
09:32:39: [<ffffffffa08902ae>] out_tx_write_exec+0x7e/0x300 [ptlrpc]
09:32:39: [<ffffffffa0887dca>] out_tx_end+0xda/0x5d0 [ptlrpc]
09:32:39: [<ffffffffa088d896>] out_handle+0xbd6/0x1890 [ptlrpc]
09:32:39: [<ffffffffa07d44e0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
09:32:39: [<ffffffffa0884aac>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc]
09:32:39: [<ffffffffa082c531>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
09:32:39: [<ffffffffa082b6f0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
09:32:39: [<ffffffff810a0fce>] kthread+0x9e/0xc0
09:32:39: [<ffffffff8100c28a>] child_rip+0xa/0x20
09:32:39: [<ffffffff810a0f30>] ? kthread+0x0/0xc0
09:32:39: [<ffffffff8100c280>] ? child_rip+0x0/0x20
09:32:39:

Info required for matching: replay-single 70b



 Comments   
Comment by Bob Glossman (Inactive) [ 17/Oct/15 ]

another on master:
https://testing.hpdd.intel.com/test_sets/b4ecf65c-74c8-11e5-b47d-5254006e85c2

from console log of MDS2:

04:54:57:LustreError: 17327:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed: 
04:54:57:LustreError: 17327:0:(dt_object.c:512:dt_record_write()) LBUG
04:54:57:Pid: 17327, comm: mdt_out00_000
04:54:57:
04:54:57:Call Trace:
04:54:57: [<ffffffffa049b875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
04:54:57: [<ffffffffa049be77>] lbug_with_loc+0x47/0xb0 [libcfs]
04:54:57: [<ffffffffa05f579f>] dt_record_write+0xbf/0x130 [obdclass]
04:54:57: [<ffffffffa08908de>] out_tx_write_exec+0x7e/0x300 [ptlrpc]
04:54:57: [<ffffffffa08883fa>] out_tx_end+0xda/0x5d0 [ptlrpc]
04:54:57: [<ffffffffa088dec6>] out_handle+0xbd6/0x1890 [ptlrpc]
04:54:57: [<ffffffffa07d44e0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
04:54:57: [<ffffffffa08850dc>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc]
04:54:57: [<ffffffffa082c9c1>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
04:54:57: [<ffffffffa082bb80>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
04:54:57: [<ffffffff810a0fce>] kthread+0x9e/0xc0
04:54:57: [<ffffffff8100c28a>] child_rip+0xa/0x20
04:54:57: [<ffffffff810a0f30>] ? kthread+0x0/0xc0
04:54:57: [<ffffffff8100c280>] ? child_rip+0x0/0x20
Comment by James Nunez (Inactive) [ 19/Oct/15 ]

Another failure on master. Logs at https://testing.hpdd.intel.com/test_sets/b4ecf65c-74c8-11e5-b47d-5254006e85c2

Comment by nasf (Inactive) [ 28/Nov/15 ]

Another failure instance on master:
https://testing.hpdd.intel.com/test_sets/77835c1e-9522-11e5-bdeb-5254006e85c2

Comment by James Nunez (Inactive) [ 03/Dec/15 ]

Another failure on master in review-dne-part-2:
2015-12-03 01:21:43 - https://testing.hpdd.intel.com/test_sets/6f123fe8-99a8-11e5-b944-5254006e85c2
2015-12-08 23:11:27 - https://testing.hpdd.intel.com/test_sets/8492871e-9e5e-11e5-b163-5254006e85c2
2015-12-17 04:05:37 - https://testing.hpdd.intel.com/test_sets/f4785ce6-a4d7-11e5-83a3-5254006e85c2

Comment by Bob Glossman (Inactive) [ 11/Jan/16 ]

another on master:
https://testing.hpdd.intel.com/test_sets/1b96d0fc-b6dd-11e5-adbf-5254006e85c2

Comment by Richard Henwood (Inactive) [ 16/Feb/16 ]

another failure on Master:

https://testing.hpdd.intel.com/test_sets/2485a3b0-d4d2-11e5-9e3f-5254006e85c2

This is on review-dne-part-2

Comment by Richard Henwood (Inactive) [ 09/Mar/16 ]

another failure on Master:

https://testing.hpdd.intel.com/test_sets/cc7dfeea-e3b0-11e5-a700-5254006e85c2

This is on review-dne-part-2

Comment by nasf (Inactive) [ 17/Nov/16 ]

+1 on master:
https://testing.hpdd.intel.com/test_sets/6e3ac50c-ac28-11e6-b1f4-5254006e85c2

Generated at Sat Feb 10 02:07:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.