[LU-6717] dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write Created: 12/Jun/15  Updated: 22/Dec/15  Resolved: 26/Aug/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Cliff White (Inactive) Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: None
Environment:

Hyperion SWL test


Attachments: File iws15.lustre.log.gz    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

During test run, started dropping DNE MDTs

xJun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed:
Jun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed:
Jun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) LBUG
Jun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) LBUG
Jun 11 22:44:52 iws15 kernel: Pid: 10346, comm: mdt_out00_007
Jun 11 22:44:52 iws15 kernel:
Jun 11 22:44:52 iws15 kernel: Call Trace:
Jun 11 22:44:52 iws15 kernel: [<ffffffffa04a2875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa04a2e77>] lbug_with_loc+0x47/0xb0 [libcfs]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa09ae15f>] dt_record_write+0xbf/0x130 [obdclass]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c3c78e>] out_tx_write_exec+0x7e/0x2e0 [ptlrpc]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c3b574>] ? out_tx_destroy_exec+0x24/0x1a0 [ptlrpc]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c36282>] out_tx_end+0xe2/0x5d0 [ptlrpc]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c3c216>] out_handle+0x9d6/0xed0 [ptlrpc]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa0bcf2cc>] ? lustre_msg_get_opc+0x8c/0x100 [ptlrpc]
Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c323be>] tgt_request_handle+0x95e/0x10b0 [ptlrpc]
Jun 11 22:44:53 iws15 kernel: [<ffffffffa0be1c51>] ptlrpc_main+0xe41/0x1970 [ptlrpc]
Jun 11 22:44:53 iws15 kernel: [<ffffffffa0be0e10>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]
Jun 11 22:44:53 iws15 kernel: [<ffffffff8109e71e>] kthread+0x9e/0xc0
Jun 11 22:44:53 iws15 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
Jun 11 22:44:53 iws15 kernel: [<ffffffff8109e680>] ? kthread+0x0/0xc0
Jun 11 22:44:53 iws15 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Jun 11 22:44:53 iws15 kernel:

System panic'd at this point, no log dump



 Comments   
Comment by Cliff White (Inactive) [ 12/Jun/15 ]

System now hits the LBUG upon every restart attempt. Have fsck'd the disks.
continues to get errors

LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567847, ql: 1, comp: 288, conn: 289, next: 4317567849, last_committed: 4317566870)
LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567852, ql: 1, comp: 288, conn: 289, next: 4317567854, last_committed: 4317566870)
LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567857, ql: 1, comp: 288, conn: 289, next: 4317567859, last_committed: 4317566870)
Lustre: lustre-MDT000a-osp-MDT0004: Connection restored to lustre-MDT000a (at 0@lo)

Comment by Doug Oucharek (Inactive) [ 12/Jun/15 ]

Cliff: can you save the state of the filesystem for analysis?

Comment by Cliff White (Inactive) [ 12/Jun/15 ]

Certainly what information would you like?

Comment by Di Wang [ 12/Jun/15 ]

Do you have crash dump for this panic?

Comment by Di Wang [ 13/Jun/15 ]

It looks like llog object got disappeared after recovery.

Comment by Di Wang [ 14/Jun/15 ]

Cliff: Could you please tell me which build did you use for the test? tag 2.7.55? Thanks

Comment by Gerrit Updater [ 15/Jun/15 ]

wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15278
Subject: LU-6717 llog: Create update llog synchronously
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ca7ad20c675ce6b0cdaf96eb350bdc46cb0bae78

Comment by Gerrit Updater [ 08/Jul/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15278/
Subject: LU-6717 llog: Create update llog synchronously
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f4313edcb837429ccc7f501158514560d602ee85

Generated at Sat Feb 10 02:02:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.