[LU-7138] LBUG: (osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed: - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Duplicate
Priority: Blocker
Fix Version/s: None
Affects Version/s: Lustre 2.7.0
Labels:
None

Severity:
1
Rank (Obsolete):
9223372036854775807

Description

This evening we have hit this LBUG on the MDT in our production file system, the file system is currently down as we hit the same bug every time we attempt to bring the MDT back, as soon as recovery finishes.

<0>LustreError: 722:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:
<0>LustreError: 722:0:(osd_handler.c:1017:osd_trans_start()) LBUG
<4>Pid: 722, comm: mdt01_017
<4>
<4>Call Trace:
<4> [<ffffffffa065f895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa065fe97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa17df24d>] osd_trans_start+0x25d/0x660 [osd_ldiskfs]
<4> [<ffffffffa09b9b4a>] llog_osd_destroy+0x42a/0xd40 [obdclass]
<4> [<ffffffffa09b2edc>] llog_cat_new_log+0x1ec/0x710 [obdclass]
<4> [<ffffffffa09b350a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
<4> [<ffffffffa09ab1e9>] llog_add+0x89/0x1c0 [obdclass]
<4> [<ffffffffa17f1976>] ? osd_attr_set+0x166/0x460 [osd_ldiskfs]
<4> [<ffffffffa0d914e2>] mdd_changelog_store+0x122/0x290 [mdd]
<4> [<ffffffffa0da4d0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
<4> [<ffffffffa0dad9b3>] mdd_attr_set+0x12f3/0x1730 [mdd]
<4> [<ffffffffa088a551>] mdt_reint_setattr+0xf81/0x13a0 [mdt]
<4> [<ffffffffa087be1c>] ? mdt_root_squash+0x2c/0x3f0 [mdt]
<4> [<ffffffffa08801dd>] mdt_reint_rec+0x5d/0x200 [mdt]
<4> [<ffffffffa086423b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
<4> [<ffffffffa08649ab>] mdt_reint+0x6b/0x120 [mdt]
<4> [<ffffffffa0c6f56e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
<4> [<ffffffffa0c1f5a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
<4> [<ffffffff8106c4f0>] ? pick_next_task_fair+0xd0/0x130
<4> [<ffffffffa0c1e760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
<4> [<ffffffff8109e66e>] kthread+0x9e/0xc0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
<4>
<0>Kernel panic - not syncing: LBUG

The stack trace doesn't quite seem to be the same as for ~~LU-6634~~ (which anyway doesn't have any fix suggested.)

Attachments

Issue Links

is related to

LU-6556 changelog catalog corruption if all possible records is define

Resolved

LU-6634 (osd_handler.c:901:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed: when reaching Catalog full condition

Resolved

Activity

People

Assignee:: Oleg Drokin

Reporter:: Frederik Ferner (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 10/Sep/15 11:23 PM

Updated:: 01/Mar/18 4:38 PM

Resolved:: 04/Oct/15 7:34 PM