[LU-4376] osd_trans_start() ASSERTION( get_current()->journal_info == ((void *)0) Created: 10/Dec/13 Updated: 14/Dec/21 Resolved: 14/Dec/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Ned Bass | Assignee: | Zhenyu Xu |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | llnl | ||
| Environment: | |||
| Severity: | 3 |
| Rank (Obsolete): | 11982 |
| Description |
|
lfsck crashes with the summary assertion and the following backtrace: libcfs_debug_dumpstack lbug_with_loc osd_trans_start lod_trans_start mdd_trans_start mdd_lfsck_namespace_exec_dir mdd_lfsck_dir_engine mdd_lfsck_oit_engine We had just carried out a procedure from |
| Comments |
| Comment by Ned Bass [ 10/Dec/13 ] |
|
This happens on each reboot, so this production filesystem is offline pending a fix or workaround. Also, we cannot provide crash dumps for this system. |
| Comment by Ned Bass [ 10/Dec/13 ] |
|
Update, it managed to stay up after the last reboot, and the OI scrub completed, so we're now back online. |
| Comment by Peter Jones [ 11/Dec/13 ] |
|
Bobijam Could you please advise on this one? Thanks Peter |
| Comment by Zhenyu Xu [ 11/Dec/13 ] |
|
It looks like there is an un-stopped transaction ongoing while another transaction is trying to start. What is your code base (the latest git commit + what patches)? Since lfsck is keeping on changing. |
| Comment by Ned Bass [ 11/Dec/13 ] |
|
We were running this tag: |
| Comment by Zhenyu Xu [ 12/Dec/13 ] |
|
Haven't found obvious double transaction start in the code so far, would you mind checking all backtrace to seek possible transaction re-entry? |
| Comment by Zhenyu Xu [ 07/Jan/14 ] |
|
Added a debug patch at http://review.whamcloud.com/8758 to log current->journal_info in osd_trans_start() |