[LU-2026] MDT kernel panic: Kernel BUG at fs/jbd2/transaction.c:292 RIP [<ffffffff887aae03>] :jbd2:jbd2_journal_start+0x3a/0xdf Created: 25/Sep/12 Updated: 18/Jul/14 Resolved: 18/Jul/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Manish Patel (Inactive) | Assignee: | Lai Siyao |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 4152 |
| Description |
|
We had an MDT kernel panic recently. We were able to get a partial core file and from that get the crash log. It managed to kernel panic in a really weird spot. I have asked them to verify that the hardware is solid as there are some messages in the mcelog file. In the meantime, can someone take a look at the log and see if it makes any sense? |
| Comments |
| Comment by Peter Jones [ 25/Sep/12 ] |
|
Bobijam Could you please look into this one? Thanks Peter |
| Comment by Zhenyu Xu [ 26/Sep/12 ] |
|
lock enqueue handling upon resource 1993378622/164683178 was never finished (kept on sleeping in ldlm_expired_completion_wait) and stopped accepting client's reconnection. Cannot tell why MDS could not finished the lock handling though (it was waiting for the lock's grant or cancellation finished before moving forward). |
| Comment by Kit Westneat (Inactive) [ 26/Sep/12 ] |
|
the same system is also having problems with certain directories: I'm not sure if it's related, but I thought I'd throw that out there. Also can you elaborate more on how that might cause an oops in the jbd2 code? What are the chances of another oops? |
| Comment by Peter Jones [ 15/Oct/12 ] |
|
Lai Could you please look into this one? Thanks Peter |
| Comment by Peter Jones [ 18/Jul/14 ] |
|
Releases:
|