[LU-2026] MDT kernel panic: Kernel BUG at fs/jbd2/transaction.c:292 RIP [<ffffffff887aae03>] :jbd2:jbd2_journal_start+0x3a/0xdf Created: 25/Sep/12  Updated: 18/Jul/14  Resolved: 18/Jul/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.7
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Manish Patel (Inactive) Assignee: Lai Siyao
Resolution: Cannot Reproduce Votes: 0
Labels: None

Attachments: File crash-log.txt.bz2    
Severity: 3
Rank (Obsolete): 4152

 Description   

We had an MDT kernel panic recently. We were able to get a partial core file and from that get the crash log. It managed to kernel panic in a really weird spot. I have asked them to verify that the hardware is solid as there are some messages in the mcelog file.

In the meantime, can someone take a look at the log and see if it makes any sense?



 Comments   
Comment by Peter Jones [ 25/Sep/12 ]

Bobijam

Could you please look into this one?

Thanks

Peter

Comment by Zhenyu Xu [ 26/Sep/12 ]

lock enqueue handling upon resource 1993378622/164683178 was never finished (kept on sleeping in ldlm_expired_completion_wait) and stopped accepting client's reconnection.

Cannot tell why MDS could not finished the lock handling though (it was waiting for the lock's grant or cancellation finished before moving forward).

Comment by Kit Westneat (Inactive) [ 26/Sep/12 ]

the same system is also having problems with certain directories:
http://jira.whamcloud.com/browse/LU-2025

I'm not sure if it's related, but I thought I'd throw that out there.

Also can you elaborate more on how that might cause an oops in the jbd2 code? What are the chances of another oops?

Comment by Peter Jones [ 15/Oct/12 ]

Lai

Could you please look into this one?

Thanks

Peter

Comment by Peter Jones [ 18/Jul/14 ]

Releases:

  • IEEL 2.0.0 GA; IEEL 2.0.1 RC2 in release testing
  • Lustre 2.5.2 GA
  • Lustre 2.6 RC2 in release testing
  • Initial scoping and timeline for IEEL 2.2 underway
  • Initial scoping and timeline for Lustre 2.7 completed
    Sustaining:
  • RHEL7 client builds now routinely occurring on master. Next step is to add into routine automated testing.
  • Initial scoping for adding Ubuntu Lustre clients completed.
  • Sanger now happy with stability of 2.5.x clients
  • ANU unhappy with stability of their DDN filesystem and have concerns about the quality of support that they have been receiving through DDN channels.
Generated at Sat Feb 10 01:21:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.