[LU-509] Hyperion-mds1 - crash in mdtest Created: 19/Jul/11  Updated: 28/May/17  Resolved: 28/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Cliff White (Inactive) Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Hyperion - RHEL5 side lustre-2.0.65-2.6.18_238.12.1.el5_lustre_ga34dd87 kernel-2.6.18-238.12.1.el5_lustre


Attachments: Text File crash_8_23.txt     Text File crash_8_24.txt     Text File mds1-crash.txt    
Severity: 3
Rank (Obsolete): 10316

 Description   

Running mdtest in a loop with other tests, mdtest ran 5 times without issue, system as a whole ran 8+ hours without issue.
MDS crashed at 5am, with attached panic stack.
Is this possible bad memory?



 Comments   
Comment by Oleg Drokin [ 24/Jul/11 ]

I would not be so sure about the "bad memory" theory.
It doues not look like a single bit error to me or anything of that nature.

Comment by Cliff White (Inactive) [ 25/Jul/11 ]

At this point the tests have run for several more days, have reformatted the filesystem, done a few hard shutdown/recoveries and no replication of this error. Enter headscratching mode.

Comment by Cliff White (Inactive) [ 24/Aug/11 ]

Had second crash while running mdtest - am going to run mdtest a few more times to try and replicate - stack attached

Comment by Cliff White (Inactive) [ 24/Aug/11 ]

stack from 8/23 crash

Comment by Cliff White (Inactive) [ 24/Aug/11 ]

8/23 crash

Comment by Cliff White (Inactive) [ 26/Aug/11 ]

8/24 crash

Comment by Cliff White (Inactive) [ 26/Aug/11 ]

8/24 crash on mount

Comment by Andreas Dilger [ 28/May/17 ]

Close old issue.

Generated at Sat Feb 10 01:07:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.