[LU-38] Kernel panic in ldiskfs on OST unmount Created: 10/Jan/11 Updated: 28/Jun/11 Resolved: 13/Jun/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Christopher Morrone | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre-1.8.3.0-6chaos_2.6.18_93chaos.ch4.3 |
||
| Severity: | 3 |
| Rank (Obsolete): | 10407 |
| Description |
|
When the sysadmins attempted to unmount the OSTs on a number of OSSs to shutdown them down for scheduled maintenance, six of the nodes kernel panicked. They all say: Kernel BUG at fs/ldiskfs/mballoc.c:3714 RIP (for what its worth) is the same for each: :ldiskfs:ldiskfs_mb_release_inode_pa The stacks are also the same: sync_buffer I apologize for any typos. That had to be copied by hand. |
| Comments |
| Comment by Dan Ferber (Inactive) [ 10/Jan/11 ] |
|
Assigned to Alex for initial analysis. |
| Comment by Alex Zhuravlev [ 10/Jan/11 ] |
|
Christopher, can you confirm from the logs the devices were turned read-only? it looks to be instance of 24214 in bugzilla. |
| Comment by Christopher Morrone [ 11/Jan/11 ] |
|
It looks like "umount /dev/<ostdev>" does indeed turn the devices read-only as part of the shutdown process. |
| Comment by Lai Siyao [ 13/Jan/11 ] |
|
bug 16680 explains the cause of this crash, and this patch from bug 22299 should be able to fix it. This fix has been landed on 1.8.4 and 2.0, because your version is 1.8.3, I think you apply it and verify. |
| Comment by Christopher Morrone [ 16/Feb/11 ] |
|
I applied the patch to our copy of ldiskfs. |
| Comment by Peter Jones [ 13/Jun/11 ] |
|
Please reopen if this reoccurs with the patches applied |