[LU-12386] ldiskfs-fs error: ldiskfs_iget:4374: inode #x: comm ll_ostx_y: bad extra_isize (36832 != 512) Created: 04/Jun/19 Updated: 11/Jun/19 Resolved: 11/Jun/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Ruth Klundt (Inactive) | Assignee: | Andreas Dilger |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre 2.10.5.2.chaos-1.ch6_1 |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Similar error occurred on 2 osts, 2 different nodes using 2 different DDN raid controllers. The ost aborted journal and was remounted RO. Subsequent e2fsck successfully cleared the problem inodes and the targets re-mounted. We don't have extra_isize showing as a file system feature on these OST devs, or at least it doesn't show up in dumpe2fs output. The OSTs have been up and running ok since last September or so. |
| Comments |
| Comment by Andreas Dilger [ 04/Jun/19 ] |
|
Ruth, the "inode #x" part of the message may be relevant if this filesystem was formatted a long time ago. There was a bug in very old mke2fs that didn't zero out the extra inode space in very low-numbered inodes (e.g. inodes 2-15 or so). Otherwise, it appears that this is inode corruption with some random garbage. Do you have the e2fsck output to see if those inodes had other corruption, or was only the i_extra_isize bad? When was the previous time that e2fsck was run? Was there anything run recently that would cause very old files to be accessed for the first time in a long time, or is this corruption on a recently-created file? Note that the extra_isize feature is not needed to use the large inode space, that is enabled by default when the filesystem is formatted with inodes larger than 256 bytes (as with all Lustre filesystems) and enough space is reserved for the current kernel's fixed inode fields (32 bytes currently). The extra_isize feature is only needed for the case where additional space is reserved beyond what is needed beyond the fixed inode fields. |
| Comment by Ruth Klundt (Inactive) [ 05/Jun/19 ] |
|
Thanks for the info about extra_isize, I'm less confused The filesystem was created on new gear in September 2018, with the software stack as listed above. One of the OSTs had a sequential group of 5 inodes with the problem, and they all had other corruption such as huge i_size, too many blocks, dtime set, and bitmap fixes were necessary. Also the fsck had to use a backup superblock because of 'bad block for block bitmap'. Not sure how to determine whether these inodes are new or old. The inode numbers were in the 36M range, with each ost having ~72M total inodes. Currently the number of inodes in use is ~3M on all the osts. On the other OST 8 consecutive inode numbers (in 33M range) were showing other problems in addition to extra_isize. No bad superblock though. The OSTs are relatively large compared to what we've had before on ldiskfs, 74TB. They are 46% full. Not sure about recent changes in user activity, I'll be looking around for that. |
| Comment by Andreas Dilger [ 05/Jun/19 ] |
|
Ok, that rules out the old mke2fs bug. Do you have the actual e2fsck output? Sometimes it is possible to see, based on what corrupt values are printed, what might have been overwriting a block. It definitely seems like a block-level corruption, since we have 8 512-byte OST inodes per 4KB block. Any errors on the controllers? Any other errors in the filesystems? |
| Comment by Joe Mervini [ 05/Jun/19 ] |
|
I checked out the storage subsystem and from the storage side of things (this is an SFA12K 10 stack) only 1 drive in the system is reporting a physical error. Otherwise there is no other reported errors. However, I checked on the IO channels (IB) and on one of the channels not associated with the servers with the OSTs is reporting symbol errors that appears to be a bad cable. This started getting reported at ~16:30 on 6/1 in the controller log. No other messages with the exception of the 'keep alive' messages were reported. |
| Comment by Ruth Klundt (Inactive) [ 05/Jun/19 ] |
|
I'm working on getting e2fsck output cleared to post. Other errors on the filesystem are things that are more or less usual, like high order page allocation failures, and grant complaints. ( |
| Comment by Ruth Klundt (Inactive) [ 06/Jun/19 ] |
|
ost002e output was not captured from the beginning. ost0008 I removed all of the 'fix? yes' lines and lines describing 'count wrong for group' since they were for the whole fs - someone has to actually look at these in order to approve release. Let me know if those omissions are of interest. Basically every group had a default value for blocks and inodes and was updated. |
| Comment by Andreas Dilger [ 07/Jun/19 ] |
|
Looking through the logs I don't see any kind of pattern with the broken inodes. They appear to be just random corruption in the inode block, with random flags set and bogus file sizes. It looks like the problem is limited to one block in the inode table (8 inodes), and the superblock, which could be recovered from a backup. The inodes were cleared by e2fsck, since they no longer contained useful information, so there isn't anything that can be done to recover the data there. It doesn't look like there are any other problems with the filesystem. At this point it isn't clear if anything can be done to diagnose the source of this problem. I don't know the hardware well enough to say whether the drive or cable that Joe reported could be causing this or not. |
| Comment by Ruth Klundt (Inactive) [ 11/Jun/19 ] |
|
Thanks for looking, I think we would mostly be concerned with whether there is a need to upgrade anything in order to avoid a repeat occurrence. If not then we can close for now. |
| Comment by Peter Jones [ 11/Jun/19 ] |
|
ok - thanks Ruth |