Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.10.4
-
None
-
CentOS 7.4
-
3
-
9223372036854775807
Description
We hit the following issue today on Oak's MDT0. We just added another MDT a few days ago so it was empty but I started today to lfs migrate a test directory (from MDT0 to MDT1) when this happened on MDT0. lfs migrate did actually work for a while (about ~40k inodes have been migrated) until MDT0 did this:
Oct 26 17:26:13 oak-md1-s2 kernel: LDISKFS-fs error (device dm-0): ldiskfs_map_blocks:594: inode #659619751: block 774843950: comm mdt00_100: lblock 0 mapped to illegal pblock (length 1) Oct 26 17:26:13 oak-md1-s2 kernel: Aborting journal on device dm-0-8. Oct 26 17:26:13 oak-md1-s2 kernel: LustreError: 3844:0:(osd_handler.c:1586:osd_trans_commit_cb()) transaction @0xffff881b024c6a80 commit error: 2 Oct 26 17:26:13 oak-md1-s2 kernel: LDISKFS-fs (dm-0): Remounting filesystem read-only
I performed a fsck with the new e2fsprogs-1.44.3.wc1-0.el7.x86_64
Not sure this fixed our issue as I don't see any reference to inode 659619751 in it. I'm attaching the full fsck logs.
Hopefully, MDT0 then restarted without problem but I haven't touched the partially migrated directory at this time to avoid further issues on a Friday night on this production system. My feeling is that lfs migrate hit some migrated inode that somehow got corrupted on MDT0? The system has been working fine for weeks now (we only have the issue reported in LU-11205 regarding changelog_clear errors), so I assume this is due to my running lfs migrate. I can perform troubleshooting next week. Any recommendation to avoid this in the future is welcome, thanks much!
Stephane
Attachments
Issue Links
- duplicates
-
LU-12485 (osd_handler.c:2146:osd_object_release()) ASSERTION( !(o->oo_destroyed == 0 && o->oo_inode && o->oo_inode->i_nlink == 0) ) faile
- Resolved
- is related to
-
LU-13157 migrate symlink with target name length > 59 cause crash
- Resolved
-
LU-14511 oak-MDT0000: deleted inode referenced and aborted journal
- Resolved
- is related to
-
LU-10581 osd_handler.c:1978:osd_object_release()) LBUG
- Open