Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.5.3
-
None
-
TOSS 2.4-9
-
2
-
9223372036854775807
Description
After a power outage we encountered a hardware error on one of our storage devices that essentially corrupted ~30 files on one of the OSTs. Since then the OST has been read-only and is throwing the following log messages:
[ 351.029519] LustreError: 8974:0:(ofd_obd.c:1376:ofd_create()) fscratch-OST0001: unable to precreate: rc = -5
[ 360.762505] LustreError: 8963:0:(ofd_obd.c:1376:ofd_create()) fscratch-OST0001: unable to precreate: rc = -5
[ 370.784372] LustreError: 8974:0:(ofd_obd.c:1376:ofd_create()) fscratch-OST0001: unable to precreate: rc = -5
I have scrubbed the device in question and rebooted the system bring up the server normally but I am still unable to create a file on that OST.
zpool status -v reports the damaged files and recommended restoring from backup and I'm inclined to simply removing the files. I know how to do this with ldiskfs but I don't know how to with ZFS. At this point I don't know how to proceed.
Attachments
Issue Links
- is related to
-
LU-7585 Implement OI Scrub for ZFS
-
- Resolved
-
Besides the OI scrub to repair the corruption in the ZFS filesystem, I think the only other option is to migrate the files off this OST onto other OSTs and then reformat it. How much space this will consume on the other OSTs depends on how many OSTs there are.
That isn't a great solution, but the repair tools for ZFS are somewhat less robust than for ldiskfs since ZFS itself doesn't get corrupted very easily. Once the OI Scrub functionality is available for ZFS we will be able to repair a fair amount of corruption in the backing filesystem, though it still wouldn't be possible to recover the user data in any OST objects that were corrupted.