Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
Lustre 1.8.x (1.8.0 - 1.8.5)
-
None
-
Sun hardware running mdadm
-
4120
Description
We have an OST on one of our scratch file systems that was deactivated and attempts to reactivate it failed with:
[204919.753933] Lustre: scratch1-MDT0000: scratch1-OST0001_UUID now active, resetting orphans
[204919.753939] Lustre: Skipped 1 previous similar message
[204919.754155] LustreError: 10403:0:(osc_create.c:589:osc_create()) scratch1-OST0001-osc: oscc recovery failed: -22
[204919.754166] Lustre: scratch1-OST0001_UUID: Failed to clear orphan objects on OST: -22
[204919.754170] Lustre: scratch1-OST0001_UUID: Sync failed deactivating: rc -22
First: Is lfsck the proper tool to recover from this error?
Second: Since this is the first time that I have ever used lfsck I am not sure what to expect. In this particular case there are 32 OSTs on this file system. Following the examples in the manual, I started a read-only run against all OSTs Saturday night and the the following morning from what I could judge it had only gotten through 11 of the 32 OSTs (if it run sequentially.) It reported LOTS of zero length orphaned inodes. Unfortunately in this case I didn't pipe the output to a log file so information regarding the OST (2/32) that couldn't be reactivated was lost when I ran out out line buffer space. So I stopped the run and restarted it against only that OST. Because it has been running listing user files for many hours I am guessing the answer is no bu is this a valid execution option or is it required that all OSTs must be accounted for?
Third: The db files were generated when the file system was quiesced and after e2fsck was run against all targets cleanly. Must lfsck be run while in an offline mode or can it be run while the file system is serving clients? Because of my inexperience with using lfsck I don't know what to expect in terms of duration of the run and since I am using the -n option I will need to run again with corrective options. Also when running the corrective options is -l -c the preferred method?
Attachments
Issue Links
- is related to
-
LU-14 live replacement of OST
- Resolved