Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Lustre 2.0.0
-
None
-
linux-2.6.32-71.24.1
-
3
-
6521
Description
Hi,
Some users at CEA site complain about inconsistencies between "lfs quota -u" vs "du -s" report.
After long investigations, on site support finally found that the lost file system space is consumed by orphaned objids on OSTs, and is a consequence of LU-601 work-around.
When it was impossible to restart the MDS (systematically asserting in "tgt_recov"), the only solution was to mount the volume in ldiskfs mode and rename the PENDING subdirectory.
Now, there are several old "PENDING* directories", and a lot of orphaned objids belonging to FIDs in these directories.
In order to recover all this lost space, the support is asking if it is safe to run "lfsck", or if they have to build their own tool to offline parse all OSTs and remove all objids that belongs to FIDs in PENDING* directories ?
Perhaps the PENDING directory was sometimes removed instead of renamed. In this case, is the recovery identical, or is there something else to do?
TIA
Patrick
Below is the support report, and I have also attached the files containing the traces of the commands executed on Client, MDT and OST.
#context: Some times ago, a few users started to report Lustre quotas inconsistencies regarding to the "lfs quota -u" report vs "du -s" over their full hierachy/sub-tree. "lfs quotacheck" did not fix inconsistencies. #consequences: Quotas are unusable and inaccurate for these users and (a lot ??) filesystem space is consumed by orphaned objids on OSTs. #details: 1st check made was to identify that the inconsistencies are due to real (and orphaned) filesystem space/blocks consumption and not only/just a bad Quota value !!... 2nd thing has been to identify that the orphaned objids belong to FIDs in the MDS multiple PENDING* directories that have been moved as part of LU-601 work-around !!! See [Client,MDT,OST]_side files showing the details. So what can we do now to recover all the space/blocks used by the orphaned objids ??? Can we safelly run "lfsck" or do we need to build our own tool to offline parse all OSTs and remove all objids that belongs to FIDs in PENDING* directories ???