<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:11:05 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-860] Lustre quota inconsistencies after multiple usages of LU-601 work-around</title>
                <link>https://jira.whamcloud.com/browse/LU-860</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;Some users at CEA site complain about inconsistencies between &quot;lfs quota -u&quot; vs &quot;du -s&quot; report.&lt;/p&gt;

&lt;p&gt;After long investigations, on site support finally found that the lost file system space is consumed by orphaned objids on OSTs, and is a consequence of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-601&quot; title=&quot;kernel BUG at fs/jbd2/transaction.c:1030&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-601&quot;&gt;&lt;del&gt;LU-601&lt;/del&gt;&lt;/a&gt; work-around.&lt;br/&gt;
When it was impossible to restart the MDS (systematically asserting in &quot;tgt_recov&quot;), the only solution was to mount the volume in ldiskfs mode and rename the PENDING subdirectory.&lt;br/&gt;
Now, there are several old &quot;PENDING* directories&quot;, and a lot of orphaned objids belonging to FIDs in these directories.&lt;/p&gt;

&lt;p&gt;In order to recover all this lost space, the support is asking if it is safe to run &quot;lfsck&quot;, or if they have to build their own tool to offline parse all OSTs and remove all objids that belongs to FIDs in PENDING* directories ?&lt;/p&gt;

&lt;p&gt;Perhaps the PENDING directory was sometimes removed instead of renamed. In this case, is the recovery identical, or is there something else to do?&lt;/p&gt;

&lt;p&gt;TIA&lt;br/&gt;
Patrick&lt;/p&gt;

&lt;p&gt;Below is the support report, and I have also attached the files containing the traces of the commands executed on Client, MDT and OST. &lt;/p&gt;


&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;
#context: 

Some times ago, a few users started to report Lustre quotas inconsistencies regarding to the &quot;lfs quota -u&quot;
report vs &quot;du -s&quot; over their full hierachy/sub-tree. &quot;lfs quotacheck&quot; did not fix inconsistencies.

#consequences: 
Quotas are unusable and inaccurate for these users and (a lot ??) filesystem space is consumed by orphaned objids on
OSTs.

#details:

1st check made was to identify that the inconsistencies are due to real (and orphaned) filesystem space/blocks consumption
and not only/just a bad Quota value !!...

2nd thing has been to identify that the orphaned objids belong to FIDs in the MDS multiple PENDING* directories that
have been moved as part of LU-601 work-around !!!

See [Client,MDT,OST]_side files showing the details.

So what can we do now to recover all the space/blocks used by the orphaned objids ??? Can we safelly run &quot;lfsck&quot;
or do we need to build our own tool to offline parse all OSTs and remove all objids that belongs to FIDs in PENDING*
directories ???

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>linux-2.6.32-71.24.1</environment>
        <key id="12477">LU-860</key>
            <summary>Lustre quota inconsistencies after multiple usages of LU-601 work-around</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="patrick.valentin">Patrick Valentin</reporter>
                        <labels>
                    </labels>
                <created>Wed, 16 Nov 2011 13:11:24 +0000</created>
                <updated>Thu, 5 Jan 2012 09:41:36 +0000</updated>
                            <resolved>Thu, 5 Jan 2012 09:41:36 +0000</resolved>
                                    <version>Lustre 2.0.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="23103" author="johann" created="Wed, 16 Nov 2011 17:16:27 +0000"  >&lt;p&gt;I would not recommend to run lfsck which is time consuming. What about moving the files of all the PENDING* directories back to the namespace and unlinking them again (with &quot;unlink&quot; command instead of rm to avoid the stat) from a lustre client?&lt;/p&gt;</comment>
                            <comment id="23120" author="pjones" created="Thu, 17 Nov 2011 08:12:24 +0000"  >&lt;p&gt;Bobi&lt;/p&gt;

&lt;p&gt;Could you please comment on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="23446" author="bobijam" created="Sat, 26 Nov 2011 10:02:16 +0000"  >&lt;p&gt;I think manually unlink the file is a doable makeshift. While we will try to fix &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-601&quot; title=&quot;kernel BUG at fs/jbd2/transaction.c:1030&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-601&quot;&gt;&lt;del&gt;LU-601&lt;/del&gt;&lt;/a&gt; issue to make things right, lest this issue happens again.&lt;/p&gt;</comment>
                            <comment id="23918" author="bfaccini" created="Thu, 8 Dec 2011 11:43:50 +0000"  >&lt;p&gt;Hello all,&lt;/p&gt;

&lt;p&gt;Just a quick comment on the &quot;moving the files of all the PENDING* directories back to the namespace and unlinking them again&quot; proposal from Johann, having a look to the MDT inodes in PENDING* directories I have found they don&apos;t have any &quot;lov&quot; EA !!... So how will the Unlink process be able to find the associated+orphaned objids we want to remove on the OSTs ???&lt;/p&gt;

&lt;p&gt;In the case it will fail, what do you think on mounting all OSTs and remove all objids referring to any PENDING*/&amp;lt;FID&amp;gt; in their &quot;fid&quot; EA ??&lt;/p&gt;

&lt;p&gt;And last, ok about &quot;lfsck&quot; heavy time consuming, but you did not answer on my original question &quot;Can we safelly run lfsck ??&quot; and I will precise it with &quot;Is lfsck THE tool pushed/supported by WhamCloud to repair Lustre inconsistencies or not ???&quot;.&lt;/p&gt;

&lt;p&gt;Bruno.&lt;/p&gt;
</comment>
                            <comment id="23949" author="bobijam" created="Thu, 8 Dec 2011 22:07:00 +0000"  >&lt;p&gt;Yes, it is safe to run lfsck.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;       lfsck is used to check and repair the distributed coherency of a Lustre filesystem.
OPTIONS
       -c     Create (empty) missing OST objects referenced by MDS inodes.

       -d     Delete orphaned objects from the filesystem.  Since objects on the OST are  often  only  one  of
              several  stripes of a file it can be difficult to put multiple objects back together into a sin-
              gle usable file.

       -h     Print a brief help message.

       -l     Put orphaned objects into a lost+found directory in the root of the filesystem.

       -n     Do not repair the filesystem, just perform a read-only check (default).

       -v     Verbose operation - more verbosity by specifing option multiple times.

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;BTW, it would be tedious work if you want mount all OSTs and finding all objects which have intended fid, while the fid info lies in the &quot;fid&quot; EA of OST objects. &lt;/p&gt;
</comment>
                            <comment id="24035" author="adilger" created="Fri, 9 Dec 2011 13:30:52 +0000"  >&lt;p&gt;Note that it is possible to generate the lfsck mdsdb and ostdb while the filesystem is mounted, by running:&lt;/p&gt;

&lt;p&gt;  e2fsck -fn --mdsdb &lt;/p&gt;
{mdsdb_file} /dev/{mdsdev}&lt;br/&gt;
&lt;br/&gt;
Note &quot;-n&quot; option here.  The database file may be slightly inconsistent (e.g. contain files that were deleted during the run) but the code should handle this internally.&lt;br/&gt;
&lt;br/&gt;
When the databases are created, please run:&lt;br/&gt;
&lt;br/&gt;
  lfsck -nv --mdsdb {mdsdb_file}
&lt;p&gt; --ostdb &lt;/p&gt;
{ostdb_file ...}
&lt;p&gt; &lt;/p&gt;
{lustre mountpoint}

&lt;p&gt;(note again -n here) to ensure that this is doing what you expect it to (e.g. the number of orphaned objects is reasonable compared to the amount of missing space).  The -n option will prevent lfsck from actually making any changes to the filesystem.  I would recommend to check several of the OST objects that lfsck thinks should be deleted to get their MDS FID (this should be possible with &apos;debugfs -c -R &quot;stat &amp;lt;O/0/d$((objid % 32))/$objid&quot; /dev/&lt;/p&gt;
{ostdev}
&lt;p&gt;&apos; which should print out the filter_fid xattr with the parent MDS inode number, and then run &apos;debugfs -c -R &quot;ncheck $ino1 $ino2 $ino3 ...&quot; /dev/&lt;/p&gt;
{mdsdev}
&lt;p&gt;&apos; to check whether any MDS inodes reference those objects.&lt;/p&gt;</comment>
                            <comment id="24605" author="bfaccini" created="Tue, 13 Dec 2011 08:56:43 +0000"  >&lt;p&gt;Yeah !! The &quot;back to namespace&quot;+unlink method has been applied to all &quot;PENDING*/*&quot; files/FIDs and 40TB have been &quot;magically&quot; recovered/freed with associated Quotas corrected !!!&lt;/p&gt;

&lt;p&gt;BTW, I am still puzzled on how the FID&amp;lt;-&amp;gt;ObjID&lt;span class=&quot;error&quot;&gt;&amp;#91;s&amp;#93;&lt;/span&gt; relation has been reconstructed ??&lt;/p&gt;</comment>
                            <comment id="24606" author="johann" created="Tue, 13 Dec 2011 09:20:26 +0000"  >&lt;p&gt;That&apos;s the magic of Christmas &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;br/&gt;
More seriously, are you sure that the files under PENDING*/* had no LOV EA? Do you still have the output of debugfs against one of those file?&lt;/p&gt;</comment>
                            <comment id="24628" author="bfaccini" created="Tue, 13 Dec 2011 13:01:10 +0000"  >
&lt;p&gt;I will try to find it in my logs and attach it ...&lt;/p&gt;

&lt;p&gt;In fact it takes more time/work finally to gat to Christmas !!...&lt;/p&gt;

&lt;p&gt;I need to tell the whole story now ... :&lt;/p&gt;

&lt;p&gt;     _ instead of moving the &quot;PENDING.old*&quot; content back in the namespace, I moved the directories, uninked all their content/files, and finally &quot;rmdir&quot; all dirs !!.. &lt;/p&gt;

&lt;p&gt;     _ it took quite some time to understand the following &quot;live&quot; ... This leaded to a situation where the original/1st PENDING directory was no longer present to satisfy to the multiple conditions/controls/attributes/... (link EA, inode/generation number in OI database, ...) occuring during FS/MDT start/mount !!!!&lt;/p&gt;

&lt;p&gt;But now FS is started finally and seems that we begin to ear some Christmas songs ...&lt;/p&gt;</comment>
                            <comment id="25865" author="pjones" created="Thu, 5 Jan 2012 09:41:36 +0000"  >&lt;p&gt;Bull confirm that this issue is now resolved&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="10621" name="Client_side" size="14267" author="patrick.valentin" created="Wed, 16 Nov 2011 13:11:24 +0000"/>
                            <attachment id="10622" name="MDT_side" size="5147" author="patrick.valentin" created="Wed, 16 Nov 2011 13:11:24 +0000"/>
                            <attachment id="10623" name="OST_side" size="2524" author="patrick.valentin" created="Wed, 16 Nov 2011 13:11:24 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvhq7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6521</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>