<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:43:18 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4504] User quota problem after Lustre upgrade (2.1.4 to 2.4.1)</title>
                <link>https://jira.whamcloud.com/browse/LU-4504</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;After the upgrade at KIT,  the user quotas are not reported correctly. The quota for root seems to be OK. The user quota is 0 on all OSTs, which is wrong.&lt;/p&gt;

&lt;p&gt;e.g for root:&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@pfs2n13 ~&amp;#93;&lt;/span&gt;# lfs quota -u root -v /lustre/pfs2wor2/client/&lt;br/&gt;
Disk quotas for user root (uid 0):&lt;br/&gt;
Filesystem kbytes quota limit grace files quota limit &lt;br/&gt;
grace&lt;br/&gt;
/lustre/pfs2wor2/client/&lt;br/&gt;
4332006768 0 0 - 790 0 &lt;br/&gt;
0 -&lt;br/&gt;
pfs2wor2-MDT0000_UUID&lt;br/&gt;
2349176 - 0 - 790 - 0 &lt;br/&gt;
-&lt;br/&gt;
pfs2wor2-OST0000_UUID&lt;br/&gt;
134219820 - 0 - - - &lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;-&lt;br/&gt;
pfs2wor2-OST0001_UUID&lt;br/&gt;
12 - 0 - - - - &lt;br/&gt;
-&lt;br/&gt;
pfs2wor2-OST0002_UUID&lt;br/&gt;
134219788 - 0 - - - &lt;/li&gt;
	&lt;li&gt;-&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;for a user&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@pfs2n3 ~&amp;#93;&lt;/span&gt;# lfs quota -v -u aj9102 /lustre/pfs2wor1/client/&lt;br/&gt;
Disk quotas for user aj9102 (uid 3522):&lt;br/&gt;
Filesystem kbytes quota limit grace files quota limit &lt;br/&gt;
grace&lt;br/&gt;
/lustre/pfs2wor1/client/&lt;br/&gt;
448 0 0 - 3985 0 0 &lt;br/&gt;
-&lt;br/&gt;
pfs2wor1-MDT0000_UUID&lt;br/&gt;
448 - 0 - 3985 - 0 &lt;br/&gt;
-&lt;br/&gt;
pfs2wor1-OST0000_UUID&lt;br/&gt;
0 - 0 - - - - &lt;br/&gt;
-&lt;br/&gt;
pfs2wor1-OST0001_UUID&lt;br/&gt;
0 - 0 - - - - &lt;br/&gt;
-&lt;br/&gt;
pfs2wor1-OST0002_UUID&lt;br/&gt;
0 - 0 - - - - &lt;br/&gt;
-&lt;/p&gt;</description>
                <environment></environment>
        <key id="22781">LU-4504</key>
            <summary>User quota problem after Lustre upgrade (2.1.4 to 2.4.1)</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="orentas">Oz Rentas</reporter>
                        <labels>
                    </labels>
                <created>Fri, 17 Jan 2014 15:35:01 +0000</created>
                <updated>Thu, 2 Feb 2017 17:32:37 +0000</updated>
                            <resolved>Thu, 2 Feb 2017 17:32:37 +0000</resolved>
                                    <version>Lustre 2.4.1</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="75192" author="pjones" created="Fri, 17 Jan 2014 16:50:44 +0000"  >&lt;p&gt;Niu&lt;/p&gt;

&lt;p&gt;Can you please help with this issue?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="75252" author="niu" created="Mon, 20 Jan 2014 01:58:48 +0000"  >&lt;p&gt;What&apos;s the e2fsprogs version? Looks like it&apos;s dup of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3784&quot; title=&quot;Quota issue on system upgraded to 2.4.x&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3784&quot;&gt;&lt;del&gt;LU-3784&lt;/del&gt;&lt;/a&gt;. Could you try the following:&lt;/p&gt;

&lt;p&gt;1. upgrade your e2fsprogs to the latest version which have the fix of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3784&quot; title=&quot;Quota issue on system upgraded to 2.4.x&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3784&quot;&gt;&lt;del&gt;LU-3784&lt;/del&gt;&lt;/a&gt; (tune2fs: update i_size in ext2fs_file_write()) (you can download it from build.whamcloud.com)&lt;br/&gt;
2. disable then enable each MDT/OST device by &quot;tune2fs -O ^quota&quot; and &quot;tune2fs -O quota&quot;&lt;/p&gt;</comment>
                            <comment id="75348" author="orentas" created="Tue, 21 Jan 2014 15:26:31 +0000"  >&lt;p&gt;The e2fsprogs RPMs installed are: &lt;br/&gt;
-e2fsprogs-1.42.7.wc2-7.el6.x86_64.rpm&lt;br/&gt;
-e2fsprogs-libs-1.42.7.wc2-7.el6.x86_64.rpm.  &lt;/p&gt;

&lt;p&gt;The procedure outlined in step #2 has been performed - we did disable/enable quota for all the mdt and ost devices with:&lt;br/&gt;
tune2fs -O ^quota $dev&lt;br/&gt;
tune2fs -O quota $dev&lt;/p&gt;

&lt;p&gt;Same result.  Any ideas on what to try next, or any debugging that can be done?  Thanks.&lt;/p&gt;


</comment>
                            <comment id="75405" author="niu" created="Wed, 22 Jan 2014 01:46:08 +0000"  >&lt;p&gt;Is this the only user who has incorrect quota usage or all other users&apos; usage are incorrect as well? Could you try to write as the user to see if the newly written bytes can be accounted?&lt;/p&gt;

&lt;p&gt;And I want to get confirmed that the e2fsprogs installed is the latest build from the build.whamcloud.com? I&apos;m not sure if the older 1.42.7-wc2 include the patch.&lt;/p&gt;</comment>
                            <comment id="75476" author="orentas" created="Thu, 23 Jan 2014 01:33:24 +0000"  >&lt;p&gt;The problem affects at least two users on one file system, and another two users on another separate lustre file system.&lt;/p&gt;

&lt;p&gt;Today, the customer discovered the issue on another lustre fs with other users:&lt;/p&gt;

&lt;p&gt;root@ic2n992:/pfs/data2/home# lfs quota -u ho_anfuchs .&lt;br/&gt;
Disk quotas for user ho_anfuchs (uid 900085):&lt;br/&gt;
      Filesystem  kbytes   quota   limit   grace   files   quota   limit &lt;br/&gt;
   grace&lt;br/&gt;
               . 1982128       0       0       -   20350       0       0 &lt;br/&gt;
       -&lt;br/&gt;
root@ic2n992:/pfs/data2/home# du -hs ho/ho_kim/ho_anfuchs&lt;br/&gt;
5.8G    ho/ho_kim/ho_anfuchs&lt;br/&gt;
root@ic2n992:/pfs/data2/home# find ho/ho_kim/ho_anfuchs ! &amp;#40; -user ho_anfuchs &amp;#41; -exec ls -l {} \;&lt;/p&gt;

&lt;p&gt;root@ic2n992:/pfs/data2/home# lfs quota -u kn_pop164377 .&lt;br/&gt;
Disk quotas for user kn_pop164377 (uid 900025):&lt;br/&gt;
      Filesystem  kbytes   quota   limit   grace   files   quota   limit &lt;br/&gt;
   grace&lt;br/&gt;
               . 15359272       0       0       -  167897       0 &lt;br/&gt;
0       -&lt;br/&gt;
root@ic2n992:/pfs/data2/home# du -hs kn/kn_kn/kn_pop164377&lt;br/&gt;
71G     kn/kn_kn/kn_pop164377&lt;br/&gt;
root@ic2n992:/pfs/data2/home# find kn/kn_kn/kn_pop164377 ! &amp;#40; -user&lt;br/&gt;
kn_pop164377 &amp;#41; -exec ls -l {} \;&lt;/p&gt;


&lt;p&gt;The efsprogs was downloaded from whamcloud, with the following patch applied: &lt;a href=&quot;http://git.whamcloud.com/?p=tools/e2fsprogs.git;a=commit;h=470ca046b1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://git.whamcloud.com/?p=tools/e2fsprogs.git;a=commit;h=470ca046b1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I will have the customer do a write operation to see if the accounting changes.&lt;/p&gt;</comment>
                            <comment id="75536" author="orentas" created="Fri, 24 Jan 2014 01:47:33 +0000"  >&lt;p&gt;Newly written bytes are accounted. This works for all OSTs.&lt;/p&gt;</comment>
                            <comment id="76009" author="johann" created="Fri, 31 Jan 2014 21:41:32 +0000"  >&lt;p&gt;I wonder whether some of the OST objects belonging to those users did not get the proper UID/GID. Could you please check on one of the OST reporting the wrong usage if all the objects have the correct UID/GID? You can do it by unmounting the OST, mounting it with -t ldiskfs and run a find command to compute usage of the user and compare it with what is reported by lfs quota. Thanks in advance.&lt;/p&gt;</comment>
                            <comment id="76156" author="orentas" created="Tue, 4 Feb 2014 03:44:26 +0000"  >&lt;p&gt;Thanks for the response. Unfortunately, the FS is in production and unmounting anything can not be easily done.  Are there any other options for gathering the information you&apos;re asking for? Please advise.&lt;br/&gt;
Thanks,&lt;br/&gt;
Oz&lt;/p&gt;</comment>
                            <comment id="76446" author="niu" created="Fri, 7 Feb 2014 08:25:44 +0000"  >&lt;p&gt;Did the customer ever see the problem before upgrading? We&apos;d make sure first that these users don&apos;t have the same problem before upgranding.&lt;/p&gt;</comment>
                            <comment id="76690" author="orentas" created="Tue, 11 Feb 2014 02:12:00 +0000"  >&lt;p&gt;Response from customer:&lt;br/&gt;
The file systems were going into production in September 2013 with Lustre 2.1.x. Actually, we never checked the quotas carefully before the upgrade to 2.4 (done in December 2013), i.e. it is possible that the problem existed before the upgrade. On the other hand, we also have one user which has this problem and which created all files in January (and he got access to the cluster after the 2.4 upgrade).&lt;/p&gt;

&lt;p&gt;In addition, I also found a response to the question of Johann Lombardi from 31/Jan/14 9:41 PM without unmounting the OST by following chapter&lt;br/&gt;
13.14 of the Lustre manual. The file I checked had the correct UID and GID. For more details see attached file (pfs2wor2_check_quotas_bad_user_20140204.txt).&lt;/p&gt;</comment>
                            <comment id="76693" author="niu" created="Tue, 11 Feb 2014 03:32:54 +0000"  >&lt;blockquote&gt;
&lt;p&gt;In addition, I also found a response to the question of Johann Lombardi from 31/Jan/14 9:41 PM without unmounting the OST by following chapter&lt;br/&gt;
13.14 of the Lustre manual. The file I checked had the correct UID and GID. For more details see attached file (pfs2wor2_check_quotas_bad_user_20140204.txt).&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Looks you only checked one file, it&apos;s better to run a script to check all files belong to user &quot;es_asaramet&quot;.&lt;/p&gt;</comment>
                            <comment id="76850" author="orentas" created="Wed, 12 Feb 2014 17:05:54 +0000"  >&lt;p&gt;From customer:&lt;/p&gt;

&lt;p&gt;pfs2wor2_check_quotas_bad_user_20140204.txt the command &quot;lfs quota -v -u es_asaramet&quot; shows 0 quota usage for pfs2wor2-OST0007. We have found a non empty file (refinementSurfaces.o) which is located on OST0007. Hence the 0 quota usage of lfs quota -v for this user is wrong. And since the file really belongs to the user with UID 900044 (es_asaramet) in the underlying ldiskfs, the assumption of Johann Lombardi that the UID/GID in the underlying ldiskfs might be wrong cannot be the reason for this 0 quota usage on pfs2wor2-OST0007.&lt;/p&gt;

&lt;p&gt;More information from another ticket the customer just today opened regarding quota:&lt;/p&gt;

&lt;p&gt;last December DDN upgraded several Lustre file systems to Lustre 2.4.  &lt;br/&gt;
However, I now recognized that quota enforcement does not work.  &lt;br/&gt;
For example we have:  &lt;br/&gt;
root@ic2n993:~# lfs quota -g imk-tro /pfs/imk &lt;br/&gt;
Disk quotas for group imk-tro (gid 44470): &lt;br/&gt;
    Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace &lt;br/&gt;
      /pfs/imk 3067173276*      0       1       -   11115       0       0       - &lt;br/&gt;
Note that the quota block limit was set to 1 before files were created. &lt;/p&gt;

&lt;p&gt;I was just reading the Lustre manual and checked the following command:  &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@pfs2n6 ~&amp;#93;&lt;/span&gt;# lctl get_param osd-&lt;b&gt;.&lt;/b&gt;.quota_slave.info &lt;br/&gt;
osd-ldiskfs.pfs2wor1-MDT0000.quota_slave.info= &lt;br/&gt;
target name:    pfs2wor1-MDT0000 &lt;br/&gt;
pool ID:        0 &lt;br/&gt;
type:           md &lt;br/&gt;
quota enabled:  none &lt;br/&gt;
conn to master: setup &lt;br/&gt;
space acct:     ug &lt;br/&gt;
user uptodate:  glb&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,slv&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,reint&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;br/&gt;
group uptodate: glb&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,slv&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,reint&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;br/&gt;
Since &quot;quota enabled&quot; is set to none I assume that the following command  &lt;br/&gt;
was not executed:  &lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;lctl conf_param pfs2wor1.quota.ost=ug&lt;br/&gt;
Do you believe this assumption is correct? &lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;BTW: I tried to check the MGS parameters according to a description of  &lt;br/&gt;
Kit Westneat and did the following:  &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@pfs2n2 ~&amp;#93;&lt;/span&gt;# debugfs -R &apos;rdump CONFIGS /tmp/&apos; /dev/mapper/vg_pfs2dat1-mgs &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@pfs2n2 ~&amp;#93;&lt;/span&gt;# for x in /tmp/CONFIGS/*; do echo $x:; llog_reader $x | grep param ; echo; done &lt;br/&gt;
However, the last command hangs. The command &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@pfs2n2 ~&amp;#93;&lt;/span&gt;# llog_reader /tmp/CONFIGS/lustre-params &lt;br/&gt;
permanently repeats the following line:  &lt;br/&gt;
Bit 0 of 135136 not set &lt;/p&gt;</comment>
                            <comment id="76946" author="niu" created="Thu, 13 Feb 2014 08:45:19 +0000"  >&lt;blockquote&gt;
&lt;p&gt;pfs2wor2_check_quotas_bad_user_20140204.txt the command &quot;lfs quota -v -u es_asaramet&quot; shows 0 quota usage for pfs2wor2-OST0007. We have found a non empty file (refinementSurfaces.o) which is located on OST0007. Hence the 0 quota usage of lfs quota -v for this user is wrong. And since the file really belongs to the user with UID 900044 (es_asaramet) in the underlying ldiskfs, the assumption of Johann Lombardi that the UID/GID in the underlying ldiskfs might be wrong cannot be the reason for this 0 quota usage on pfs2wor2-OST0007.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I see, thanks. Could you check the &quot;/proc/fs/lustre/osd-ldiskfs/pfs2wor2-OST0007/quota_slave/acct_user&quot; and post the result here?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I was just reading the Lustre manual and checked the following command: &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@pfs2n6 ~&amp;#93;&lt;/span&gt;# lctl get_param osd-..quota_slave.info &lt;br/&gt;
osd-ldiskfs.pfs2wor1-MDT0000.quota_slave.info= &lt;br/&gt;
target name: pfs2wor1-MDT0000 &lt;br/&gt;
pool ID: 0 &lt;br/&gt;
type: md &lt;br/&gt;
quota enabled: none &lt;br/&gt;
conn to master: setup &lt;br/&gt;
space acct: ug &lt;br/&gt;
user uptodate: glb&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,slv&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,reint&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;br/&gt;
group uptodate: glb&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,slv&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;,reint&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;br/&gt;
Since &quot;quota enabled&quot; is set to none I assume that the following command &lt;br/&gt;
was not executed:&lt;br/&gt;
lctl conf_param pfs2wor1.quota.ost=ug&lt;br/&gt;
Do you believe this assumption is correct?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The info you checked is for metadata, you&apos;d check the quota_slave.info on OST to see if data quota is enabled, and enabled it by &apos;lctl conf_param $FSNAME.quota.ost=ug&quot;.&lt;/p&gt;</comment>
                            <comment id="77454" author="orentas" created="Thu, 20 Feb 2014 03:45:44 +0000"  >&lt;p&gt;The customer checked it on the MDS because the example in chapter 21.2 of the Lustre Operations Manual also displays this information for the MDT. &lt;/p&gt;

&lt;p&gt;Is this example of the manual wrong?&lt;/p&gt;

&lt;p&gt;The OSTs also showed quota enabled: none&lt;/p&gt;

&lt;p&gt;The customer checked and verified they have the same problem on their test system.&lt;/p&gt;

&lt;p&gt;After &quot;lctl conf_param pfscdat1.quota.ost=ug&quot; on the MGS this indeed fixed the problem.&lt;/p&gt;

&lt;p&gt;It would also be good to get a response why the llog_reader hangs permanently for some file systems while we try to check MGS parameters?  &lt;/p&gt;</comment>
                            <comment id="77458" author="niu" created="Thu, 20 Feb 2014 06:09:37 +0000"  >&lt;blockquote&gt;
&lt;p&gt;The customer checked it on the MDS because the example in chapter 21.2 of the Lustre Operations Manual also displays this information for the MDT.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The command should be run on MDS or OSS to check the quota status of each MDT &amp;amp; OST. The example in manual displayed only the output of MDS.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It would also be good to get a response why the llog_reader hangs permanently for some file systems while we try to check MGS parameters?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;It looks like &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-632&quot; title=&quot;llog_reader loops forever on an empty file&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-632&quot;&gt;&lt;del&gt;LU-632&lt;/del&gt;&lt;/a&gt;, is the file empty?&lt;/p&gt;</comment>
                            <comment id="77519" author="orentas" created="Thu, 20 Feb 2014 20:58:00 +0000"  >&lt;p&gt; &amp;gt; The command should be run on MDS or OSS to check the quota status of each MDT &amp;amp; OST. The example in manual displayed only the output of MDS.&lt;/p&gt;

&lt;p&gt;Thank you for this clarification!&lt;/p&gt;

&lt;p&gt; &amp;gt; It looks like &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-632&quot; title=&quot;llog_reader loops forever on an empty file&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-632&quot;&gt;&lt;del&gt;LU-632&lt;/del&gt;&lt;/a&gt;, is the file empty?&lt;/p&gt;

&lt;p&gt;Yes, the file is indeed empty. The displayed error message was misleading.&lt;/p&gt;</comment>
                            <comment id="77520" author="orentas" created="Thu, 20 Feb 2014 21:02:39 +0000"  >&lt;p&gt; &amp;gt; Niu Yawei added a comment - 13/Feb/14 8:45 AM  &amp;gt; I see, thanks. Could you check the &quot;/proc/fs/lustre/osd-ldiskfs/pfs2wor2-OST0007/quota_slave/acct_user&quot; and post the result here?&lt;/p&gt;

&lt;p&gt;A file with the requested information is attached, pfs2wor2-OST0007_acct_user_20140213.txt.&lt;/p&gt;

&lt;p&gt;It is a bit strange that pretty few user IDs appear and that user ID&lt;br/&gt;
900044 is missing.&lt;br/&gt;
I also checked the used capacity and compared with the lfs df output and there is a difference of more than 100 GB.&lt;/p&gt;</comment>
                            <comment id="77824" author="orentas" created="Tue, 25 Feb 2014 16:30:41 +0000"  >&lt;p&gt;Any updates? &lt;/p&gt;

&lt;p&gt;Please let us know if you need any additional information or if there is any additional debugging we could be doing.  We would like to close this one out fairly soon.&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="77889" author="niu" created="Wed, 26 Feb 2014 06:11:00 +0000"  >&lt;p&gt;I cooked a debug patch for e2fsprogs: &lt;a href=&quot;http://review.whamcloud.com/#/c/9397&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/9397&lt;/a&gt; and it&apos;s built on &lt;a href=&quot;http://build.whamcloud.com/job/e2fsprogs-reviews/200/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://build.whamcloud.com/job/e2fsprogs-reviews/200/&lt;/a&gt; .&lt;/p&gt;

&lt;p&gt;Oz, could you install the e2fsprogs from &lt;a href=&quot;http://build.whamcloud.com/job/e2fsprogs-reviews/200/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://build.whamcloud.com/job/e2fsprogs-reviews/200/&lt;/a&gt; and collect the debug information while disabling/enabling quota for the OST0007 device? (by tune2fs -O ^quota &amp;amp; tune2fs -O quota command). Thanks a lot.&lt;/p&gt;</comment>
                            <comment id="79135" author="orentas" created="Wed, 12 Mar 2014 15:52:47 +0000"  >&lt;p&gt;The debug e2fsprogs has been installed on the OSS, and quota has been disabled / enabled on the affected OST.  The output from running the &apos;tune2fs &lt;del&gt;O quota&apos; can be taken from &lt;a href=&quot;http://ddntsr.com/ftp/2014-03-12-SR28763_tunefs-O-quota.out.tgz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://ddntsr.com/ftp/2014-03-12-SR28763_tunefs-O-quota.out.tgz&lt;/a&gt; and the new &apos;lctl get_param osd&lt;/del&gt;&lt;b&gt;.&lt;/b&gt;.quota_slave.info&apos; output is attached.&lt;/p&gt;</comment>
                            <comment id="79203" author="niu" created="Thu, 13 Mar 2014 01:44:00 +0000"  >&lt;p&gt;Hi, Oz&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;[DEBUG] quotaio_tree.c:316:qtree_write_dquot:: writing ddquot 1: id=900044 off=0, info-&amp;gt;dqi_entry_size=72^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=1, depth=0^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=2, depth=1^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=34, depth=2^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=36, depth=3^M
[DEBUG] quotaio_tree.c:330:qtree_write_dquot:: writing ddquot 2: id=900044 off=34168, info-&amp;gt;dqi_entry_size=72^M
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looks the accounting information for 900044 is written back, could you verify that if the space accounting for 900044 on this OST is fixed? Thank you.&lt;/p&gt;</comment>
                            <comment id="79316" author="orentas" created="Fri, 14 Mar 2014 05:11:52 +0000"  >&lt;p&gt;It appears the space accounting for 900044 on OST0007 is fixed.  For details see attached file pfs2wor2-OST0007_acct_user_20140313.txt.&lt;/p&gt;

&lt;p&gt;The customer performed further investigations on all 4 of the affected file systems.  The details can be seen in the attached file pfs2wor2_check_quotas_bad_user_20140313.txt.&lt;/p&gt;

&lt;p&gt;Results:&lt;br/&gt;
1. On the pfs2wor2 file system it seems that the quota problems of both affected users were fixed by only re-enabling quotas for OST0007.&lt;br/&gt;
2. For all other file systems (pfs2dat1, pfs2dat2, pfs2wor1) we still have users with quota issues. Only for the pfs2dat1 file system we found an OST which might be the reason. But please note that we would rather wait for this file system for the next maintenance with disruptive actions since it contains home directories.&lt;/p&gt;

&lt;p&gt;Questions:&lt;br/&gt;
1. What might have caused these issues?&lt;br/&gt;
2. How can this problem be resolved and prevented from reoccurring?&lt;br/&gt;
NOTE: Quota was reset (disabled / enabled) on all the OSTs back in December.&lt;/p&gt;</comment>
                            <comment id="79317" author="niu" created="Fri, 14 Mar 2014 05:31:08 +0000"  >&lt;blockquote&gt;
&lt;p&gt;It appears the space accounting for 900044 on OST0007 is fixed. For details see attached file pfs2wor2-OST0007_acct_user_20140313.txt.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Good news, thank you.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Questions:&lt;br/&gt;
1. What might have caused these issues?&lt;br/&gt;
2. How can this problem be resolved and prevented from reoccurring?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;It could probably because the e2fsprogs on OST0007 was not uptodate, could you verify that if the e2fsprogs on OST0007(or other problematic OSTs) was same as others&apos;?&lt;/p&gt;</comment>
                            <comment id="79916" author="orentas" created="Thu, 20 Mar 2014 18:45:56 +0000"  >&lt;p&gt;1. The software is usually installed by pdsh, i.e. it is the same on all servers.&lt;br/&gt;
2. This does not explain why some OSTs on the same OSS showed no problems with their quotas.&lt;br/&gt;
3. Also, OST0007 did not have a quota problem for most users.&lt;/p&gt;

&lt;p&gt;Since we see the same problem on all 4 file systems this is a general problem, i.e. not something which happened once by chance.&lt;/p&gt;

&lt;p&gt;I just had a look at the upgrade documentation which was sent by the vendor field engineer. He wrote (translated): tunefs is problematic and does not always work. Maybe Sven can comment what exactly was meant here and if this could be the reason.&lt;/p&gt;

&lt;p&gt;Anyway I wonder how we could clearly repair the problem for the remaining file systems during the next maintenance. I see 2 problems:&lt;br/&gt;
1. The vendor had written that he had done the following for one file system and this had not fixed the quota problem:&lt;br/&gt;
turn off quota first, turn it back on and run an e2fsck&lt;br/&gt;
How can	we be sure to have a procedure which clearly fixes the problem?&lt;br/&gt;
2. The problem might move to different users after disabling and re-enabling quotas. How can we easily and quickly find out if the problem still appears?&lt;/p&gt;</comment>
                            <comment id="79926" author="orentas" created="Thu, 20 Mar 2014 19:37:58 +0000"  >&lt;p&gt;Another interesting thing to note is both user quotas and group quotas are used, but there was not a problem with group quotas.&lt;/p&gt;</comment>
                            <comment id="79961" author="niu" created="Fri, 21 Mar 2014 03:19:18 +0000"  >&lt;blockquote&gt;
&lt;p&gt;1. The software is usually installed by pdsh, i.e. it is the same on all servers.&lt;br/&gt;
2. This does not explain why some OSTs on the same OSS showed no problems with their quotas.&lt;br/&gt;
3. Also, OST0007 did not have a quota problem for most users.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Comparing the two versions of accounting information of OST0007 (before and after executing tune2fs), we can see lots of user accounting was fixed, so I think many users were having accounting problems, but not discovered. Maybe it&apos;s same to other OSTs on the same OSS?&lt;/p&gt;

&lt;p&gt;Another possibility is that customer just missed tune2fs on OST0007?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;1. The vendor had written that he had done the following for one file system and this had not fixed the quota problem:&lt;br/&gt;
turn off quota first, turn it back on and run an e2fsck&lt;br/&gt;
How can	we be sure to have a procedure which clearly fixes the problem?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I think first we&apos;d make sure we are using the correct e2fsprogs. To verify if the accounting information is fixed, you can check the &quot;acct_user/group&quot; in proc file.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;2. The problem might move to different users after disabling and re-enabling quotas. How can we easily and quickly find out if the problem still appears?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Disable/re-enable quota is just for triggering a quotacheck, you can verify the accounting information in proc file.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Another interesting thing to note is both user quotas and group quotas are used, but there was not a problem with group quotas.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I think it probably because it was just not detected.&lt;/p&gt;</comment>
                            <comment id="80396" author="orentas" created="Thu, 27 Mar 2014 17:51:35 +0000"  >&lt;p&gt;The customer doesn&apos;t believe that tune2fs was missed on some OSTs. Either this is a general Lustre problem or it is a problem with the vendors tunefs wrapper script. &lt;/p&gt;

&lt;p&gt;Concerning this wrapper script, the field engineer sent the following:&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt; &amp;gt; I just had a look at the upgrade documentation which was sent by the vendor: tunefs is problematic  &amp;gt; and does not always work. &lt;/p&gt;

&lt;p&gt;I have not completely isolated the problem yet. I think it is the EXAScaler tunefs wrapper script es_tunefs. It does a second tunefs to set the MMP timeout, which may be causing the problem.&lt;br/&gt;
If tunefs was not done correctly the OST do not register with the MGS correctly and the clients can not access some of the OSTs. This can be easily verified by running &quot;lfs df&quot; on a client. I have noticed that this behaviour is worse with Lustre 2.4.x but i have seen it with older Lustre versions as well.&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;Since I had noticed a strange difference between user and group quotas I wrote a perl script which checks the sum of &quot;acct_user/group&quot;&lt;br/&gt;
in proc. The perl script and a text file with the output on all file systems is attached.&lt;/p&gt;

&lt;p&gt;Here are the results:&lt;br/&gt;
1. Some OSTs are affected and others are not affected, i.e.&lt;br/&gt;
this is an easy way to find out which OSTs are affected.&lt;br/&gt;
2. On the affected OSTs both inodes and kbytes are wrong.&lt;br/&gt;
3. We have higher group values and higher user values, i.e.&lt;br/&gt;
both user and group quotas are affected.&lt;br/&gt;
4. On the different file systems in nearly all cases either group values or user values are higher. The reason for this behaviour is not clear.&lt;/p&gt;

&lt;p&gt;Do you have any comments or ideas about the possible reason for the problem?&lt;/p&gt;</comment>
                            <comment id="80426" author="niu" created="Fri, 28 Mar 2014 01:41:44 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Do you have any comments or ideas about the possible reason for the problem?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;This sounds same problem as OST0007, and OST0007 can be fixed by re-run &quot;tune2fs -O quota&quot; (with uptodate e2fsprogs), can these problematic OSTs be fixed in same way or not?&lt;/p&gt;</comment>
                            <comment id="82702" author="orentas" created="Mon, 28 Apr 2014 22:44:54 +0000"  >&lt;p&gt;The customer ran through the &quot;tune2fs -O quota&quot; procedure last week during their scheduled downtime.  However, did this not resolve the problem.  &lt;br/&gt;
.&lt;br/&gt;
For OST pfs2dat2-OST0000 the customer also used the patched e2fsprogs and collected all output. &lt;/p&gt;

&lt;p&gt;The log file with the additional details can be downloaded from &quot;http://ddntsr.com/ftp/2014-04-28-SR28763_tunefs_20140424.txt.gz&quot; (69MB)&lt;/p&gt;</comment>
                            <comment id="82711" author="niu" created="Tue, 29 Apr 2014 01:26:25 +0000"  >&lt;p&gt;Oz, which uid/gid has problem on pfs2dat2-OST0000?&lt;/p&gt;</comment>
                            <comment id="82865" author="orentas" created="Wed, 30 Apr 2014 16:26:15 +0000"  >&lt;p&gt;we do not know which uid/gid has wrong quotas on pfs2dat2-OST0000.&lt;br/&gt;
We used our perl script which sums up all user and group quotas of acct_user/group in proc. This should show the same results for users and groups but it does not for pfs2dat2-OST0000.&lt;/p&gt;

&lt;p&gt;In detail, before the maintenance and after clients were unmounted the script reported this for pfs2dat2-OST0000:&lt;br/&gt;
Sum of inodes of users:      9353416&lt;br/&gt;
Sum of inodes of groups:     9447415&lt;br/&gt;
Sum of kbytes of users:  11926483836&lt;br/&gt;
Sum of kbytes of groups: 12132828844&lt;/p&gt;

&lt;p&gt;After servers were upgraded to Lustre 2.4.3 and quotas were re-enabled (with normal e2fsprogs):&lt;br/&gt;
Sum of inodes of users:      9325574&lt;br/&gt;
Sum of inodes of groups:     9446294&lt;br/&gt;
Sum of kbytes of users:  11897886304&lt;br/&gt;
Sum of kbytes of groups: 12132673600&lt;br/&gt;
Note the changes although clients were not mounted in the meantime.&lt;/p&gt;

&lt;p&gt;After just re-enabling quotas again for pfs2dat2-OST0000 (with normal e2fsprogs):&lt;br/&gt;
Sum of inodes of users:      9325357&lt;br/&gt;
Sum of inodes of groups:     9446077&lt;br/&gt;
Sum of kbytes of users:  11897857144&lt;br/&gt;
Sum of kbytes of groups: 12132644440&lt;br/&gt;
Note that tune2fs -O quota reported messages like these:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;ERROR&amp;#93;&lt;/span&gt; quotaio_tree.c:277:do_insert_tree:: Inserting already present quota entry (block 5).&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;ERROR&amp;#93;&lt;/span&gt; quotaio_tree.c:277:do_insert_tree:: Inserting already present quota entry (block 35).&lt;/p&gt;

&lt;p&gt;After re-enabling quotas again for pfs2dat2-OST0000 (with patched e2fsprogs):&lt;br/&gt;
Sum of inodes of users:      9325357&lt;br/&gt;
Sum of inodes of groups:     9446077&lt;br/&gt;
Sum of kbytes of users:  11897857144&lt;br/&gt;
Sum of kbytes of groups: 12132644440&lt;/p&gt;

&lt;p&gt;It is also interesting that only one OST of the pfs2dat2 has the same value for users and groups. For the pfs2wor2 file system most OSTs show the same values. pfs2dat2 has 219 million files and stripe count 1,&lt;br/&gt;
pfs2wor2 has 69 million files and default stripe count 2.&lt;/p&gt;

&lt;p&gt;Is further investigation possible with this information and with the provided tune2fs logs?&lt;/p&gt;

&lt;p&gt;If not, the customer will develop another script to find out uids/gids with wrong quotas on pfs2dat2-OST0000. Since this makes some effort I just wanted to check if this is really needed/helpful.&lt;/p&gt;</comment>
                            <comment id="83130" author="niu" created="Sun, 4 May 2014 03:52:43 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Note the changes although clients were not mounted in the meantime.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Orphan cleanup may removed some files.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note that tune2fs -O quota reported messages like these:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;ERROR&amp;#93;&lt;/span&gt; quotaio_tree.c:277:do_insert_tree:: Inserting already present quota entry (block 5).&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;ERROR&amp;#93;&lt;/span&gt; quotaio_tree.c:277:do_insert_tree:: Inserting already present quota entry (block 35).&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;I noticed the UID/GID on this system is very huge, some UIDs are larger than 2G. I think there could be some defect in the e2fsprogs which handle large ID incorrectly. For example:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;[DEBUG] quotaio.c:326:quota_file_create:: Creating quota ino=3, type=0^M
[DEBUG] quotaio_tree.c:316:qtree_write_dquot:: writing ddquot 1: id=2171114240 off=0, info-&amp;gt;dqi_entry_size=72^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=1, depth=0^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=0, depth=1^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=0, depth=2^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=0, depth=3^M
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;e2fsprogs is writing UID 2171114240 into quota file, and later on...&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;[DEBUG] quotaio_tree.c:316:qtree_write_dquot:: writing ddquot 1: id=2171114240 off=0, info-&amp;gt;dqi_entry_size=72^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=1, depth=0^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=2, depth=1^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=3, depth=2^M
[DEBUG] quotaio_tree.c:253:do_insert_tree:: inserting in tree: treeblk=4, depth=3^M
[ERROR] quotaio_tree.c:277:do_insert_tree:: Inserting already present quota entry (block 5).^M
[DEBUG] quotaio_tree.c:330:qtree_write_dquot:: writing ddquot 2: id=2171114240 off=11543712, info-&amp;gt;dqi_entry_size=72^M
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;e2fsprogs tries to write some UID 2171114240 into quota file again. Looks the UID 2171114240 got duplicated in the memory dict.&lt;/p&gt;

&lt;p&gt;I&apos;ll investigate further to see what happened when inserting large id into memory dict.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Is further investigation possible with this information and with the provided tune2fs logs?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Yes, no need to develop new script for now. I just want get confirmed from customer that they really have such large UID/GIDs.&lt;/p&gt;</comment>
                            <comment id="83203" author="orentas" created="Mon, 5 May 2014 16:30:14 +0000"  >&lt;p&gt;Thanks Niu.  Here is the response from the customer:&lt;/p&gt;

&lt;p&gt;We have pretty huge UIDs/GIDs. However, they are by far not as huge as reported. The largest UID is 901987 and the largest GID is 890006.&lt;/p&gt;</comment>
                            <comment id="83275" author="niu" created="Tue, 6 May 2014 08:19:03 +0000"  >&lt;p&gt;The huge UID/GIDs may caused by a lustre defect described in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4345&quot; title=&quot;failed to update accounting ZAP for user&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4345&quot;&gt;&lt;del&gt;LU-4345&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And looks there is a defect in e2fsprogs which could mess dict lookup when the difference of two keys greater than 2G.&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; dict_uint_cmp(&lt;span class=&quot;code-keyword&quot;&gt;const&lt;/span&gt; void *a, &lt;span class=&quot;code-keyword&quot;&gt;const&lt;/span&gt; void *b)
{
        unsigned &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt;    c, d;

        c = VOIDPTR_TO_UINT(a);
        d = VOIDPTR_TO_UINT(b);

        &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; c - d;
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This function returns an unsigned int value in int type, and quota relies on this function to insert ids into dict on quotacheck. I think that&apos;s why we see dup ID on quotacheck. I&apos;ll cooke a patch to fix this soon.&lt;/p&gt;</comment>
                            <comment id="83277" author="niu" created="Tue, 6 May 2014 09:09:15 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/10227&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10227&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="83604" author="orentas" created="Fri, 9 May 2014 05:23:35 +0000"  >&lt;p&gt;From the customer:&lt;/p&gt;

&lt;p&gt;It&apos;s good news that you found possible reasons for the problem.&lt;br/&gt;
We will install the patches during our next maintenance which is expected to take place during the next 2 months.  However, DDN will have to provide a Lustre version which includes those patches.&lt;/p&gt;

&lt;p&gt;For the huge UID/GIDs caused by the lustre defect described in&lt;br/&gt;
&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4345&quot; title=&quot;failed to update accounting ZAP for user&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4345&quot;&gt;&lt;del&gt;LU-4345&lt;/del&gt;&lt;/a&gt;: Is there a way to repair the bad IDs on the OST objects?&lt;/p&gt;</comment>
                            <comment id="83605" author="niu" created="Fri, 9 May 2014 06:07:35 +0000"  >&lt;blockquote&gt;
&lt;p&gt;For the huge UID/GIDs caused by the lustre defect described in&lt;br/&gt;
&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4345&quot; title=&quot;failed to update accounting ZAP for user&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4345&quot;&gt;&lt;del&gt;LU-4345&lt;/del&gt;&lt;/a&gt;: Is there a way to repair the bad IDs on the OST objects?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Fix bad IDs on existing OST objects: &lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Find the objects with bad IDs first (mount the OST device with ldiskfs to check IDs of each file or use debugfs without umount)&lt;/li&gt;
	&lt;li&gt;Get the correct ID from MDT (see Lustre manual 13.14 to identify which file the object belongs to) the set the correct IDs to these OST objects.&lt;/li&gt;
	&lt;li&gt;Set correct IDs for the objects on OST directly.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="90870" author="jfc" created="Tue, 5 Aug 2014 15:46:56 +0000"  >&lt;p&gt;Hello Oz,&lt;br/&gt;
We&apos;ve recently heard from another site that Niu&apos;s fixes have resolved the quota problems they were seeing.&lt;br/&gt;
Has DDN installed a new version at this site, with those patches?&lt;br/&gt;
If so, do you have any new news on this?&lt;br/&gt;
Thanks,&lt;br/&gt;
~ jfc. &lt;/p&gt;</comment>
                            <comment id="90948" author="niu" created="Wed, 6 Aug 2014 01:46:21 +0000"  >&lt;p&gt;I updated the way of how to fix bad ID for OST object (see my previous comment). Thanks.&lt;/p&gt;</comment>
                            <comment id="91477" author="rganesan@ddn.com" created="Tue, 12 Aug 2014 21:11:30 +0000"  >&lt;p&gt;Hello Niu, &lt;/p&gt;

&lt;p&gt;We dont think manually fixing the OST object is not a good idea. Since the filesystem have more than 100 million files.&lt;br/&gt;
Cu is expecting a better way to fix this issue, &lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Rajesh&lt;/p&gt;</comment>
                            <comment id="91589" author="niu" created="Thu, 14 Aug 2014 01:23:17 +0000"  >&lt;blockquote&gt;
&lt;p&gt;We dont think manually fixing the OST object is not a good idea. Since the filesystem have more than 100 million files.&lt;br/&gt;
Cu is expecting a better way to fix this issue,&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;I can&apos;t think of any other good ways to fix the bad IDs. I think running a script instead of repeating the commands manually would be better?&lt;/p&gt;</comment>
                            <comment id="183088" author="mdiep" created="Thu, 2 Feb 2017 17:19:54 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=rganesan%40ddn.com&quot; class=&quot;user-hover&quot; rel=&quot;rganesan@ddn.com&quot;&gt;rganesan@ddn.com&lt;/a&gt;, &lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=orentas&quot; class=&quot;user-hover&quot; rel=&quot;orentas&quot;&gt;orentas&lt;/a&gt;, have you resolved this issue? do you need anything else from this ticket?&lt;/p&gt;</comment>
                            <comment id="183094" author="orentas" created="Thu, 2 Feb 2017 17:27:57 +0000"  >&lt;p&gt;Yes, a long time ago.  Please close. Thanks!&lt;/p&gt;</comment>
                            <comment id="183096" author="mdiep" created="Thu, 2 Feb 2017 17:32:37 +0000"  >&lt;p&gt;Thank you Sir!&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10120">
                    <name>Blocker</name>
                                                                <inwardlinks description="is blocked by">
                                        <issuelink>
            <issuekey id="24596">LU-5018</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="22341">LU-4345</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="24982">LU-5129</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="14018" name="check_quotas_bad_user_20140123.txt" size="6697" author="orentas" created="Fri, 24 Jan 2014 01:47:33 +0000"/>
                            <attachment id="14615" name="compare_user_group_quotas_20140327.txt" size="59783" author="orentas" created="Thu, 27 Mar 2014 17:51:35 +0000"/>
                            <attachment id="14279" name="pfs2n18-quota_slaveinfo.txt" size="2730" author="orentas" created="Wed, 12 Mar 2014 15:52:47 +0000"/>
                            <attachment id="14143" name="pfs2wor2-OST0007_acct_user_20140213.txt" size="3447" author="orentas" created="Thu, 20 Feb 2014 21:02:39 +0000"/>
                            <attachment id="14296" name="pfs2wor2-OST0007_acct_user_20140313.txt" size="6104" author="orentas" created="Fri, 14 Mar 2014 05:11:52 +0000"/>
                            <attachment id="14086" name="pfs2wor2_check_quotas_bad_user_20140204.txt" size="8796" author="orentas" created="Tue, 11 Feb 2014 02:12:00 +0000"/>
                            <attachment id="14295" name="pfs2wor2_check_quotas_bad_user_20140313.txt" size="20066" author="orentas" created="Fri, 14 Mar 2014 05:11:52 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10490" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>End date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 14 Aug 2014 15:35:01 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwd27:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12320</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10493" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>Start date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 17 Jan 2014 15:35:01 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    </customfields>
    </item>
</channel>
</rss>