<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:59:44 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6382] quota : inconsistence between master &amp; slave</title>
                <link>https://jira.whamcloud.com/browse/LU-6382</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We have a quota problem on one of our OST.&lt;/p&gt;

&lt;p&gt;Here&apos;s the error logs:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 11-0: lustre1-MDT0000-lwp-OST0008: Communicating with 10.225.8.3@o2ib, operation ldlm_enqueue failed with -3.

LustreError: 12476:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -3, flags:0x9 qsd:lustre1-OST0008 qtype:grp
id:10011 enforced:1 granted:1276244380 pending:0 waiting:128 req:1 usage:1276244415 qunit:0 qtune:0 edquot:0

LustreError: 12476:0:(qsd_handler.c:767:qsd_op_begin0()) $$$ ID isn&apos;t enforced on master, it probably due to a legeal race, if this
message is showing up constantly, there could be some inconsistence between master &amp;amp; slave, and quota reintegration needs be
re-triggered. qsd:lustre1-OST0008 qtype:grp id:10011 enforced:1 granted:1276244380 pending:0 waiting:0 req:0 usage:1276244415
qunit:0 qtune:0 edquot:0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;the errors occurs only on this OST. (and for that groupid only)&lt;/p&gt;

&lt;p&gt;We set the quotas with these commands:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lfs setquota -g $gid --block-softlimit 40t --block-hardlimit 40t /lustre1
lfs setquota -u $uid --inode-softlimit 1000000 --inode-hardlimit 1000000 /lustre1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and for the group 10011, we have disabled the quotas 1 or 2 days before the errors occur, using:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lfs setquota -g 10011 --block-softlimit 0 --block-hardlimit 0 /lustre1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;What does mean &quot;quota reintegration needs be re-triggered&quot;? I guess it&apos;s to run an &quot;lfs quotacheck&quot; on the filesystem, right?&lt;/p&gt;

&lt;p&gt;Thanks&lt;br/&gt;
JS&lt;/p&gt;</description>
                <environment>We are running lustre 2.5.3 on all our servers, with zfs 0.6.3 on the OSS and ldiskfs/ext4 on the MDS. (all 18 servers are running centos 6.5)&lt;br/&gt;
The client nodes are running lustre 2.4.3 on centos 6.6&lt;br/&gt;
</environment>
        <key id="29152">LU-6382</key>
            <summary>quota : inconsistence between master &amp; slave</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="jslandry">JS Landry</reporter>
                        <labels>
                    </labels>
                <created>Wed, 18 Mar 2015 18:27:09 +0000</created>
                <updated>Wed, 8 Feb 2023 10:31:08 +0000</updated>
                            <resolved>Sat, 9 Oct 2021 06:43:38 +0000</resolved>
                                    <version>Lustre 2.4.3</version>
                    <version>Lustre 2.5.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="362074" author="eaujames" created="Wed, 8 Feb 2023 10:27:47 +0000"  >&lt;p&gt;The CEA hits this issue in production on a ClusterStor Lustre version (server side, 2.12.4...).&lt;br/&gt;
Some users have quota id enforced on OSTs (slave, QSD), but not on MDT0000 (master, QMT). If the slave quota limits are exceeded (on OST), the clients fallback from BIO to sync IO:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-c&quot;&gt;
&lt;span class=&quot;code-keyword&quot;&gt;&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt;&lt;/span&gt; vvp_io_write_commit(&lt;span class=&quot;code-keyword&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;struct&lt;/span&gt; lu_env *env, &lt;span class=&quot;code-keyword&quot;&gt;struct&lt;/span&gt; cl_io *io)
{
......
        &lt;span class=&quot;code-comment&quot;&gt;/* out of quota, &lt;span class=&quot;code-keyword&quot;&gt;try&lt;/span&gt; sync write */&lt;/span&gt;                          
        if (rc == -EDQUOT &amp;amp;&amp;amp; !cl_io_is_mkwrite(io)) {               
                &lt;span class=&quot;code-keyword&quot;&gt;struct&lt;/span&gt; ll_inode_info *lli = ll_i2info(inode);       
                                                                    
                rc = vvp_io_commit_sync(env, io, queue,            
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This causes &lt;b&gt;a lot of small write IOs&lt;/b&gt; from those user jobs on OSTs and increases quickly the load on OSS (raid6 parity calculations) and the disk usages (raid6 on rotational disk with no OST write cache). The overall filesystem was really slow.&lt;/p&gt;

&lt;p&gt;This issue has been resolved by forcing a quota reintegration on the OSS:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl set_param osd-ldiskfs.*.quota_slave.force_reint=1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="22542">LU-4404</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>Quota</label>
            <label>zfs</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx8un:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>