<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:40:01 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4139] Significant perforamce issue when user over soft quota limit</title>
                <link>https://jira.whamcloud.com/browse/LU-4139</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When a user goes over their softlimit there is a major performace hit.&lt;/p&gt;

&lt;p&gt;Testing showed a file copied in 3sec when under softlimit and 7 Min when over softlimit.&lt;/p&gt;

&lt;p&gt;Can be reproduced by just testing below and over softlimit.&lt;/p&gt;

&lt;p&gt;see trace for when the copy was slow.&lt;/p&gt;</description>
                <environment></environment>
        <key id="21613">LU-4139</key>
            <summary>Significant perforamce issue when user over soft quota limit</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="mhanafi">Mahmoud Hanafi</reporter>
                        <labels>
                    </labels>
                <created>Wed, 23 Oct 2013 22:19:11 +0000</created>
                <updated>Wed, 4 Dec 2019 09:07:29 +0000</updated>
                            <resolved>Tue, 3 Dec 2013 19:15:53 +0000</resolved>
                                    <version>Lustre 2.4.1</version>
                                    <fixVersion>Lustre 2.6.0</fixVersion>
                    <fixVersion>Lustre 2.4.2</fixVersion>
                    <fixVersion>Lustre 2.5.1</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="69693" author="pjones" created="Wed, 23 Oct 2013 22:29:59 +0000"  >&lt;p&gt;Niu&lt;/p&gt;

&lt;p&gt;Could you please comment on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="69711" author="niu" created="Thu, 24 Oct 2013 03:39:24 +0000"  >&lt;p&gt;Hi, Mahmoud&lt;/p&gt;

&lt;p&gt;This is by design but not a bug.&lt;/p&gt;

&lt;p&gt;To acheive an accurate grace time management, we have to shrink the qunit size to the least qunit size (1K), then quota slave can only acquire 1K quota limit from master each time, that definitely hurts performance. For good performance, large qunit has to be kept when approaching softlimit, however, grace timer would not be triggered (or stopped) exactly on user&apos;s real disk usage, I think that would be really bad user experience.&lt;/p&gt;

&lt;p&gt;We think the performance when over softlimit (or approaching hardlimit) is less important comparing with an accurate grace timer (or an accurate -EDQUOT), so we choose to sacrifice the performance (in certain special case) for the accuracy. This is the reason of dynamic qunit introduced actually.&lt;/p&gt;</comment>
                            <comment id="69771" author="adilger" created="Thu, 24 Oct 2013 16:24:33 +0000"  >&lt;p&gt;Niu, when you write &quot;1k qunit&quot; is that really 1kB of data, or 1024 blocks?  It doesn&apos;t make sense to only have 1kB of quota if the blocks are allocated in chunks of 4kB. &lt;/p&gt;</comment>
                            <comment id="69846" author="mhanafi" created="Thu, 24 Oct 2013 21:31:15 +0000"  >&lt;p&gt;3sec to 7 min is not a very good trade off. This behaviour makes softlimit useless for us. Is there a way we can tune how much min qunit is allocated if a user is over their softlimit.&lt;/p&gt;
</comment>
                            <comment id="69859" author="jaylan" created="Thu, 24 Oct 2013 22:56:12 +0000"  >&lt;p&gt;The qsd_internal.h showed the meaning of (qunit == 1024) as:&lt;br/&gt;
    1MB or 1K inodes.&lt;/p&gt;
</comment>
                            <comment id="69865" author="niu" created="Fri, 25 Oct 2013 02:57:49 +0000"  >&lt;blockquote&gt;
&lt;p&gt;The qsd_internal.h showed the meaning of (qunit == 1024) as:&lt;br/&gt;
1MB or 1K inodes.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Right, it is 1K blocks (1M).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;3sec to 7 min is not a very good trade off. This behaviour makes softlimit useless for us. Is there a way we can tune how much min qunit is allocated if a user is over their softlimit.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;After over softlimit, client turns to sync write instead of writing to cache, and quota slave only acquires minimum quota limit which can just satisfy the incoming write each time, that means two RPC round-trip delay would be added to each write operation, but I don&apos;t think the performance gap will be so big (3sec to 7min), could you verify that all data was really flushed back (for the 3sec copy)? Or could you retry the test with direct io to see the difference? Thanks.&lt;/p&gt;</comment>
                            <comment id="69922" author="mhanafi" created="Fri, 25 Oct 2013 16:20:53 +0000"  >&lt;p&gt;Here is what my testing showed:&lt;/p&gt;

&lt;p&gt;Using IOR with directio 3 threads writing to single ost:&lt;br/&gt;
Over Softlimit: 316MB/sec&lt;br/&gt;
Under Softlimit: 356MB/sec&lt;/p&gt;


&lt;p&gt;Using IOR with buffered IO 3 threads writing to singe ost:&lt;br/&gt;
Over Softlimit: 3.33MB/sec&lt;br/&gt;
Under Softlimit: 2.91MB/sec&lt;/p&gt;

&lt;p&gt;using &apos;cp&apos; to copy a  10G file&lt;br/&gt;
Under Softlimit: 7.8sec&lt;br/&gt;
Over softlimit: &amp;gt;10Min. I didn&apos;t want to wait for it to finish&lt;/p&gt;

&lt;p&gt;What I noticed is when I was over my softlimit using cp all the I/O was 4KB RPCs. I was able to see this happen in the middle of my test as I would go over my softlimit the RPCs would drop from 1MB to 4KB. Also using IOR with buffered I/O all the RPCs where 4K. It seems that the  smaller I/O sizes are the main issue.&lt;/p&gt;


&lt;p&gt;MY IOR COMMANDS&lt;br/&gt;
Buffered IO&lt;br/&gt;
mpiexec -n 3 /u/mhanafi/bin/IOR  -a POSIX -C -Y -e -i 1 -w -t 1m -b 1g -v -O useExistingTestFile=1,keepFile=1,filePerProc=1,intraTestBarriers=1,testFile=ior,outlierThreshold=1&lt;/p&gt;

&lt;p&gt;Direct IO&lt;br/&gt;
mpiexec -n 3 /u/mhanafi/bin/IOR  -a POSIX -C -B -i 1 -w -t 1m -b 1g -v -O useExistingTestFile=1,keepFile=1,filePerProc=1,intraTestBarriers=1,testFile=ior,outlierThreshold=1&lt;/p&gt;




</comment>
                            <comment id="69974" author="adilger" created="Sat, 26 Oct 2013 06:06:46 +0000"  >&lt;p&gt;Niu,&lt;br/&gt;
I think it is unintentional that over softlimit IO is done in 4kB chunks, even if the qunit is getting 1MB chunks.  Is it possible to avoid throttling the clients if there is a large gap between the soft and hard quota limit (i.e. treating over softlimit the same as under softlimit if there is still a large margin before the hardlimit)?&lt;/p&gt;</comment>
                            <comment id="69990" author="niu" created="Mon, 28 Oct 2013 04:20:11 +0000"  >&lt;blockquote&gt;
&lt;p&gt;What I noticed is when I was over my softlimit using cp all the I/O was 4KB RPCs. I was able to see this happen in the middle of my test as I would go over my softlimit the RPCs would drop from 1MB to 4KB. Also using IOR with buffered I/O all the RPCs where 4K. It seems that the smaller I/O sizes are the main issue.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;For quota accuracy, when approaching (or over) quota hardlimit (or softlimit), client turns to sync write (see bug16642), and in consequence the PRC size will be page size, 4K (page can&apos;t be cached on client, it has to be synced out on write).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I think it is unintentional that over softlimit IO is done in 4kB chunks, even if the qunit is getting 1MB chunks. Is it possible to avoid throttling the clients if there is a large gap between the soft and hard quota limit (i.e. treating over softlimit the same as under softlimit if there is still a large margin before the hardlimit)?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;As I described above, page size (4KB) io is because of sync write on client. To avoid sync write on client after over softlimit, I think probably we can tweak the qunit size differently when over softlimit. I&apos;ll try to cooke a patch.&lt;/p&gt;</comment>
                            <comment id="69991" author="jaylan" created="Mon, 28 Oct 2013 05:03:14 +0000"  >&lt;p&gt;While the servers run 2.4.1 the clients are 2.1.5. The client code has no knowledge of the new quota rules. Which variable/field enforce sync write, and how the server tells clients to start using sync write? I found where the qunit is adjusted, but I have not figured out how the sync write was enforced.&lt;/p&gt;</comment>
                            <comment id="69993" author="niu" created="Mon, 28 Oct 2013 05:36:24 +0000"  >&lt;blockquote&gt;
&lt;p&gt;While the servers run 2.4.1 the clients are 2.1.5. The client code has no knowledge of the new quota rules. Which variable/field enforce sync write, and how the server tells clients to start using sync write? I found where the qunit is adjusted, but I have not figured out how the sync write was enforced.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;The new quota didn&apos;t change client protocol, so triggering sync write when approaching limit is same as before, please check the server code qsd_op_begin0():&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;                        __u64   usage;

                        lqe_read_lock(lqe);
                        usage  = lqe-&amp;gt;lqe_usage;
                        usage += lqe-&amp;gt;lqe_pending_write;
                        usage += lqe-&amp;gt;lqe_waiting_write;
                        usage += qqi-&amp;gt;qqi_qsd-&amp;gt;qsd_sync_threshold;

                        &lt;span class=&quot;code-comment&quot;&gt;/* &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; we should notify client to start sync write */&lt;/span&gt;
                        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (usage &amp;gt;= lqe-&amp;gt;lqe_granted - lqe-&amp;gt;lqe_pending_rel)
                                *flags |= LQUOTA_OVER_FL(qqi-&amp;gt;qqi_qtype);
                        &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt;
                                *flags &amp;amp;= ~LQUOTA_OVER_FL(qqi-&amp;gt;qqi_qtype);
                        lqe_read_unlock(lqe);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And the client code osc_queue_async_io() -&amp;gt; osc_quota_chkdq().&lt;/p&gt;</comment>
                            <comment id="70019" author="niu" created="Mon, 28 Oct 2013 16:15:59 +0000"  >&lt;p&gt;Lose some grace time accuracy to improve write performance when over softlimit: &lt;a href=&quot;http://review.whamcloud.com/8078&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8078&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="70079" author="mhanafi" created="Mon, 28 Oct 2013 21:07:42 +0000"  >&lt;p&gt;How does this patch help with the 4k io sizes? I think that is the real issue with the performance. &lt;/p&gt;</comment>
                            <comment id="70091" author="niu" created="Tue, 29 Oct 2013 02:26:50 +0000"  >&lt;blockquote&gt;
&lt;p&gt;How does this patch help with the 4k io sizes? I think that is the real issue with the performance.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;4K io size is caused by over quota flag on client, with this patch, slave can acquire/pre-acquire little bit more spare limit each time when over softlimit, then over quota flag won&apos;t be set on client anymore.&lt;/p&gt;</comment>
                            <comment id="70216" author="mhanafi" created="Wed, 30 Oct 2013 00:18:36 +0000"  >&lt;p&gt;New benchmark number with the patch&lt;/p&gt;

&lt;p&gt;Direct I/O&lt;br/&gt;
UnderSoftlimit: 383MB/sec&lt;br/&gt;
OverSoftlimit: 359MB/sec&lt;/p&gt;

&lt;p&gt;Buffered I/O&lt;br/&gt;
UnderSoftlimit:316MB.sec&lt;br/&gt;
OverSoftlimit: 304MB/sec&lt;/p&gt;

&lt;p&gt;So it looks good! &lt;/p&gt;</comment>
                            <comment id="72113" author="yujian" created="Fri, 22 Nov 2013 08:24:12 +0000"  >&lt;p&gt;Patch &lt;a href=&quot;http://review.whamcloud.com/8078&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8078&lt;/a&gt; landed on master branch and was cherry-picked to Lustre b2_4 branch.&lt;/p&gt;</comment>
                            <comment id="72724" author="mhanafi" created="Tue, 3 Dec 2013 18:28:02 +0000"  >&lt;p&gt;Please close this one.&lt;/p&gt;</comment>
                            <comment id="72732" author="pjones" created="Tue, 3 Dec 2013 19:15:53 +0000"  >&lt;p&gt;ok Mahmoud&lt;/p&gt;</comment>
                            <comment id="73656" author="niu" created="Tue, 17 Dec 2013 06:14:07 +0000"  >&lt;p&gt;backported to b2_5:  &lt;a href=&quot;http://review.whamcloud.com/8603&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8603&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="34784">LU-7795</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="13677" name="quota.debug.slow.gz" size="946405" author="mhanafi" created="Wed, 23 Oct 2013 22:19:11 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw6pr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11230</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10023"><![CDATA[4]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>