<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:31:12 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-10005] File creation to slave MDT is much slower than primary MDT on DNE1 configuration</title>
                <link>https://jira.whamcloud.com/browse/LU-10005</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;There is a MDS and two MDTs. both MDT&apos;s hardware setup is symmetric.&lt;br/&gt;
This is DNE1 setup and static MDT allocation to each directory below.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@c01 ~]# lfs mkdir -i 0 /scratch0/dir0
[root@c01 ~]# lfs mkdir -i 1 /scratch0/dir1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If it run mdtest to each MDT separately, File creation to slave MDT (MDT0001) is much slower than primary MDT (MDT0000).&lt;br/&gt;
Here is quick summary&lt;/p&gt;

&lt;p&gt;1. MDT0000 on MDT0 : 154K ops/sec &lt;br/&gt;
2. MDT0001 on MDT1 :   94K ops/sec&lt;/p&gt;

&lt;p&gt;Also tested MDT1 device as MDT0000. reformated MDT1 device as MDT0000 and also reformated MDT0 device as MDT0001. (which means swapped MDT0 and MDT1 device)&lt;/p&gt;

&lt;p&gt;3. MDT0000 on original MDT1 devcide : 151K ops/sec &lt;br/&gt;
4. MDT0001 on original MDT0 devcide : 106K ops/sec&lt;/p&gt;

&lt;p&gt;From those benchmark results, MDT device and backend storage are no problem and it doesn&apos;t master. In any case, file creation to MDT0001 is slower than MDT0000.&lt;/p&gt;

&lt;p&gt;Here is full mutest results.&lt;br/&gt;
mpirun -np 128 /work/tools/bin/mdtest -n 10000 -v -d /scratch0/dir1 -i 3 -p 60 -F -u&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Format MDT0 device with MDT0000
SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :     155589.552     152790.159     154009.399       1170.995
   File stat         :     454932.894     444775.351     449009.216       4315.516
   File read         :     233628.858     230038.744     232081.775       1507.029
   File removal      :     188460.588     184435.235     186712.008       1685.251
   Tree creation     :        551.714        444.141        493.856         44.292
   Tree removal      :         19.593         18.601         18.984          0.436
V-1: Entering timestamp...

Format MDT1 device with MDT0001
SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :      97428.133      92734.086      94657.494       2007.797
   File stat         :     463844.746     439627.133     450037.946      10174.240
   File read         :     234910.249     232565.024     233533.717        999.923
   File removal      :     186289.259     181171.423     184208.010       2195.839
   Tree creation     :        476.266         32.866        325.249        206.784
   Tree removal      :         19.429         14.144         17.055          2.191
V-1: Entering timestamp...

Reformat MDT1 device as MDT0000
SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :     155432.973     145656.215     151697.151       4311.335
   File stat         :     436363.906     420914.320     428509.377       6309.935
   File read         :     231848.424     229879.273     230823.486        805.926
   File removal      :     189856.501     186441.697     187710.599       1525.794
   Tree creation     :        564.044        432.872        504.217         54.166
   Tree removal      :         18.839         17.053         17.802          0.757
V-1: Entering timestamp...

Reformat MDT0 device as MDT0001
SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :     110312.440     103512.106     106042.905       3036.285
   File stat         :     443284.493     425246.521     435923.695       7728.341
   File read         :     226239.692     225898.388     226067.629        139.351
   File removal      :     186702.519     181944.612     184773.293       2043.883
   Tree creation     :        533.233         28.863        342.123        223.290
   Tree removal      :         17.901         17.260         17.650          0.280
V-1: Entering timestamp...

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>b2_10</environment>
        <key id="48375">LU-10005</key>
            <summary>File creation to slave MDT is much slower than primary MDT on DNE1 configuration</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="ihara">Shuichi Ihara</reporter>
                        <labels>
                    </labels>
                <created>Tue, 19 Sep 2017 11:17:12 +0000</created>
                <updated>Wed, 5 Sep 2018 18:34:48 +0000</updated>
                            <resolved>Sun, 17 Dec 2017 16:19:10 +0000</resolved>
                                                    <fixVersion>Lustre 2.11.0</fixVersion>
                    <fixVersion>Lustre 2.10.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="208766" author="adilger" created="Tue, 19 Sep 2017 17:15:03 +0000"  >&lt;p&gt;Hi Ihara,&lt;br/&gt;
I&apos;m wondering if there is some extra overhead in looking up the &lt;tt&gt;scratch0-&amp;gt;dir1&lt;/tt&gt; directory entry that is causing the MDT0001 operations to be slower?  It would be useful to run mdtest inside the &lt;tt&gt;dir1&lt;/tt&gt; directory to avoid the extra lookup, and determine if that is the cause of the slowdown:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cd /scratch/dir1
# mpirun -np 128 /work/tools/bin/mdtest -n 10000 -v -d . -i 3 -p 60 -F -u
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;tt&gt;scratch&lt;/tt&gt; directory should always be cached on the client, but I&apos;m wondering if there is some problem with the locking on &lt;tt&gt;dir1&lt;/tt&gt; that is preventing it from being cached?&lt;/p&gt;</comment>
                            <comment id="208809" author="di.wang" created="Tue, 19 Sep 2017 20:43:41 +0000"  >&lt;p&gt;Another possible cause is that the default lov stripping cache does not work correctly, which might cause each file open/create (on non-root MDT) tries to get default striping from root MDT (extra RPC). See lod_ah_init()-&amp;gt;lod_get_default_lov_striping().  &lt;/p&gt;

&lt;p&gt;I am not sure OSP cache works in this case, I will check.&lt;/p&gt;</comment>
                            <comment id="208819" author="gerrit" created="Tue, 19 Sep 2017 22:05:14 +0000"  >&lt;p&gt;wangdi (di.wang@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/29078&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29078&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10005&quot; title=&quot;File creation to slave MDT is much slower than primary MDT on DNE1 configuration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10005&quot;&gt;&lt;del&gt;LU-10005&lt;/del&gt;&lt;/a&gt; osp: cache non-exist EA&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 8a7e2c23d4bd35d0d544e91bdc51ce0abff895ac&lt;/p&gt;</comment>
                            <comment id="208823" author="di.wang" created="Tue, 19 Sep 2017 22:06:41 +0000"  >&lt;p&gt;Ihara, please try this patch, thanks. &lt;/p&gt;</comment>
                            <comment id="208840" author="ihara" created="Wed, 20 Sep 2017 02:09:45 +0000"  >&lt;p&gt;Thanks WangDi&lt;br/&gt;
patch makes better results.&lt;/p&gt;

&lt;p&gt;MDT0000&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :     145472.642     137093.283     140274.791       3706.089
   File stat         :     443154.312     431557.793     436764.649       4807.570
   File read         :     233326.068     229897.041     231796.549       1424.131
   File removal      :     186842.911     186376.008     186627.793        192.368
   Tree creation     :        572.336        436.418        499.243         55.961
   Tree removal      :         19.798         18.276         19.165          0.647
V-1: Entering timestamp...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;MDT0001&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :     153350.909     135825.706     143892.288       7222.027
   File stat         :     462687.460     449961.013     457977.445       5697.362
   File read         :     230307.880     224196.385     226475.436       2726.092
   File removal      :     192887.031     187816.726     189799.023       2212.682
   Tree creation     :        514.550        399.017        451.076         47.852
   Tree removal      :         18.876         17.538         18.222          0.546

V-1: Entering timestamp...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;btw, I didn&apos;t see such performance differences with IEEL3.0. somethjing we did in lusre-2.7, but missed or changed after lustre-2.7 and showed up this issue?&lt;/p&gt;
</comment>
                            <comment id="208842" author="ihara" created="Wed, 20 Sep 2017 02:14:58 +0000"  >&lt;p&gt;BTW, after applied patch &lt;a href=&quot;https://review.whamcloud.com/29078&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29078&lt;/a&gt;.&lt;br/&gt;
Overall, average file creation performance to primary MDT (MDT0000) drops.&lt;/p&gt;

&lt;p&gt;run mdtest 10 times  with 60 sec interval&lt;br/&gt;
mpirun --allow-run-as-root /work/tools/bin/mdtest -n 10000 -v -d /scratch0/dir0 -i 10 -p 60 -u -F &lt;/p&gt;

&lt;p&gt;without patch &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@c01 ~]# grep &apos;V-1:   File creation&apos; mdtest-default-dir0-loop.log 
V-1:   File creation     :          8.723 sec,     146739.418 ops/sec
V-1:   File creation     :          8.855 sec,     144543.110 ops/sec
V-1:   File creation     :          8.829 sec,     144978.787 ops/sec
V-1:   File creation     :          8.803 sec,     145404.742 ops/sec
V-1:   File creation     :          8.637 sec,     148192.295 ops/sec
V-1:   File creation     :          9.084 sec,     140911.792 ops/sec
V-1:   File creation     :          8.837 sec,     144853.049 ops/sec
V-1:   File creation     :          9.288 sec,     137808.205 ops/sec
V-1:   File creation     :          9.046 sec,     141502.448 ops/sec
V-1:   File creation     :          9.392 sec,     136287.278 ops/sec
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;with patch &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@c01 ~]# grep &apos;V-1:   File creation&apos; mdtest-LU10005-dir0-loop.log 
V-1:   File creation     :          8.874 sec,     144246.332 ops/sec
V-1:   File creation     :          8.552 sec,     149675.160 ops/sec
V-1:   File creation     :          9.211 sec,     138957.104 ops/sec
V-1:   File creation     :          9.058 sec,     141315.265 ops/sec
V-1:   File creation     :          9.297 sec,     137673.095 ops/sec
V-1:   File creation     :          9.263 sec,     138185.327 ops/sec
V-1:   File creation     :          9.469 sec,     135184.898 ops/sec
V-1:   File creation     :          9.266 sec,     138134.736 ops/sec
V-1:   File creation     :          9.373 sec,     136563.934 ops/sec
V-1:   File creation     :          9.486 sec,     134930.710 ops/sec
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="208873" author="di.wang" created="Wed, 20 Sep 2017 04:14:35 +0000"  >&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;btw, I didn&apos;t see such performance differences with IEEL3.0. somethjing we did in lusre-2.7, but missed or changed after lustre-2.7 and showed up this issue?
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think this is brought in by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8998&quot; title=&quot;Progressive File Layout (PFL)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8998&quot;&gt;&lt;del&gt;LU-8998&lt;/del&gt;&lt;/a&gt; &lt;a href=&quot;https://review.whamcloud.com/#/c/24823&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/24823&lt;/a&gt;, which is landed in 2.10, so 2.7 should be fine.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;BTW, after applied patch https://review.whamcloud.com/29078.
Overall, average file creation performance to primary MDT (MDT0000) drops.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Hmm, I did not touch anything in the path of local file open/creation. Is it repeatable? This drop seems unlikely related with the patch, IMHO. I will check again. thanks&lt;/p&gt;</comment>
                            <comment id="216525" author="gerrit" created="Sun, 17 Dec 2017 06:21:05 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/29078/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29078/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10005&quot; title=&quot;File creation to slave MDT is much slower than primary MDT on DNE1 configuration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10005&quot;&gt;&lt;del&gt;LU-10005&lt;/del&gt;&lt;/a&gt; osp: cache non-exist EA&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: b2fa448050aff0b5230c8cc94e8baf848fbb4ded&lt;/p&gt;</comment>
                            <comment id="216554" author="pjones" created="Sun, 17 Dec 2017 16:19:10 +0000"  >&lt;p&gt;Landed for 2.11&lt;/p&gt;</comment>
                            <comment id="216636" author="gerrit" created="Mon, 18 Dec 2017 18:37:18 +0000"  >&lt;p&gt;Minh Diep (minh.diep@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/30585&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/30585&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10005&quot; title=&quot;File creation to slave MDT is much slower than primary MDT on DNE1 configuration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10005&quot;&gt;&lt;del&gt;LU-10005&lt;/del&gt;&lt;/a&gt; osp: cache non-exist EA&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 96e35818b3c09978f7f05020a8f2641e0de0c92c&lt;/p&gt;</comment>
                            <comment id="217026" author="gerrit" created="Fri, 22 Dec 2017 00:00:41 +0000"  >&lt;p&gt;Minh Diep (minh.diep@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/30643&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/30643&lt;/a&gt;&lt;br/&gt;
Subject: Revert &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10005&quot; title=&quot;File creation to slave MDT is much slower than primary MDT on DNE1 configuration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10005&quot;&gt;&lt;del&gt;LU-10005&lt;/del&gt;&lt;/a&gt; osp: cache non-exist EA&quot;&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 8da4d30a18ce1371624730225ad1b324e5128db1&lt;/p&gt;</comment>
                            <comment id="225872" author="gerrit" created="Thu, 12 Apr 2018 15:25:46 +0000"  >&lt;p&gt;John L. Hammond (john.hammond@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/30585/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/30585/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10005&quot; title=&quot;File creation to slave MDT is much slower than primary MDT on DNE1 configuration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10005&quot;&gt;&lt;del&gt;LU-10005&lt;/del&gt;&lt;/a&gt; osp: cache non-exist EA&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 3a59349d931250c3ea008a68f8f0121500d984a4&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="49912">LU-10406</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzkev:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>