<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:34:19 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-10355] Lower write throughput after put many files into lustre</title>
                <link>https://jira.whamcloud.com/browse/LU-10355</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Hi ,&lt;/p&gt;

&lt;p&gt;&#160; &#160;The total lustre write throughput became lower after I transfer many small files to lustre.(about 1000000 files, 1MB per file). The write throughput seems the same even remove these small files. I have test obdfilter-survey, it also got the lower throughput performance. Do &#160;you &#160;have any suggestion to make the luster cleaner? &#160; Thanks.&lt;/p&gt;</description>
                <environment>Lustre 2.10.1 + OSD_ZFS </environment>
        <key id="49656">LU-10355</key>
            <summary>Lower write throughput after put many files into lustre</summary>
                <type id="9" iconUrl="https://jira.whamcloud.com/images/icons/issuetypes/undefined.png">Question/Request</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="pjones">Peter Jones</assignee>
                                    <reporter username="sebg-crd-pm">sebg-crd-pm</reporter>
                        <labels>
                    </labels>
                <created>Fri, 8 Dec 2017 09:59:15 +0000</created>
                <updated>Thu, 6 Sep 2018 10:15:07 +0000</updated>
                            <resolved>Thu, 6 Sep 2018 10:15:07 +0000</resolved>
                                    <version>Lustre 2.10.1</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="215850" author="adilger" created="Fri, 8 Dec 2017 22:39:25 +0000"  >&lt;p&gt;Hi, could you please give us some more details about your test, and how big the performance problem is?  Is this a newly-formatted filesystem, or is this an existing filesystem that you are running the test on?  Are you creating/removing/creating files in the same directory, or a new directory each time?  What are the actual speeds (creates/sec) for the first and second test?  Are you using HDD or SSD storage for the MDT and OST?&lt;/p&gt;

&lt;p&gt;In general, a disk-based filesystem will get slower over time, because the free space becomes fragmented, as well as a significant performance impact is seen between the inner tracks of the disk and the outer tracks of the disk (about 50% slower, see &lt;a href=&quot;http://example.com&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this StackExchange question&lt;/a&gt;https://unix.stackexchange.com/questions/293176/hdds-outer-track-vs-inner-track-performance-benchmarks/409832] for example).&lt;/p&gt;

&lt;p&gt;Note that ZFS doesn&apos;t immediately delete the files from disk after they are removed.  There are &quot;internal snapshots&quot; (at least 4) that are kept in the filesystem in case of a crash, which may take tens of seconds to actually be removed if the filesystem is not being modified.  This may perturb your test results if you are doing create/remove/create in quick succession.&lt;/p&gt;

&lt;p&gt;Secondly, if you create 1M files in a directory, the size of the directory itself needs to grow to store the filenames, but these blocks are not freed when the files are deleted.  File access in a large directory is slower than in a small directory, since the filenames are hashed and distributed around the whole directory, and blocks may need to be read from disk.  However, the performance of the directory should not continue to get worse after the first few cycles if you continually create/remove files in the same directory.&lt;/p&gt;</comment>
                            <comment id="216037" author="sebg-crd-pm" created="Tue, 12 Dec 2017 10:28:15 +0000"  >&lt;p&gt;Lustre Configuration&lt;/p&gt;

&lt;p&gt;MDSx2 &#160;: &#160;SSDx 10 &#160;mirror / per MDS&lt;/p&gt;

&lt;p&gt;OSSx2 : &#160; HDD &#160;Raidz2(9+2) &#160;x 3 / Per OSS&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;1.Test Case1: With many small files(about 1M files, in seperate directorys, 500 files/ directory )&#160;&lt;/p&gt;

&lt;p&gt;8 client IOR &#160;512G test &#160; write throughput: 3GB/s&lt;/p&gt;

&lt;p&gt;2.Test Case2: Then re-create zpool and formate all&#160;&lt;/p&gt;

&lt;p&gt;8 client IOR &#160;512G test &#160; write throughput: 4GB/s&lt;/p&gt;

&lt;p&gt;3.Test Case3: &#160;new create zpool &#160;+ copy many small files (about 1M files in the same directory)&lt;/p&gt;

&lt;p&gt;Test one ost &#160;zfs fio , 1.3GB/s(Newly created) =&amp;gt; 1GBs (after copy 1M files)&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;I will try to test it &#160;base one zfs pool &#160;with 1M files(seperate directorys)&lt;/p&gt;

&lt;p&gt;It looks like OSS zfs became slower when save many files.&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="216038" author="adilger" created="Tue, 12 Dec 2017 10:56:09 +0000"  >&lt;p&gt;How long after small file create was Fio test run?  If right after small file create+write, then it may be large file will be waiting for previous small files to be written?&lt;/p&gt;

&lt;p&gt;We are working on the &#8220;Data on MDS&#8221; feature to separate small files onto the MDS, but it is not released yet. &lt;/p&gt;</comment>
                            <comment id="216083" author="adilger" created="Tue, 12 Dec 2017 18:30:50 +0000"  >&lt;p&gt;Does the test have 512TB in total, or per client? How big is the zpool in total?&lt;/p&gt;

&lt;p&gt;Have you tried this test on a local ZFS filesystem, without Lustre? It may be that this is the behavior of ZFS. &lt;/p&gt;</comment>
                            <comment id="216275" author="sebg-crd-pm" created="Thu, 14 Dec 2017 11:07:39 +0000"  >&lt;p&gt;Hi&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160;&#160;&#160;Test Case1/2 &#160;are 8 lustre clients, per client write 512GB. &#160;(lustre 6 OST , Raidz9+2 )&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160;Test Case3 is on local test ZFS &#160;(one zpool local test )&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160;We look forward to this feature &quot;&#8220;Data on MDS&#8221; &#160;for small files access if it is available.&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; I have test other cases about perforamce&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160;Test Case4: &#160;local zfs pool Capacity 47TB(raidz 9+2) 1M files (130k/ per file)&#160;&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; The throghput is also the same without these small files. (1.3GB/s)&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; =&amp;gt;So many very small files will not impact throughput.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160;Test Case5: local zfs pool Capacity 47TB(raidz 9+2) 5000 files (1G/ per files )&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; It is almost only &#160;&quot;half&quot; throghput(655M/s) &#160;compared with new created zpool&#160;(1.3GB/s).&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; =&amp;gt;It looks like &#160;the throughput has been impacted by&#160;&#160;the inner/outer tracks.&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; But the case only use 10% Capacity, so I expect the zpool throughput only &lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; lower &#160;5~10% just like one disk throghput. It seems like have other reason to made&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; zpool&#160;throughput down to 50%. &#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160;&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160; &#160; Do you have any suggestion? Thanks.&#160;&lt;/p&gt;

&lt;p&gt; &#160; &#160; &#160;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160; &#160;&#160; &#160; &#160;&lt;/p&gt;</comment>
                            <comment id="216332" author="jgmitter" created="Thu, 14 Dec 2017 18:11:47 +0000"  >&lt;p&gt;    &quot; We look forward to this feature &quot;&#8220;Data on MDS&#8221;  for small files access if it is available.&quot;&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Data on MDT will be available beginning with the Lustre 2.11.0 release projected for the end of Q1.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="216414" author="pjones" created="Fri, 15 Dec 2017 18:08:12 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=sebg-crd-pm&quot; class=&quot;user-hover&quot; rel=&quot;sebg-crd-pm&quot;&gt;sebg-crd-pm&lt;/a&gt; are you able to test the pre-release 2.11 code including the data on MDT feature?&lt;/p&gt;</comment>
                            <comment id="216703" author="sebg-crd-pm" created="Tue, 19 Dec 2017 08:13:22 +0000"  >&lt;p&gt;I am focusing on throughput performance and other issues, so I may&#160;test &#160;data on MDT later.&lt;/p&gt;

&lt;p&gt;Can I download&#160;pre-release 2.11 code now?&lt;/p&gt;</comment>
                            <comment id="216715" author="pjones" created="Tue, 19 Dec 2017 13:55:45 +0000"  >&lt;p&gt;Yes the latest 2.11 pre-release build can always be accessed via &lt;a href=&quot;https://build.hpdd.intel.com/job/lustre-master/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.hpdd.intel.com/job/lustre-master/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, when are you targeting entering production?&lt;/p&gt;</comment>
                            <comment id="224198" author="pjones" created="Wed, 21 Mar 2018 16:45:22 +0000"  >&lt;p&gt;2.11 RC1 is now in testing&lt;/p&gt;</comment>
                            <comment id="233108" author="pjones" created="Thu, 6 Sep 2018 10:15:07 +0000"  >&lt;p&gt;2.11 has been GA for some time&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzp0n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>