<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:18:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1695] Demonstrate MDS performance with increasing client load for SMP Affinity</title>
                <link>https://jira.whamcloud.com/browse/LU-1695</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Cray has found that, plotting mds performance against number of clients (or ranks) using either mdtest or metabench, demonstrates that MDS performance for create/stat/unlink rises from 1 to 64 clients, where it peaks, then declines as additional clients are added to the test.  Historically, more than 64 clients were not needed to show MDS performance saturation.  the problem is that using more than 64 clients leads to a decline in performance rather than reaching a plateau, which would be expected given the limitation of using a single MDS. &lt;/p&gt;

&lt;p&gt;The following data were gathered using metabench to measure rates of create/stat/unlink for a fixed number of files spread over a growing number of clients.  We are using Lustre 2.1.1 plus patches on the Lustre servers and the clients were Lustre 1.8.6 on Cray XE6.  The data are for 1M files, but the degradation of create and unlink rates as the number of clients increases is consistent for a broad range of file counts.  Furthermore, the degradation is higher when all files are in a single directory (as expected).     &lt;/p&gt;

&lt;p&gt;           Individual directories:&lt;br/&gt;
1M files&lt;br/&gt;
 Ranks   Nodes Creates   Stats  Unlinks&lt;br/&gt;
   512      32   18868   52702   13730 &lt;br/&gt;
  1024      64   20615   55660   15427 &lt;br/&gt;
  2048     128   19583   54987   11249 &lt;br/&gt;
  4096     256   16587   54386    9586 &lt;br/&gt;
  8192     512   13807   52892    7910 &lt;/p&gt;

&lt;p&gt;           Shared directory:&lt;br/&gt;
1M files&lt;br/&gt;
 Ranks   Nodes Creates   Stats  Unlinks&lt;br/&gt;
   512      32   19636   56030    9905  &lt;br/&gt;
  1024      64   20149   56807   10190  &lt;br/&gt;
  1024      64   16610   58880    9339  &lt;br/&gt;
  2048     128   19890   57257    9343  &lt;br/&gt;
  4096     256    6906   55991    4338  &lt;br/&gt;
  8192     512    6348   59329    2761  &lt;/p&gt;

&lt;p&gt;The DoD&apos;s HPCMOD office first reported this &quot;behavior&quot; to Sun and Cray several years ago following a test they funded to compare Lustre and GPFS metadata performance.  For a small range of clients, Lustre out performed GPFS, but then, instead of hitting a plateau with increasing client load, the Lustre MDS performance declined significantly (greater than 64 or 128 nodes, depending on the test run).  At the time, Sun told Cray and its customer that making the MDS SMP-aware would resolve the problem.  &lt;/p&gt;

&lt;p&gt;As a result, we need to add a test of create/stat/unlink rates as a function of a wide range of client counts into the qualification of the SMP affinity feature.  We need to show results before and after the SMP patches.  If there is no effect, then these results will provide a baseline for comparison with future investigations.&lt;/p&gt;
</description>
                <environment>Cray XE6 with Lustre 2.1.1 MDS/OSS</environment>
        <key id="15321">LU-1695</key>
            <summary>Demonstrate MDS performance with increasing client load for SMP Affinity</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="4">Incomplete</resolution>
                                        <assignee username="liang">Liang Zhen</assignee>
                                    <reporter username="jcarrier">John Carrier</reporter>
                        <labels>
                    </labels>
                <created>Wed, 25 Jul 2012 23:22:33 +0000</created>
                <updated>Wed, 28 Feb 2018 20:31:51 +0000</updated>
                            <resolved>Wed, 28 Feb 2018 20:31:51 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="43042" author="bryon" created="Fri, 10 Aug 2012 15:53:58 +0000"  >&lt;p&gt;Liang will look at Toro test results from Aug 4-5 to check for degradation.  He will also run SMP Scaling code on Hyperion DAT cluster to look at larger scales.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="13418">LU-1167</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzurxb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2186</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>