<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:01:17 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-13439] DNE3: MDT QOS tuning to avoid full MDTs completely</title>
                <link>https://jira.whamcloud.com/browse/LU-13439</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Testing for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13417&quot; title=&quot;DNE3: mkdir() automatically create remote directory on MDS which has more space&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13417&quot;&gt;&lt;del&gt;LU-13417&lt;/del&gt;&lt;/a&gt; showed that &quot;&lt;tt&gt;lfs setdirstripe -D -c 1 -i -1 /mnt/testfs&lt;/tt&gt;&quot; now caused subdirectories to be created on different MDTs when the &lt;tt&gt;qos_threshold_rr&lt;/tt&gt; was reduced.  However, there were still errors hit when one MDT ran out of space, when there are free inodes (and also reported multiple kernel errors).  For mkdir this is a real problem because each directory needs at least one block, so the QOS code should completely avoid selection of MDTs with little free space (e.g. below 5% of the average MDT free space).&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
testfs-MDT0000_UUID       125368        9508      104624   9% /mnt/testfs[MDT:0]
testfs-MDT0001_UUID       125368       93560       20572  82% /mnt/testfs[MDT:1]

# lfs df -i
UUID                      Inodes       IUsed       IFree IUse% Mounted on
testfs-MDT0000_UUID       100000       20295       79705  21% /mnt/testfs[MDT:0]
testfs-MDT0001_UUID       100000       40580       59420  41% /mnt/testfs[MDT:1]

# ./createmany -d /mnt/testfs/dir 1000
total: 1000 mkdir in 0.57 seconds: 1768.49 ops/second
[root@centos7 tests]# lfs getdirstripe -m /mnt/testfs/dir* | sort | uniq -c
    871 0
    129 1

# ./createmany -d /mnt/testfs/dsub/d 1000
total: 1000 mkdir in 1.64 seconds: 608.97 ops/second
# lfs getdirstripe -m /mnt/testfs/dsub/d[0-9]* | sort | uniq -c
    860 0
    140 1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;These showed a reasonable distribution of directories, over 85% of directories going to MDT0000.&lt;/p&gt;

&lt;p&gt;However, when creating more directories the space balance doesn&apos;t change very much:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# ./createmany -d /mnt/testfs/dsub/d 1000 9000
 - mkdir 5742 (time 1586376135.48 total 10.00 last 574.16)
mkdir(/mnt/testfs/dsub/d9621) error: No space left on device
total: 8621 mkdir in 15.30 seconds: 563.46 ops/second
# lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
testfs-MDT0000_UUID       125368       48572       65560  43% /mnt/testfs[MDT:0]
testfs-MDT0001_UUID       125368      125368           0 100% /mnt/testfs[MDT:1]

# lfs df -i
UUID                      Inodes       IUsed       IFree IUse% Mounted on
testfs-MDT0000_UUID       100000       29764       70236  30% /mnt/testfs[MDT:0]
testfs-MDT0001_UUID       100000       50330       49670  51% /mnt/testfs[MDT:1]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This shows that the mkdir is failing with &lt;tt&gt;-ENOSPC&lt;/tt&gt; in the &quot;&lt;tt&gt;-i -1&lt;/tt&gt;&quot; directory even though MDT0000 is still having a lot of free blocks and space.  Checking the distribution of files that were created show that the distribution didn&apos;t change very much:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lfs getdirstripe -m /mnt/testfs/dsub/d1[0-9][0-9][0-9] | sort | uniq -c
    891 0
    109 1
# lfs getdirstripe -m /mnt/testfs/dsub/d2[0-9][0-9][0-9] | sort | uniq -c
    882 0
    118 1
# lfs getdirstripe -m /mnt/testfs/dsub/d3[0-9][0-9][0-9] | sort | uniq -c
    887 0
    113 1
# lfs getdirstripe -m /mnt/testfs/dsub/d4[0-9][0-9][0-9] | sort | uniq -c
    881 0
    119 1
# lfs getdirstripe -m /mnt/testfs/dsub/d5[0-9][0-9][0-9] | sort | uniq -c
    884 0
    116 1
# lfs getdirstripe -m /mnt/testfs/dsub/d6[0-9][0-9][0-9] | sort | uniq -c
    862 0
    138 1
# lfs getdirstripe -m /mnt/testfs/dsub/d7[0-9][0-9][0-9] | sort | uniq -c
    884 0
    116 1
# lfs getdirstripe -m /mnt/testfs/dsub/d8[0-9][0-9][0-9] | sort | uniq -c
    886 0
    114 1
# lfs getdirstripe -m /mnt/testfs/dsub/d9[0-9][0-9][0-9] | sort | uniq -c
    554 0
     67 1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I figure that this may be related to the &quot;&lt;tt&gt;qos_maxage=60&lt;/tt&gt;&quot; on the client causing it not to get a new space update while &quot;&lt;tt&gt;createmany -d&lt;/tt&gt;&quot; is running, and the relatively small amount of space on the MDTs.  However, even if I waited a long time it is not allowing files to create on the empty MDT:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# ./createmany -d /mnt/testfs/dsub/d 10000 1000
mkdir(/mnt/testfs/dsub/d10001) error: No space left on device
total: 1 mkdir in 0.01 seconds: 104.60 ops/second
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think two improvements are needed:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;the QOS code should avoid allocating on an MDT before it becomes too full.  We should limit the space/inode used to minimum ~10% of the average free space across all MDTs.  This will avoid hitting &lt;tt&gt;-ENOSPC&lt;/tt&gt; during creation, either from the directory or the llogs.  Since directories take space, we should consider either free blocks or inodes much lower than average as a reason not to use the MDT.&lt;/li&gt;
	&lt;li&gt;the default &quot;&lt;tt&gt;qos_threshold_rr=17%&lt;/tt&gt;&quot; is too high to start balancing directory creation across MDTs. This might mean that a large MDT0000 is used for many millions of files and top-level directories before any balancing is even started. At that point it will be harder to return the balance of the MDTs because so many top-level directories and subdirectories have been created on MDT0000. I think it would be better to have a smaller &quot;&lt;tt&gt;qos_threshold_rr=5%&lt;/tt&gt;&quot; or &quot;&lt;tt&gt;=10%&lt;/tt&gt;&quot; by default, to avoid the MDTs becoming too imbalanced before starting QOS.&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="58701">LU-13439</key>
            <summary>DNE3: MDT QOS tuning to avoid full MDTs completely</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>dne3</label>
                    </labels>
                <created>Wed, 8 Apr 2020 21:29:17 +0000</created>
                <updated>Sun, 29 May 2022 14:30:25 +0000</updated>
                            <resolved>Wed, 5 May 2021 12:58:40 +0000</resolved>
                                                    <fixVersion>Lustre 2.15.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="267199" author="adilger" created="Wed, 8 Apr 2020 21:44:10 +0000"  >&lt;p&gt;I think the other change needed to make the MDT balancing work better is to make remote directory creation much more aggressive in the &quot;&lt;tt&gt;ROOT/&lt;/tt&gt;&quot; directory than the default &quot;&lt;tt&gt;qos_threshold_rr&lt;/tt&gt;&quot;.  That would allow the MDT space balancing to work better for high-level directories and new filesystems, and having a spread across MDTs at the top level reduces the need for a &lt;b&gt;lot&lt;/b&gt; more lower-level remote directories.&lt;/p&gt;</comment>
                            <comment id="269750" author="adilger" created="Sat, 9 May 2020 10:51:14 +0000"  >&lt;p&gt;The patch in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13417&quot; title=&quot;DNE3: mkdir() automatically create remote directory on MDS which has more space&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13417&quot;&gt;&lt;del&gt;LU-13417&lt;/del&gt;&lt;/a&gt; should address the &quot;&lt;tt&gt;qos_threshold_rr&lt;/tt&gt;&quot; issue.&#160; It should be noted that setting &quot;&lt;tt&gt;-D -c 1 -i -1&lt;/tt&gt;&quot; on 2.13 and later &lt;b&gt;will&lt;/b&gt; round-robin subdirectories across all MDTs if they are evenly balanced (e.g. at format time) and will use QOS to balance across MDTs if their free space threshold exceeds &lt;tt&gt;qos_threshold_rr&lt;/tt&gt;.&lt;/p&gt;</comment>
                            <comment id="299455" author="adilger" created="Thu, 22 Apr 2021 03:02:52 +0000"  >&lt;p&gt;Lai, I had an idea about this that I think will help a lot. For the DNE auto-remote directory creation (&lt;tt&gt;-i -1 -c 1&lt;/tt&gt;) it should &lt;b&gt;only&lt;/b&gt; create a remote subdirectory if the MDT of the parent directory is more full than other MDTs (e.g. parent MDT has less than the average free space/inodes of other MDTs). It doesn&apos;t make sense to &quot;space balance&quot; a subdirectory if the parent is already on an MDT that is less full than other MDTs. &lt;/p&gt;

&lt;p&gt;This will also help reduce the number of remote subdirectories that are created. &lt;/p&gt;</comment>
                            <comment id="299678" author="gerrit" created="Sun, 25 Apr 2021 11:03:16 +0000"  >&lt;p&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/43445&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/43445&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13439&quot; title=&quot;DNE3: MDT QOS tuning to avoid full MDTs completely&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13439&quot;&gt;&lt;del&gt;LU-13439&lt;/del&gt;&lt;/a&gt; lmv: qos stay on current MDT if less full&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: d656423a0b640d9427693efa7c16c26ed6d9ea9a&lt;/p&gt;</comment>
                            <comment id="300520" author="gerrit" created="Wed, 5 May 2021 02:52:22 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/43445/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/43445/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13439&quot; title=&quot;DNE3: MDT QOS tuning to avoid full MDTs completely&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13439&quot;&gt;&lt;del&gt;LU-13439&lt;/del&gt;&lt;/a&gt; lmv: qos stay on current MDT if less full&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 3f6fc483013da443b1494d81efe2d271ac67f901&lt;/p&gt;</comment>
                            <comment id="300551" author="pjones" created="Wed, 5 May 2021 12:58:40 +0000"  >&lt;p&gt;Landed for 2.15&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="58656">LU-13417</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="70278">LU-15850</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="64648">LU-14762</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="58702">LU-13440</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00xhr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>