<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:55:38 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12785] DOM2: dynamic DoM component size as MDT becomes full</title>
                <link>https://jira.whamcloud.com/browse/LU-12785</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;As the MDT becomes full it makes sense to reduce the size or completely remove DOM components from the layout if created from default directory or root filesystem layout.  i think a reasonable heuristic would be that if the percentage of free inodes is larger than the percentage of free space, the size of the DoM component can be increased (up to the &lt;tt&gt;mdt.&amp;#42;.dom_stripesize&lt;/tt&gt; maximum).  If the percentage of free inodes is smaller than the percentage of free space, or if the MDT is within configurable threshold (e.g. &lt;tt&gt;mdt.&amp;#42;.dom_threshold&lt;/tt&gt;=10%) of being full, the DoM component size should be cut in half, and within 1/2 of &lt;tt&gt;mdt.&amp;#42;.dom_threshold&lt;/tt&gt; the DoM component should be removed (or similar, see more complex options below).&lt;/p&gt;

&lt;p&gt;Note that the DoM component size must be a multiple of &lt;tt&gt;LOV_MIN_STRIPE_SIZE&lt;/tt&gt; (64KiB) so it will not be possible to exactly match the inode ratio with the blocks ratio, but it makes sense to keep them relatively well balanced by default.&lt;/p&gt;

&lt;p&gt;It could be proposed to have a policy that each 1/4 reduction in free space below &lt;tt&gt;mdt.&amp;#42;.dom_threshold&lt;/tt&gt; should reduce the DoM component size by 1/2 until it is below the 64KiB minimum component size.  That would ensure that the ldiskfs MDT+DoM filesystem is not completely filled with DoM data when it is close to being filled.  This is most critical for ldiskfs filesystems, since ZFS has dynamic inode allocation, but can still help ZFS to avoid being totally filled by DoM data.  This should also be helped by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12624&quot; title=&quot;DNE3: striped directory allocate stripes by QoS&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12624&quot;&gt;&lt;del&gt;LU-12624&lt;/del&gt;&lt;/a&gt; to balance DNE directory allocations across MDTs, but that is only a coarse-grained balance and will not prevent MDTs filling with DoM data too quickly.&lt;/p&gt;</description>
                <environment></environment>
        <key id="56945">LU-12785</key>
            <summary>DOM2: dynamic DoM component size as MDT becomes full</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="tappro">Mikhail Pershin</reporter>
                        <labels>
                            <label>DoM2</label>
                    </labels>
                <created>Wed, 18 Sep 2019 19:03:09 +0000</created>
                <updated>Thu, 18 Mar 2021 02:13:05 +0000</updated>
                            <resolved>Fri, 19 Jun 2020 22:08:56 +0000</resolved>
                                                    <fixVersion>Lustre 2.14.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="259409" author="adilger" created="Sat, 7 Dec 2019 01:03:23 +0000"  >&lt;p&gt;Mike, what is the behavior of DoM today if, say, one were to enable a default &quot;&lt;tt&gt;-E 64K -L mdt&lt;/tt&gt;&quot; component on an MDT that was formatted with only the old default 2.5KB per inode?&lt;/p&gt;

&lt;p&gt;1)  Would the files get &lt;tt&gt;ENOSPC&lt;/tt&gt; errors when the MDT was totally full, and the filesystem would be unusable?&lt;br/&gt;
2) Would the &lt;tt&gt;mdt&lt;/tt&gt; component be automatically dropped when the filesystem was totally full (allowing some limited use, but there would be no free space for directory/changelog block allocations? &lt;br/&gt;
3) Is there some free blocks threshold on the MDT below which DoM will drop the &lt;tt&gt;mdt&lt;/tt&gt; component, but reserve some space for non-DoM allocations so the filesystem can continue to work?&lt;/p&gt;

&lt;p&gt;if it is not #3, it seems that this would be relatively easy to implement and backport to a 2.12.x release so that it would be possible to default to enabling DoM on all fileystsmes, and if the MDT wasn&apos;t formatted for it, then it would just revert to non-DoM behavior for most files.&lt;/p&gt;</comment>
                            <comment id="259594" author="tappro" created="Wed, 11 Dec 2019 12:36:44 +0000"  >&lt;p&gt;Andreas, now DOM size is limited just by &lt;tt&gt;lod.*.dom_stripesize&lt;/tt&gt; so that can be implemented in any way. With DOM threshold introduced it will be possible to limit its size or drop component. &lt;/p&gt;</comment>
                            <comment id="262673" author="adilger" created="Wed, 5 Feb 2020 21:04:17 +0000"  >&lt;p&gt;Note that while the internal variable is named &quot;&lt;tt&gt;lod&amp;#95;dom&amp;#95;max&amp;#95;stripesize&lt;/tt&gt;&quot;, the userspace tunable parameter name is actually named &quot;&lt;tt&gt;dom_stripesize&lt;/tt&gt;&quot;, which is confusing for everyone.  This makes the tunable name different from the internal variable name, which I would normally suggest to fix by renaming the internal variable name to match, so that searching for this name finds both the internal variable and the parameter handling functions.  In this case, I think the &quot;&lt;tt&gt;&amp;#95;max&amp;#95;&lt;/tt&gt;&quot; part of the name is important for both the code and the user&apos;s understanding of what that parameter does.&lt;/p&gt;

&lt;p&gt;I think it would be useful to submit a patch to add a second &quot;&lt;tt&gt;dom&amp;#95;stripesize&amp;#95;max&lt;/tt&gt;&quot; tunable for userspace that also sets the &lt;tt&gt;lod_dom_stripesize&amp;#95;max&lt;/tt&gt;, then add a warning message into the next release if &quot;&lt;tt&gt;dom_stripesize&lt;/tt&gt;&quot; is used, and eventually deprecate/remove the &quot;&lt;tt&gt;dom&amp;#95;stripesize&lt;/tt&gt;&quot; tunable.  We might consider to name the new tunable &quot;&lt;tt&gt;dom&amp;#95;stripesize&amp;#95;max&amp;#95;kb&lt;/tt&gt;&quot; since it doesn&apos;t really make sense to store it in units of bytes (currently it must always be a multiple of 64KB).&lt;/p&gt;</comment>
                            <comment id="262686" author="adilger" created="Thu, 6 Feb 2020 00:33:22 +0000"  >&lt;p&gt;I think that having a setting like &quot;&lt;tt&gt;dom&amp;#95;stripesize=-1&lt;/tt&gt;&quot; (possibly set as the default), and some basic helper function called by &lt;tt&gt;lod_fix_dom_stripe()&lt;/tt&gt; like the following, which could be improved later if needed:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
&lt;span class=&quot;code-comment&quot;&gt;/* Max files created before dom_max_stripesize is recalculated */&lt;/span&gt;
unsigned &lt;span class=&quot;code-object&quot;&gt;long&lt;/span&gt; lod_dom_max_stripesize_recalc_count = 1048576;

unsigned &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; lod_dom_stripesize_tune(struct lod_device *lod)
{
        unsigned &lt;span class=&quot;code-object&quot;&gt;long&lt;/span&gt; avg_free_kb;

        &lt;span class=&quot;code-comment&quot;&gt;/* autotune is disabled by a specific max_stripesize set by user */&lt;/span&gt;
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (lod-&amp;gt;lod_dom_max_stripesize != -1)
                 &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; lod-&amp;gt;lod_dom_stripesize_max;

        &lt;span class=&quot;code-comment&quot;&gt;/*  &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; has never been set, then block &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; one thread to finish it */&lt;/span&gt;
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (unlikely(lod-&amp;gt;lod_dom_stripesize_tune == 0)) {
                spin_lock(&amp;amp;lod-&amp;gt;lod_dom_stripesize_tune_lock);
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (lod-&amp;gt;lod_dom_stripesize_tune)
                        &lt;span class=&quot;code-keyword&quot;&gt;goto&lt;/span&gt; out_unlock;
        &lt;span class=&quot;code-comment&quot;&gt;/* don&apos;t really care &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; check is racy on SMP &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; there is _some_ limit set */&lt;/span&gt;
        } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (++lod-&amp;gt;lod_dom_stripesize_count &amp;lt; lod-&amp;gt;lod_dom_stripesize_limit ||
                   !spin_trylock(&amp;amp;lod-&amp;gt;lod_dom_stripesize_tune_lock)) {
                 &lt;span class=&quot;code-keyword&quot;&gt;goto&lt;/span&gt; out;
        }
        lod-&amp;gt;lod_dom_stripesize_count = 0;

        &lt;span class=&quot;code-comment&quot;&gt;/* I _think_ statfs is always cached by &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; point, but that should be checked */&lt;/span&gt;

        avg_free_kb = osfs-&amp;gt;os_bavail * (osfs-&amp;gt;os_bsize &amp;gt;&amp;gt; 10) / (osfs-&amp;gt;os_ffree + 1);

        /* This algorithm may need to change &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; ZFS, since we estimate the value
         *      os_ffree = os_bfree * usedobjs / usedblocks
         *      avg_used_kb = usedblocks / usedobjs
         * so,
         *     avg_free_kb = avg_used_kb * (os_bavail / os_bfree)
         * which means avg_free_kb will always be lower than avg_used_kb so the
         * lod_dom_stripesize_tune will never increase (inode count will just grow),
         * which is bad since ZFS is much more flexible with allocation than ldiskfs...
         */
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (avg_free_kb &amp;lt; lod-&amp;gt;lod_dom_max_stripesize_tune * 3 / 4 ||
            avg_free_kb &amp;gt;= lod-&amp;gt;lod_dom_max_stripesize_tune * 9 / 4)
                lod-&amp;gt;lod_dom_max_stripesize_tune =
                        min(avg_free_kb &amp;amp; ~(LOV_MIN_STRIPE_SIZE - 1), DT_MAX_BRW_SIZE);

        &lt;span class=&quot;code-comment&quot;&gt;/* allow at most 10% of the filesystem to be used before recalculating */&lt;/span&gt;
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (lod-&amp;gt;lod_dom_max_stripesize_tune &amp;gt; 0) {
               lod-&amp;gt;lod_dom_stripesize_recalc = min(osfs-&amp;gt;os_bavail / (lod-&amp;gt;lod_dom_max_stripesize_tune * 10),
                           lod-&amp;gt;lod_dom_stripesize_recalc_count);
        } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; {
               lod-&amp;gt;lod_dom_stripesize_recalc = lod-&amp;gt;lod_dom_stripesize_recalc_count;
        }
out_unlock:
        spin_unlock(&amp;amp;lod-&amp;gt;lod_dom_stripesize_tune_lock);
out:
        &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; lod-&amp;gt;lod_dom_stripesize_tune;
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="262688" author="adilger" created="Thu, 6 Feb 2020 01:53:08 +0000"  >&lt;blockquote&gt;
&lt;p&gt; This algorithm may need to change for ZFS, since we estimate the value&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;One option would be to set an &lt;tt&gt;OS_STATE_FILES_EST&lt;/tt&gt; flag in the &lt;tt&gt;osfs-&amp;gt;os_state&lt;/tt&gt; field, so that this code (and clients as well) can see that the &lt;tt&gt;os_files&lt;/tt&gt; field is estimated (and by extension &lt;tt&gt;os_ffree&lt;/tt&gt; as well), and use a different algorithm for deciding the maximum &lt;tt&gt;lod_dom_stripesize_tune&lt;/tt&gt; value (perhaps just limiting it to &lt;tt&gt;MD_MAX_BRW_SIZE&lt;/tt&gt; by default or a static value set by the admin).&lt;/p&gt;</comment>
                            <comment id="265178" author="gerrit" created="Thu, 12 Mar 2020 11:22:41 +0000"  >&lt;p&gt;Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/37904&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37904&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12785&quot; title=&quot;DOM2: dynamic DoM component size as MDT becomes full&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12785&quot;&gt;&lt;del&gt;LU-12785&lt;/del&gt;&lt;/a&gt; dom: adjust DOM stripe size by free space&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 9eef5743de88857ff7f3c05d2bb2b7c0d2bd5d41&lt;/p&gt;</comment>
                            <comment id="268344" author="gerrit" created="Thu, 23 Apr 2020 12:48:58 +0000"  >&lt;p&gt;Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/38337&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/38337&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12785&quot; title=&quot;DOM2: dynamic DoM component size as MDT becomes full&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12785&quot;&gt;&lt;del&gt;LU-12785&lt;/del&gt;&lt;/a&gt; dom: fix DoM component deletion code&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: f4e21c851bd4b0f0cae6b13223ae62e43fb335fc&lt;/p&gt;</comment>
                            <comment id="269490" author="gerrit" created="Thu, 7 May 2020 05:42:09 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/37904/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37904/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12785&quot; title=&quot;DOM2: dynamic DoM component size as MDT becomes full&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12785&quot;&gt;&lt;del&gt;LU-12785&lt;/del&gt;&lt;/a&gt; dom: adjust DOM stripe size by free space&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: f2a3bbfb3f3fef910201259dd1827bf8c475da06&lt;/p&gt;</comment>
                            <comment id="273313" author="gerrit" created="Fri, 19 Jun 2020 16:50:24 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/38337/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/38337/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12785&quot; title=&quot;DOM2: dynamic DoM component size as MDT becomes full&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12785&quot;&gt;&lt;del&gt;LU-12785&lt;/del&gt;&lt;/a&gt; dom: fix DoM component deletion code&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: b24ba6c6ea1b3cc514241b01968bf31bc8f9cf46&lt;/p&gt;</comment>
                            <comment id="273371" author="pjones" created="Fri, 19 Jun 2020 22:08:56 +0000"  >&lt;p&gt;Landed for 2.14&lt;/p&gt;</comment>
                            <comment id="279812" author="gerrit" created="Thu, 17 Sep 2020 11:40:00 +0000"  >&lt;p&gt;Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/39958&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/39958&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12785&quot; title=&quot;DOM2: dynamic DoM component size as MDT becomes full&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12785&quot;&gt;&lt;del&gt;LU-12785&lt;/del&gt;&lt;/a&gt; dom: adjust DOM stripe size by free space&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 2a9ebe70d33e6f02cb7db7d3d810c16ef40e587a&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="56556">LU-12624</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="57575">LU-13058</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00mz3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>