<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:05:26 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-13932] reduce maximum wait_time for MMP recovery</title>
                <link>https://jira.whamcloud.com/browse/LU-13932</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When the MMP block is written, &lt;tt&gt;mmp_check_interval&lt;/tt&gt; is computed as &lt;tt&gt;EXT4_MMP_CHECK_MULT = 2&lt;/tt&gt; times the actual IO completion time:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
        mmp_check_interval = max(min(EXT4_MMP_CHECK_MULT * diff / HZ, 
                                     EXT4_MMP_MAX_CHECK_INTERVAL),
                                 EXT4_MMP_MIN_CHECK_INTERVAL);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Later, during MMP recovery after a crash, the &lt;tt&gt;wait_time&lt;/tt&gt; is computed as either 2x &lt;tt&gt;mmp_check_interval&lt;/tt&gt; or 60s longer.&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
        wait_time = min(mmp_check_interval * 2 + 1,
                        mmp_check_interval + 60);

        &lt;span class=&quot;code-comment&quot;&gt;/* Print MMP interval &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; more than 20 secs. */&lt;/span&gt;
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (wait_time &amp;gt; EXT4_MMP_MIN_CHECK_INTERVAL * 4)
                ext4_warning(sb, &lt;span class=&quot;code-quote&quot;&gt;&quot;MMP interval %u higher than expected, please&quot;&lt;/span&gt;
                             &lt;span class=&quot;code-quote&quot;&gt;&quot; wait.\n&quot;&lt;/span&gt;, wait_time * 2);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;There should be &lt;em&gt;some&lt;/em&gt; margin in order to compensate for nodes that became more busy after the last time the MMP block was updated, but this seems excessive, given that we also need to wait twice &lt;em&gt;that&lt;/em&gt; interval in order to finish recovery (once to detect if the MMP block is idle, and once again after writing our own MMP block to detect races with other nodes also trying to mount the filesystem). That may result in a mount time of 12 minutes (720s) after all of the doublings are taken into account.&lt;/p&gt;

&lt;p&gt;We don&apos;t really need to increase the &lt;tt&gt;wait_time&lt;/tt&gt; by a factor of two or 60s. It would be enough to use e.g. &lt;tt&gt;min(mmp_check_interval * 2, mmp_check_interval + 20&lt;/tt&gt; or similar, given that the second value will take precedence once &lt;tt&gt;mmp_check_interval&lt;/tt&gt; is above 160s already. This would reduce the maximum wait interval to 640s (-80s).&lt;/p&gt;

&lt;p&gt;Also, based on the MMP code in ZFS, it probably makes sense to have the &lt;tt&gt;mmp_check_interval&lt;/tt&gt; written by &lt;tt&gt;ext4_kmmpd()&lt;/tt&gt; to use a decaying average time rather than just the most recent interval. That would avoid writing a very short interval during a period of alternating short and long IO submission times (e.g. due to fluctuating load or intermittent IO errors). Something like:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
        new_check_interval = EXT4_MMP_CHECK_MULT * diff / HZ;
        /* Increase mmp_check_interval immediately &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; IO completion time
         * is longer, but decay slowly to minimum &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; it is shorter.
         */
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (new_check_interval &amp;gt;= mmp_check_interval)
                mmp_check_interval = min(new_check_interval,
                                         EXT4_MMP_MAX_CHECK_INTERVAL);
        &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt;
                mmp_check_interval = (mmp_check_interval * 15 +
                                      max(EXT4_MMP_MIN_CHECK_INTERVAL,
                                          new_check_interval)) / 16;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="60518">LU-13932</key>
            <summary>reduce maximum wait_time for MMP recovery</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>e2fsprogs</label>
                            <label>easy</label>
                            <label>ldiskfs</label>
                    </labels>
                <created>Fri, 28 Aug 2020 02:45:05 +0000</created>
                <updated>Fri, 7 Oct 2022 22:47:53 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i018jb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>