<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:46:41 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11760] formatted OST recognition change</title>
                <link>https://jira.whamcloud.com/browse/LU-11760</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We faced multiple missing OST objects mdtest job failure during failover/failback test(stripe count 2).&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;V-1: Entering unique_dir_access...
V-1: Entering mdtest_stat...
08/14/2018 21:46:25: Process 25(nid00016): FAILED in mdtest_stat, unable to stat file: No such file or directory
08/14/2018 21:46:25: Process 30(nid00045): FAILED in mdtest_stat, unable to stat file: No such file or directory&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Failure occurred because of absent the range of objects on one of the OSTs.&lt;/p&gt;

&lt;p&gt;The marker of the failure could be the following message after recovery on OST:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Aug 14 21:46:07 snx11205n004 kernel: format at ofd_dev.c:1713:ofd_create_hdl doesn&apos;t end in newline &lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
</description>
                <environment></environment>
        <key id="54255">LU-11760</key>
            <summary>formatted OST recognition change</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="scherementsev">Sergey Cheremencev</assignee>
                                    <reporter username="scherementsev">Sergey Cheremencev</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Tue, 11 Dec 2018 14:24:15 +0000</created>
                <updated>Thu, 12 Sep 2019 04:18:15 +0000</updated>
                            <resolved>Wed, 21 Aug 2019 11:26:27 +0000</resolved>
                                                    <fixVersion>Lustre 2.13.0</fixVersion>
                    <fixVersion>Lustre 2.12.3</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="238379" author="sergey" created="Tue, 11 Dec 2018 14:27:24 +0000"  >&lt;p&gt;The reason of failures is uncommitted journal transaction.&lt;br/&gt;
By default jbd2 commits transaction each 5 seconds or in case when journal buffers are full.&lt;br/&gt;
Create of one object doesn&apos;t need a lot of space(inode size + direntry). Thus usually under mdtest load&lt;br/&gt;
journal buffers don&apos;t become full and commit is occurred every 5 seconds.&lt;br/&gt;
In respect that OST creates objects asynchronously(except several rare cases), we can get a situation&lt;br/&gt;
when MDS think that successfully created 100 000 objects despite these objects are not written to disk(OST side) .&lt;br/&gt;
In case of failover these uncommitted objects could be lost and should be recreated after establishing connection from OST to MDT.&lt;br/&gt;
MDT sends LAST_ID to OST and it should recreate missing object.&lt;br/&gt;
This scheme works fine for a long time however there is exclusion for the case when the difference between MDS_Last_ID and OST_Last_ID is bigger then 100 000 (5 * OST_MAX_PRECREATE).&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                /* This can happen if a new OST is formatted and installed
                 * in place of an old one at the same index.  Instead of
                 * precreating potentially millions of deleted old objects
                 * (possibly filling the OST), only precreate the last batch.
                 * LFSCK will eventually clean up any orphans. LU-14 */
                if (diff &amp;gt; 5 * OST_MAX_PRECREATE) {
                        diff = OST_MAX_PRECREATE / 2; 
                        LCONSOLE_WARN(&quot;%s: Too many FIDs to precreate &quot;
                                      &quot;OST replaced or reformatted: &quot;
                                      &quot;LFSCK will clean up&quot;,
                                      ofd_name(ofd)); 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;In such case It decides OST is formatted and recreates only last 10 000. For us it means we can have huge gaps in OST objects.&lt;br/&gt;
I think 100 000 constant was actual for old systems and should be revised.&lt;/p&gt;</comment>
                            <comment id="238380" author="gerrit" created="Tue, 11 Dec 2018 14:29:25 +0000"  >&lt;p&gt;Sergey Cheremencev (c17829@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/33833&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33833&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: formatted OST recognition change&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 829732b1fd8f11e4a4aa5bc7ef6509c4aa771dc3&lt;/p&gt;</comment>
                            <comment id="247688" author="gerrit" created="Sat, 25 May 2019 04:55:53 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/33833/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33833/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: formatted OST recognition change&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: d07d9c5ed0aa1d6614944c7d1e0ca55cba301dc4&lt;/p&gt;</comment>
                            <comment id="247703" author="pjones" created="Sat, 25 May 2019 05:36:39 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="250206" author="adilger" created="Thu, 27 Jun 2019 22:07:29 +0000"  >&lt;p&gt;I think that this is not a full solution to the problem, and is causing a lot of test failures for conf-sanity test_69 for OSTs that do not have 500k free inodes.&lt;/p&gt;

&lt;p&gt;I think there are two problems with the patch that was landed:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;this essentially is changing &lt;tt&gt;OST_MAX_PRECREATE&lt;/tt&gt; on the OST side, but not on the MDS, so isn&apos;t properly handling the recovery&lt;/li&gt;
	&lt;li&gt;the root of the problem is not the number of objects created &lt;em&gt;per RPC&lt;/em&gt; but rather the number of objects created &lt;em&gt;per commit&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I don&apos;t mind to have a change that is increasing the maximum number of objects created per commit, but this needs to be negotiated between the MDS and OSS at connect time, with a new &lt;tt&gt;OBD_CONNECT2_MAX_PRECREATE&lt;/tt&gt; flag and an &lt;tt&gt;ocd_max_precreate&lt;/tt&gt; field (probably using &lt;tt&gt;__u32 padding1&lt;/tt&gt;) that passes the current &lt;tt&gt;OST_MAX_PRECREATE&lt;/tt&gt; value, and use &lt;tt&gt;OST_MAX_PRECREATE_OLD = 20000&lt;/tt&gt; if the feature is not available.  This also allows fixing the problem properly in older releases.&lt;/p&gt;

&lt;p&gt;Until this is fixed (and even afterward), the proper solution is to track on the OST how many objects have been created by each MDT/sequence within the current transaction (use a commit callback to reset the &quot;created this transaction&quot; counter to zero), and force a sync journal commit during precreate if the number exceeds &lt;tt&gt;OST_MAX_PRECREATE&lt;/tt&gt;.  If there are multiple MDTs creating, or the create rate is not too large, or if the clients do some IO and force a transaction commit anyway, then there is no need for a commit.  This also avoids the future bug when 500k+ precreates can happen within a single commit, since we are already over 150k/s creates on the MDS, and that is processing separate RPCs while the OST is in a fast local loop precreating objects.&lt;/p&gt;

&lt;p&gt;I &lt;em&gt;was&lt;/em&gt; going to say that the &lt;tt&gt;if (diff &amp;gt; 500000)&lt;/tt&gt; check in &lt;tt&gt;ofd_create_hdl()&lt;/tt&gt; could be modified to also check if &lt;tt&gt;LAST_ID &amp;lt; 5&lt;/tt&gt; or similar, to detect if this is a newly-formatted filesystem, but I realize that this case may also happen if e.g. the OST is restored from a backup or a snapshot and has an old &lt;tt&gt;LAST_ID&lt;/tt&gt; but is not new.  It would be useful to update the comment to reflect this.&lt;/p&gt;</comment>
                            <comment id="250323" author="sergey" created="Fri, 28 Jun 2019 20:40:52 +0000"  >&lt;blockquote&gt;&lt;p&gt;I don&apos;t mind to have a change that is increasing the maximum number of objects created per commit, but this needs to be negotiated between the MDS and OSS at connect time, with a new&#160;&lt;tt&gt;OBD_CONNECT2_MAX_PRECREATE&lt;/tt&gt;&#160;flag and an&#160;&lt;tt&gt;ocd_max_precreate&lt;/tt&gt;&#160;field (probably using&#160;&lt;tt&gt;__u32 padding1&lt;/tt&gt;) that passes the current&#160;&lt;tt&gt;OST_MAX_PRECREATE&lt;/tt&gt;&#160;value, and use&#160;&lt;tt&gt;OST_MAX_PRECREATE_OLD = 20000&lt;/tt&gt;&#160;if the feature is not available. This also allows fixing the problem properly in older releases.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;If we increase OST_MAX_PRECREATE on MDS also, I am afraid MDS will ask OST to precreate too rare. It may cause several different problems. For example OST after failover needs to precreate more objects that takes extra time(500 000 instead of 20 000).&lt;/p&gt;


&lt;p&gt; I suggest to use the 2nd approach with commit callback and revert &lt;a href=&quot;https://review.whamcloud.com/33833&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33833&lt;/a&gt;.&lt;br/&gt;
 I will push a patch.&lt;/p&gt;</comment>
                            <comment id="250326" author="gerrit" created="Fri, 28 Jun 2019 20:53:55 +0000"  >&lt;p&gt;Sergey Cheremencev (c17829@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/35373&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35373&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: limit num of objects to create in 1 transaction&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e75f4977695657710a9144a31a0d149fa1925ef8&lt;/p&gt;</comment>
                            <comment id="250376" author="gerrit" created="Sun, 30 Jun 2019 07:54:17 +0000"  >&lt;p&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/35388&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35388&lt;/a&gt;&lt;br/&gt;
Subject: Revert &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: formatted OST recognition change&quot;&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 563277b8d728c45bd89074c07a22f1432beea344&lt;/p&gt;</comment>
                            <comment id="250952" author="gerrit" created="Wed, 10 Jul 2019 15:19:15 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/35388/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35388/&lt;/a&gt;&lt;br/&gt;
Subject: Revert &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: formatted OST recognition change&quot;&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8065d44c0a2b29885ca429674ccab7785d2db08b&lt;/p&gt;</comment>
                            <comment id="253332" author="gerrit" created="Wed, 21 Aug 2019 04:44:22 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/35373/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35373/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: limit num of objects to create in 1 transaction&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 4485ee8be4cf224e2543f6344efc6e1cb295a0a7&lt;/p&gt;</comment>
                            <comment id="253361" author="pjones" created="Wed, 21 Aug 2019 11:26:27 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="253763" author="gerrit" created="Wed, 28 Aug 2019 15:40:53 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/35951&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35951&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: limit num of objects to create in 1 transaction&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e08f46ed1e8fc4daf8ac092ca305438ea354ec24&lt;/p&gt;</comment>
                            <comment id="254581" author="gerrit" created="Thu, 12 Sep 2019 03:48:54 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/35951/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35951/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11760&quot; title=&quot;formatted OST recognition change&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11760&quot;&gt;&lt;del&gt;LU-11760&lt;/del&gt;&lt;/a&gt; ofd: limit num of objects to create in 1 transaction&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 963559b3087bcbb0bdd541c983085eff7feca882&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="55904">LU-12415</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="55888">LU-12404</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i007s7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>