<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:43:46 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11426] 2/2 Olafs agree: changelog entries are emitted out of order</title>
                <link>https://jira.whamcloud.com/browse/LU-11426</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When storing changelogs we do not strictly order the entries with respect to index.&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; mdd_changelog_store(&lt;span class=&quot;code-keyword&quot;&gt;const&lt;/span&gt; struct lu_env *env, struct mdd_device *mdd,
                        struct llog_changelog_rec *rec, struct thandle *th)
{
        ...

        &lt;span class=&quot;code-comment&quot;&gt;/* llog_lvfs_write_rec sets the llog tail len */&lt;/span&gt;
        rec-&amp;gt;cr_hdr.lrh_type = CHANGELOG_REC;
        rec-&amp;gt;cr.cr_time = cl_time();

        spin_lock(&amp;amp;mdd-&amp;gt;mdd_cl.mc_lock);
        /* NB: I suppose it&apos;s possible llog_add adds out of order wrt cr_index,
         * but as &lt;span class=&quot;code-object&quot;&gt;long&lt;/span&gt; as the MDD transactions are ordered correctly &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; e.g.
         * rename conflicts, I don&apos;t think &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; should matter. */
        rec-&amp;gt;cr.cr_index = ++mdd-&amp;gt;mdd_cl.mc_index;
        spin_unlock(&amp;amp;mdd-&amp;gt;mdd_cl.mc_lock);

        ctxt = llog_get_context(obd, LLOG_CHANGELOG_ORIG_CTXT);
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (ctxt == NULL)
                &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; -ENXIO;

        llog_th = thandle_get_sub(env, th, ctxt-&amp;gt;loc_handle-&amp;gt;lgh_obj);
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (IS_ERR(llog_th))
                GOTO(out_put, rc = PTR_ERR(llog_th));

        &lt;span class=&quot;code-comment&quot;&gt;/* nested journal transaction */&lt;/span&gt;
        rc = llog_add(env, ctxt-&amp;gt;loc_handle, &amp;amp;rec-&amp;gt;cr_hdr, NULL, llog_th);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This may not be a bug in itself but it means that a changelog consumer must account for gaps in the sequence when clearing changelog entries.&lt;/p&gt;</description>
                <environment></environment>
        <key id="53413">LU-11426</key>
            <summary>2/2 Olafs agree: changelog entries are emitted out of order</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="qian_wc">Qian Yingjin</assignee>
                                    <reporter username="jhammond">John Hammond</reporter>
                        <labels>
                            <label>ORNL</label>
                            <label>changelog</label>
                            <label>hsm</label>
                            <label>llnl</label>
                    </labels>
                <created>Tue, 25 Sep 2018 12:49:00 +0000</created>
                <updated>Thu, 24 Oct 2019 21:23:56 +0000</updated>
                            <resolved>Sat, 28 Sep 2019 03:40:11 +0000</resolved>
                                                    <fixVersion>Lustre 2.13.0</fixVersion>
                    <fixVersion>Lustre 2.12.3</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>18</watches>
                                                                            <comments>
                            <comment id="233975" author="adilger" created="Tue, 25 Sep 2018 14:38:06 +0000"  >&lt;p&gt;On the one hand, out-of-order records are not great and fixing that would be nice if it was possible without significant overhead.&#160; That said, this race can only happen for two records that are being generated within the same transaction group, since the transaction handles are already open, so it is not possible that the later (lower-numbered) record will always commit together with the earlier (higher-numbered) record.&lt;/p&gt;

&lt;p&gt;This means it should always be possible to reorder the records within a transaction group to be sequential and contiguous.  Whether we should try to do this at the llapi layer, in the llog code, or up in the consumer is an open question, but I&apos;d think it makes sense to do it in llapi or llog so that we don&apos;t need to burden every consumer to handle this quirk.&lt;/p&gt;</comment>
                            <comment id="234002" author="olaf" created="Wed, 26 Sep 2018 08:14:24 +0000"  >&lt;p&gt;&lt;em&gt;This may not be a bug in itself but it means that a changelog consumer must account for gaps in the sequence when clearing changelog entries.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Looking at the code the problem may be more at the restarting reading end: clearing apparently stops at the first record with a higher &lt;tt&gt;cr_index&lt;/tt&gt;. But the reader filters records by &lt;tt&gt;cr_index&lt;/tt&gt; even if they come after records that are being transmitted to the reader.&lt;/p&gt;

&lt;p&gt;In any case, this issue greatly complicates changelog readers.&lt;/p&gt;</comment>
                            <comment id="234145" author="olaf" created="Sun, 30 Sep 2018 13:38:57 +0000"  >&lt;p&gt;And the final complication is that every once in a blue moon, an entry is indeed just not there. So you get a sequence like&#160;6423027 -&#160;6423029 and&#160;6423028 never shows up. If these numbers were always increasing, then a &quot;missing&quot; entry is easy to spot and handle. But as the situation stands how do I tell the difference between &quot;missing&quot; and &quot;out of order&quot;? Right now the best I can do is to assume that if an expected entry doesn&apos;t show up within the next N entries, I should assume it won&apos;t ever. But what is a good value for N?&lt;/p&gt;</comment>
                            <comment id="235816" author="olaf" created="Mon, 29 Oct 2018 17:14:37 +0000"  >&lt;p&gt;One more thing: I have now see cases where two changelog entries affecting the same target have the same nanosecond timestamp. The implication of that is pretty much that if we cannot use the Index to sort entries, and it turns out that entries may not be emitted in the correct order, then sorting by timestamp will not work either.&lt;/p&gt;</comment>
                            <comment id="246019" author="olaf" created="Thu, 18 Apr 2019 18:46:41 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11581&quot; title=&quot;Not all changelog entries are returned to userspace&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11581&quot;&gt;&lt;del&gt;LU-11581&lt;/del&gt;&lt;/a&gt;&#160;was closed as a duplicate of this one. The replication instructions are of interest for this LU as well.&lt;/p&gt;</comment>
                            <comment id="250155" author="simmonsja" created="Thu, 27 Jun 2019 17:58:54 +0000"  >&lt;p&gt;So I looked at the timestamps like you described and its just handled in a strange way. The time stamp is created with cl_time() which is in the format of secs &amp;lt;&amp;lt; 30 + nsecs. Strange that nanoseconds since epoch wasn&apos;t just sent since its not hard math &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/tongue.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="252594" author="olaf" created="Tue, 6 Aug 2019 14:44:53 +0000"  >&lt;p&gt;It turns out that&#160;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11581&quot; title=&quot;Not all changelog entries are returned to userspace&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11581&quot;&gt;&lt;del&gt;LU-11581&lt;/del&gt;&lt;/a&gt; is a major problem for HPE DMF, and since it was marked a duplicate of this one, that makes this LU a major problem for us&lt;/p&gt;</comment>
                            <comment id="252644" author="qian_wc" created="Wed, 7 Aug 2019 01:46:11 +0000"  >&lt;p&gt;Hi Olaf,&lt;/p&gt;

&lt;p&gt;The out of order for changlog records can be easily reproduced by using the test_160j scripts (&lt;a href=&quot;https://review.whamcloud.com/#/c/35650/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/35650/&lt;/a&gt;) in a single machine.&lt;/p&gt;

&lt;p&gt;But I am&#160;curious that how does HPE DMF solution generate the following message:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2019-06-21 15:23:51.045796+01:00 946781.704204 tesseract-dmf-core01 dmf-lustre-changelog-processor 39762 39847 INFO ---------- tessfs1-MDT0001 Lost 2276026
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Although current changlog mechanism can not ensure the changelog records strictly in the index order, but it should not &quot;Lost 2276026&quot;, it should be appeared in N (which is 1024*2 in the patch &lt;a href=&quot;https://review.whamcloud.com/#/c/35650/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/35650/&lt;/a&gt;) records around &quot;6423027 -&#160;6423029&quot; finally.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;For the question: how do I tell the difference between &quot;missing&quot; and &quot;out of order&quot;?&#160;&lt;/p&gt;

&lt;p&gt;I think you do not worry about cancellation the records that are out of order, the initial version of above patch has already consider this case, can handle it correctly.&lt;/p&gt;</comment>
                            <comment id="252662" author="olaf" created="Wed, 7 Aug 2019 13:49:45 +0000"  >&lt;p&gt;The DMF7 changelog processor tracks the indices of records it has read but not yet cleared. If there appears to be a gap in that sequence it will not clear beyond that hap until at least 1000 newer records have been seen. At that point it assumes that the record in the gap will never turn up, reports it as &quot;lost&quot;, and allows clearing the log beyond the gap.&lt;/p&gt;

&lt;p&gt;If, according to the patch, that window should be extended from 1000 to 1024*2 = 2048, we can certainly do that, as it is a simple tunable of the program.&lt;/p&gt;

&lt;p&gt;However, when I disable clearing the changelog, and then read it using lfs, I can easily verify that most lost records do not show up within the next million records. In a handful of cases records show up inconsistently. When that happens, they always appear to be part of a small out-of-order sequence.&lt;/p&gt;

&lt;p&gt;For what it is worth, the dmf changelog processor has never, to my knowledge, encountered&#160;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11205&quot; title=&quot;Failure to clear the changelog for user 1 on MDT&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11205&quot;&gt;&lt;del&gt;LU-11205&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="252914" author="olaf" created="Fri, 9 Aug 2019 21:32:36 +0000"  >&lt;p&gt;HPE is also working on this issue with Cray, and we&apos;ve been testing code that appears to improve matters. (Cray bug id may be&#160;LUS-7691)&lt;/p&gt;</comment>
                            <comment id="254701" author="gerrit" created="Sat, 14 Sep 2019 13:09:03 +0000"  >&lt;p&gt;Andrew Perepechko (c17827@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36187&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36187&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11426&quot; title=&quot;2/2 Olafs agree: changelog entries are emitted out of order&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11426&quot;&gt;&lt;del&gt;LU-11426&lt;/del&gt;&lt;/a&gt; llog: changelog records reordering&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: a4611acc55049002eb6736d8fc504aab23e4a71c&lt;/p&gt;</comment>
                            <comment id="255512" author="gerrit" created="Fri, 27 Sep 2019 23:11:57 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36187/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36187/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11426&quot; title=&quot;2/2 Olafs agree: changelog entries are emitted out of order&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11426&quot;&gt;&lt;del&gt;LU-11426&lt;/del&gt;&lt;/a&gt; llog: changelog records reordering&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 1fa0a984c5c3863d8f40b3b0d63c3d08cfa1a9f0&lt;/p&gt;</comment>
                            <comment id="255525" author="pjones" created="Sat, 28 Sep 2019 03:40:11 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="255531" author="bruno.travouillon" created="Sat, 28 Sep 2019 03:52:16 +0000"  >&lt;p&gt;Hi Peter,&lt;br/&gt;
Can we consider a backport to LTS12? (2.12.4?)&lt;/p&gt;</comment>
                            <comment id="255532" author="pjones" created="Sat, 28 Sep 2019 03:57:41 +0000"  >&lt;p&gt;Yes absolutely - a number of sites have hit this&lt;/p&gt;</comment>
                            <comment id="255546" author="simmonsja" created="Sat, 28 Sep 2019 17:45:55 +0000"  >&lt;p&gt;To back port this fix will require a bunch of other change log fixes to land first. One of them is the add polling patch. Is that patch needed for 2.12 LTS?&lt;/p&gt;</comment>
                            <comment id="255552" author="gerrit" created="Sat, 28 Sep 2019 21:31:36 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36316&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36316&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11426&quot; title=&quot;2/2 Olafs agree: changelog entries are emitted out of order&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11426&quot;&gt;&lt;del&gt;LU-11426&lt;/del&gt;&lt;/a&gt; llog: changelog records reordering&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 2b7ea4e3cf43f18b1e1354044d024e085701946e&lt;/p&gt;</comment>
                            <comment id="255553" author="pjones" created="Sat, 28 Sep 2019 21:55:53 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=simmonsja&quot; class=&quot;user-hover&quot; rel=&quot;simmonsja&quot;&gt;simmonsja&lt;/a&gt; could you please be more specific as to what is missing from b2_12 before the benefits of this patch could be realized?&lt;/p&gt;</comment>
                            <comment id="255706" author="simmonsja" created="Tue, 1 Oct 2019 13:28:56 +0000"  >&lt;p&gt;We have also&lt;/p&gt;

&lt;p&gt;Lustre commit&#160;d7bb6647cd4dd26949bceb6a099cd606623aff2b (&quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11626&quot; title=&quot;mdc: obd might go away while referenced by code in mdc_changelog&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11626&quot;&gt;&lt;del&gt;LU-11626&lt;/del&gt;&lt;/a&gt; mdc: hold obd while processing changelog&quot;)&lt;/p&gt;

&lt;p&gt;The other patch is &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12533&quot; title=&quot;Improve readahead RPC issuance for large window sizes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12533&quot;&gt;&lt;del&gt;LU-12533&lt;/del&gt;&lt;/a&gt; but I don&apos;t know if that is more a feature or a fix so its not clear in that case.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="255708" author="pjones" created="Tue, 1 Oct 2019 13:34:26 +0000"  >&lt;p&gt;Thanks James. The second patch looks like a nice performance improvement but I am not sure how this would relate to changelog correctness. Are you just listing all the patches that you are currently applying to b2_12 regardless of their relevance to this particular issue?&lt;/p&gt;</comment>
                            <comment id="255738" author="simmonsja" created="Tue, 1 Oct 2019 17:33:43 +0000"  >&lt;p&gt;I&apos;m also applying &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11617&quot; title=&quot;lockdep exposed a possible deadlock in chlg_open()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11617&quot;&gt;&lt;del&gt;LU-11617&lt;/del&gt;&lt;/a&gt; which is a potential bug as well as the above patches. The reason for also including &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12533&quot; title=&quot;Improve readahead RPC issuance for large window sizes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12533&quot;&gt;&lt;del&gt;LU-12533&lt;/del&gt;&lt;/a&gt; was so the patches applied cleanly to 2.12 LTS. If its not needed then thats fine &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="256055" author="gerrit" created="Tue, 8 Oct 2019 13:26:05 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36316/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36316/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11426&quot; title=&quot;2/2 Olafs agree: changelog entries are emitted out of order&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11426&quot;&gt;&lt;del&gt;LU-11426&lt;/del&gt;&lt;/a&gt; llog: changelog records reordering&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 5c5a1e9b4839c3d6a70b3b7e768944f6dc237c2e&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="57172">LU-12869</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="52894">LU-11205</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="53844">LU-11581</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0031r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>