<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:55:49 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5939] Error: trying to overwrite bigger transno</title>
                <link>https://jira.whamcloud.com/browse/LU-5939</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I&apos;ve been running sanity-hsm test 90 several time on this cluster and nearly every time I run the test, I see the following in dmesg on the MDS:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Lustre: DEBUG MARKER: == sanity-hsm test 90: Archive/restore a file list == 15:39:24 (1416440364)
Lustre: HSM agent bb8c2497-7403-4909-0e46-6614668e8ed7 already registered
LustreError: 26047:0:(mdt_coordinator.c:957:mdt_hsm_cdt_start()) scratch-MDT0000: Coordinator already started
LustreError: 19956:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818612, new: 25769818611 replay: 0. see LU-617.
LustreError: 19956:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) Skipped 5 previous similar messages
Lustre: DEBUG MARKER: == sanity-hsm test complete, duration 37 sec == 15:39:50 (1416440390)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;From the kernel logs, I see:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;...
00000001:00020000:9.0:1416440377.839622:0:19956:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818612, new: 25769818611 replay: 0. see LU-617.
...
00000001:00080000:8.0:1416440377.869378:0:30331:0:(tgt_lastrcvd.c:1231:tgt_txn_stop_cb()) More than one transaction 25769818612
...
00000001:00080000:8.0:1416440377.869423:0:30331:0:(tgt_lastrcvd.c:1231:tgt_txn_stop_cb()) More than one transaction 25769818612
...
00000001:00080000:8.0:1416440377.869508:0:30331:0:(tgt_lastrcvd.c:1231:tgt_txn_stop_cb()) More than one transaction 25769818612
...
00000100:00100000:8.0:1416440377.869685:0:30331:0:(service.c:2116:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_002:bb8c2497-7403-4909-0e46-6614668e8ed7+713:21533:x1485210712561904:12345-192.168.2.111@o2ib:57 Request procesed in 30116us (30167us total) trans 25769818612 rc 0/0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Similarly for other transaction numbers:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000001:00020000:0.0:1416440378.133498:0:19955:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818617, new: 25769818614 replay: 0. see LU-617.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000001:00020000:1.0F:1416440378.133518:0:31313:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818619, new: 25769818618 replay: 0. see LU-617.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Before running sanity-hsm test 90, the copytool was started on the agent, c11.&lt;/p&gt;</description>
                <environment>OpenSFS cluster running lustre-master tag 2.6.90 build #2745 with one MDS/MDT, three OSSs with two OSTs each and three clients.</environment>
        <key id="27659">LU-5939</key>
            <summary>Error: trying to overwrite bigger transno</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="jamesanunez">James Nunez</reporter>
                        <labels>
                            <label>HB</label>
                    </labels>
                <created>Thu, 20 Nov 2014 00:17:32 +0000</created>
                <updated>Sun, 1 Nov 2015 17:12:38 +0000</updated>
                            <resolved>Sun, 24 May 2015 12:51:24 +0000</resolved>
                                    <version>Lustre 2.7.0</version>
                                    <fixVersion>Lustre 2.8.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>14</watches>
                                                                            <comments>
                            <comment id="99641" author="jamesanunez" created="Thu, 20 Nov 2014 00:32:36 +0000"  >&lt;p&gt;I&apos;ve attached the kernel debug log from the MDS; hsm_log_1.txt.&lt;/p&gt;</comment>
                            <comment id="99775" author="jlevi" created="Fri, 21 Nov 2014 18:51:52 +0000"  >&lt;p&gt;Mike,&lt;br/&gt;
Can you please comment on this one?&lt;br/&gt;
Thank you!&lt;/p&gt;</comment>
                            <comment id="100462" author="jlevi" created="Tue, 2 Dec 2014 19:03:24 +0000"  >&lt;p&gt;Mike,&lt;br/&gt;
Have you had a chance to look at this one?&lt;/p&gt;</comment>
                            <comment id="103893" author="tappro" created="Mon, 19 Jan 2015 18:53:43 +0000"  >&lt;p&gt;I am not familiar with HSM coordinator design, but it seems it does many transactions per single RPC, which cause &apos;More than one transaction&apos; message. Also it seems that HSM RPC from client are sent without mdc RPC lock and non-serialized, that cause &apos;trying to overwrite bigger transno&apos; message. This may be not a problem if HSM requests don&apos;t need recovery, in that case the coordinator don&apos;t need transactions for its requests and last_cvd file update.&lt;/p&gt;</comment>
                            <comment id="104207" author="tappro" created="Wed, 21 Jan 2015 17:44:34 +0000"  >&lt;p&gt;Considering that HSM requests are filtered out during recovery, they are not supposed to be replayed during Lustre recovery and HSM copytool will just send them again. In that case such updates shouldn&apos;t get transaction number assigned and update last_rcvd slot.&lt;/p&gt;</comment>
                            <comment id="104891" author="tappro" created="Tue, 27 Jan 2015 19:20:43 +0000"  >&lt;p&gt;Meanwhile, I am not sure how to fix that properly, HSM uses common API to access related objects and all transaction are started by MDD. There is no easy way to deny transaction number generation for some of them&lt;/p&gt;</comment>
                            <comment id="105035" author="adilger" created="Wed, 28 Jan 2015 22:16:20 +0000"  >&lt;p&gt;Any comments from HSM folks on this issue?&lt;/p&gt;</comment>
                            <comment id="105089" author="adegremont" created="Thu, 29 Jan 2015 17:11:12 +0000"  >&lt;p&gt;Sorry, I did not see this issue.&lt;/p&gt;

&lt;p&gt;HSM code is doing 2 kind of disk access:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;It manages the coordinator llog. This stores all requests and especially their status. Each time the request is added, started, completed, the corresponding llog records are updated.&lt;/li&gt;
	&lt;li&gt;File HSM status is also updated (LMA EA is updated) when a request is started or finished.&lt;br/&gt;
IIRC, those 2 modifications are not in the same transaction.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Nothing in HSM code will resend the HSM request registrations. HSM (wrongly?) rely on Lustre recovery mechanism. HSM actions (archive this file, restore this one, etc...) are supposed to be kept between a MDT restarts.&lt;br/&gt;
I think those requests should be replayed and consider as any other RPCs.&lt;/p&gt;

&lt;p&gt;Another thing that could lead to very big transaction is when sending an archive request for a big list of files. MDT will try to add them (one llog record per file) in one transaction. So far, this file list was limited by other Lustre limitation (LNET message size, KUC, ...). Don&apos;t know if this limitation changed but the test is still using only 51 files.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;2. Don&#8217;t update last_rcvd and don&#8217;t assign transaction to such requests if they don&#8217;t need transaction and/or recovery by replaying them.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;AFAIK, replay and recovery is tightly coupled. I do know how you can manage replay without recovery. Or doing this by hand? But is there a code somewhere in Lustre which does that?&lt;/p&gt;</comment>
                            <comment id="105149" author="tappro" created="Thu, 29 Jan 2015 23:51:36 +0000"  >&lt;p&gt;Aurelien, I decided that HSM requests don&apos;t rely on Lustre recovery (replays) because HSM requests are filtered out during recovery, check tgt_filter_recovery_requests(), there are no any MDS_HSM_ opcode listed. This must be fixed if they are supposed to be replayed, also that means we should serialize HSM requests from mdc by mdc_rpc_lock(). This should solve also &apos;overwrite bigger transno&apos; issue.&lt;/p&gt;</comment>
                            <comment id="105681" author="adegremont" created="Wed, 4 Feb 2015 16:35:07 +0000"  >&lt;p&gt;I think this was forgotten. If somebody think this is bad, tell it, but IMO, we should replay them, and so, serialize them.&lt;/p&gt;</comment>
                            <comment id="105715" author="adilger" created="Wed, 4 Feb 2015 19:24:48 +0000"  >&lt;p&gt;If the MDC is serializing change requests, is this going to hurt HSM performance, or is there only a single HSM request active on any client/agent node at once?&lt;/p&gt;

&lt;p&gt;Adding the HSM opcodes to the recovery list seems like a simple and low-risk change, and serializing the HSM requests on the MDC is also only limited to HSM usage so doesn&apos;t seem too high risk. &lt;/p&gt;</comment>
                            <comment id="105996" author="tappro" created="Fri, 6 Feb 2015 04:51:38 +0000"  >&lt;p&gt;Yes, I am working on patch already, as for HSM agent performance, I am trying to serialize only requests that might update data on server&lt;/p&gt;</comment>
                            <comment id="106002" author="adegremont" created="Fri, 6 Feb 2015 09:22:01 +0000"  >&lt;p&gt;On my side, I&apos;ve identified 3 RPCs which can update data on server&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;MDS_HSM_PROGRESS&lt;/li&gt;
	&lt;li&gt;MDS_HSM_STATE_SET&lt;/li&gt;
	&lt;li&gt;MDS_HSM_REQUEST&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Serializing those requests can slow request ingestion rate but I think it is acceptable.&lt;/p&gt;

&lt;p&gt;If you&apos;re working on a patch, please also add MUTABOR flag to MDS_HSM_REQUEST RPC.&lt;/p&gt;</comment>
                            <comment id="106157" author="tappro" created="Sat, 7 Feb 2015 06:19:13 +0000"  >&lt;p&gt;Thanks for help, I identified first two as well, but not MDS_HSM_REQUEST, is it specific action which update data on disk or it does that upon any request?&lt;/p&gt;</comment>
                            <comment id="106221" author="gerrit" created="Mon, 9 Feb 2015 03:51:46 +0000"  >&lt;p&gt;Mike Pershin (mike.pershin@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/13684&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13684&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5939&quot; title=&quot;Error: trying to overwrite bigger transno&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5939&quot;&gt;&lt;del&gt;LU-5939&lt;/del&gt;&lt;/a&gt; hsm: make HSM modification requests replayable&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 726362d38cb95233acabacb3fe98ed484f0bff4e&lt;/p&gt;</comment>
                            <comment id="106222" author="tappro" created="Mon, 9 Feb 2015 04:15:59 +0000"  >&lt;p&gt;Note, the patch above makes just HSM requests repayable, it contains no tests because I found that HSM actions can&apos;t recover for reasons not related to this particular patch. I&apos;ve create &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6223&quot; title=&quot;HSM recovery needs more tests and fixes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6223&quot;&gt;LU-6223&lt;/a&gt; for HSM recovery testing.&lt;/p&gt;

&lt;p&gt;So this particular patch should solve just problem with &quot;overwrite bigger transno&quot; message.&lt;/p&gt;</comment>
                            <comment id="106767" author="jay" created="Thu, 12 Feb 2015 04:57:29 +0000"  >&lt;p&gt;hmm I&apos;m not comfortable with mutilple transactions can be made by HSM requests because I&apos;m afraid it may have problems down the road. OUT can have multiple transaction for one RPC because it was carefully designed for this, but it can&apos;t be applied to HSM. I&apos;d like to have an alternative way to fix this problem by limiting that HSM request can have only one trans per RPC.&lt;/p&gt;</comment>
                            <comment id="106796" author="tappro" created="Thu, 12 Feb 2015 14:34:51 +0000"  >&lt;p&gt;Yes, I agree, that would be better&lt;/p&gt;</comment>
                            <comment id="106798" author="adegremont" created="Thu, 12 Feb 2015 14:44:28 +0000"  >&lt;p&gt;IIRC, HSM_REQUEST store a list of requests to be done in a llog. One RPC can send request for the same action (archive, restore, ...) for a list of files. One llog record will be added for each files (with the same compound_id to be able to rebuilt this request later).&lt;/p&gt;

&lt;p&gt;Records are added using &lt;tt&gt;llog_cat_add()&lt;/tt&gt;. If we want to have only one transaction, we need a special version which can add several records in one call, and update &lt;tt&gt;mdt_hsm_add_actions()&lt;/tt&gt; accordingly.&lt;/p&gt;</comment>
                            <comment id="106807" author="jay" created="Thu, 12 Feb 2015 16:56:15 +0000"  >&lt;p&gt;Exactly, llog_cat_add() can be revised to carry a transaction handler parameter therefore we can start a transaction in mdt_hsm_add_actions() and use it for all llog operations later.&lt;/p&gt;

&lt;p&gt;The only concern is about the size of the transaction. I remember that there is a limitation for it, but I&apos;m not an OSD expert. If that is the case, we also need to take log file creation into account for the transaction size.&lt;/p&gt;</comment>
                            <comment id="106814" author="jay" created="Thu, 12 Feb 2015 17:19:08 +0000"  >&lt;p&gt;after a second thought, we don&apos;t even need to add a parameter into llog_cat_add(). We just need to call llog_add() series of interfaces instead, just as what we do for changelog.&lt;/p&gt;</comment>
                            <comment id="106816" author="adegremont" created="Thu, 12 Feb 2015 17:28:04 +0000"  >&lt;p&gt;The problem is we should declare the number of credits we need for the transaction in advance. So we need to also update the credit declaration.&lt;/p&gt;</comment>
                            <comment id="106902" author="tappro" created="Fri, 13 Feb 2015 05:04:25 +0000"  >&lt;p&gt;I am not sure it is about llog records only, llog_cat_add() cause local transaction which produce no transaction number, there must be another update, maybe attributes of file or something like that? I can give more details about HSM request type and operations behind multiple transno later today. Meanwhile, llog_cat_add() should be replaced with llog_add() in any case.&lt;/p&gt;

&lt;p&gt;As for putting everything into single transaction, we still have another way to go - use the same mechanism as OUT uses to control batch of updates. This will cause compatibility problem but maybe it is not so difficult to solve. I mean we shouldn&apos;t deny this case completely and review it too. This is context of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6223&quot; title=&quot;HSM recovery needs more tests and fixes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6223&quot;&gt;LU-6223&lt;/a&gt; though.&lt;/p&gt;</comment>
                            <comment id="106923" author="jay" created="Fri, 13 Feb 2015 12:28:40 +0000"  >&lt;p&gt;Hi Mike, you have the expertise on recovery - so if you think it&apos;s better to go for multiple transactions, I&apos;m good. Sorry for noise.&lt;/p&gt;</comment>
                            <comment id="106932" author="tappro" created="Fri, 13 Feb 2015 14:56:35 +0000"  >&lt;p&gt;I&apos;ve got trace for last occurrence of &apos;Multiple transactions&apos;. It is not about llog, it is HSM_PROGRESS, and MDT does several disk changes:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;mdt_hsm_attr_set()&lt;/li&gt;
	&lt;li&gt;another mdt_hsm_attr_set()&lt;/li&gt;
	&lt;li&gt;mo_swap_layouts()&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;this is him_archive operation from sanity-hsm.sh. Each call to lower level (MDD) will cause separate transaction. I am not sure how to solve that right now, ideally transaction should start in MDT but mo_... interface cannot pass transaction details to the MDD. I think we might allow multiple transactions for this case and for restore, with additional checks. This is to be continued in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6223&quot; title=&quot;HSM recovery needs more tests and fixes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6223&quot;&gt;LU-6223&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="114724" author="gerrit" created="Fri, 8 May 2015 14:58:01 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/13684/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13684/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5939&quot; title=&quot;Error: trying to overwrite bigger transno&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5939&quot;&gt;&lt;del&gt;LU-5939&lt;/del&gt;&lt;/a&gt; hsm: make HSM modification requests replayable&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 9eda825b1b449baaf2676cc80ccae79d4297cf2d&lt;/p&gt;</comment>
                            <comment id="116292" author="pjones" created="Sun, 24 May 2015 12:51:24 +0000"  >&lt;p&gt;Landed for 2.8&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="28603">LU-6223</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="28693">LU-6244</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="16395" name="hsm_log_1.txt" size="214" author="jamesanunez" created="Thu, 20 Nov 2014 00:32:36 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx19j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>16583</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>