<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:34:03 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17270] TN status is lost if TN_EVENT_TAG_RX_OK occurs before TN_EVENT_TX_OK</title>
                <link>https://jira.whamcloud.com/browse/LU-17270</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When kfilnd issues a bulk get, the originator posts a tagged receive buffer and then the BULK_GET TN transitions to TN_STATE_WAIT_COMP where it waits for the TN_EVENT_TX_OK and TN_EVENT_TAG_RX_OK events to complete the transaction.&lt;/p&gt;

&lt;p&gt;These events may arrive at originator in any order. If the TX_OK arrives first, the BULK_GET TN is transitioned to TN_STATE_WAIT_TAG_COMP. If TAG_RX_OK arrives with a non-zero status then this is handled correctly.&lt;/p&gt;

&lt;p&gt;If the TAG_RX_OK arrives first, then the status of this event is not recorded in the TN, and the BULK_GET TN is transitioned to TN_STATE_WAIT_SEND_COMP . When the TX_OK event arrives the message is finalized as though the TN completed successfully when in fact it failed. This can result in data corruption.&lt;/p&gt;

&lt;p&gt;This shows the good/working case:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# TX_OK first with status 0
00000800:40000000:3.0:1699032572.037560:0:61649:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 TN_STATE_TAGGED_RECV_POSTED -&amp;gt; TN_STATE_WAIT_COMP state change
00000800:40000000:5.0:1699032572.037584:0:61651:0:(kfilnd_tn.c:1084:kfilnd_tn_state_wait_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TX_OK event status 0 &amp;lt;&amp;lt;&amp;lt;&amp;lt; TX_OK first

# TAG_RX_OK second with status -61
00000800:40000000:5.0:1699032572.374710:0:61651:0:(kfilnd_tn.c:1241:kfilnd_tn_state_wait_tag_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TAG_RX_OK event status -61
00000800:40000000:5.0:1699032572.374712:0:61651:0:(kfilnd_tn.c:313:kfilnd_tn_status_update()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 0 -&amp;gt; -61 status change
00000800:40000000:5.0:1699032572.374713:0:61651:0:(kfilnd_tn.c:319:kfilnd_tn_status_update()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 0 -&amp;gt; 0 health status change
00000400:00000200:5.0:1699032572.374714:0:61651:0:(lib-msg.c:1046:lnet_is_health_check()) Msg 0000000039f6c737 is in inconsistent state, don&apos;t perform health checking (-61, 0)
00000400:00000200:5.0:1699032572.374715:0:61651:0:(lib-msg.c:1051:lnet_is_health_check()) health check = 0, status = -61, hstatus = 0
00000400:40000000:5.0:1699032572.374715:0:61651:0:(lib-msg.c:979:lnet_msg_detach_md()) md 00000000b5e76aa5 msg 0000000039f6c737

# Correct status is passed to upper layer
00000100:40000000:5.0:1699032572.374716:0:61651:0:(events.c:481:server_bulk_callback()) event type 5, status -61, desc 000000002a3c77d3
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This shows the broken case:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# TAG_RX_OK first with status -61
00000800:40000000:5.0:1699032576.237177:0:61651:0:(kfilnd_tn.c:1084:kfilnd_tn_state_wait_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TAG_RX_OK event status -61
00000800:40000000:5.0:1699032576.237178:0:61651:0:(kfilnd_tn.c:299:kfilnd_tn_state_change()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 TN_STATE_WAIT_COMP -&amp;gt; TN_STATE_WAIT_SEND_COMP state change

# TX_OK second with status 0
00000800:40000000:5.0:1699032576.237242:0:61651:0:(kfilnd_tn.c:1178:kfilnd_tn_state_wait_send_comp()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 TN_EVENT_TX_OK event status 0
00000400:00000200:5.0:1699032576.237243:0:61651:0:(lib-msg.c:1051:lnet_is_health_check()) health check = 1, status = 0, hstatus = 0
00000400:00000200:5.0:1699032576.237244:0:61651:0:(lib-msg.c:822:lnet_health_check()) health check: 17@kfi-&amp;gt;0@kfi: GET: OK
00000400:40000000:5.0:1699032576.237247:0:61651:0:(lib-msg.c:979:lnet_msg_detach_md()) md 00000000c76d55f1 msg 00000000c54fcba7
00000400:40000000:5.0:1699032576.237248:0:61651:0:(lib-msg.c:979:lnet_msg_detach_md()) md 00000000c76d55f1 msg 000000003842aa4c
00000400:00000200:5.0:1699032576.237249:0:61651:0:(lib-md.c:68:lnet_md_unlink()) Unlinking md 00000000c76d55f1
00000800:40000000:5.0:1699032576.237251:0:61651:0:(kfilnd_tn.c:1473:kfilnd_tn_free()) KFILND_MSG_BULK_GET_REQ Transaction ID 000000000105cad8: 17@kfi:1 -&amp;gt; 0@kfi(00000000eb641e7b):0x0 Transaction freed

# Upper layer got status 0, corruption detected
00000001:00020000:4.0:1699032576.261312:0:58034:0:(brw_test.c:230:brw_check_page()) Bad data in page 000000008c1819d6: 0xbeefbeefbeefbeef, 0xeeb0eeb1eeb2eeb3 expected
00000001:00020000:4.0:1699032576.261312:0:58034:0:(brw_test.c:266:brw_check_bulk()) Bulk page 000000008c1819d6 (0/256) is corrupted!
00000001:00020000:4.0:1699032576.261313:0:58034:0:(brw_test.c:432:brw_bulk_ready()) Bulk data rpc 0000000007c0e22a from 12345-0@kfi is corrupted!
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="78829">LU-17270</key>
            <summary>TN status is lost if TN_EVENT_TAG_RX_OK occurs before TN_EVENT_TX_OK</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="hornc">Chris Horn</assignee>
                                    <reporter username="hornc">Chris Horn</reporter>
                        <labels>
                    </labels>
                <created>Tue, 7 Nov 2023 22:36:55 +0000</created>
                <updated>Tue, 7 Nov 2023 22:48:29 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="392120" author="gerrit" created="Tue, 7 Nov 2023 22:48:29 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/53027&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/53027&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17270&quot; title=&quot;TN status is lost if TN_EVENT_TAG_RX_OK occurs before TN_EVENT_TX_OK&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17270&quot;&gt;LU-17270&lt;/a&gt; kfilnd: Check status of TAG_RX_OK in WAIT_COMP&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: c10b92f65813eec7712fdb62432dd393323be350&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i040z3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>