<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:19:52 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15618] ksock_conn ref leak on shutdown</title>
                <link>https://jira.whamcloud.com/browse/LU-15618</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This is a bug with:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;commit 47b7b319783f27023b0cefe54a2a2eea678284f2
Author: Doug Oucharek &amp;lt;doug.s.oucharek@intel.com&amp;gt;
Date:   Wed Mar 2 12:08:00 2016 +0800

    LU-8106 lnet: Do not drop message when shutting down LNet
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That changes makes it so that if we fail to lookup the peer NI w/ESHUTDOWN, then lnet_parse() returns 0 instead of dropping the message. This can lead to a situation where ksocknal_process_receive() isn&apos;t aware that anything went wrong with lnet_parse() and so an extra ref is left on the ksock_conn.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                ksocknal_conn_addref(conn);     /* ++ref while parsing */


                rc = lnet_parse(conn-&amp;gt;ksnc_peer-&amp;gt;ksnp_ni,
                                &amp;amp;hdr,
                                &amp;amp;conn-&amp;gt;ksnc_peer-&amp;gt;ksnp_id.nid,
                                conn, 0);
                if (rc &amp;lt; 0) {
                        /* I just received garbage: give up on this conn */
                        ksocknal_new_packet(conn, 0);
                        ksocknal_close_conn_and_siblings(conn, rc);
                        CDEBUG(D_NET, &quot;pre %p %u\n&quot;, conn,
                               refcount_read(&amp;amp;conn-&amp;gt;ksnc_conn_refcount));
                        ksocknal_conn_decref(conn);
                        return (-EPROTO);
                }
                &amp;lt;&amp;lt;&amp;lt;&amp;lt; REF LEAKED &amp;gt;&amp;gt;&amp;gt;&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This prevents ksocklnd from shutting down, as it gets stuck waiting forever for the associated peer_ni to be destroyed. A symptom of this could be message like this printed to the console log:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000800:00000200:0.0:1646425022.404025:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425023.404594:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425024.405196:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425025.405475:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425026.406188:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425027.407158:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425028.407259:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425029.407153:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425030.407047:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
00000800:00000200:0.0:1646425031.407210:0:6589:0:(socklnd.c:2369:ksocknal_shutdown()) waiting for 2 peers to disconnect
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It seems more appropriate to have lnet_parse() return &amp;lt;0 in this case to signal to LNDs that the parse failed. They can handle it in the same way as EPROTO errors, etc.&lt;/p&gt;</description>
                <environment></environment>
        <key id="68971">LU-15618</key>
            <summary>ksock_conn ref leak on shutdown</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="hornc">Chris Horn</assignee>
                                    <reporter username="hornc">Chris Horn</reporter>
                        <labels>
                    </labels>
                <created>Fri, 4 Mar 2022 21:03:08 +0000</created>
                <updated>Mon, 28 Nov 2022 19:13:47 +0000</updated>
                            <resolved>Sat, 11 Jun 2022 15:47:17 +0000</resolved>
                                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="328155" author="gerrit" created="Fri, 4 Mar 2022 21:33:08 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/46711&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/46711&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15618&quot; title=&quot;ksock_conn ref leak on shutdown&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15618&quot;&gt;&lt;del&gt;LU-15618&lt;/del&gt;&lt;/a&gt; lnet: Return ESHUTDOWN in lnet_parse()&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 88092ffe31af313abefd70df88d419a8f0651954&lt;/p&gt;</comment>
                            <comment id="328264" author="cbordage" created="Mon, 7 Mar 2022 14:08:39 +0000"  >&lt;p&gt;Is this only for ksock?&lt;/p&gt;</comment>
                            <comment id="328285" author="hornc" created="Mon, 7 Mar 2022 15:53:24 +0000"  >&lt;p&gt;The change impacts all LNDs, but I haven&apos;t observed or figured out if this bug causes a problem with other LNDs.&lt;/p&gt;</comment>
                            <comment id="328332" author="hornc" created="Mon, 7 Mar 2022 21:40:28 +0000"  >&lt;p&gt;If I run the reproducer with o2iblnd I see similar symptom, but I haven&apos;t root caused it to the same bug.&lt;/p&gt;</comment>
                            <comment id="328988" author="hornc" created="Fri, 11 Mar 2022 20:51:04 +0000"  >&lt;p&gt;If I run the reproducer on o2iblnd with the fix applied then I can&apos;t reproduce the symptoms. So it seems like the fix applies for ko2iblnd as well.&lt;/p&gt;</comment>
                            <comment id="336831" author="adilger" created="Mon, 6 Jun 2022 15:24:23 +0000"  >&lt;p&gt;Chris, any idea why this started failing recently, even though the patch is 6y old?&lt;/p&gt;</comment>
                            <comment id="337427" author="gerrit" created="Sat, 11 Jun 2022 05:41:09 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/46711/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/46711/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15618&quot; title=&quot;ksock_conn ref leak on shutdown&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15618&quot;&gt;&lt;del&gt;LU-15618&lt;/del&gt;&lt;/a&gt; lnet: Return ESHUTDOWN in lnet_parse()&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 4fbd0705a3d25bbc85e953f81e697e5006b215ce&lt;/p&gt;</comment>
                            <comment id="337520" author="pjones" created="Sat, 11 Jun 2022 15:47:17 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                            <comment id="354374" author="gerrit" created="Mon, 28 Nov 2022 19:13:47 +0000"  >&lt;p&gt;&quot;Olaf Faaland &amp;lt;faaland1@llnl.gov&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/49259&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/49259&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15618&quot; title=&quot;ksock_conn ref leak on shutdown&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15618&quot;&gt;&lt;del&gt;LU-15618&lt;/del&gt;&lt;/a&gt; lnet: Return ESHUTDOWN in lnet_parse()&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_15&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 28788d5e91eae26dc562f085f8577fd8c2813718&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="55330">LU-12148</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="58026">LU-13218</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="36684">LU-8106</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i02k0n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>