<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:12:23 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-14741] Close RPC might get stuck behind normal RPCs waiting for slot</title>
                <link>https://jira.whamcloud.com/browse/LU-14741</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;It looks like obd_get_mod_rpc_slot places all RPCs waiting for a slot into a single exclusive waitq:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;               wait_event_idle_exclusive(cli-&amp;gt;cl_mod_rpcs_waitq,
                                          obd_mod_rpc_slot_avail(cli,
                                                                 close_req));
        } while (true);
}
EXPORT_SYMBOL(obd_get_mod_rpc_slot);&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The problem is CLOSE RPCs have a higher chance of being sent. So if a CLOSE RPC completes and frees a slot the next item (only one) at the top of the waitq would be woken up and if it happens to be a non-close RPC, it&apos;ll go back to sleep and nothing would wake up the close rpc somewhere down the list.&lt;/p&gt;

&lt;p&gt;Normally this is not too much of a visible problem because the hope is eventually a normal RPC or a few will complete and the close. cpc will gets its turn, but sometimes the entire available queue is plugged on requests waiting on say an open lock that needs the close to finish first and if it&apos;s stuck down the list - we have a deadlock. This seems to be especially common with NFS servers, but could also manifest in master now that we added opencache on by default.&lt;/p&gt;

&lt;p&gt;We should either have separate waitqs or close/non-close RPCs or do wake_up_all() for completed CLOSE RPCs&lt;/p&gt;</description>
                <environment></environment>
        <key id="64569">LU-14741</key>
            <summary>Close RPC might get stuck behind normal RPCs waiting for slot</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="green">Oleg Drokin</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Mon, 7 Jun 2021 19:08:30 +0000</created>
                <updated>Sat, 19 Nov 2022 16:23:03 +0000</updated>
                            <resolved>Sat, 19 Nov 2022 16:19:48 +0000</resolved>
                                    <version>Lustre 2.12.6</version>
                    <version>Lustre 2.12.7</version>
                    <version>Lustre 2.15.0</version>
                                    <fixVersion>Lustre 2.15.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="303767" author="gerrit" created="Mon, 7 Jun 2021 19:22:06 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/43941&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/43941&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14741&quot; title=&quot;Close RPC might get stuck behind normal RPCs waiting for slot&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14741&quot;&gt;&lt;del&gt;LU-14741&lt;/del&gt;&lt;/a&gt; obdclass: Wake up entire queue of requests on close completion&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 5f3a9a7f292c3c46ac6b249db7066d5826559c55&lt;/p&gt;</comment>
                            <comment id="305847" author="gerrit" created="Wed, 30 Jun 2021 03:16:18 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/43941/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/43941/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14741&quot; title=&quot;Close RPC might get stuck behind normal RPCs waiting for slot&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14741&quot;&gt;&lt;del&gt;LU-14741&lt;/del&gt;&lt;/a&gt; obdclass: Wake up entire queue of requests on close completion&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: a4e1567d67559b797a5c24ee0bfbca4a52649c47&lt;/p&gt;</comment>
                            <comment id="320843" author="gerrit" created="Tue, 14 Dec 2021 13:47:03 +0000"  >&lt;p&gt;&quot;Etienne AUJAMES &amp;lt;eaujames@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45850&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45850&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14741&quot; title=&quot;Close RPC might get stuck behind normal RPCs waiting for slot&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14741&quot;&gt;&lt;del&gt;LU-14741&lt;/del&gt;&lt;/a&gt; obdclass: Wake up entire queue of requests on close completion&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 963c6e18113fc7e044b83ae0feedb68894dbe073&lt;/p&gt;</comment>
                            <comment id="353610" author="pjones" created="Sat, 19 Nov 2022 16:19:48 +0000"  >&lt;p&gt;IIUC this fix was landed to master for 2.15.0&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="70643">LU-15915</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="70643">LU-15915</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i01wen:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>