<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:34:12 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3471] &quot;client_obd_lock_t cl_loi_list_lock&quot; in struct client_obd should not be a spin lock (b1_8)</title>
                <link>https://jira.whamcloud.com/browse/LU-3471</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;client_obd_lock_t cl_loi_list_lock of struct client_obd is used to protect async page operations which are not guaranteed to not block even on linux, therefore&lt;br/&gt;
spinlock (used for linux implementaion of cl_loi_list_lock) is not appropriate.&lt;/p&gt;

&lt;p&gt;For example, in the call chain:&lt;br/&gt;
osc_check_rpcs() -&amp;gt; osc_send_oap_rpc() -&amp;gt; ptlrpcd_add_req():&lt;/p&gt;

&lt;p&gt;osc_check_rpc() is called with cli-&amp;gt;cl_loi_list_lock spinlock held and&lt;br/&gt;
ptlrpcd_add_req() may wait with timeout.&lt;/p&gt;

&lt;p&gt;In &lt;a href=&quot;http://jira-nss.xy01.xyratex.com:8080/browse/MRP-1053&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://jira-nss.xy01.xyratex.com:8080/browse/MRP-1053&lt;/a&gt; there was discovered a hang cause by scheduling from a process holding the cl_loi_list_lock.&lt;br/&gt;
Corresponding core dump is removed already, and I do not remember exactly it hung that time.&lt;/p&gt;

&lt;p&gt;In new kernels, 3.0.42, the following call chain may block:&lt;br/&gt;
osc_check_rpcs() -&amp;gt; osc_send_oap_rpc() -&amp;gt; ll_ap_make_ready() -&amp;gt; clear_page_dirty_for_io() -&amp;gt; page_mkclean() -&amp;gt; page_mkclean_file()&lt;/p&gt;

&lt;p&gt;page_mkclean_file() locks mutex:&lt;br/&gt;
mutex_lock(&amp;amp;mapping-&amp;gt;i_mmap_mutex);&lt;/p&gt;

&lt;p&gt;See &lt;a href=&quot;http://jira-nss.xy01.xyratex.com:8080/browse/LELUS-116&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://jira-nss.xy01.xyratex.com:8080/browse/LELUS-116&lt;/a&gt; for more details&lt;/p&gt;
</description>
                <environment></environment>
        <key id="19431">LU-3471</key>
            <summary>&quot;client_obd_lock_t cl_loi_list_lock&quot; in struct client_obd should not be a spin lock (b1_8)</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="keith">Keith Mannthey</assignee>
                                    <reporter username="vsaveliev">Vladimir Saveliev</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Fri, 14 Jun 2013 09:33:35 +0000</created>
                <updated>Fri, 25 Apr 2014 18:26:28 +0000</updated>
                            <resolved>Fri, 25 Apr 2014 18:26:28 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="60630" author="vsaveliev" created="Fri, 14 Jun 2013 09:44:29 +0000"  >&lt;p&gt;please take a look at the patch:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/6646&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6646&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="60681" author="keith" created="Fri, 14 Jun 2013 17:04:24 +0000"  >&lt;p&gt;When I click on &lt;a href=&quot;http://jira-nss.xy01.xyratex.com:8080/browse/LELUS-116&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://jira-nss.xy01.xyratex.com:8080/browse/LELUS-116&lt;/a&gt; I get a Server not found error.  I get the same thing for the MRP link. &lt;/p&gt;</comment>
                            <comment id="60686" author="keith" created="Fri, 14 Jun 2013 17:32:51 +0000"  >&lt;p&gt;This seems to be a 1.8 functional improvement. I don&apos;t know if many improvements like this have been taken into the tree in a while. &lt;/p&gt;

&lt;p&gt;What version of the 1.8 tree did the initial problem hit with?&lt;/p&gt;

&lt;p&gt;Is this issue still relevant for Master? &lt;/p&gt;</comment>
                            <comment id="60736" author="vsaveliev" created="Sat, 15 Jun 2013 09:28:14 +0000"  >&lt;p&gt;&amp;gt; When I click on &lt;a href=&quot;http://jira-nss.xy01.xyratex.com:8080/browse/LELUS-116&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://jira-nss.xy01.xyratex.com:8080/browse/LELUS-116&lt;/a&gt; I get a Server not found error. I get the same thing for the MRP link&lt;/p&gt;

&lt;p&gt;Ok&lt;/p&gt;

&lt;p&gt;&amp;gt; This seems to be a 1.8 functional improvement.&lt;/p&gt;

&lt;p&gt;This is a fix for reproducible lockups.&lt;/p&gt;

&lt;p&gt;&amp;gt; What version of the 1.8 tree did the initial problem hit with?&lt;/p&gt;

&lt;p&gt;LELUS-116 reports the failure on 2.2.&lt;br/&gt;
MRP-1053 is about this bug hit on Oracle&apos;s 1.8.&lt;/p&gt;

&lt;p&gt;&amp;gt; Is this issue still relevant for Master?&lt;/p&gt;

&lt;p&gt;2.4 does not have this problem after new IO engine (&lt;a href=&quot;https://jira.hpdd.intel.com/browse/LU-1030&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jira.hpdd.intel.com/browse/LU-1030&lt;/a&gt;) was introduced.&lt;/p&gt;</comment>
                            <comment id="60753" author="keith" created="Mon, 17 Jun 2013 15:06:20 +0000"  >&lt;p&gt;Can you provide some more details about the tickets you reference?&lt;/p&gt;

&lt;p&gt;Can you confirm the code version in MRP-1053?  &lt;/p&gt;

&lt;p&gt;How can the lockup be reproduced?  Is there a test for this issue?&lt;/p&gt;</comment>
                            <comment id="61817" author="vsaveliev" created="Thu, 4 Jul 2013 11:58:42 +0000"  >&lt;p&gt;&amp;gt; Can you provide some more details about the tickets you reference?&lt;/p&gt;

&lt;p&gt;These tickets are about discovered with help of crash(8) and core dumps lockups where a process gets blocked and rescheduled having spinlock held.&lt;/p&gt;

&lt;p&gt;&amp;gt; Can you confirm the code version in MRP-1053?&lt;/p&gt;

&lt;p&gt;MRP-1053 is about Oracle&apos;s b1_8.&lt;/p&gt;

&lt;p&gt;&amp;gt; How can the lockup be reproduced? Is there a test for this issue?&lt;/p&gt;

&lt;p&gt;We reproduced the lockup with Oracle&apos;s b1_8 in order to call to mind the exact traces which deadlocked.&lt;/p&gt;

&lt;p&gt;It appeared that patch from &lt;a href=&quot;https://projectlava.xyratex.com/show_bug.cgi?id=21812&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://projectlava.xyratex.com/show_bug.cgi?id=21812&lt;/a&gt; is responsible for that particular lockup.&lt;br/&gt;
As long as Intel&apos;s 1.8 does not include that patch, you will not able to reproduce it.&lt;/p&gt;

&lt;p&gt;You can probably close the bug.&lt;/p&gt;

&lt;p&gt;But, please review example in description. It describes a call chain, where a process may reschedule holding the spinlock.&lt;/p&gt;

&lt;p&gt;Also, scheduling may become possible due to changes coming to linux.&lt;br/&gt;
For example, in linux-3.0.42 - page_mkclean_file() (which is called with spinlock held) may block.&lt;/p&gt;
</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvtdb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8699</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>