<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:33:23 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17190] Client-side high priority I/O handling under lock blocking AST</title>
                <link>https://jira.whamcloud.com/browse/LU-17190</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We found a deadlock caused by parallel DIO:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
T1: writer
Obtain DLM extent lock: L1=PW[0, EOF]
T2: DIO reader: 50M data, iosize=64M, max_pages_per_rpc=1024 (4M) max_rpcs_in_flight=8
ll_direct_IO_impl()
use all available RPC slots: number of read RPC in flight is 9
on the server side:
-&amp;gt;tgt_brw_read()
-&amp;gt;tgt_brw_lock() # server side locking
-&amp;gt; Try to cancel the conflict locks on client: L1=PW[0, EOF]
T3: reader
take DLM lock ref on L1=PW[0, EOF]
Read-ahead pages (prepare pages);
wait &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; RPC slots to send the read RPCs to OST
deadlock: T2-&amp;gt;T3: T2 is waiting &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; T3 to release DLM extent lock L1;
T3-&amp;gt;T2: T3 is waiting &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; T2 finished to free RPC slots...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;To solve this problem, we propose a client-side high priority I/O where the extent lock protecting it is under blocking AST.&lt;/p&gt;

&lt;p&gt;It implements as follows:&lt;/p&gt;

&lt;p&gt;When receive a lock blocking AST and the lock is in use (reader and writer count are not zero), it check whether there are any I/O extent (osc_extent) protected by this lock is outstanding (i.e. waiting for RPC slot). Make this kind of read/write I/O with high priority and put them to the HP list. Thus the client will force to send the HP I/Os even the available RPC slots is use out.&lt;/p&gt;

&lt;p&gt;By this way, it makes I/O engine on OSC layer more efficient. For the normal urgent I/O, the client will tier over the object list one by one and send I/O one by one. Moreover, the in-flight I/O count can not exceed the max RPCs in flight.&lt;/p&gt;

&lt;p&gt;The hight priority I/Os are put into HP list of the client, will handle more quickly.&lt;/p&gt;

&lt;p&gt;It can avoid the possible deadlock caused by parallel DIO and response the lock blocking AST more quickly.&lt;/p&gt;</description>
                <environment></environment>
        <key id="78372">LU-17190</key>
            <summary>Client-side high priority I/O handling under lock blocking AST</summary>
                <type id="2" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11311&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="qian_wc">Qian Yingjin</assignee>
                                    <reporter username="qian_wc">Qian Yingjin</reporter>
                        <labels>
                    </labels>
                <created>Fri, 13 Oct 2023 02:37:34 +0000</created>
                <updated>Fri, 10 Nov 2023 03:22:47 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="389443" author="gerrit" created="Mon, 16 Oct 2023 15:02:43 +0000"  >&lt;p&gt;&quot;Qian Yingjin &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/52711&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/52711&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17190&quot; title=&quot;Client-side high priority I/O handling under lock blocking AST&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17190&quot;&gt;LU-17190&lt;/a&gt; osc: force I/O when all RPC slots used out by DIO&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 612a050c6386a18306d627a095e3ad29d1455177&lt;/p&gt;</comment>
                            <comment id="389460" author="adilger" created="Mon, 16 Oct 2023 15:56:55 +0000"  >&lt;p&gt;There may be a couple of other ways to fix this issue:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;don&apos;t allow sending lockless DIO RPCs when the client is already holding a DLM lock on that object.&lt;/li&gt;
	&lt;li&gt;similar to the previous option, but have the lockless DIO RPC block on the local DLM lock while it is in-use by another thread and then &quot;pre-cancel&quot; the DLM lock with the RPCs under the blocked lock. This is not great, because it may block the new IO for a long time.&lt;/li&gt;
	&lt;li&gt;have the &quot;lockless DIO&quot; requests still send a DLM lock handle in the OSS RPC, so that the server doesn&apos;t try to get another lock. This might make Patrick unhappy, since it will add a DLM lock lookup to every DIO syscall, but it allows the &quot;lockless DIO&quot; and regular BIO to be submitted and processed at the same time.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="389806" author="qian_wc" created="Wed, 18 Oct 2023 16:28:39 +0000"  >&lt;p&gt;Hi Andreas,&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;I have implemented two solutions for this bug:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;a href=&quot;https://review.whamcloud.com/#/c/ex/lustre-release/+/52682/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/ex/lustre-release/+/52682/&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://review.whamcloud.com/c/ex/lustre-release/+/52747&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/ex/lustre-release/+/52747&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;In the firat patch, it puts all acquired DLM extent locks in a list. After submit all read-ahead I/Os, the client releases all acquired DLM extent locks. By this way, in a lock blocking AST, all reading extents has already submitted and put into the list @oo_reading_exts of the OSC object. Then in blocking AST, the client can check this list to find out the conflict outstanding extents as all I/O RPC slots are used out by direct I/O.&lt;/p&gt;

&lt;p&gt;Otherwise, If we use original way that match DLM lock, add read-ahead page into queue list and release the matched DLM lock; repeat it for read-ahead and finally submit all I/Os (osc_io_submit), the conflict lockdone extents may be added into the list @oo_reading_exts after the check in blocking AST. And on the client the blocking AST for the server-side locking for DIO will try to lock the pages in these lockdone extents (all pages in lockdone extents are PG_locked, and this extent maybe watis for RPC slots and all RPC slots are used out by DIO). Thus it may cause deadlock.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;The second patch gives another solution: for each matched read-ahead DLM extent lock, tag the last read-ahead page (osc_page), and increase the tagged accounting for the OSC object. After submit all I/O and add lockdone extents for read-ahead pages into the list @oo_reading_exts, reduce the tagged accounting. Once tagged count becomes zero, wake up waiters. In lock blocing AST, we first wait until the tagged count becomes zero, and then check @oo_reading_exts list to avoid the deadlock.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;The first patch has passed the customized Maloo test (test_99b).&lt;/p&gt;

&lt;p&gt;The second patch has passed in my local testing (I think it can achieve the same effort with solution 1).&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;Could you please review these two solutions, give some advice which one is better?&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;(BTW, test_99a sometimes failed due to PCC mmap problem, I will fix it later.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;Regards,&lt;/p&gt;

&lt;p&gt;Qian&lt;/p&gt;</comment>
                            <comment id="391060" author="paf0186" created="Mon, 30 Oct 2023 16:11:51 +0000"  >&lt;p&gt;I definitely prefer the first approach - I think readahead locking needs to be much more &apos;normal&apos;, taking and holding dlmlocks in a more regular fashion.&#160; So I strongly prefer the first option.&#160; The second one would work but it feels &apos;clever&apos; rather than &apos;right&apos;.&lt;/p&gt;

&lt;p&gt;Yingjin, does that answer your questions?&#160; I know you had some stuff in Gerrit as well, are there other issues to consider?&lt;/p&gt;</comment>
                            <comment id="391127" author="qian_wc" created="Tue, 31 Oct 2023 01:29:41 +0000"  >&lt;p&gt;Yes, I will refine the patch of the first solution. Thanks!&lt;/p&gt;</comment>
                            <comment id="392589" author="gerrit" created="Fri, 10 Nov 2023 03:20:53 +0000"  >&lt;p&gt;&quot;Qian Yingjin &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/53065&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/53065&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17190&quot; title=&quot;Client-side high priority I/O handling under lock blocking AST&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17190&quot;&gt;LU-17190&lt;/a&gt; test: parallel DIO should not cause deadlock&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: a14b0d9b1e7e4ea42957799d1d5596fe94ce3c4f&lt;/p&gt;</comment>
                            <comment id="392590" author="qian_wc" created="Fri, 10 Nov 2023 03:22:47 +0000"  >&lt;p&gt;The test scripts sanity/test_441 (&lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/53065&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/53065&lt;/a&gt;) can reproduce the deadlock problem easily on master branch without PCC-RO locally.&lt;/p&gt;

&lt;p&gt;So this is a general deadlock bug in both b_es6_0 and master.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03y8v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>