<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:47:21 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11836] DOM read-open resend vs getattr deadlock</title>
                <link>https://jira.whamcloud.com/browse/LU-11836</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;DOM read-on-open may cause resend when reply buffer is larger then client buffer, that is OK in general, client just re-allocate buffer and resend request. The problem occurs when between first reply and resend the new request on the same file is arrived, e.g. getattr.&lt;br/&gt;
Whole scenario in that case:&lt;br/&gt;
1. OPEN takes PARENT WRITE lock and new CHILD PR/PW lock&lt;br/&gt;
2. The CHILD lock on server gets PARENT handle from the client as remote handle (resource change)&lt;br/&gt;
3. Due to resend condition in reply_in_callback() the client didn&apos;t finish that resource replacement, so that lock handle is still PARENT lock handle, while it is CHILD one on server&lt;br/&gt;
4. Getattr on server locks the CHILD and cause BL AST to PR/PW lock from OPEN&lt;br/&gt;
5. client gets BL AST but lock handle refers to PARENT lock, so CHILD lock on server will never receive cancel from that BL AST&lt;br/&gt;
6. Meanwhile OPEN resend is arrived on server and try to get WRITE lock on PARENT but it is blocked by getattr process waiting for CHILD cancel, so OPEN resend is waiting on PARENT lock and cannot complete OPEN to send reply with blocked CHILD lock. Deadlock.&lt;/p&gt;

&lt;p&gt;That specific combination exists only with DOM files (PR/PW modes causes conflicts with getattr) and only with read-on-open feature because it produces resent without reconnect.&lt;/p&gt;</description>
                <environment></environment>
        <key id="54439">LU-11836</key>
            <summary>DOM read-open resend vs getattr deadlock</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="tappro">Mikhail Pershin</reporter>
                        <labels>
                            <label>DoM2</label>
                    </labels>
                <created>Sun, 6 Jan 2019 21:13:10 +0000</created>
                <updated>Wed, 11 Sep 2019 14:25:39 +0000</updated>
                            <resolved>Wed, 11 Sep 2019 14:25:39 +0000</resolved>
                                                    <fixVersion>Lustre 2.13.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="239440" author="tappro" created="Sun, 6 Jan 2019 21:15:01 +0000"  >&lt;p&gt;This issue happens from time to time in racer.sh with DOM files. I have a reproducer for that scenario and is working on patch.&lt;/p&gt;</comment>
                            <comment id="240416" author="gerrit" created="Sun, 20 Jan 2019 17:57:54 +0000"  >&lt;p&gt;Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34072&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34072&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11836&quot; title=&quot;DOM read-open resend vs getattr deadlock&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11836&quot;&gt;&lt;del&gt;LU-11836&lt;/del&gt;&lt;/a&gt; ldlm: fix enqueue reply vs bl_ast race&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e52819e9792f865391b280d1c6b2f862823d91e7&lt;/p&gt;</comment>
                            <comment id="242052" author="gerrit" created="Fri, 15 Feb 2019 09:21:23 +0000"  >&lt;p&gt;Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34264&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34264&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11836&quot; title=&quot;DOM read-open resend vs getattr deadlock&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11836&quot;&gt;&lt;del&gt;LU-11836&lt;/del&gt;&lt;/a&gt; ldlm: don&apos;t convert wrong resource&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 599247c27bc6f6351d7f2c4c1313e0686635893d&lt;/p&gt;</comment>
                            <comment id="242054" author="tappro" created="Fri, 15 Feb 2019 09:26:10 +0000"  >&lt;p&gt;this issue should be resolved with proper open resent/reconstruct handling. As noted by Vitaly that is just not right to take parent lock on server again while we already have child lock, that cause reverse lock ordering. Meanwhile this intersects with &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11952&quot; title=&quot;open+create resend can recreate a file after unlink&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11952&quot;&gt;&lt;del&gt;LU-11952&lt;/del&gt;&lt;/a&gt; which also requires similar fixes in OPEN reconstruct.&lt;/p&gt;</comment>
                            <comment id="244037" author="gerrit" created="Fri, 15 Mar 2019 23:46:00 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/34264/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34264/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11836&quot; title=&quot;DOM read-open resend vs getattr deadlock&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11836&quot;&gt;&lt;del&gt;LU-11836&lt;/del&gt;&lt;/a&gt; ldlm: don&apos;t convert wrong resource&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 2bc71659db69335ba1c93dab44dc733dc0849d0c&lt;/p&gt;</comment>
                            <comment id="244047" author="pjones" created="Sat, 16 Mar 2019 00:25:10 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="244061" author="tappro" created="Sat, 16 Mar 2019 06:21:56 +0000"  >&lt;p&gt;Re-open ticket, there are still things to resolve&lt;/p&gt;</comment>
                            <comment id="254523" author="pjones" created="Wed, 11 Sep 2019 14:25:39 +0000"  >&lt;p&gt;It looks like the remaining work would be landed under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11952&quot; title=&quot;open+create resend can recreate a file after unlink&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11952&quot;&gt;&lt;del&gt;LU-11952&lt;/del&gt;&lt;/a&gt;. If a separate ticket is needed please open one and link to this ticket - thanks&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="54844">LU-11952</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i008x3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>