<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:59:52 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6398] LDLM lock creation race condition</title>
                <link>https://jira.whamcloud.com/browse/LU-6398</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;As promised in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6397&quot; title=&quot;LDLM lock creation race condition on new object creation&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6397&quot;&gt;LU-6397&lt;/a&gt;, this ticket describes another race condition between acquiring new LDLM locks made possible by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1669&quot; title=&quot;lli-&amp;gt;lli_write_mutex (single shared file performance)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1669&quot;&gt;&lt;del&gt;LU-1669&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;After the kms_valid problem has been fixed for new objects (see &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6397&quot; title=&quot;LDLM lock creation race condition on new object creation&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6397&quot;&gt;LU-6397&lt;/a&gt;), a closely related race condition remains.&lt;/p&gt;

&lt;p&gt;Consider this sequence of events with two processes, P1 and P2:&lt;br/&gt;
P1 makes an IO request (FX, write to the first page of a file)&lt;br/&gt;
P1 creates an LDLM lock request&lt;br/&gt;
P1 calls osc_enqueue_base and ldlm_lock_match to check for locks, none found&lt;br/&gt;
P1 waits for reply from server&lt;br/&gt;
P2 makes an IO request (FX, read from the second page of the file)&lt;br/&gt;
P2 creates an LDLM lock request&lt;br/&gt;
P2 calls osc_enqueue_base and ldlm_lock_match to check for locks, none found&lt;br/&gt;
P2 waits for a reply from server&lt;br/&gt;
P1 Receives reply, lock is granted&lt;br/&gt;
(Lock is expanded beyond the requested extent, so it covers the area P2 wants to read)&lt;br/&gt;
P2 Receives reply, lock is blocked by lock granted to P1&lt;br/&gt;
Lock granted to P1 is called back by server, even though it matches request from P2&lt;/p&gt;

&lt;p&gt;The problem is this:&lt;br/&gt;
The lock to allow P1s IO request is still waiting for a reply from the server, so it is not on any queue, and is not found by the lock request from P2.&lt;/p&gt;

&lt;p&gt;Locks are currently added to the waiting or granted queue (at which point they can be matched by other lock requests) by the processing policy, which is called from ldlm_lock_enqueue, which is called from ldlm_cli_enqueue_fini.&lt;/p&gt;

&lt;p&gt;ldlm_cli_enqueue_fini is not called until a reply has been received from the server.  So while P1 is waiting on a reply from the server, the P2 lock request (which would match the P1 lock if it were available for matching) can continue.&lt;/p&gt;

&lt;p&gt;This was previously prevented by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1669&quot; title=&quot;lli-&amp;gt;lli_write_mutex (single shared file performance)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1669&quot;&gt;&lt;del&gt;LU-1669&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I do not currently see a simple way to resolve this problem.  This is made particularly difficult by async locks, such as lock ahead locks, which do not wait for a reply from the server.  If we were concerned with only synchronous locks, we could either ignore this or conceivably hold a lock preventing a new lock request from calling in to ldlm_lock_match until the other lock was issued.  The problem with this idea is that it would prevent other IOs from using other existing locks.&lt;/p&gt;

&lt;p&gt;I have one partial idea:&lt;br/&gt;
Another lock queue, &quot;lr_waiting_reply&quot; or &quot;lr_created&quot;, to which locks could be added to when created at the start of ldlm_cli_enqueue_fini but before sending to the server.&lt;/p&gt;

&lt;p&gt;This would only require preventing lock matching requests for the time it took to allocate the ptlrpc request &amp;amp; create the lock.  I am not sure how we would lock this, though - Holding the resource spinlock for so long does not seem advisable.&lt;/p&gt;

&lt;p&gt;This bug should be fairly low priority: It does not currently cause any actual failures, as valid locks are issued for the various IO requests involved.  It&apos;s just inefficient.&lt;/p&gt;

&lt;p&gt;For asynchronous locks such as the proposed lock ahead locks, this problem is perhaps a bit worse, since they can conceivably be cancelled by a normal IO request which was intended to fit in to one.  Still, if there are multiple lock ahead requests on the server, the lock requested for the IO request will not be expanded, and as a result, the server will only cancel one of the lock ahead locks.&lt;/p&gt;</description>
                <environment></environment>
        <key id="29227">LU-6398</key>
            <summary>LDLM lock creation race condition</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="paf">Patrick Farrell</reporter>
                        <labels>
                    </labels>
                <created>Tue, 24 Mar 2015 22:39:23 +0000</created>
                <updated>Mon, 31 Jan 2022 16:16:33 +0000</updated>
                            <resolved>Mon, 31 Jan 2022 16:16:33 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="324482" author="adilger" created="Sun, 30 Jan 2022 10:48:11 +0000"  >&lt;p&gt;Patrick, is this still an issue?&lt;/p&gt;</comment>
                            <comment id="324597" author="paf0186" created="Mon, 31 Jan 2022 16:16:33 +0000"  >&lt;p&gt;Yes but it&apos;s pretty minor and I think not worth fixing - Basically it describes a situation where a client has a pending lock request (pending but not yet in the waiting queue, so basically, no reply from the server yet), then another thread on the client generates a second lock request which &lt;b&gt;could&lt;/b&gt; use that lock, but doesn&apos;t match because it&apos;s not granted yet.&#160; The second lock request is actually made (when it could have just waited for the first lock to be available), and the client ends up self-conflicting and cancelling the first lock.&lt;/p&gt;

&lt;p&gt;But that all works just fine - no deadlock or other issue.&lt;/p&gt;

&lt;p&gt;Anyway, this is A) rare (there&apos;s provision in the cl/OSC locking that mostly catches this), and B) harmless - just a &lt;b&gt;very&lt;/b&gt; slight and probably transient performance cost (ie, after the first two compatible requests have fought it out, an LDLM lock is held in the normal way without further issue, so there&apos;s no ongoing cost paid).&lt;/p&gt;

&lt;p&gt;Also, this is technically possible for normal I/O, but really it&apos;s something I saw with lockahead when I&apos;d have lockahead start an async request, then the thread would go on to do I/O right away and would send a new request before the first one had received a reply.&#160; This was solved in practice by creating a way for userspace to check if a lock existed (via the lockahead/ladvise interface), so the lockahead library could spool off a bunch of async requests all at once, but could then wait to start I/O until at least the first lock had been granted.&lt;/p&gt;

&lt;p&gt;I&apos;m going to close this one out.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx9a7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>