<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:59:52 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6397] LDLM lock creation race condition on new object creation</title>
                <link>https://jira.whamcloud.com/browse/LU-6397</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1669&quot; title=&quot;lli-&amp;gt;lli_write_mutex (single shared file performance)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1669&quot;&gt;&lt;del&gt;LU-1669&lt;/del&gt;&lt;/a&gt; has made possible a pair of race conditions between acquiring new LDLM locks.&lt;br/&gt;
This is the first of two, I will report the other shortly.&lt;/p&gt;

&lt;p&gt;This applies to newly created objects, for which kms_valid is not yet set.&lt;/p&gt;

&lt;p&gt;If kms_valid is not set, osc_enqueue_base will not attempt to match existing ldlm locks:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;         /*
          * kms is not valid when either object is completely fresh (so that no
          * locks are cached), or object was evicted. In the latter &lt;span class=&quot;code-keyword&quot;&gt;case&lt;/span&gt; cached
          * lock cannot be used, because it would prime inode state with
          * potentially stale LVB.
          */
         &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!kms_valid)
                 &lt;span class=&quot;code-keyword&quot;&gt;goto&lt;/span&gt;(no_match);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;kms_valid is read out from osc-&amp;gt;oo_oinfo-&amp;gt;loi_kms_valid in the osc_object.&lt;br/&gt;
kms_valid is set by loi_kms_set, which is done in osc_attr_update as part of cl_object_attr_update called from osc_lock_lvb_update, which is called from osc_enqueue_fini.&lt;br/&gt;
This is not called until a reply has been received from the server, either in osc_enqueue_base (regular locks) or osc_enqueue_interpret (async locks).&lt;/p&gt;

&lt;p&gt;This results in a race when two IO requests are going at the same time.&lt;br/&gt;
Consider:&lt;/p&gt;

&lt;p&gt;P1 makes an IO request (FX, write to the first page of the file)&lt;br/&gt;
P1 creates an LDLM lock request&lt;br/&gt;
P1 waits for reply from server&lt;br/&gt;
P2 makes an IO request (FX, read from the second page of the file)&lt;br/&gt;
P2 creates an LDLM lock request&lt;br/&gt;
P2 does not check for existing LDLM locks (goto(no_match) in osc_enqueue_base as described above) &lt;br/&gt;
P2 waits for a reply from server&lt;br/&gt;
P1 Receives reply, lock is granted&lt;br/&gt;
(Lock is expanded beyond the requested extent, so it covers the area P2 wants to read)&lt;br/&gt;
P2 Receives reply, lock is blocked by lock granted to P1&lt;br/&gt;
Lock granted to P1 is called back by server, even though it matches request from P2&lt;/p&gt;

&lt;p&gt;This is easier to see with async lock requests, since they do not wait (and do not take the range lock which would prevent this race for truly overlapping IOs.), but it also applies to regular lock requests.&lt;/p&gt;

&lt;p&gt;This can be solved by removing the usage of kms_valid in osc_enqueue_base.&lt;/p&gt;

&lt;p&gt;Per the comment on that usage, there are two things to handle to remove this usage of kms_valid:&lt;br/&gt;
Newly created objects with no LDLM locks, and evicted objects.  (&quot;Evicted objects&quot; refers to OSC objects removed from LRU due to memory pressure.)&lt;/p&gt;

&lt;p&gt;For newly created objects: If the object is new and no locks exist, then it&apos;s safe to try to match.&lt;br/&gt;
It will simply fail to match and request a new lock.&lt;/p&gt;

&lt;p&gt;For evicted objects, Jinshan suggested a solution:&lt;br/&gt;
&quot;&lt;span class=&quot;error&quot;&gt;&amp;#91;...&amp;#93;&lt;/span&gt; we can change the code &lt;span class=&quot;error&quot;&gt;&amp;#91;in osc_object_prune&amp;#93;&lt;/span&gt; to get rid of all cached dlm locks when the &lt;span class=&quot;error&quot;&gt;&amp;#91;osc&amp;#93;&lt;/span&gt; object is being destroyed. After this is done, we don&#8217;t need to worry about kms_valid any more.&quot;&lt;/p&gt;

&lt;p&gt;Jinshan also provided the patch for this, which I&apos;ve done basic testing on and will upload shortly.&lt;/p&gt;</description>
                <environment></environment>
        <key id="29224">LU-6397</key>
            <summary>LDLM lock creation race condition on new object creation</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="paf">Patrick Farrell</assignee>
                                    <reporter username="paf">Patrick Farrell</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Tue, 24 Mar 2015 21:02:52 +0000</created>
                <updated>Wed, 7 Feb 2018 20:24:13 +0000</updated>
                                            <version>Lustre 2.7.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="110555" author="gerrit" created="Tue, 24 Mar 2015 21:52:56 +0000"  >&lt;p&gt;Patrick Farrell (paf@cray.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/14167&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14167&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6397&quot; title=&quot;LDLM lock creation race condition on new object creation&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6397&quot;&gt;LU-6397&lt;/a&gt; osc: Remove kms_valid check in osc_enqueue&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: a2703428a62f531c64dbef6d1a7a8c548d0ca91f&lt;/p&gt;</comment>
                            <comment id="113630" author="gerrit" created="Tue, 28 Apr 2015 15:43:39 +0000"  >&lt;p&gt;Patrick Farrell (paf@cray.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/14630&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14630&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6397&quot; title=&quot;LDLM lock creation race condition on new object creation&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6397&quot;&gt;LU-6397&lt;/a&gt; osc: Remove kms_valid check in osc_enqueue&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 47a84ccb47b91b9c02d20136072d97024bde697c&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx99j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>