<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:35:57 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17502] distribute flock locking across multiple MDS nodes</title>
                <link>https://jira.whamcloud.com/browse/LU-17502</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Lustre currently implements all of the flock locking only on MDT0000, and its MDS managing &lt;b&gt;all&lt;/b&gt; of the flocks in the filesystem. This can lead to potential performance bottlenecks and high memory usage on the MDS when locking large numbers of files from many clients.&lt;/p&gt;

&lt;p&gt;It would be desirable for the flock management to scale across multiple MDS nodes for improved performance and reduced load on MDS0.  This would be fairly straight forward if clients would only lock one file at a time (e.g. manage file FID NNN flocking on the MDT where the FID is located).  It gets slightly more complex if there is MDT migration to another server, which may cause imbalanced lock traffic (though it can&apos;t ever be worse than today where 100% of the flock locking is done on a single node).&lt;/p&gt;

&lt;p&gt;The more serious issue is lock ordering issues with clients locking multiple FIDs at the same time (e.g. AB/BA deadlock both for extents within a single file and between different files, and more complex chain variants of the same).  Is it possible/practical/efficient to distribute the flock deadlock/dependency checking across multiple servers?  Non-flock Lustre DLM internal file consistency locking avoids this distributed ordering issue (most of the time) by avoiding to hold multiple locks at the same time, and when this is strictly necessary the locks are taken in a pre-determined order to avoid deadlocks (or use &quot;trylock and undo/restart&quot; for efficiency if lock(s) already held and the next lock is not in the correct order).&lt;/p&gt;

&lt;p&gt;For flock locking, the lock ordering is provided by the userspace application (i.e. outside of Lustre&apos;s control) and the code must determine if granting the lock would result in a possible deadlock, and return an error to the application in that case. &lt;/p&gt;

&lt;p&gt;It would be necessary to determine if there are efficient algorithms for distributed locking with deadlock detection, where the &quot;common&quot; case of independent flock locks on individual files is distributed across MDTs, while cross-MDS communication is minimized.&lt;/p&gt;

&lt;p&gt;Another option to distribute the locking might be to determine the primary MDT for the lock management based on the JobID used by the application.  That should put most/all of the flocks for a single job on a single MDT, even if the job is distributed across many client nodes, and reduce/eliminate cross-MDS communication &lt;b&gt;for that job&lt;/b&gt;.  However, if multiple jobs are locking the same files, or if the JobID is structured to contains the hostname, then the JobID would map to different MDTs and this would likely only increase complexity.&lt;/p&gt;</description>
                <environment></environment>
        <key id="80634">LU-17502</key>
            <summary>distribute flock locking across multiple MDS nodes</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>hard</label>
                    </labels>
                <created>Mon, 5 Feb 2024 00:48:50 +0000</created>
                <updated>Mon, 5 Feb 2024 01:23:30 +0000</updated>
                                            <version>Lustre 2.16.0</version>
                    <version>Lustre 2.17.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="78854">LU-17276</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i04a9b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>