<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:17:54 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15391] lustre group lock usage and restrictions</title>
                <link>https://jira.whamcloud.com/browse/LU-15391</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Suppose one has two processes, one on each of two nodes, and each open the same file and lock it via llapi_group_lock(3) using the same group, and the two processes then attempt to read and write to non-overlapping extents.&lt;/p&gt;

&lt;p&gt;Since they&apos;re using group lock, I understand that neither node will be aware of the writes on the other node. &#160;I would expect that this means that the extents that each node is &quot;assigned&quot; for writing need to be either page-aligned or even maybe stripe-aligned, and that not choosing extents this way would result in reads possibly returning stale data.&lt;/p&gt;

&lt;p&gt;Is that correct? &#160;What is the required alignment? &#160;Are there other requirements?&lt;/p&gt;</description>
                <environment></environment>
        <key id="67719">LU-15391</key>
            <summary>lustre group lock usage and restrictions</summary>
                <type id="9" iconUrl="https://jira.whamcloud.com/images/icons/issuetypes/undefined.png">Question/Request</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="paf0186">Patrick Farrell</assignee>
                                    <reporter username="ofaaland">Olaf Faaland</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Wed, 22 Dec 2021 13:22:23 +0000</created>
                <updated>Mon, 22 Aug 2022 18:04:18 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="321358" author="ofaaland" created="Wed, 22 Dec 2021 13:24:00 +0000"  >&lt;p&gt;I&apos;ve submitted this as an LU-ticket instead of LUDOC- because I would think the best place to document any requirements would be in the llapi_group_lock(3) man page.&#160; But if that&apos;s not the best place, then please modify the ticket appropriately.&lt;/p&gt;</comment>
                            <comment id="321375" author="pjones" created="Wed, 22 Dec 2021 18:31:09 +0000"  >&lt;p&gt;Patrick&lt;/p&gt;

&lt;p&gt;Could you please advise?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="321381" author="paf0186" created="Wed, 22 Dec 2021 20:25:25 +0000"  >&lt;p&gt;Olaf,&lt;/p&gt;

&lt;p&gt;You&apos;re absolutely right that we need to update the man page and this is a good context in which to do it.&lt;/p&gt;

&lt;p&gt;You&apos;re also largely right about group locks.&#160; The stale data issue is particularly significant - All nodes sharing a group lock have a full read/write lock on the file, so they consider any data they have to be the authoritative copy (because it&apos;s under a write lock!).&lt;/p&gt;

&lt;p&gt;So if any part of the file is updated one place and then read in another, or read and then updated elsewhere, stale data can result as you&apos;d expect.&#160; The granularity for this is &lt;b&gt;page&lt;/b&gt; granularity, so if I/Os &apos;overlap&apos; at the level of pages, they are subject to this stale data risk.&#160; And if two clients do partial updates of the same page, read-modify-write means various intermixings of new and old data are possible depending on timing.&lt;/p&gt;

&lt;p&gt;So in general it&apos;s not safe to overlap any accesses (from different nodes - local to one node the page cache handles it and things work normally) with a write while holding a group lock.&#160; Read-read is safe, but write-read, read-write, and even write-write cannot be safe.&#160; (In the case of write-write, the issue is that you do not know if the first write is on disk when you start the second.&#160; If you do page aligned writes, you can make write-write safe by using sync() calls.&#160; But non-page-aligned writes include a read, so stale data issues apply (the read is to finish the partial page - Lustre always writes full pages to disk, so if a partial page write is attempted, it reads up the remainder of the page from disk first).)&lt;/p&gt;

&lt;p&gt;One other thing - while a file is open with a group lock, the file size (the size shown by &apos;stat&apos; or if you&apos;re using O_APPEND) may be inaccurate if the file has been extended or truncated.&#160; Generally it would just be potentially stale, but there are various options.&#160; In general size cannot be trusted if you&apos;re using a group lock and writing to the file.&lt;/p&gt;

&lt;p&gt;One strategy I&apos;ve seen used is to detect overlapping accesses in a userspace library, and when they are detected, to release the group lock on all nodes, then re-acquire it.&#160; This syncs any dirty data to disk and clears the file from cache on all the nodes, so it&apos;s very expensive.&lt;/p&gt;

&lt;p&gt;In general, if you page aligned your writes and &lt;b&gt;only&lt;/b&gt; wrote to the file, you&apos;d be fine.&#160; Not much else is really safe except read only access, and if your access is read only, why use a group lock?&lt;/p&gt;

&lt;p&gt;It would help me if you ask any questions or request any clarifications if this is unclear. (or even if it&apos;s confusing but you can figure it out, maybe point that out - Consider it a first review of some of the content and possible phrasing for the man page &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; ).&lt;/p&gt;</comment>
                            <comment id="321640" author="ofaaland" created="Tue, 28 Dec 2021 17:58:29 +0000"  >&lt;p&gt;Hi Patrick,&lt;/p&gt;

&lt;p&gt;Thanks, this is very helpful.&#160; What you wrote makes sense to me.&#160; I had forgotten about lockahead, so I suggested that to the person who was interested in grouplock as a safer alternative.&lt;/p&gt;

&lt;p&gt;-Olaf&lt;/p&gt;</comment>
                            <comment id="344290" author="ofaaland" created="Mon, 22 Aug 2022 18:04:18 +0000"  >&lt;p&gt;Hi Patrick,&lt;/p&gt;

&lt;p&gt;&amp;gt; The granularity for this is page granularity&lt;/p&gt;

&lt;p&gt;In the case where the client and server have different page sizes, whose page size is the relevant one?&lt;/p&gt;

&lt;p&gt;thanks,&lt;br/&gt;
Olaf&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i02d5b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>