<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:03:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-68] write_disjoint: invalid file size</title>
                <link>https://jira.whamcloud.com/browse/LU-68</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;+ su mpiuser sh -c &quot;/opt/mpich/ch-p4/bin/mpirun  -np 12 -machinefile /tmp/parallel-scale.machines&lt;br/&gt;
/usr/lib64/lustre/tests/write_disjoint -f /mnt/lustre/d0.write_disjoint/file -n 10000 &quot;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; MPI Abort by user Aborting program !&lt;br/&gt;
loop 0: chunk_size 103399&lt;br/&gt;
loop 544: chunk_size 113838, file size was 1366056&lt;br/&gt;
rank 0, loop 545: invalid file size 528737 instead of 576804 = 48067 * 12&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; Aborting program!&lt;br/&gt;
p4_error: latest msg from perror: Resource temporarily unavailable&lt;/p&gt;

&lt;p&gt;Reproduced at Oracle and I also have seen similar failures locally.&lt;br/&gt;
Could be related to the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-67&quot; title=&quot;write_disjoint: data corruption&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-67&quot;&gt;&lt;del&gt;LU-67&lt;/del&gt;&lt;/a&gt; issue&lt;/p&gt;</description>
                <environment></environment>
        <key id="10346">LU-68</key>
            <summary>write_disjoint: invalid file size</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="green">Oleg Drokin</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Wed, 9 Feb 2011 12:29:18 +0000</created>
                <updated>Tue, 29 Mar 2011 11:12:41 +0000</updated>
                            <resolved>Tue, 29 Mar 2011 11:12:41 +0000</resolved>
                                    <version>Lustre 2.0.0</version>
                    <version>Lustre 2.1.0</version>
                                    <fixVersion>Lustre 2.1.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="10595" author="adilger" created="Wed, 9 Feb 2011 17:06:19 +0000"  >&lt;p&gt;This looks very similar to &lt;a href=&quot;https://bugzilla.lustre.org/show_bug.cgi?id=3523&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugzilla.lustre.org/show_bug.cgi?id=3523&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="10613" author="green" created="Thu, 10 Feb 2011 13:46:34 +0000"  >&lt;p&gt;I have a log for this one now from my local testing.&lt;br/&gt;
This is a case of page not sent to the server, so it might cause this or &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-67&quot; title=&quot;write_disjoint: data corruption&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-67&quot;&gt;&lt;del&gt;LU-67&lt;/del&gt;&lt;/a&gt; depending on if the page did not make at the end of file or in the middle.&lt;br/&gt;
Something to do with incorrect kms it seems, still digging.&lt;/p&gt;</comment>
                            <comment id="11065" author="green" created="Mon, 14 Mar 2011 12:17:37 +0000"  >&lt;p&gt;I think I definitely foudn what the problem is finally&lt;br/&gt;
in osc_lock_detach&lt;br/&gt;
 &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;/* Update the kms. Need to loop all granted locks.&lt;br/&gt;
 &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; * Not a problem for the client */&lt;br/&gt;
 &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;attr-&amp;gt;cat_kms = ldlm_extent_shift_kms(dlmlock, old_kms);&lt;br/&gt;
 &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;unlock_res_and_lock(dlmlock);&lt;br/&gt;
&amp;lt;&amp;lt;HERE&amp;gt;&amp;gt; &#160; &#160; &#160; &#160;&lt;br/&gt;
 &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;cl_object_attr_lock(obj);&lt;br/&gt;
 &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;cl_object_attr_set(env, obj, attr, CAT_KMS);&lt;br/&gt;
 &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160;cl_object_attr_unlock(obj);&lt;/p&gt;

&lt;p&gt;for the discussed case the ldlm_shift_kms found an existing lock with bigger offset and returned old kms. As soon as we unlock, in comes another thread and updates kms (in our case it is write updating size in commit_write), then we proceed to write stale kms in the original thread and as a result the last page of the write is not reflected in kms and is lost.&lt;/p&gt;

&lt;p&gt;the problem did not happen in 1.8 because there the ldlm_extent_shift_kms was called under lov lock, but not anymore.&lt;/p&gt;
</comment>
                            <comment id="11067" author="hudson" created="Mon, 14 Mar 2011 17:12:03 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://build.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://build.whamcloud.com/job/reviews-centos5/448/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;reviews-centos5 #448&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-68&quot; title=&quot;write_disjoint: invalid file size&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-68&quot;&gt;&lt;del&gt;LU-68&lt;/del&gt;&lt;/a&gt; Fix a race between lock cancel and write&lt;/p&gt;

&lt;p&gt;Oleg Drokin : &lt;a href=&quot;http://git.whamcloud.com/gitweb/?p=fs/lustre-release.git&amp;amp;a=commit&amp;amp;h=186df50693a7a0fd9e20b4ac0ac08d523f5473be&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;186df50693a7a0fd9e20b4ac0ac08d523f5473be&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/osc/osc_lock.c&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="11149" author="hudson" created="Wed, 16 Mar 2011 09:36:38 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://build.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://build.whamcloud.com/job/lustre-master-centos5/151/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-master-centos5 #151&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-68&quot; title=&quot;write_disjoint: invalid file size&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-68&quot;&gt;&lt;del&gt;LU-68&lt;/del&gt;&lt;/a&gt; Fix a race between lock cancel and write&lt;/p&gt;

&lt;p&gt;Oleg Drokin : &lt;a href=&quot;http://git.whamcloud.com/gitweb/?p=fs/lustre-release.git&amp;amp;a=commit&amp;amp;h=d2dbff42e78d7ebca4db534df7e1c19f6b674a22&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;d2dbff42e78d7ebca4db534df7e1c19f6b674a22&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/osc/osc_lock.c&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="11156" author="hudson" created="Wed, 16 Mar 2011 12:17:00 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://build.whamcloud.com/images/16x16/red.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://build.whamcloud.com/job/reviews-rhel6/33/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;reviews-rhel6 #33&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-68&quot; title=&quot;write_disjoint: invalid file size&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-68&quot;&gt;&lt;del&gt;LU-68&lt;/del&gt;&lt;/a&gt; Fix a race between lock cancel and write&lt;/p&gt;

&lt;p&gt;Oleg Drokin : &lt;a href=&quot;http://git.whamcloud.com/gitweb/?p=fs/lustre-dev.git&amp;amp;a=commit&amp;amp;h=d2dbff42e78d7ebca4db534df7e1c19f6b674a22&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;d2dbff42e78d7ebca4db534df7e1c19f6b674a22&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/osc/osc_lock.c&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="11164" author="hudson" created="Wed, 16 Mar 2011 12:56:40 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://build.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://build.whamcloud.com/job/reviews-centos5/483/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;reviews-centos5 #483&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-68&quot; title=&quot;write_disjoint: invalid file size&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-68&quot;&gt;&lt;del&gt;LU-68&lt;/del&gt;&lt;/a&gt; Fix a race between lock cancel and write&lt;/p&gt;

&lt;p&gt;Oleg Drokin : &lt;a href=&quot;http://git.whamcloud.com/gitweb/?p=fs/lustre-release.git&amp;amp;a=commit&amp;amp;h=d2dbff42e78d7ebca4db534df7e1c19f6b674a22&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;d2dbff42e78d7ebca4db534df7e1c19f6b674a22&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/osc/osc_lock.c&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="11338" author="pjones" created="Thu, 24 Mar 2011 11:26:33 +0000"  >&lt;p&gt;James&lt;/p&gt;

&lt;p&gt;When do you think that you might be able to try out your reproducer with the latest code?&lt;/p&gt;

&lt;p&gt;Please advise&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="11547" author="pjones" created="Tue, 29 Mar 2011 11:12:41 +0000"  >&lt;p&gt;Believed resolved. ORNL will reopen or open a new ticket if their reproducer still has issues&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                    <customfield id="customfield_10020" key="com.atlassian.jira.plugin.system.customfieldtypes:float">
                        <customfieldname>Bugzilla ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>23175.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv9n3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5096</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>