<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:03:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-67] write_disjoint: data corruption</title>
                <link>https://jira.whamcloud.com/browse/LU-67</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Write disjoint occasionally fails with a data corruption pattern like this:&lt;br/&gt;
loop 0: chunk_size 103399&lt;br/&gt;
loop 1000: chunk_size 69125&lt;br/&gt;
loop 2000: chunk_size 104360&lt;br/&gt;
loop 3000: chunk_size 11295&lt;br/&gt;
loop 4000: chunk_size 77918&lt;br/&gt;
loop 4370: chunk_size 51125&lt;br/&gt;
loop 4371: chunk 3 corrupted with chunk_size 93369, page_size 4096&lt;br/&gt;
ranks:	page boundry	chunk boundry	page boundry&lt;br/&gt;
A -&amp;gt; B:	90112	93369	94208&lt;br/&gt;
B -&amp;gt; C:	184320	186738	188416&lt;br/&gt;
C -&amp;gt; D:	278528	280107	282624&lt;br/&gt;
D -&amp;gt; E:	372736	373476	376832&lt;br/&gt;
E -&amp;gt; F:	462848	466845	466944&lt;br/&gt;
F -&amp;gt; G:	557056	560214	561152&lt;br/&gt;
G -&amp;gt; H:	651264	653583	655360&lt;br/&gt;
H -&amp;gt; I:	745472	746952	749568&lt;br/&gt;
I -&amp;gt; J:	839680	840321	843776&lt;br/&gt;
J -&amp;gt; K:	929792	933690	933888&lt;br/&gt;
K -&amp;gt; L:	1024000	1027059	1028096&lt;br/&gt;
0000000   A   A   A   A   A   A   A   A   A   A   A   A   A   A   A   A&lt;br/&gt;
*&lt;br/&gt;
0093360   A   A   A   A   A   A   A   A   A   B   B   B   B   B   B   B&lt;br/&gt;
0093376   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B&lt;br/&gt;
*&lt;br/&gt;
0186736   B   B   C   C   C   C   C   C   C   C   C   C   C   C   C   C&lt;br/&gt;
0186752   C   C   C   C   C   C   C   C   C   C   C   C   C   C   C   C&lt;br/&gt;
*&lt;br/&gt;
0280096   C   C   C   C   C   C   C   C   C   C   C   D   D   D   D   D&lt;br/&gt;
0280112   D   D   D   D   D   D   D   D   D   D   D   D   D   D   D   D&lt;br/&gt;
*&lt;br/&gt;
0372736 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul&lt;br/&gt;
*&lt;br/&gt;
0373472 nul nul nul nul   E   E   E   E   E   E   E   E   E   E   E   E&lt;br/&gt;
0373488   E   E   E   E   E   E   E   E   E   E   E   E   E   E   E   E&lt;br/&gt;
*&lt;br/&gt;
0466832   E   E   E   E   E   E   E   E   E   E   E   E   E   F   F   F&lt;br/&gt;
0466848   F   F   F   F   F   F   F   F   F   F   F   F   F   F   F   F&lt;br/&gt;
*&lt;br/&gt;
0560208   F   F   F   F   F   F   G   G   G   G   G   G   G   G   G   G&lt;br/&gt;
0560224   G   G   G   G   G   G   G   G   G   G   G   G   G   G   G   G&lt;br/&gt;
*&lt;br/&gt;
0653568   G   G   G   G   G   G   G   G   G   G   G   G   G   G   G   H&lt;br/&gt;
0653584   H   H   H   H   H   H   H   H   H   H   H   H   H   H   H   H&lt;br/&gt;
*&lt;br/&gt;
0746944   H   H   H   H   H   H   H   H   I   I   I   I   I   I   I   I&lt;br/&gt;
0746960   I   I   I   I   I   I   I   I   I   I   I   I   I   I   I   I&lt;br/&gt;
*&lt;br/&gt;
0840320   I   J   J   J   J   J   J   J   J   J   J   J   J   J   J   J&lt;br/&gt;
0840336   J   J   J   J   J   J   J   J   J   J   J   J   J   J   J   J&lt;br/&gt;
*&lt;br/&gt;
0933680   J   J   J   J   J   J   J   J   J   J   K   K   K   K   K   K&lt;br/&gt;
0933696   K   K   K   K   K   K   K   K   K   K   K   K   K   K   K   K&lt;br/&gt;
*&lt;br/&gt;
1027056   K   K   K   L   L   L   L   L   L   L   L   L   L   L   L   L&lt;br/&gt;
1027072   L   L   L   L   L   L   L   L   L   L   L   L   L   L   L   L&lt;br/&gt;
*&lt;br/&gt;
1120416   L   L   L   L   L   L   L   L   L   L   L   L&lt;br/&gt;
1120428&lt;br/&gt;
rank 0, loop 4371: data check error - exiting&lt;br/&gt;
--------------------------------------------------------------------------&lt;br/&gt;
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD &lt;br/&gt;
with errorcode -1.&lt;/p&gt;

&lt;p&gt;Originally reproduced at Oracle (see the bug and the attachment for the logs).&lt;br/&gt;
Now I also reproduced the issue locally, but after examining the logs I believe there might be two issues since my logs are pretty different from Oracle logs.&lt;br/&gt;
The issue that I can reproduce also affects lustre 1.8&lt;/p&gt;</description>
                <environment></environment>
        <key id="10345">LU-67</key>
            <summary>write_disjoint: data corruption</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="green">Oleg Drokin</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Wed, 9 Feb 2011 12:25:32 +0000</created>
                <updated>Tue, 28 Jun 2011 15:01:37 +0000</updated>
                            <resolved>Mon, 14 Mar 2011 12:10:51 +0000</resolved>
                                    <version>Lustre 2.0.0</version>
                    <version>Lustre 2.1.0</version>
                    <version>Lustre 1.8.6</version>
                                    <fixVersion>Lustre 2.1.0</fixVersion>
                    <fixVersion>Lustre 1.8.6</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                            <comments>
                            <comment id="10594" author="adilger" created="Wed, 9 Feb 2011 17:05:54 +0000"  >&lt;p&gt;Note that there has been a similar problem with write_disjoint for ages on 1.6 and 1.8, I think it is &lt;a href=&quot;https://bugzilla.lustre.org/show_bug.cgi?id=3654&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugzilla.lustre.org/show_bug.cgi?id=3654&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="10599" author="green" created="Wed, 9 Feb 2011 20:10:08 +0000"  >&lt;p&gt;Could be. The issue that I have found exists since forever. It&apos;s just a race between enqueue reply and completion AST where completion AST happens first with correct LVB and then RPC reply rewrites correct lvb with incorrect (I know there is a fix for this race, but it&apos;s racy by itself).&lt;br/&gt;
I am testing a patch for most of today and it seems to be holding well, so I plan to give it to ORNL tomorrow for more testing and also submit for inspections.&lt;/p&gt;</comment>
                            <comment id="10603" author="chris" created="Thu, 10 Feb 2011 01:24:02 +0000"  >&lt;p&gt;Is severity 3 the highest or lowest? I ask because data corruption would seem to me to always be highest severity. An important point I guess is whether this is this silent data corruption or not. I&apos;m not knowledgable enough to now what is detecting the error from your log.&lt;/p&gt;</comment>
                            <comment id="10610" author="pjones" created="Thu, 10 Feb 2011 07:40:52 +0000"  >&lt;p&gt;Severity 3 is the default and means a minor issue. Bumping the severity to major issue (2)&lt;/p&gt;</comment>
                            <comment id="11055" author="pjones" created="Mon, 14 Mar 2011 08:14:35 +0000"  >&lt;p&gt;This fix has been landed upstream for 1.8.6&lt;/p&gt;</comment>
                            <comment id="11064" author="green" created="Mon, 14 Mar 2011 12:10:51 +0000"  >&lt;p&gt;landed to 2.1 and 1.8.6 branches&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="10111" name="24375.tar.bz2" size="2188225" author="green" created="Wed, 9 Feb 2011 12:25:32 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                    <customfield id="customfield_10020" key="com.atlassian.jira.plugin.system.customfieldtypes:float">
                        <customfieldname>Bugzilla ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>24375.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw18n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10282</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>