<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:40:14 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11020] Data corruption during MDT&lt;&gt;OST recovery</title>
                <link>https://jira.whamcloud.com/browse/LU-11020</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Data corruption found during mdtest run with MDT failover testing in parallel.&lt;br/&gt;
Visible effects is some files have lack own OST objects after delete orphan phase.&lt;br/&gt;
This can be result of unreplayed requests or wrong OST reading after failover.&lt;br/&gt;
Originally testing start with power lost failover tests, but it have replicated with normal shutdown + failover. This allow to grab a logs during failover / recovery.&lt;br/&gt;
Cut from logs.&lt;br/&gt;
Server take a request from client:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00100000:3.0:1524592151.996038:0:18109:0:(service.c:1955:ptlrpc_server_handle_req_in()) got req x1598582492486880
00000100:00100000:3.0:1524592151.996042:0:18109:0:(nrs_fifo.c:179:nrs_fifo_req_get()) NRS start fifo request from 12345-54@gni, seq: 1564278792
00000100:00100000:3.0:1524592151.996043:0:18109:0:(service.c:2103:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt01_003:061cd4a0-cd22-5048-946a-d8f4ec53917a+6881:15972:x1598582492486880:12345-54@gni:101
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Server starts generate a ENODEV during obd_fail set in unmount.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00080000:1.0:1524592151.996485:0:2924:0:(niobuf.c:583:ptlrpc_send_reply()) sending ENODEV from failed obd 16
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;server issue a sync as next step of failover and last committed updated.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00010000:00080000:3.0:1524592151.996589:0:5591:0:(ldlm_lib.c:2492:target_committed_to_req()) 0c6ce748-6618-ff9d-ad84-6e6f761558b2 last_committed 2937774893859, transno 2937774968804, xid 1598584386635888
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&amp;#8212; this create is on disk&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000004:00080000:13.0:1524592151.996630:0:8166:0:(osp_precreate.c:1476:osp_precreate_get_fid()) next [0x240000400:0x91561aae:0x0] last created [0x240000400:0x91562c9d:0x0]
00000004:00080000:13.0:1524592151.996632:0:8166:0:(osp_object.c:1547:osp_object_create()) last fid for osp_object is [0x240000400:0x91561aae:0x0]
00000004:00080000:13.0:1524592151.996867:0:8166:0:(osp_object.c:1601:osp_object_create()) snx11205-OST0000-osc-MDT0001: Wrote last used FID: [0x240000400:0x91561aae:0x0], index 0: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&amp;#8211; but this is not.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000004:00080000:3.0:1524592151.996637:0:18109:0:(osp_precreate.c:1476:osp_precreate_get_fid()) next [0x240000400:0x91561aaf:0x0] last created [0x240000400:0x91562c9d:0x0]
00000004:00080000:3.0:1524592151.996639:0:18109:0:(osp_object.c:1547:osp_object_create()) last fid for osp_object is [0x240000400:0x91561aaf:0x0]
00000004:00080000:3.0:1524592151.996641:0:18109:0:(osp_object.c:1601:osp_object_create()) snx11205-OST0000-osc-MDT0001: Wrote last used FID: [0x240000400:0x91561aaf:0x0], index 0: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Looking in time stamps of osp_object_create calls, I found it have called in reverse. so First write have finished as second it have caused most modern write have overwrite a stale data.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00100000:3.0:1524592151.996703:0:18109:0:(service.c:2153:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt01_003:061cd4a0-cd22-5048-946a-d8f4ec53917a+6888:15972:x1598582492486880:12345-54@gni:101 Request procesed in 660us (669us total) trans 2937774968811 rc -19/-19
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;as affected create have processed after obd_fail have set it have finished with ENODEV and it handled as ptlrpc recoverable error, so it have resend after connection restored and recovery finished.&lt;/p&gt;

</description>
                <environment>found in IEEL3 testing, but bug looks exist from 2.5.0 to master</environment>
        <key id="52236">LU-11020</key>
            <summary>Data corruption during MDT&lt;&gt;OST recovery</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="shadow">Alexey Lyashkov</assignee>
                                    <reporter username="shadow">Alexey Lyashkov</reporter>
                        <labels>
                    </labels>
                <created>Tue, 15 May 2018 08:51:57 +0000</created>
                <updated>Tue, 15 Oct 2019 21:23:58 +0000</updated>
                            <resolved>Sat, 17 Nov 2018 06:42:39 +0000</resolved>
                                    <version>Lustre 2.5.0</version>
                    <version>Lustre 2.11.0</version>
                                    <fixVersion>Lustre 2.12.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="227866" author="shadow" created="Tue, 15 May 2018 08:58:49 +0000"  >&lt;p&gt;some additional info&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@snx11205n000 #test-dir.0&amp;#93;&lt;/span&gt;# lfs getstripe mdtest_tree.19.0/file.mdtest.19.2004&lt;br/&gt;
mdtest_tree.19.0/file.mdtest.19.2004&lt;br/&gt;
lmm_stripe_count:   4&lt;br/&gt;
lmm_stripe_size:    1048576&lt;br/&gt;
lmm_pattern:        1&lt;br/&gt;
lmm_layout_gen:     0&lt;br/&gt;
lmm_stripe_offset:  0&lt;br/&gt;
        obdidx           objid           objid           group&lt;br/&gt;
             0      2438339247     0x91561aaf      0x240000400&lt;br/&gt;
             1      2443066678     0x919e3d36      0x2c0000400&lt;br/&gt;
             3      2429947051     0x90d60cab      0x280000400&lt;br/&gt;
             2      2428369630     0x90bdfade      0x200000400&lt;/p&gt;

&lt;p&gt;from kern log:&lt;/p&gt;

&lt;p&gt;Apr 24 12:52:23 snx11205n007 kernel: Lustre: snx11205-OST0003: deleting orphan objects from 0x280000400:2429947060 to 0x280000400:2429962999&lt;br/&gt;
Apr 24 12:52:23 snx11205n005 kernel: Lustre: snx11205-OST0001: deleting orphan objects from 0x2c0000400:2443066688 to 0x2c0000400:2443071739&lt;br/&gt;
Apr 24 12:52:23 snx11205n004 kernel: Lustre: snx11205-OST0000: deleting orphan objects from 0x240000400:2438339247 to 0x240000400:2438353837&lt;br/&gt;
Apr 24 12:52:23 snx11205n006 kernel: Lustre: snx11205-OST0002: deleting orphan objects from 0x200000400:2428369639 to 0x200000400:2428385256&lt;/p&gt;


&lt;p&gt;problem files objects:&lt;br/&gt;
ost0 objects 2438339247 - 2438339257    &amp;lt;&amp;lt;&amp;lt; these objects were deleted&lt;/p&gt;

&lt;p&gt;$ ./a.out mdt1.pre_mnt.lov_objid&lt;br/&gt;
91561aae&lt;br/&gt;
919e3d3f&lt;br/&gt;
90bdfae6&lt;br/&gt;
90d60cb3&lt;/p&gt;</comment>
                            <comment id="227867" author="shadow" created="Tue, 15 May 2018 09:13:23 +0000"  >&lt;p&gt;I think problem is osp_create_object isn&apos;t have protection against parallel execution, it allow to delay a first thread due waiting a reading from disk, while second thread take a fast path for this reading.&lt;/p&gt;

&lt;p&gt;Solution can be easy - use a mutex to protect a write a whole like i_mutex does, but it problem from performance perspective. Other solution move a write buffer to the osd layer and callback to fill buffer provided.&lt;br/&gt;
It allow to use less locking (a specially for ldiskfs case), mutex will be need for zfs anyway but need more work and dt_record_write prototype change.&lt;/p&gt;</comment>
                            <comment id="227870" author="bzzz" created="Tue, 15 May 2018 11:47:10 +0000"  >&lt;p&gt;&amp;gt;&#160; I think problem is osp_create_object isn&apos;t have protection against parallel execution&lt;/p&gt;

&lt;p&gt;not sure what exactly you mean, but&#160;osp_precreate_get_fid() does serialise FID assignment using&#160;opd_pre_lock.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="227891" author="shadow" created="Tue, 15 May 2018 16:22:36 +0000"  >&lt;p&gt;Alex,&lt;/p&gt;

&lt;p&gt;yes, osp_precreate_get_fid get a right fid assignment to the osp object, But what about last used FID store to the disk? it&apos;s on bottom osp_object_create (osp_create) function.&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        /* Only need update last_used oid file, seq file will only be update
         * during seq rollover */
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (fid_is_idif((last_fid)))
                osi-&amp;gt;osi_id = fid_idif_id(fid_seq(last_fid),
                                          fid_oid(last_fid), fid_ver(last_fid));
        &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt;
                osi-&amp;gt;osi_id = fid_oid(last_fid);
        osp_objid_buf_prep(&amp;amp;osi-&amp;gt;osi_lb, &amp;amp;osi-&amp;gt;osi_off,
                           &amp;amp;osi-&amp;gt;osi_id, d-&amp;gt;opd_index);

        rc = dt_record_write(env, d-&amp;gt;opd_last_used_oid_file, &amp;amp;osi-&amp;gt;osi_lb,
                             &amp;amp;osi-&amp;gt;osi_off, th);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;this code take a last_fid and store into buffer, but dt_record_write calls can be reordered due different reasons - like fast / slow path in block reading or get_write_access code.&lt;/p&gt;</comment>
                            <comment id="230799" author="gerrit" created="Tue, 24 Jul 2018 08:30:50 +0000"  >&lt;p&gt;Alexey Lyashkov (c17817@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/32866&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32866&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11020&quot; title=&quot;Data corruption during MDT&amp;lt;&amp;gt;OST recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11020&quot;&gt;&lt;del&gt;LU-11020&lt;/del&gt;&lt;/a&gt; osd: use right sync&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b62e159963b02957f18e35cb9cb7997ed9d4dede&lt;/p&gt;</comment>
                            <comment id="230800" author="gerrit" created="Tue, 24 Jul 2018 08:30:51 +0000"  >&lt;p&gt;Alexey Lyashkov (c17817@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/32867&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32867&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11020&quot; title=&quot;Data corruption during MDT&amp;lt;&amp;gt;OST recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11020&quot;&gt;&lt;del&gt;LU-11020&lt;/del&gt;&lt;/a&gt; osp: fix race when lov objid update&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: fc4c31ff634fd10a1d5a805426a1879e4b27bd3c&lt;/p&gt;</comment>
                            <comment id="234161" author="gerrit" created="Mon, 1 Oct 2018 14:00:36 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/32866/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32866/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11020&quot; title=&quot;Data corruption during MDT&amp;lt;&amp;gt;OST recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11020&quot;&gt;&lt;del&gt;LU-11020&lt;/del&gt;&lt;/a&gt; osd: use right sync&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: e2b4a521e260cb7b121dc51d4c29d4d47b7c1e1e&lt;/p&gt;</comment>
                            <comment id="237128" author="gerrit" created="Sat, 17 Nov 2018 01:25:50 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/32867/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32867/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11020&quot; title=&quot;Data corruption during MDT&amp;lt;&amp;gt;OST recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11020&quot;&gt;&lt;del&gt;LU-11020&lt;/del&gt;&lt;/a&gt; osp: fix race during lov_objids update&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8cd4760536d7f423db87c67bdc8214f13ede3ca8&lt;/p&gt;</comment>
                            <comment id="237141" author="pjones" created="Sat, 17 Nov 2018 06:42:39 +0000"  >&lt;p&gt;Landed for 2.12&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="30185" name="lost1" size="11298" author="shadow" created="Tue, 15 May 2018 08:52:28 +0000"/>
                            <attachment id="30186" name="mdt1.pre_mnt.lov_objid" size="32" author="shadow" created="Tue, 15 May 2018 08:55:37 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzx6v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>