<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:13:46 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1129] filter_handle_precreate()) ASSERTION(diff &gt;= 0) failed</title>
                <link>https://jira.whamcloud.com/browse/LU-1129</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;It seems my async journal code for 2.0 port had a bug in it regarding recovery.&lt;/p&gt;

&lt;p&gt;When the object is not created we try to recreate it, but we also need to set a recreate flag.&lt;/p&gt;

&lt;p&gt;The solution is something like this. In filter_preprw_write:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                        if (oa == NULL) {
                                OBDO_ALLOC(noa);
                                if (noa == NULL)
                                        GOTO(recreate_out, rc = -ENOMEM);
                                noa-&amp;gt;o_id = obj-&amp;gt;ioo_id;
                                noa-&amp;gt;o_valid = OBD_MD_FLID;
                        }
+                if ((oa-&amp;gt;o_valid &amp;amp; OBD_MD_FLFLAGS) == 0) {
+                        oa-&amp;gt;o_valid |= OBD_MD_FLFLAGS;
+                        oa-&amp;gt;o_flags = OBD_FL_RECREATE_OBJS;
+                } else {
+                        oa-&amp;gt;o_flags |= OBD_FL_RECREATE_OBJS;
+                }

                        if (filter_create(exp, noa, NULL, oti) == 0) {
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Without it we always end up in precreate code that tries to allocate multiple objects and if we happen to allocate below last_id (which should be pretty rare I guess? I cannot come with any very plausible scenarios, but it does happen apparently), this assertion triggers as the result.&lt;/p&gt;</description>
                <environment></environment>
        <key id="13269">LU-1129</key>
            <summary>filter_handle_precreate()) ASSERTION(diff &gt;= 0) failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="yujian">Jian Yu</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Wed, 22 Feb 2012 23:57:29 +0000</created>
                <updated>Fri, 22 Feb 2013 11:12:58 +0000</updated>
                            <resolved>Fri, 22 Feb 2013 11:12:58 +0000</resolved>
                                    <version>Lustre 2.2.0</version>
                    <version>Lustre 2.1.1</version>
                                    <fixVersion>Lustre 2.3.0</fixVersion>
                    <fixVersion>Lustre 2.1.3</fixVersion>
                    <fixVersion>Lustre 1.8.9</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="41155" author="jaylan" created="Tue, 26 Jun 2012 14:21:28 +0000"  >&lt;p&gt;I think we hit this LBUG today at NASA Ames after recovery.&lt;/p&gt;


&lt;p&gt;LustreError: 7249:0:(ost_handler.c:1067:ost_brw_write()) client csum 3b637596, original server csum 7f5de8f4, server csum now 7f5de8f4^M&lt;br/&gt;
LustreError: 7250:0:(filter.c:3685:filter_handle_precreate()) ASSERTION(diff &amp;gt;= 0) failed: nbp1-OST0025: 44069624 - 44069830 = -206^M&lt;br/&gt;
LustreError: 7249:0:(filter.c:3685:filter_handle_precreate()) ASSERTION(diff &amp;gt;= 0) failed: nbp1-OST0065: 42271430 - 42271591 = -161^M&lt;br/&gt;
LustreError: 7249:0:(filter.c:3685:filter_handle_precreate()) LBUG^M&lt;br/&gt;
Pid: 7249, comm: ll_ost_io_126^M&lt;br/&gt;
^M&lt;br/&gt;
Call Trace:^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0568855 &amp;gt;&amp;#93;&lt;/span&gt;  2l3i bouctf so_fd eb2u4g _cdpuusm pisnt ackkd+b0,x w5a5i/t0ixn8g0  &lt;span class=&quot;error&quot;&gt;&amp;#91;floirb tchfes &amp;#93;&lt;/span&gt;r^Me&lt;br/&gt;
st, timeout in 10 second(s)^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0568e95&amp;gt;&amp;#93;&lt;/span&gt; lbug_with_loc+0x75/0xe0 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa082dc9d&amp;gt;&amp;#93;&lt;/span&gt; filter_create+0x160d/0x1640 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdfilter&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0844550&amp;gt;&amp;#93;&lt;/span&gt; ? filter_alloc_iobuf+0x170/0x850 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdfilter&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0838b51&amp;gt;&amp;#93;&lt;/span&gt; filter_preprw_write+0x6c1/0x1f10 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdfilter&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0569a13&amp;gt;&amp;#93;&lt;/span&gt; ? cfs_alloc+0x63/0x90 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa079eacd&amp;gt;&amp;#93;&lt;/span&gt; ? null_alloc_rs+0x19d/0x320 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa05e556f&amp;gt;&amp;#93;&lt;/span&gt; ? LNetPut+0x2bf/0x7f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lnet&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa078bd54&amp;gt;&amp;#93;&lt;/span&gt; ? sptlrpc_svc_alloc_rs+0x74/0x2d0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa075df74&amp;gt;&amp;#93;&lt;/span&gt; ? lustre_msg_add_version+0x94/0x110 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa075e205&amp;gt;&amp;#93;&lt;/span&gt; ? lustre_pack_reply_v2+0x215/0x2e0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa075d085&amp;gt;&amp;#93;&lt;/span&gt; ? lustre_msg_buf+0x85/0x90 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa083b298&amp;gt;&amp;#93;&lt;/span&gt; filter_preprw+0x68/0x90 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdfilter&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa06e17fe&amp;gt;&amp;#93;&lt;/span&gt; obd_preprw+0x11e/0x420 &lt;span class=&quot;error&quot;&gt;&amp;#91;ost&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa06ebc3e&amp;gt;&amp;#93;&lt;/span&gt; ost_brw_write+0x9fe/0x1850 &lt;span class=&quot;error&quot;&gt;&amp;#91;ost&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa075cce4&amp;gt;&amp;#93;&lt;/span&gt; ? lustre_msg_get_opc+0x94/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa06f06e5&amp;gt;&amp;#93;&lt;/span&gt; ost_handle+0x3325/0x4ba0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ost&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0a28bd6&amp;gt;&amp;#93;&lt;/span&gt; ? vvp_session_key_init+0x76/0x1d0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa075cce4&amp;gt;&amp;#93;&lt;/span&gt; ? lustre_msg_get_opc+0x94/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa076d7be&amp;gt;&amp;#93;&lt;/span&gt; ptlrpc_main+0xb7e/0x18f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa076cc40&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpc_main+0x0/0x18f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100c14a&amp;gt;&amp;#93;&lt;/span&gt; child_rip+0xa/0x20^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa076cc40&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpc_main+0x0/0x18f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa076cc40&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpc_main+0x0/0x18f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;^M&lt;br/&gt;
 &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100c140&amp;gt;&amp;#93;&lt;/span&gt; ? child_rip+0x0/0x20^M&lt;br/&gt;
^M&lt;br/&gt;
Kernel panic - not syncing: LBUG^M&lt;br/&gt;
Pid: 7249, comm: ll_ost_io_126 Not tainted 2.6.32-220.4.1.el6.20120130.x86_64.lustre211 #1^M&lt;/p&gt;
</comment>
                            <comment id="41167" author="jaylan" created="Tue, 26 Jun 2012 18:29:17 +0000"  >&lt;p&gt;BTW, the OSS was running 2.1.1 with some LU patches that fixed LBUG problems:&lt;br/&gt;
   LU 685, 1092, 1095, 1098, 1166, 1350, 1467.&lt;/p&gt;

&lt;p&gt;Could you put your patch through testing and review cycle, if you have completed the patch. Thanks!&lt;/p&gt;</comment>
                            <comment id="41297" author="pjones" created="Fri, 29 Jun 2012 00:43:37 +0000"  >&lt;p&gt;Yujian&lt;/p&gt;

&lt;p&gt;Could you please look into this one&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="41401" author="yujian" created="Tue, 3 Jul 2012 10:31:11 +0000"  >&lt;p&gt;Patch for b2_1 branch is in &lt;a href=&quot;http://review.whamcloud.com/3264&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/3264&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="41418" author="adilger" created="Tue, 3 Jul 2012 16:45:22 +0000"  >&lt;p&gt;Yu Jian,&lt;br/&gt;
can you please also submit a patch for master, since it can get much more testing after lustre-review testing, by being landed to the master branch and undergoing full/scale/recovery testing.&lt;/p&gt;

&lt;p&gt;Also, this change likely affects OFD, so the master version of this patch should include a fix for the OFD code.&lt;/p&gt;</comment>
                            <comment id="41419" author="adilger" created="Tue, 3 Jul 2012 17:03:19 +0000"  >&lt;p&gt;Looking at the patch more closely, I&apos;m not sure the problem analysis of this bug is correct.&lt;/p&gt;

&lt;p&gt;The code that is being patched is intended to be run in the case where a very recent object precreate (which is currently asynchronous) is lost because of no journal commit.  After the OSS crash/recovery, the client may submit its writes before the MDS-&amp;gt;OSS object recovery has completed, so the async write needs to recreate the object.&lt;/p&gt;

&lt;p&gt;However, in the error case being hit here the precreate was NOT lost, since LAST_ID is written to disk and is triggering the assertion failure.  This instead implies that the object being written to was missing for some other reason, such as filesystem corruption/e2fsck, or possibly being unlinked/destroyed while the write was still in progress.  This latter case &lt;em&gt;shouldn&apos;t&lt;/em&gt; happen, since the OST will cancel all extent locks on the object before it is destroyed.&lt;/p&gt;

&lt;p&gt;Rather, I suspect the root of this problem may be that two threads are racing to &quot;recreate&quot; the missing objects during recovery?  I &lt;em&gt;think&lt;/em&gt; that recovery is now multi-threaded in 2.x, and the fo_create_lock on the precreate call is only taken inside filter_handle_precreate(), so the check done in filter_preprw_write() may be outdated by the time filter_handle_precreate() is called.&lt;/p&gt;

&lt;p&gt;Is this problem reproducible?  It would be good to have a test case for this.&lt;/p&gt;

&lt;p&gt;If yes, it would be useful to print in filter_preprw_write() what the current LAST_ID value is at the time the object is missing.  This will allow us to determine whether the precreate and write are racing (if LAST_ID &amp;lt; o_id, but it becomes &amp;gt; o_id inside filter_handle_precreate()), or if the object is missing for some other reason (if LAST_ID &amp;gt;= o_id in filter_preprw_write()).&lt;/p&gt;
</comment>
                            <comment id="41425" author="jaylan" created="Tue, 3 Jul 2012 17:37:30 +0000"  >&lt;p&gt;I have not seen this again yet. Those are production systems so the focus is to take the dump and get the systems back to service ASAP. &lt;br/&gt;
Certainly we would not encourage our users try to reproduce the problem. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;br/&gt;
A lot of people lose their productivity when a lustre server crashes.&lt;/p&gt;</comment>
                            <comment id="41453" author="yujian" created="Wed, 4 Jul 2012 05:12:55 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Yu Jian,&lt;br/&gt;
can you please also submit a patch for master, since it can get much more testing after lustre-review testing, by being landed to the master branch and undergoing full/scale/recovery testing.&lt;/p&gt;

&lt;p&gt;Also, this change likely affects OFD, so the master version of this patch should include a fix for the OFD code.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Sure, I&apos;ll run recovery-mds-scale subtest failover_ost to reproduce this issue (I hit it before in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1121&quot; title=&quot;recovery-mds-scale (FLAVOR=OSS): tar: Wrote only 4096 of 7168 bytes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1121&quot;&gt;&lt;del&gt;LU-1121&lt;/del&gt;&lt;/a&gt;). After I understand and figure out the root cause of this problem, I&apos;ll create a patch for master branch also.&lt;/p&gt;</comment>
                            <comment id="41661" author="yujian" created="Tue, 10 Jul 2012 11:37:40 +0000"  >&lt;p&gt;I&apos;ve run recovery-mds-scale(FLAVOR=OSS) test for about 20 times on solid Lustre 2.1.1 but failed to reproduce the issue.&lt;/p&gt;

&lt;p&gt;By looking into filter_preprw_write() again, I found the patch set 1 in &lt;a href=&quot;http://review.whamcloud.com/#change,3264&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,3264&lt;/a&gt; was incorrect. Since the codes inside &quot;if (exp-&amp;gt;exp_obd-&amp;gt;obd_recovering) {}&quot; are intended to recreate the missing precreated object, the o_id will &amp;gt; LAST_ID. If we set OBD_FL_RECREATE_OBJS flag over there, then after the code goes into filter_create(), it will hit the following error:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                if (!obd-&amp;gt;obd_recovering ||
                    oa-&amp;gt;o_id &amp;gt; filter_last_id(filter, oa-&amp;gt;o_seq)) {
                        CERROR(&quot;recreate objid &quot;LPU64&quot; &amp;gt; last id &quot;LPU64&quot;\n&quot;,
                               oa-&amp;gt;o_id, filter_last_id(filter, oa-&amp;gt;o_seq));
                        rc = -EINVAL;
                    ......
                }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, we should not apply that patch set.&lt;/p&gt;

&lt;p&gt;In addition, I found OBD_FL_RECREATE_OBJS flag was only set in ll_lov_recreate(), which was finally only used by ll_file_ioctl(). It seems all of the codes under (oa-&amp;gt;o_flags &amp;amp; OBD_FL_RECREATE_OBJS) conditions in filter_create() and filter_precreate() will never be performed unless the LL_IOC_RECREATE_OBJ and LL_IOC_RECREATE_FID ioctl commands are used.&lt;/p&gt;

&lt;p&gt;Now, I also think that the race condition of recreating missing precreated objects is more like the cause of the issue than recreating a missing o_id &amp;lt;= LAST_ID object with unknown reason.&lt;/p&gt;

&lt;p&gt;While different ll_ost_io threads handling different OST_WRITE requests (which all need recreating missing precreated objects) in parallel during recovery, the &quot;if (dentry-&amp;gt;d_inode == NULL) {}&quot; in filter_preprw_write() is a race condition which needs to be protected. It&apos;s likely that one request goes into that condition, recreates missing precreated objects and sets new LAST_ID, while the other request also goes into that condition but finally finds the o_id become &amp;lt; new LAST_ID.&lt;/p&gt;

&lt;p&gt;I&apos;ll look more and try to create a patch.&lt;/p&gt;</comment>
                            <comment id="41736" author="yujian" created="Thu, 12 Jul 2012 04:57:15 +0000"  >&lt;blockquote&gt;&lt;p&gt;While different ll_ost_io threads handling different OST_WRITE requests (which all need recreating missing precreated objects) in parallel during recovery, the &quot;if (dentry-&amp;gt;d_inode == NULL) {}&quot; in filter_preprw_write() is a race condition which needs to be protected. It&apos;s likely that one request goes into that condition, recreates missing precreated objects and sets new LAST_ID, while the other request also goes into that condition but finally finds the o_id become &amp;lt; new LAST_ID.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The above analysis is incorrect. By looking into filter_prep()-&amp;gt;target_recovery_init(), we can find there is still only one target_recovery_thread (tgt_recov) started to handle replayed requests from all of the clients in transaction number order serially. There is no possibility for the recovery thread to handle different replayed OST_WRITE requests in parallel. So, the above race can not happen.&lt;/p&gt;

&lt;p&gt;Instead, the race can happen while handling replayed OST_WRITE request during the MDS-&amp;gt;OST orphan recovery period:&lt;/p&gt;

&lt;p&gt;After an OST is restarted and re-establishes communication with the MDS, the MDS and OST automatically perform orphan recovery to destroy any objects that belong to files that were deleted while the OST was unavailable. This is done by mds_lov_clear_orphans() to send an OST_CREATE request with OBD_FL_DELORPHAN flag to the OST. This create handled inside filter_handle_precreate() will in fact either create or destroy: if the LAST_ID on OST is less than the record on MDS, then the missing precreated objects will be recreated; if the LAST_ID on OST is greater than the record on MDS, then the orphan objects on the OST will be deleted.&lt;/p&gt;

&lt;p&gt;So, while the OST recovery thread handling a replayed OST_WRITE request from one client, it finds the precreated object is missing and goes into filter_handle_precreate() to recreate it, at the same time, the missing precreated objects are also being recreated as per the above MDS-&amp;gt;OST synchronization mechanism. If the latter first holds the fo_create_locks, then the LAST_ID will be updated with a new value. And after the former acquires the fo_create_locks, it finds the o_id become less than the LAST_ID which causes the assertion failure.&lt;/p&gt;

&lt;p&gt;To handle the above race condition, we can just add a check into LASSERTF(diff &amp;gt;= 0,...) to see whether the OST is in recovery period or not. If yes, then no assertion failure occurs, filter_handle_precreate() just returns back to filter_preprw_write(), which will perform filter_fid2dentry() again to find the recreated object. If the OST is not in recovery period, then assertion failure occurs as normal.&lt;/p&gt;

&lt;p&gt;Patch for b2_1 branch is updated in &lt;a href=&quot;http://review.whamcloud.com/3264&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/3264&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="41793" author="yujian" created="Fri, 13 Jul 2012 08:39:04 +0000"  >&lt;p&gt;Patch for master branch is in &lt;a href=&quot;http://review.whamcloud.com/3391&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/3391&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the OFD codes, ofd_preprw_write() uses dt_create() instead of ofd_create() to recreate missing object. The dt_write_lock() is used to avoid racing with recreating missing object during the MDS-&amp;gt;OST orphan recovery period. In addition, in ofd_precreate_object(), which is called by ofd_create(), the following codes also exist to handle the race condition:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;        if (unlikely(ofd_object_exists(fo))) {
                /* object may exist being re-created by write replay */
                CDEBUG(D_INODE, &quot;object %u/&quot;LPD64&quot; exists: &quot;DFID&quot;\n&quot;,
                       (unsigned) group, id, PFID(&amp;amp;info-&amp;gt;fti_fid));
                rc = dt_trans_start_local(env, ofd-&amp;gt;ofd_osd, th);
                if (rc)
                        GOTO(trans_stop, rc);
                GOTO(last_id_write, rc);
        }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, we do not need patch the OFD codes.&lt;/p&gt;</comment>
                            <comment id="42024" author="yujian" created="Thu, 19 Jul 2012 19:20:22 +0000"  >&lt;p&gt;Patches were landed on master and b2_1 branches.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="13238">LU-1121</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv687:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4543</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>