<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:47:19 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4957] :osp_precreate_send()) ASSERTION( osp_fid_diff(fid, &amp;d-&gt;opd_pre_used_fid) &gt; 0 ) failed: reply fid [0x100000001:0x0:0x0] pre used fid [0x100000000:0x1e8:0x0]</title>
                <link>https://jira.whamcloud.com/browse/LU-4957</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Hit this during rolling upgrade. Please see &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4653&quot; title=&quot;Hit LBUG ASSERTION( fid_seq(fid1) == fid_seq(fid2) ) failed after upgrade OST from 2.5.0 to 2.6&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4653&quot;&gt;&lt;del&gt;LU-4653&lt;/del&gt;&lt;/a&gt; for detail steps.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;fat-amd-1.lab.whamcloud.com login: root
Password: 
Lustre: lustre-MDT0000: Recovery over after 0:10, of 2 clients 2 recovered and 0 were evicted.
LustreError: 2166:0:(osp_precreate.c:476:osp_precreate_send()) ASSERTION( osp_fid_diff(fid, &amp;amp;d-&amp;gt;opd_pre_used_fid) &amp;gt; 0 ) failed: reply fid [0x100000001:0x0:0x0] pre used fid [0x100000000:0x1e8:0x0]
LustreError: 2166:0:(osp_precreate.c:476:osp_precreate_send()) LBUG
Pid: 2166, comm: osp-pre-0

Call Trace:
 [&amp;lt;ffffffffa0209895&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [&amp;lt;ffffffffa0209e97&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
 [&amp;lt;ffffffffa0b119d7&amp;gt;] osp_precreate_send+0x1a47/0x1b00 [osp]
 [&amp;lt;ffffffffa0491304&amp;gt;] ? lustre_msg_set_timeout+0x74/0xc0 [ptlrpc]
 [&amp;lt;ffffffffa0b11f79&amp;gt;] osp_precreate_thread+0x4e9/0xc50 [osp]
 [&amp;lt;ffffffff810096f0&amp;gt;] ? __switch_to+0xd0/0x320
 [&amp;lt;ffffffff81065df0&amp;gt;] ? default_wake_function+0x0/0x20
 [&amp;lt;ffffffffa0b11a90&amp;gt;] ? osp_precreate_thread+0x0/0xc50 [osp]
Last login: Wed  [&amp;lt;ffffffff8109aee6&amp;gt;] kthread+0x96/0xa0
Apr 23 17:31:38  [&amp;lt;ffffffff8100c20a&amp;gt;] child_rip+0xa/0x20
on ttyS0
 [&amp;lt;ffffffff8109ae50&amp;gt;] ? kthread+0x0/0xa0
 [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20

Kernel panic - not syncing: LBUG
Pid: 2166, comm: osp-pre-0 Not tainted 2.6.32-431.5.1.el6_lustre.x86_64 #1
Call Trace:
 [&amp;lt;ffffffff81527983&amp;gt;] ? panic+0xa7/0x16f
 [&amp;lt;ffffffffa0209eeb&amp;gt;] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [&amp;lt;ffffffffa0b119d7&amp;gt;] ? osp_precreate_send+0x1a47/0x1b00 [osp]
 [&amp;lt;ffffffffa0491304&amp;gt;] ? lustre_msg_set_timeout+0x74/0xc0 [ptlrpc]
 [&amp;lt;ffffffffa0b11f79&amp;gt;] ? osp_precreate_thread+0x4e9/0xc50 [osp]
 [&amp;lt;ffffffff810096f0&amp;gt;] ? __switch_to+0xd0/0x320
 [&amp;lt;ffffffff81065df0&amp;gt;] ? default_wake_function+0x0/0x20
 [&amp;lt;ffffffffa0b11a90&amp;gt;] ? osp_precreate_thread+0x0/0xc50 [osp]
 [&amp;lt;ffffffff8109aee6&amp;gt;] ? kthread+0x96/0xa0
 [&amp;lt;ffffffff8100c20a&amp;gt;] ? child_rip+0xa/0x20
 [&amp;lt;ffffffff8109ae50&amp;gt;] ? kthread+0x0/0xa0
 [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="24388">LU-4957</key>
            <summary>:osp_precreate_send()) ASSERTION( osp_fid_diff(fid, &amp;d-&gt;opd_pre_used_fid) &gt; 0 ) failed: reply fid [0x100000001:0x0:0x0] pre used fid [0x100000000:0x1e8:0x0]</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="di.wang">Di Wang</assignee>
                                    <reporter username="di.wang">Di Wang</reporter>
                        <labels>
                            <label>dne</label>
                    </labels>
                <created>Fri, 25 Apr 2014 06:15:18 +0000</created>
                <updated>Wed, 11 Jun 2014 13:40:28 +0000</updated>
                            <resolved>Wed, 11 Jun 2014 13:40:28 +0000</resolved>
                                    <version>Lustre 2.6.0</version>
                                    <fixVersion>Lustre 2.6.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="82508" author="jlevi" created="Fri, 25 Apr 2014 17:30:13 +0000"  >&lt;p&gt;Alex,&lt;br/&gt;
Could you please look into this one?&lt;br/&gt;
Thank you!&lt;/p&gt;</comment>
                            <comment id="85060" author="sarah" created="Wed, 28 May 2014 19:03:12 +0000"  >&lt;p&gt;debug dump log from MDS&lt;/p&gt;</comment>
                            <comment id="85492" author="jlevi" created="Mon, 2 Jun 2014 18:23:32 +0000"  >&lt;p&gt;Di,&lt;br/&gt;
Could you please take a look at this?&lt;br/&gt;
Thank you!&lt;/p&gt;</comment>
                            <comment id="85543" author="di.wang" created="Tue, 3 Jun 2014 06:26:18 +0000"  >&lt;p&gt;Ok, I will check.&lt;/p&gt;</comment>
                            <comment id="85554" author="bzzz" created="Tue, 3 Jun 2014 10:41:44 +0000"  >&lt;p&gt;unfortunately the log isn&apos;t full (echo -1 &amp;gt;/proc/sys/lnet/debug was missing?)&lt;/p&gt;</comment>
                            <comment id="85668" author="di.wang" created="Tue, 3 Jun 2014 23:18:59 +0000"  >&lt;p&gt;It seems in 2.6, OFD returned IDIF to MDT during precreate, which the old MDT can not handle. Here is the fix&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/10580&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10580&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="85973" author="adilger" created="Fri, 6 Jun 2014 05:37:10 +0000"  >&lt;p&gt;There is still something strange here.  Sequence 0x100000001 is not the IDIF sequence for OST1 (which would be 0x100010000 with the low 16 bits encoding the OST index).  This LASSERT() was not hit on the OSP#1 precreate thread, but rather process &quot;osp-pre-0&quot; which is OSP#0.&lt;/p&gt;

&lt;p&gt;In any case, I don&apos;t think the OSP should LASSERT() on something it got over the network, just return an error and refuse to precreate objects on this OST.&lt;/p&gt;</comment>
                            <comment id="85976" author="adilger" created="Fri, 6 Jun 2014 05:49:33 +0000"  >&lt;p&gt;In fact, it looks like there is a separate bug in &lt;tt&gt;osp_fid_diff()&lt;/tt&gt; that is being hit here.  Because both of these FID sequences are for OST0000 then it continues on to subtract (0x100000000 - 0x000001e8) but return it as an int, so it returns -0x1e8 and this is what triggers the LASSERT().  Otherwise, osp_fid_diff() would itself have triggered the LASSERT:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;                LASSERTF(ost_idx1 == 0 || ost_idx2 == 0 || ost_idx1 == ost_idx2,
                         &lt;span class=&quot;code-quote&quot;&gt;&quot;fid1: &quot;&lt;/span&gt;DFID&lt;span class=&quot;code-quote&quot;&gt;&quot;, fid2: &quot;&lt;/span&gt;DFID&lt;span class=&quot;code-quote&quot;&gt;&quot;\n&quot;&lt;/span&gt;, PFID(fid1),
                         PFID(fid2));

                &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; fid_idif_id(fid1-&amp;gt;f_seq, fid1-&amp;gt;f_oid, 0) -
                       fid_idif_id(fid2-&amp;gt;f_seq, fid2-&amp;gt;f_oid, 0);
        }

        LASSERTF(fid_seq(fid1) == fid_seq(fid2), &lt;span class=&quot;code-quote&quot;&gt;&quot;fid1:&quot;&lt;/span&gt;DFID
                 &lt;span class=&quot;code-quote&quot;&gt;&quot;, fid2:&quot;&lt;/span&gt;DFID&lt;span class=&quot;code-quote&quot;&gt;&quot;\n&quot;&lt;/span&gt;, PFID(fid1), PFID(fid2));
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Sadly, LASSERT() is not a robust form of error handling here either, and this should also be fixed.  What happens in the precreate code if the number of OST objects exceeds 4B and the sequence increases as it does here?  Is there some code path on the OSTs that bumps the OST LAST_ID a lot that may cause a large difference in returned ostids?&lt;/p&gt;</comment>
                            <comment id="85984" author="di.wang" created="Fri, 6 Jun 2014 07:09:41 +0000"  >&lt;p&gt;Actually for this bug, it is because OFD(2.6) returns a IDIF directly  to OSP(2.5),  but old OSP can not handle this well, because in 2.5, during precreate, OSP and OFD will convert IDIF to old OSTID (with MDT0 seq).  So when old OSP receives the IDIF from new OFD, it regarded it as a OSTID, so convert it to a wrong FID. (2.5 might have bug as well) , that is why we see a strange reply FID &quot;&lt;span class=&quot;error&quot;&gt;&amp;#91;0x100000001:0x0:0x0&amp;#93;&lt;/span&gt;&quot; here, which trigger the LBUG. Hmm, I probably need fix 2.5 and 2.4 as well.&lt;/p&gt;

&lt;p&gt;yes, if OST object exceeds 4B(it will be more for IDIF 45 bits), OST will request a new sequence. Hmm, I probably need add some test case to test it. &lt;/p&gt;

&lt;p&gt;Hmm, it is OSP that will tell how many objects OFD should precreate(max 10k) in one RPC. Normally OSP will only request precreate when its pool is nearly empty, and OST precreate will not exceed 10k per RPC, so the diff will at most 10k. Even if OSP and OST are out of sync because of failover,  I do not see how the diff will bump up more than 10k, but need think a bit.&lt;/p&gt;

</comment>
                            <comment id="86313" author="jlevi" created="Wed, 11 Jun 2014 13:40:28 +0000"  >&lt;p&gt;Patch landed to Master. Please reopen ticket if more work is needed.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="23216">LU-4653</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="15005" name="lustre-log.1401303426.3299" size="2731218" author="sarah" created="Wed, 28 May 2014 19:03:12 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwkzr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>13716</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>