<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:46:38 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4878] fld_server_lookup() ASSERTION( fld-&gt;lsf_control_exp ) failed</title>
                <link>https://jira.whamcloud.com/browse/LU-4878</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The following LBUG appeared at customer site, during the mount process on all OSS in lustre 2.4.3 version.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 12838:0:(fld_handler.c:172:fld_server_lookup()) ASSERTION(fld-&amp;gt;lsf_control_exp ) failed:
LustreError: 12838:0:(fld_handler.c:172:fld_server_lookup()) LBUG

Pid: 12838, comm: mount.lustre

Call Trace:
 libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 lbug_with_loc+0x47/0xb0 [libcfs]
 fld_server_lookup+0x2f7/0x3d0 [fld]
 osd_fld_lookup+0x71/0x1d0 [osd_ldiskfs]
 osd_remote_fid+0x9a/0x280 [osd_ldiskfs]
 osd_index_ea_lookup+0521/0x850 [osd_ldiskfs]
 dt_lookup_dir+0x6f/0x130 [obdclass]
 llog_osd_open+0x485/0xc00 [obdclass]
 llog_open+0xba/0x2c0 [obdclass]
 mgc_process_log [mgc]
 mgc_process_config [mgc]
 lustre_process_log [obdclass]
 server_start_targets [obdclass]
 server_fill_super [obdclass]
 lustre_fill_super[obdclass]
 get_sb_nodev
 lustre_get_sb
 vfs_kern_mount
 do_kern_mount
 do_mount
 sys_mount
 system_call_fastpath
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This issue seems the same as &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3126&quot; title=&quot;conf-sanity test_41b: fld_server_lookup()) ASSERTION( fld-&amp;gt;lsf_control_exp ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3126&quot;&gt;&lt;del&gt;LU-3126&lt;/del&gt;&lt;/a&gt; for which a patch has been landed in lustre 2.5. Unfortunately no patch has been provided for lustre 2.4 release.&lt;/p&gt;</description>
                <environment></environment>
        <key id="24149">LU-4878</key>
            <summary>fld_server_lookup() ASSERTION( fld-&gt;lsf_control_exp ) failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bfaccini">Bruno Faccini</assignee>
                                    <reporter username="pichong">Gregoire Pichon</reporter>
                        <labels>
                            <label>mn4</label>
                    </labels>
                <created>Thu, 10 Apr 2014 14:07:58 +0000</created>
                <updated>Thu, 15 May 2014 14:03:10 +0000</updated>
                            <resolved>Thu, 15 May 2014 14:03:10 +0000</resolved>
                                    <version>Lustre 2.4.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="81369" author="bfaccini" created="Thu, 10 Apr 2014 14:27:54 +0000"  >&lt;p&gt;Hello Gregoire,&lt;br/&gt;
Did you mean that this same LBUG occured for all OSSs of the same FS at OSTs mount time ??&lt;br/&gt;
And yes you are right &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3126&quot; title=&quot;conf-sanity test_41b: fld_server_lookup()) ASSERTION( fld-&amp;gt;lsf_control_exp ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3126&quot;&gt;&lt;del&gt;LU-3126&lt;/del&gt;&lt;/a&gt; patch has not been back-ported to b2_4, but it is mainly because the issue was unlikely to happen ...&lt;/p&gt;</comment>
                            <comment id="81382" author="bogl" created="Thu, 10 Apr 2014 16:08:55 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3126&quot; title=&quot;conf-sanity test_41b: fld_server_lookup()) ASSERTION( fld-&amp;gt;lsf_control_exp ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3126&quot;&gt;&lt;del&gt;LU-3126&lt;/del&gt;&lt;/a&gt; backport to b2_4:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/9929&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9929&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="81383" author="apercher" created="Thu, 10 Apr 2014 16:33:47 +0000"  >&lt;p&gt;Yes We meet the same LBUG on all OSS, with or without abort-recov&lt;br/&gt;
when mounting all ost at the same time or when we try to mount&lt;br/&gt;
manuely just one ost.&lt;br/&gt;
and it s just with 2.4.3 with some additional patchs and works&lt;br/&gt;
fine on 2.4.2 with some other additionnal patchs &lt;/p&gt;</comment>
                            <comment id="81530" author="pichong" created="Mon, 14 Apr 2014 15:25:26 +0000"  >&lt;p&gt;I have tested the patch #9929 posted in gerrit. Unfortunately the OSS still crashes when mounting OST.&lt;/p&gt;

&lt;p&gt;Here is the stack&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;lt;3&amp;gt;LustreError: 6577:0:(fld_handler.c:174:fld_server_lookup()) srv-fs_pv-OST0000: lookup 0x7d, but not connects to MDT0yet: rc = -5.
&amp;lt;3&amp;gt;LustreError: 6577:0:(osd_handler.c:2135:osd_fld_lookup()) fs_pv-OST0000-osd: cannot find FLD range for 0x7d: rc = -5
&amp;lt;3&amp;gt;LustreError: 6577:0:(osd_handler.c:3344:osd_mdt_seq_exists()) fs_pv-OST0000-osd: Can not lookup fld for 0x7d
&amp;lt;0&amp;gt;LustreError: 6577:0:(osd_handler.c:2651:osd_object_ref_del()) ASSERTION( inode-&amp;gt;i_nlink &amp;gt; 0 ) failed: 
&amp;lt;0&amp;gt;LustreError: 6577:0:(osd_handler.c:2651:osd_object_ref_del()) LBUG
&amp;lt;4&amp;gt;Pid: 6577, comm: mount.lustre
&amp;lt;4&amp;gt;
&amp;lt;4&amp;gt;Call Trace:
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0d57895&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0d57e97&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa04197e7&amp;gt;] osd_object_ref_del+0x1e7/0x220 [osd_ldiskfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0ec1fee&amp;gt;] llog_osd_destroy+0x48e/0xb20 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e91d61&amp;gt;] llog_destroy+0x51/0x170 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e96b34&amp;gt;] llog_erase+0x1c4/0x1e0 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e97401&amp;gt;] llog_backup+0x231/0x500 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa049ad66&amp;gt;] mgc_process_log+0x1636/0x18f0 [mgc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa049c514&amp;gt;] mgc_process_config+0x594/0xed0 [mgc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0ede64c&amp;gt;] lustre_process_log+0x25c/0xaa0 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0f126d3&amp;gt;] server_start_targets+0x1833/0x19c0 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0f1340c&amp;gt;] server_fill_super+0xbac/0x1660 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0ee3d68&amp;gt;] lustre_fill_super+0x1d8/0x530 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8118c7df&amp;gt;] get_sb_nodev+0x5f/0xa0
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0edb3b5&amp;gt;] lustre_get_sb+0x25/0x30 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8118be3b&amp;gt;] vfs_kern_mount+0x7b/0x1b0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8118bfe2&amp;gt;] do_kern_mount+0x52/0x130
&amp;lt;4&amp;gt; [&amp;lt;ffffffff811acfeb&amp;gt;] do_mount+0x2fb/0x930
&amp;lt;4&amp;gt; [&amp;lt;ffffffff811ad6b0&amp;gt;] sys_mount+0x90/0xe0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
&amp;lt;4&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="81583" author="bfaccini" created="Tue, 15 Apr 2014 07:42:38 +0000"  >&lt;p&gt;Hello Gregoire,&lt;br/&gt;
Thanks for the update.&lt;br/&gt;
But concerning the new and different stack/traceback you reported, in my opinion it looks more like a different problem which has already been reported in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3915&quot; title=&quot;After upgrade from 2.4.0 to 2.5, can not mount OST, (osd_handler.c:2668:osd_object_ref_del()) LBUG Pid: 9537, comm: mount.lustre&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3915&quot;&gt;&lt;del&gt;LU-3915&lt;/del&gt;&lt;/a&gt;.&lt;br/&gt;
So you may now give a try to my b2_4 back-port, of master patch #7673 from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3915&quot; title=&quot;After upgrade from 2.4.0 to 2.5, can not mount OST, (osd_handler.c:2668:osd_object_ref_del()) LBUG Pid: 9537, comm: mount.lustre&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3915&quot;&gt;&lt;del&gt;LU-3915&lt;/del&gt;&lt;/a&gt;, I just pushed at &lt;a href=&quot;http://review.whamcloud.com/9958&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9958&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="81585" author="pichong" created="Tue, 15 Apr 2014 07:47:55 +0000"  >&lt;p&gt;The LBUG I hit when testing path #9929 has the same stack trace than &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3915&quot; title=&quot;After upgrade from 2.4.0 to 2.5, can not mount OST, (osd_handler.c:2668:osd_object_ref_del()) LBUG Pid: 9537, comm: mount.lustre&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3915&quot;&gt;&lt;del&gt;LU-3915&lt;/del&gt;&lt;/a&gt;, which reports an OSS crash when mounting OSTs after an upgrade from 2.4.0 to 2.5.&lt;/p&gt;

&lt;p&gt;In my case, I was upgrading from 2.4.2 to 2.4.3 with a few additional patches including #5049 &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2059&quot; title=&quot;mgc to backup configuration on osd-based llogs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2059&quot;&gt;&lt;del&gt;LU-2059&lt;/del&gt;&lt;/a&gt; llog: MGC to use OSD API for backup logs&quot;.&lt;/p&gt;

&lt;p&gt;Patch #7673 &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3915&quot; title=&quot;After upgrade from 2.4.0 to 2.5, can not mount OST, (osd_handler.c:2668:osd_object_ref_del()) LBUG Pid: 9537, comm: mount.lustre&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3915&quot;&gt;&lt;del&gt;LU-3915&lt;/del&gt;&lt;/a&gt; osd-ldiskfs: don&apos;t assert on possible upgrade&quot; seems to fix upgrade issue caused by #5049. So I probably need to add #7673 in the list of patches on top of 2.4.3, do you agree ?&lt;/p&gt;

</comment>
                            <comment id="81586" author="bfaccini" created="Tue, 15 Apr 2014 08:01:48 +0000"  >&lt;p&gt;You may have missed my previous update that already confirmed what you finally found!&lt;br/&gt;
So yes, I agree that you can add #7673 or its back-port on top of your 2.4.3 version that also include #5049 ...&lt;/p&gt;</comment>
                            <comment id="81594" author="pichong" created="Tue, 15 Apr 2014 09:43:48 +0000"  >&lt;p&gt;Thanks for the backport Bruno. Our comments interleaved !&lt;/p&gt;

&lt;p&gt;I have tested a lustre version 2.4.3 with both additional patches&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;#9929 &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3126&quot; title=&quot;conf-sanity test_41b: fld_server_lookup()) ASSERTION( fld-&amp;gt;lsf_control_exp ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3126&quot;&gt;&lt;del&gt;LU-3126&lt;/del&gt;&lt;/a&gt; osd: remove fld lookup during configuration&lt;/li&gt;
	&lt;li&gt;#9958 &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4878&quot; title=&quot;fld_server_lookup() ASSERTION( fld-&amp;gt;lsf_control_exp ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4878&quot;&gt;&lt;del&gt;LU-4878&lt;/del&gt;&lt;/a&gt; osd-ldiskfs: don&apos;t assert on possible upgrade (backport of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3915&quot; title=&quot;After upgrade from 2.4.0 to 2.5, can not mount OST, (osd_handler.c:2668:osd_object_ref_del()) LBUG Pid: 9537, comm: mount.lustre&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3915&quot;&gt;&lt;del&gt;LU-3915&lt;/del&gt;&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The OSS is able to start without any problem. Filesystem is operational.&lt;/p&gt;

&lt;p&gt;I am now waiting for these patches to be fully approved and Maloo tested so they can be delivered to the customer.&lt;/p&gt;</comment>
                            <comment id="81711" author="bfaccini" created="Wed, 16 Apr 2014 08:09:54 +0000"  >&lt;p&gt;Hello Gregoire,&lt;br/&gt;
I am not sure that my patch #9958 will finally be fully accepted+landed to b2_4 ... The main reason of this is that #5049 is itself still not in b2_4 and may be won&apos;t, so #9958 is not necessary then as Mike commented in patch with reason !! &lt;br/&gt;
This points to some limit in the process where people use to decide to add more patches on top of releases we tested vs regressions and interoperability...&lt;/p&gt;</comment>
                            <comment id="81712" author="pichong" created="Wed, 16 Apr 2014 08:25:25 +0000"  >&lt;p&gt;Hello Bruno,&lt;/p&gt;

&lt;p&gt;Actually the patch #5049 &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2059&quot; title=&quot;mgc to backup configuration on osd-based llogs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2059&quot;&gt;&lt;del&gt;LU-2059&lt;/del&gt;&lt;/a&gt; llog: MGC to use OSD API for backup logs&quot; has been integrated by Bull on top of release 2.4.x because the customer hit the LBUG ASSERTION(cli-&amp;gt;cl_mgc_configs_dir) described in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2959&quot; title=&quot;ASSERTION( cli-&amp;gt;cl_mgc_configs_dir ) for 200 osts x 2 oss&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2959&quot;&gt;&lt;del&gt;LU-2959&lt;/del&gt;&lt;/a&gt;. As mentionned in that ticket, the LBUG is fixed by patch #5049.&lt;/p&gt;

&lt;p&gt;These problems occured in lustre 2.4.x release and need to be addressed.&lt;/p&gt;</comment>
                            <comment id="81714" author="bfaccini" created="Wed, 16 Apr 2014 08:35:15 +0000"  >&lt;p&gt;Gregoire, don&apos;t misunderstand me, I did not mean that you added patches without good reasons to do so, but only that doing so you fall back out from our regression/interop testing process.&lt;br/&gt;
Concerning the fact you added #5049 due to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2959&quot; title=&quot;ASSERTION( cli-&amp;gt;cl_mgc_configs_dir ) for 200 osts x 2 oss&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2959&quot;&gt;&lt;del&gt;LU-2959&lt;/del&gt;&lt;/a&gt;, that may help for #5049 and #9958 to finally land ...&lt;/p&gt;</comment>
                            <comment id="83209" author="bfaccini" created="Mon, 5 May 2014 16:58:34 +0000"  >&lt;p&gt;Hello Gregoire,&lt;br/&gt;
Since patch #9958 is planned for 2.4 integration, do you agree if we close/resolve this issue as fixed ? &lt;/p&gt;</comment>
                            <comment id="83266" author="pichong" created="Tue, 6 May 2014 06:48:19 +0000"  >&lt;p&gt;Hi Bruno,&lt;/p&gt;

&lt;p&gt;Yes this ticket can be closed since our tests have shown the issue is fixed with patches #9929 and #9958.&lt;br/&gt;
I hope both patches are planned for integration in 2.4 if a new version is released.&lt;/p&gt;</comment>
                            <comment id="84172" author="pjones" created="Thu, 15 May 2014 14:03:10 +0000"  >&lt;p&gt;Yes this would be under consideration for 2.4.4.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="18285">LU-3126</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="20872">LU-3915</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwjqn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>13490</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>