<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:46:13 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4829] LBUG: ASSERTION( !fid_is_idif(fid) )</title>
                <link>https://jira.whamcloud.com/browse/LU-4829</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We have our TDS system setup in wide-stripe mode.  Each OSS is mounting over 100 OSTs.  On mount the other day, we hit an assertion when scrub started.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[12319.230157] LustreError: 54554:0:(osd_internal.h:752:osd_fid2oi()) ASSERTION( !fid_is_idif(fid) ) failed: [0x100000000:0x1:0x0]
[12319.242502] LustreError: 54554:0:(osd_internal.h:752:osd_fid2oi()) LBUG
[12319.249538] Pid: 54554, comm: OI_scrub
[12319.253707] 
[12319.253707] Call Trace:
[12319.258529]  [&amp;lt;ffffffffa03dd895&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[12319.265837]  [&amp;lt;ffffffffa03dde97&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
[12319.272395]  [&amp;lt;ffffffffa0be40f5&amp;gt;] __osd_oi_lookup+0x3a5/0x3b0 [osd_ldiskfs]
[12319.279770]  [&amp;lt;ffffffff8119dfcd&amp;gt;] ? generic_drop_inode+0x1d/0x80
[12319.286133]  [&amp;lt;ffffffffa0be4174&amp;gt;] osd_oi_lookup+0x74/0x140 [osd_ldiskfs]
[12319.293197]  [&amp;lt;ffffffffa0bf8fbf&amp;gt;] osd_scrub_exec+0x1af/0xf30 [osd_ldiskfs]
[12319.300553]  [&amp;lt;ffffffffa0bfa5f2&amp;gt;] ? osd_scrub_next+0x142/0x4b0 [osd_ldiskfs]
[12319.308061]  [&amp;lt;ffffffffa0b71432&amp;gt;] ? ldiskfs_read_inode_bitmap+0x172/0x2c0 [ldiskfs]
[12319.316454]  [&amp;lt;ffffffffa0bf4d4f&amp;gt;] osd_inode_iteration+0x1cf/0x570 [osd_ldiskfs]
[12319.324461]  [&amp;lt;ffffffff810516b9&amp;gt;] ? __wake_up_common+0x59/0x90
[12319.330764]  [&amp;lt;ffffffffa0bf8e10&amp;gt;] ? osd_scrub_exec+0x0/0xf30 [osd_ldiskfs]
[12319.337941]  [&amp;lt;ffffffffa0bfa4b0&amp;gt;] ? osd_scrub_next+0x0/0x4b0 [osd_ldiskfs]
[12319.345300]  [&amp;lt;ffffffffa0bf732a&amp;gt;] osd_scrub_main+0x59a/0xd00 [osd_ldiskfs]
[12319.352591]  [&amp;lt;ffffffff810097cc&amp;gt;] ? __switch_to+0x1ac/0x320
[12319.358585]  [&amp;lt;ffffffffa0bf6d90&amp;gt;] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[12319.365881]  [&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20
[12319.371174]  [&amp;lt;ffffffffa0bf6d90&amp;gt;] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[12319.378509]  [&amp;lt;ffffffffa0bf6d90&amp;gt;] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[12319.385799]  [&amp;lt;ffffffff8100c0c0&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We had panic_on_lbug off, so we don&apos;t have a crash dump.  But the system is still running, so if there&apos;s anything useful we can try to grab it.  I tried to cat /proc/fs/lustre/osd-ldiskfs/atlastds-OST00f3/oi_scrub but it just hangs.  That &apos;cat&apos; process is stuck on the following:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat /proc/83715/stack
[&amp;lt;ffffffff81281f34&amp;gt;] call_rwsem_down_read_failed+0x14/0x30
[&amp;lt;ffffffffa0bf630d&amp;gt;] osd_scrub_dump+0x3d/0x320 [osd_ldiskfs]
[&amp;lt;ffffffffa0be6055&amp;gt;] lprocfs_osd_rd_oi_scrub+0x75/0xb0 [osd_ldiskfs]
[&amp;lt;ffffffffa054f563&amp;gt;] lprocfs_fops_read+0xf3/0x1f0 [obdclass]
[&amp;lt;ffffffff811e9fee&amp;gt;] proc_reg_read+0x7e/0xc0
[&amp;lt;ffffffff81181f05&amp;gt;] vfs_read+0xb5/0x1a0
[&amp;lt;ffffffff81182041&amp;gt;] sys_read+0x51/0x90
[&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
[&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;the FID it&apos;s complaining about &lt;span class=&quot;error&quot;&gt;&amp;#91;0x100000000:0x1:0x0&amp;#93;&lt;/span&gt; looks suspect.  The sequence is FID_SEQ_IDIF and the ObjID is 1.  I know on ext4 inode 1 stores the bad blocks information, but I don&apos;t think that&apos;s what we&apos;re seeing here.&lt;/p&gt;

&lt;p&gt;We haven&apos;t yet tried to re-mount to see if the issue is persistent, since there may be something on the running system that you want us to provide.  But we can do that if it&apos;s helpful.&lt;/p&gt;</description>
                <environment></environment>
        <key id="23932">LU-4829</key>
            <summary>LBUG: ASSERTION( !fid_is_idif(fid) )</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="yong.fan">nasf</assignee>
                                    <reporter username="ezell">Matt Ezell</reporter>
                        <labels>
                            <label>mn4</label>
                    </labels>
                <created>Fri, 28 Mar 2014 14:38:58 +0000</created>
                <updated>Sun, 10 Aug 2014 12:32:07 +0000</updated>
                            <resolved>Thu, 7 Aug 2014 14:22:15 +0000</resolved>
                                    <version>Lustre 2.4.3</version>
                                    <fixVersion>Lustre 2.5.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="80497" author="green" created="Fri, 28 Mar 2014 17:55:58 +0000"  >&lt;p&gt;Is this a 2.4 formatted filesystem or was it created in the past with some other version and then upgraded?&lt;/p&gt;</comment>
                            <comment id="80512" author="ezell" created="Fri, 28 Mar 2014 18:51:50 +0000"  >&lt;p&gt;Sorry, I should have included that in the original report.  It was formatted with 2.4, so there shouldn&apos;t be any IGIF/IDIF files.&lt;/p&gt;</comment>
                            <comment id="80513" author="pjones" created="Fri, 28 Mar 2014 18:52:09 +0000"  >&lt;p&gt;Fan Yong&lt;/p&gt;

&lt;p&gt;Could you please advise on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="80579" author="yong.fan" created="Mon, 31 Mar 2014 02:45:16 +0000"  >&lt;p&gt;With enabling LMA on OST-object for lustre-2.4.3, we need to back port the patch &lt;a href=&quot;http://review.whamcloud.com/#/c/6669/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/6669/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="80616" author="pjones" created="Mon, 31 Mar 2014 15:24:14 +0000"  >&lt;p&gt;Matt&lt;/p&gt;

&lt;p&gt;This fix is included in all 2.5.x releases. It would be possible to back port it to 2.4.x but there would be quite a few dependencies to pick up so how we proceed will depend on the timeline for your move to 2.5.x&lt;/p&gt;

&lt;p&gt;Regards&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="80650" author="ezell" created="Mon, 31 Mar 2014 18:59:46 +0000"  >&lt;p&gt;We will need to take several test shots between now putting 2.5 into production, but I think it&apos;s reasonable for us to target that (and avoid the work of backporting).  We haven&apos;t seen this in production, yet.&lt;/p&gt;

&lt;p&gt;I guess my only question is:&lt;br/&gt;
If we hit this in production, would a reboot and re-mount hit the problem again? Or is it intermittent?&lt;/p&gt;</comment>
                            <comment id="80668" author="yong.fan" created="Tue, 1 Apr 2014 02:50:37 +0000"  >&lt;p&gt;Only reboot or re-mount but without 6669 patch applied can NOT resolve the issue, even though it may work for a while, you still will hit it again some time later.&lt;/p&gt;</comment>
                            <comment id="81437" author="yong.fan" created="Fri, 11 Apr 2014 16:37:26 +0000"  >&lt;p&gt;Matt, how about your process? what do you want us to do next step?&lt;/p&gt;</comment>
                            <comment id="82241" author="laisiyao" created="Wed, 23 Apr 2014 07:23:29 +0000"  >&lt;p&gt;backport patch for 2.4.x is on &lt;a href=&quot;http://review.whamcloud.com/#/c/10061/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/10061/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="91024" author="jamesanunez" created="Wed, 6 Aug 2014 22:25:43 +0000"  >&lt;p&gt;Matt, &lt;/p&gt;

&lt;p&gt;Since the fix is in b2_5 and later releases and there is a patch for b2_4, should we close this ticket or is there something else you need from us?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;</comment>
                            <comment id="91033" author="ezell" created="Thu, 7 Aug 2014 02:26:13 +0000"  >&lt;p&gt;Yes, we can close it.  Thanks.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="19004">LU-3335</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwilb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>13287</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>