<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:58:02 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-13061] osd_fid_lookup()) ASSERTION( fid_is_sane(fid) || fid_is_idif(fid) ) failed: [0x0:0x68:0x0]</title>
                <link>https://jira.whamcloud.com/browse/LU-13061</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;If a system has hit &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12593&quot; title=&quot;update_log corruption&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12593&quot;&gt;&lt;del&gt;LU-12593&lt;/del&gt;&lt;/a&gt; with a corrupt block in the llog file, it may trigger an &lt;tt&gt;LASSERT()&lt;/tt&gt; because of a bad FID found in the unintialized part of the block.  Applying the patch from that ticket is too late to fix the problem.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 17438:0:(osd_handler.c:1077:osd_fid_lookup()) ASSERTION( fid_is_sane(fid) || fid_is_idif(fid) ) failed: [0x0:0x68:0x0]
LustreError: 17438:0:(osd_handler.c:1077:osd_fid_lookup()) LBUG
Pid: 17438, comm: llog_process_th 3.10.0-1062.1.1.el7_lustre.x86_64
Call Trace:
 libcfs_call_trace+0x8c/0xc0 [libcfs]
 lbug_with_loc+0x4c/0xa0 [libcfs]
 osd_fid_lookup+0xc8/0x1c60 [osd_ldiskfs]
 osd_object_init+0x61/0x110 [osd_ldiskfs]
 lu_object_start.isra.35+0x8b/0x120 [obdclass]  
 lu_object_find_at+0x1e1/0xa60 [obdclass]
 dt_locate_at+0x1d/0xb0 [obdclass]
 llog_osd_open+0x50e/0xf30 [obdclass]
 llog_open+0x15a/0x3e0 [obdclass]
 osp_sync_init+0x44a/0xe20 [osp]
 osp_init0.isra.19+0x1aed/0x1f60 [osp]
 osp_device_alloc+0x86/0x130 [osp]
 obd_setup+0x119/0x280 [obdclass]
 class_setup+0x2a8/0x840 [obdclass]
 class_process_config+0x1726/0x2830 [obdclass]
 class_config_llog_handler+0x819/0x1520 [obdclass]
 llog_process_thread+0x82f/0x18e0 [obdclass]
 llog_process_thread_daemonize+0x9f/0xe0 [obdclass]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Lustre should avoid ever triggering an LASSERT() on data read from disk or from the network.  In this case, it probably makes sense to add a check in &lt;tt&gt;llog_osd_open()&lt;/tt&gt; with &lt;tt&gt;fid_is_sane()&lt;/tt&gt; before it uses the FID, and just return an error rather than crashing.&lt;/p&gt;</description>
                <environment></environment>
        <key id="57594">LU-13061</key>
            <summary>osd_fid_lookup()) ASSERTION( fid_is_sane(fid) || fid_is_idif(fid) ) failed: [0x0:0x68:0x0]</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="hongchao.zhang">Hongchao Zhang</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                    </labels>
                <created>Tue, 10 Dec 2019 23:40:35 +0000</created>
                <updated>Wed, 22 Jan 2020 12:56:14 +0000</updated>
                            <resolved>Fri, 10 Jan 2020 13:07:03 +0000</resolved>
                                    <version>Lustre 2.12.3</version>
                                    <fixVersion>Lustre 2.14.0</fixVersion>
                    <fixVersion>Lustre 2.12.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="259554" author="adilger" created="Tue, 10 Dec 2019 23:42:04 +0000"  >&lt;p&gt;HongChao, can you please make a patch for this, preferably with a test case.  You can add a &lt;tt&gt;CFS_FAIL_LOC()&lt;/tt&gt; check to set &lt;tt&gt;fid_seq=0&lt;/tt&gt; or similar to trigger the &lt;tt&gt;LASSERT()&lt;/tt&gt;, then verify your patch fixes it.&lt;/p&gt;</comment>
                            <comment id="259568" author="adilger" created="Wed, 11 Dec 2019 03:59:05 +0000"  >&lt;p&gt;Based on further analysis of the problem, it seems we can&apos;t verify &lt;tt&gt;fid_is_sane()&lt;/tt&gt; directly in &lt;tt&gt;osp_sync_llog_init()&lt;/tt&gt; since the logid is not yet converted to a FID.  If &lt;tt&gt;llog_osd_open()&lt;/tt&gt; checks &lt;tt&gt;fid_is_sane()&lt;/tt&gt; and returns &lt;tt&gt;-ENOENT&lt;/tt&gt; if it isn&apos;t sane:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
               &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (logid != NULL) {
                        logid_to_fid(logid, &amp;amp;lgi-&amp;gt;lgi_fid);
                        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!fid_is_sane(&amp;amp;lgi-&amp;gt;lgi_fid)) {
                                CERROR(&lt;span class=&quot;code-quote&quot;&gt;&quot;%s: bad FID &quot;&lt;/span&gt;DFID&lt;span class=&quot;code-quote&quot;&gt;&quot; in %s idx: rc = %d\n&quot;&lt;/span&gt;,
                                       ctxt-&amp;gt;loc_exp-&amp;gt;exp_obd-&amp;gt;obd_name,
                                       PFID(&amp;amp;lgi-&amp;gt;lgi_fid), -ENOENT);
                                RETURN(-ENOENT);
                        } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; {
                                /* If logid == NULL, then it means the caller needs
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;then this return code is checked in &lt;tt&gt;osp_sync_llog_init()&lt;/tt&gt; and the llog will be recreated:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
                rc = llog_open(env, ctxt, &amp;amp;lgh, &amp;amp;osi-&amp;gt;osi_cid.lci_logid, NULL,
                               LLOG_OPEN_EXISTS);
                &lt;span class=&quot;code-comment&quot;&gt;/* re-create llog &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; it is missing */&lt;/span&gt;
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc == -ENOENT)
                        logid_set_id(&amp;amp;osi-&amp;gt;osi_cid.lci_logid, 0);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="259569" author="hongchao.zhang" created="Wed, 11 Dec 2019 04:04:36 +0000"  >&lt;p&gt;Hi Andreas, &lt;br/&gt;
How about calling &apos;logid_to_fid&quot; in osp_sync_llog_init after getting the logid from &quot;CATALOGS&quot; file, then check whether it is sane?&lt;/p&gt;</comment>
                            <comment id="259570" author="adilger" created="Wed, 11 Dec 2019 04:07:21 +0000"  >&lt;p&gt;Sure, that would also work, and is also much more direct.&lt;/p&gt;</comment>
                            <comment id="259573" author="adilger" created="Wed, 11 Dec 2019 04:30:48 +0000"  >&lt;p&gt;There is a bit of a question in my mind whether &lt;tt&gt;logid&lt;/tt&gt; may &lt;em&gt;not&lt;/em&gt; be a FID in some cases (e.g. upgrade from some very old system), but I don&apos;t &lt;em&gt;think&lt;/em&gt; that is true anymore.  This compatibility was needed for old filesystems when they were upgraded from pre-2.4 &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/help_16.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; Lustre, but all of these old systems should be gone at this point, or are never upgrading.&lt;/p&gt;

&lt;p&gt;After fixing the current LASSERT issue, it might make sense to look at the use of &lt;tt&gt;llog_logid&lt;/tt&gt; in the code and update it to use &lt;tt&gt;lu_fid&lt;/tt&gt; wherever possible, by converting the &lt;tt&gt;llog_logid&lt;/tt&gt; into &lt;tt&gt;lu_fid&lt;/tt&gt; immediately and only passing the FID around.  The next step would be to add a new &lt;tt&gt;LLOG_LOGFID_MAGIC&lt;/tt&gt; that only stores the &lt;tt&gt;lu_fid&lt;/tt&gt; to the catalog, start using it after some delay (with &lt;tt&gt;OBD_VERSION_CHECK()&lt;/tt&gt; and/or backport code to LTS to read &lt;tt&gt;LLOG_LOGFID_MAGIC&lt;/tt&gt; records and convert into &lt;tt&gt;llog_logid&lt;/tt&gt; to minimize code change but still allow upgrade/downgrade to work) and then eventually we can remove &lt;tt&gt;llog_logid&lt;/tt&gt;.&lt;/p&gt;</comment>
                            <comment id="259576" author="hongchao.zhang" created="Wed, 11 Dec 2019 04:56:03 +0000"  >&lt;p&gt;as per the dump of the &quot;CATALOGS&quot;, the content of the &quot;llog_catid&quot; is&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;000020 0000000400000068 0000000000000000
000030 0000000000000000 0000000000000000
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;the OID field is &quot;0x400000068&quot;, but is truncated to &quot;0x68&quot; during assigning it to lu_fid.f_oid(32bits).&lt;br/&gt;
the data seems to be a normal FID (f_seq=0x400000068, f_oid=0, f_ver=0)&lt;/p&gt;</comment>
                            <comment id="259577" author="adilger" created="Wed, 11 Dec 2019 05:09:30 +0000"  >&lt;blockquote&gt;
&lt;p&gt;the OID field is &quot;0x400000068&quot;, but is truncated to &quot;0x68&quot; during assigning it to lu_fid.f_oid(32bits).&lt;br/&gt;
the data seems to be a normal FID (f_seq=0x400000068, f_oid=0, f_ver=0)&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;No, the &lt;tt&gt;0x400000068&lt;/tt&gt; field is the &lt;tt&gt;oi_id&lt;/tt&gt; field (mapped to &lt;tt&gt;f_oid&lt;/tt&gt;), and the &lt;tt&gt;f_seq&lt;/tt&gt; field is &lt;tt&gt;0x0&lt;/tt&gt;, which is what triggers the LASSERT.  Valid records look like the following with &lt;tt&gt;oi_seq = f_seq = 0x1 = FID_SEQ_LLOG&lt;/tt&gt;:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;000080 00000000000004ef 0000000000000001
000090 0000000000000000 0000000000000000
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;In this case, the FID is &lt;tt&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;0x1:0x4ef:0x0&amp;#93;&lt;/span&gt;&lt;/tt&gt; and would map to object &lt;tt&gt;O/1/d15/1263&lt;/tt&gt; on the MDT.&lt;/p&gt;</comment>
                            <comment id="259687" author="gerrit" created="Thu, 12 Dec 2019 13:51:35 +0000"  >&lt;p&gt;Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36998&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36998&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13061&quot; title=&quot;osd_fid_lookup()) ASSERTION( fid_is_sane(fid) || fid_is_idif(fid) ) failed: [0x0:0x68:0x0]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13061&quot;&gt;&lt;del&gt;LU-13061&lt;/del&gt;&lt;/a&gt; osp: check catlog FID after reading in&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 7a5df8c9c0fa2cab2135e68e305a9a850263d720&lt;/p&gt;</comment>
                            <comment id="260967" author="gerrit" created="Fri, 10 Jan 2020 07:40:25 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36998/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36998/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13061&quot; title=&quot;osd_fid_lookup()) ASSERTION( fid_is_sane(fid) || fid_is_idif(fid) ) failed: [0x0:0x68:0x0]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13061&quot;&gt;&lt;del&gt;LU-13061&lt;/del&gt;&lt;/a&gt; osp: check catlog FID after reading in&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 4597fa7d884de0f1a1b030052d4d34983fed6109&lt;/p&gt;</comment>
                            <comment id="261009" author="pjones" created="Fri, 10 Jan 2020 13:07:03 +0000"  >&lt;p&gt;Landed for 2.14&lt;/p&gt;</comment>
                            <comment id="261031" author="gerrit" created="Fri, 10 Jan 2020 14:57:41 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/37185&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37185&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13061&quot; title=&quot;osd_fid_lookup()) ASSERTION( fid_is_sane(fid) || fid_is_idif(fid) ) failed: [0x0:0x68:0x0]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13061&quot;&gt;&lt;del&gt;LU-13061&lt;/del&gt;&lt;/a&gt; osp: check catlog FID after reading in&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 0d2e23757ef6663f8b06b935253835937d8ca1f3&lt;/p&gt;</comment>
                            <comment id="261603" author="gerrit" created="Wed, 22 Jan 2020 02:25:41 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/37185/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37185/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13061&quot; title=&quot;osd_fid_lookup()) ASSERTION( fid_is_sane(fid) || fid_is_idif(fid) ) failed: [0x0:0x68:0x0]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13061&quot;&gt;&lt;del&gt;LU-13061&lt;/del&gt;&lt;/a&gt; osp: check catlog FID after reading in&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 055eab6bd4c29bc961a10824ffa44323cce7640c&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="25650">LU-5369</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="56498">LU-12593</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00qr3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>