<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:02:20 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6683] OSS crash when starting lfsck layout check</title>
                <link>https://jira.whamcloud.com/browse/LU-6683</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When starting the lfsck layout check on our test file system, but OSS servers immediately crash with something like the following on the console (or in vmcore-dmesg.txt). I also discovered that I can&apos;t stop the lfsck (lctl lfsck_stop just hangs) in this stage (after recovering the OSTs) and when failing over the MDT in this state, it is re-started when mounting the MDT on the other MDS, crashing the OSS nodes again. The output below has been collected after the crash triggered by the MDT failover mounting.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;------------[ cut here ]------------
kernel BUG at fs/jbd2/transaction.c:1030!
Lustre: play01-OST0001: deleting orphan objects from 0x0:51613818 to 0x0:5161388
Lustre: play01-OST0003: deleting orphan objects from 0x0:77539134 to 0x0:7753920
Lustre: play01-OST0005: deleting orphan objects from 0x0:44598982 to 0x0:4459905
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:0c:00.0/host7/target7
CPU 2 
Modules linked in: osp(U) ofd(U) lfsck(U) ipmi_si ost(U) mgc(U) osd_ldiskfs(U) a

Pid: 25013, comm: lfsck Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Dell Inc
RIP: 0010:[&amp;lt;ffffffffa039179d&amp;gt;]  [&amp;lt;ffffffffa039179d&amp;gt;] jbd2_journal_dirty_metadata
RSP: 0018:ffff8801fa26da00  EFLAGS: 00010246
RAX: ffff88043b4aa680 RBX: ffff880202e1f498 RCX: ffff880226a866e0
RDX: 0000000000000000 RSI: ffff880226a866e0 RDI: 0000000000000000
RBP: ffff8801fa26da20 R08: ffff880226a866e0 R09: 0000000000000018
R10: 0000000000480403 R11: 0000000000000001 R12: ffff880202e386d8
R13: ffff880226a866e0 R14: ffff880239208800 R15: 0000000000000000
FS:  00007fdff3fff700(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007feb2ce760a0 CR3: 000000043b4d1000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process lfsck (pid: 25013, threadinfo ffff8801fa26c000, task ffff8801f78b3540)
Stack:
 ffff880202e1f498 ffffffffa0fd7710 ffff880226a866e0 0000000000000000
&amp;lt;d&amp;gt; ffff8801fa26da60 ffffffffa0f9600b ffff8801fa26daa0 ffffffffa0fd2af3
&amp;lt;d&amp;gt; ffff8802159f3000 ffff8803f12396e0 ffff8803f1239610 ffff8801fa26db28
Call Trace:
 [&amp;lt;ffffffffa0f9600b&amp;gt;] __ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
 [&amp;lt;ffffffffa0fd2af3&amp;gt;] ? ldiskfs_xattr_set_entry+0x4e3/0x4f0 [ldiskfs]
 [&amp;lt;ffffffffa0fa1d9a&amp;gt;] ldiskfs_mark_iloc_dirty+0x52a/0x630 [ldiskfs]
 [&amp;lt;ffffffffa0fd4abc&amp;gt;] ldiskfs_xattr_set_handle+0x33c/0x560 [ldiskfs]
 [&amp;lt;ffffffffa0fd4ddc&amp;gt;] ldiskfs_xattr_set+0xfc/0x1a0 [ldiskfs]
 [&amp;lt;ffffffffa0fd500e&amp;gt;] ldiskfs_xattr_trusted_set+0x2e/0x30 [ldiskfs]
 [&amp;lt;ffffffff811b4722&amp;gt;] generic_setxattr+0xa2/0xb0
 [&amp;lt;ffffffffa0d4690d&amp;gt;] __osd_xattr_set+0x8d/0xe0 [osd_ldiskfs]
 [&amp;lt;ffffffffa0d4e005&amp;gt;] osd_xattr_set+0x3a5/0x4b0 [osd_ldiskfs]
 [&amp;lt;ffffffffa0a3f446&amp;gt;] lfsck_master_oit_engine+0x14c6/0x1ef0 [lfsck]
 [&amp;lt;ffffffffa0a4094e&amp;gt;] lfsck_master_engine+0xade/0x13e0 [lfsck]
 [&amp;lt;ffffffff81064b90&amp;gt;] ? default_wake_function+0x0/0x20
 [&amp;lt;ffffffffa0a3fe70&amp;gt;] ? lfsck_master_engine+0x0/0x13e0 [lfsck]
 [&amp;lt;ffffffff8109e66e&amp;gt;] kthread+0x9e/0xc0
 [&amp;lt;ffffffff8100c20a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffff8109e5d0&amp;gt;] ? kthread+0x0/0xc0
 [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20
Code: c6 9c 03 00 00 4c 89 f7 e8 91 bf 19 e1 48 8b 33 ba 01 00 00 00 4c 89 e7 e 
RIP  [&amp;lt;ffffffffa039179d&amp;gt;] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
 RSP &amp;lt;ffff8801fa26da00&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We&apos;ve got a vmcore file on one of the servers, which we can upload if this is required.&lt;/p&gt;

&lt;p&gt;After failing over the MDT and recovering the OSTs, I can stop the lfsck layout check.&lt;/p&gt;</description>
                <environment>files system with 1MDT, 6 OST, 2 OSS, installed as 1.6, upgrade to 1.8, 2.5, now 2.7</environment>
        <key id="30486">LU-6683</key>
            <summary>OSS crash when starting lfsck layout check</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="yong.fan">nasf</assignee>
                                    <reporter username="ferner">Frederik Ferner</reporter>
                        <labels>
                    </labels>
                <created>Wed, 3 Jun 2015 16:34:59 +0000</created>
                <updated>Mon, 9 Nov 2015 18:36:05 +0000</updated>
                            <resolved>Tue, 21 Jul 2015 16:23:21 +0000</resolved>
                                    <version>Lustre 2.7.0</version>
                                    <fixVersion>Lustre 2.8.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="117332" author="pjones" created="Wed, 3 Jun 2015 18:54:11 +0000"  >&lt;p&gt;Fan Yong&lt;/p&gt;

&lt;p&gt;Could you please advise on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="117389" author="yong.fan" created="Thu, 4 Jun 2015 04:59:01 +0000"  >&lt;p&gt;The reason is that the osd_declare_xattr_set() does not preserve enough journal credits for the subsequent osd_xattr_set() that is triggered by the LFSCK for upgrading object&apos;s FID-in-LMA.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;static int osd_declare_xattr_set(const struct lu_env *env,
                                 struct dt_object *dt,
                                 const struct lu_buf *buf, const char *name,
                                 int fl, struct thandle *handle)
{
...
        /* optimistic optimization: LMA is set first and usually fit inode */
        if (strcmp(name, XATTR_NAME_LMA) == 0) {
                if (dt_object_exists(dt))
                        credits = 0;
                else
                        credits = 1;
        } else if (strcmp(name, XATTR_NAME_VERSION) == 0) {
                credits = 1;
...
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Above optimisation does not consider the upgrading case, and should be improved.&lt;/p&gt;</comment>
                            <comment id="117402" author="gerrit" created="Thu, 4 Jun 2015 06:12:11 +0000"  >&lt;p&gt;Fan Yong (fan.yong@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/15132&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15132&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6683&quot; title=&quot;OSS crash when starting lfsck layout check&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6683&quot;&gt;&lt;del&gt;LU-6683&lt;/del&gt;&lt;/a&gt; osd: declare enough credits for generating LMA&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_7&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b22c236872dbd5e585ff1b3ff7cf08e00967f6b6&lt;/p&gt;</comment>
                            <comment id="117410" author="yong.fan" created="Thu, 4 Jun 2015 08:25:10 +0000"  >&lt;p&gt;Frederik, would you please to try above patch? Thanks!&lt;/p&gt;</comment>
                            <comment id="117415" author="ferner" created="Thu, 4 Jun 2015 09:22:23 +0000"  >&lt;p&gt;I noticed on review page that the builds are marked as failure, but this seems to be RHEL7 only. I&apos;ll certainly try the patch ASAP.&lt;/p&gt;</comment>
                            <comment id="117416" author="yong.fan" created="Thu, 4 Jun 2015 09:26:33 +0000"  >&lt;p&gt;The failure is related with the build system, not the patch. So please go ahead with the patch. Thanks!&lt;/p&gt;</comment>
                            <comment id="117427" author="ferner" created="Thu, 4 Jun 2015 12:19:50 +0000"  >&lt;p&gt;Thanks for confirming regarding the build failure.&lt;/p&gt;

&lt;p&gt;I have now updated our test file system to include the patch and can confirm that this fixed the crash for us.&lt;/p&gt;</comment>
                            <comment id="117433" author="yong.fan" created="Thu, 4 Jun 2015 13:33:01 +0000"  >&lt;p&gt;Thanks Frederik for the updating. The patch for b2_7 has been replaced by the patch for b2_7_fe: &lt;a href=&quot;http://review.whamcloud.com/#/c/15133/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/15133/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="119161" author="adilger" created="Fri, 19 Jun 2015 22:58:13 +0000"  >&lt;p&gt;Is this patch needed for master?&lt;/p&gt;</comment>
                            <comment id="119162" author="yong.fan" created="Fri, 19 Jun 2015 23:12:55 +0000"  >&lt;p&gt;Yes, master needs the patch also.&lt;/p&gt;</comment>
                            <comment id="119163" author="gerrit" created="Fri, 19 Jun 2015 23:22:09 +0000"  >&lt;p&gt;Fan Yong (fan.yong@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/15361&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15361&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6683&quot; title=&quot;OSS crash when starting lfsck layout check&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6683&quot;&gt;&lt;del&gt;LU-6683&lt;/del&gt;&lt;/a&gt; osd: declare enough credits for generating LMA&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 71d360f5a2201aa666382a0b7da1b89860596777&lt;/p&gt;</comment>
                            <comment id="121819" author="gerrit" created="Tue, 21 Jul 2015 16:09:46 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/15361/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15361/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6683&quot; title=&quot;OSS crash when starting lfsck layout check&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6683&quot;&gt;&lt;del&gt;LU-6683&lt;/del&gt;&lt;/a&gt; osd: declare enough credits for generating LMA&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 3675d14de7ffcd761eca1448aab950f80773412a&lt;/p&gt;</comment>
                            <comment id="121831" author="pjones" created="Tue, 21 Jul 2015 16:23:21 +0000"  >&lt;p&gt;Landed for 2.8&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxetz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>