<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:13:06 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-14822] Panic at dnode.c leading to LNet service thread inactive</title>
                <link>https://jira.whamcloud.com/browse/LU-14822</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;On our MDS running ZFS backing, we&apos;re seeing a frequent issue which&apos;ll hang the clients and show the attached stack trace.&#160; I&apos;ll also attach the lustre-logs for the relevant time period of this morning&apos;s instance.&lt;/p&gt;

&lt;p&gt;Once this has happened, it seems the only option to reconnect the client is a reboot of the MDS.&lt;/p&gt;

&lt;p&gt;This may be related to the MDT filling up - we changed the zpool topology to increase the size of the MDT and all seemed well for a few days after before these issues started to occur.&lt;/p&gt;

&lt;p&gt;I&apos;m running an lfsck which has so far repaired a large number of namespaces but the problem as occurred again while that was running.&lt;/p&gt;

&lt;p&gt;Any help as always much appreciated.&#160;&lt;/p&gt;</description>
                <environment>EL7</environment>
        <key id="65000">LU-14822</key>
            <summary>Panic at dnode.c leading to LNet service thread inactive</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="pjones">Peter Jones</assignee>
                                    <reporter username="dneg">Dneg</reporter>
                        <labels>
                    </labels>
                <created>Tue, 6 Jul 2021 16:32:21 +0000</created>
                <updated>Mon, 12 Jul 2021 16:02:06 +0000</updated>
                            <resolved>Mon, 12 Jul 2021 16:02:06 +0000</resolved>
                                    <version>Lustre 2.12.6</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="306336" author="dneg" created="Tue, 6 Jul 2021 16:33:38 +0000"  >&lt;p&gt;and so you&apos;re not just talking to a generic company name, Stephen here &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="306339" author="pjones" created="Tue, 6 Jul 2021 16:41:38 +0000"  >&lt;p&gt;Stephen&lt;/p&gt;

&lt;p&gt;What ZFS version are you running? Could this be &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13536&quot; title=&quot;Lustre ZFS dnode kernel panic&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13536&quot;&gt;&lt;del&gt;LU-13536&lt;/del&gt;&lt;/a&gt;? If so, moving to a newer ZFS version could address this issue for you.&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="306340" author="dneg" created="Tue, 6 Jul 2021 16:45:38 +0000"  >&lt;p&gt;Browsing around it seems this might be related to the issue seen in&#160;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13536&quot; title=&quot;Lustre ZFS dnode kernel panic&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13536&quot;&gt;&lt;del&gt;LU-13536&lt;/del&gt;&lt;/a&gt; ?&lt;/p&gt;

&lt;p&gt;Is there a recommended package set to upgrade to ZFS 0.8.3 or newer/different?&#160; I may of course be barking up the wrong tree here.&lt;/p&gt;</comment>
                            <comment id="306341" author="dneg" created="Tue, 6 Jul 2021 16:46:38 +0000"  >&lt;p&gt;Hi Peter.&#160; Your comment showed up as I clicked &apos;Add&apos; &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&#160; As mentioned, happy to upgrade ZFS.&#160; Do you have a recommended method for that or just grab them from the OpenZFS project?&lt;/p&gt;</comment>
                            <comment id="306342" author="dneg" created="Tue, 6 Jul 2021 16:48:57 +0000"  >&lt;p&gt;For reference:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;emds1 /tmp # rpm -qa | grep zfs
lustre-osd-zfs-mount-2.12.6-1.el7.x86_64
zfs-0.7.13-1.el7.x86_64
kmod-zfs-3.10.0-1160.2.1.el7_lustre.x86_64-0.7.13-1.el7.x86_64
libzfs2-0.7.13-1.el7.x86_64
kmod-lustre-osd-zfs-2.12.6-1.el7.x86_64 &lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="306344" author="pjones" created="Tue, 6 Jul 2021 17:00:34 +0000"  >&lt;p&gt;Yes - just grab the updated ZFS version from the ZoL site and rebuild. Several sites have done this successfully.&lt;/p&gt;</comment>
                            <comment id="306345" author="dneg" created="Tue, 6 Jul 2021 17:04:08 +0000"  >&lt;p&gt;Would I be correct in assuming that there&apos;s nothing particularly version specific about the two lustre packages in that list (lustre-osd-zfs-mount and kmod-lustre-osd-zfs)?&lt;/p&gt;

&lt;p&gt;I&apos;ll just replace zfs, libzfs2 with the newer ones and swap out the Lustre provided kmod-zfs with zfs-dkms?&lt;/p&gt;</comment>
                            <comment id="306348" author="dneg" created="Tue, 6 Jul 2021 17:25:57 +0000"  >&lt;p&gt;Had a first pass at this and I&apos;m afraid I think I&apos;m gonna have to ask for some step-by-step here.&lt;/p&gt;

&lt;p&gt;A quick rpmbuild -ba of the ZFS spec with a simple swap out of the spec file version number and ZFS tar.gz source is taking me down a road of missing files and so on.&#160; Do you have appropriate spec files for the newer versions?&#160; I notice also that the SPL source is no longer separate.&#160; Haven&apos;t tried rebuilding that yet.&lt;/p&gt;</comment>
                            <comment id="306351" author="adilger" created="Tue, 6 Jul 2021 17:45:13 +0000"  >&lt;p&gt;Per the comments in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13536&quot; title=&quot;Lustre ZFS dnode kernel panic&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13536&quot;&gt;&lt;del&gt;LU-13536&lt;/del&gt;&lt;/a&gt;, there are two approaches to solving the crashes in that ticket - updating to ZFS 0.8.x, or patching ZFS 0.7.13 with the two referenced patches:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;    78e213946 Fix dnode_hold() freeing dnode behavior
    58769a4eb Don&#8217;t allow dnode allocation if dn_holds != 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Due to changes in ZFS between 0.7 and 0.8, if you do a ZFS upgrade you would need to rebuild all of the Lustre RPMs to get a new &lt;tt&gt;kmod-lustre-osd-zfs&lt;/tt&gt; since it links directly with the ZFS module.  If you apply the patches to the ZFS 0.7.13 you could very likely keep the existing Lustre RPMs since that change is internal only.&lt;/p&gt;</comment>
                            <comment id="306352" author="dneg" created="Tue, 6 Jul 2021 17:49:10 +0000"  >&lt;p&gt;Right.&#160; Thanks.&#160; Patching for now then.&lt;/p&gt;</comment>
                            <comment id="306376" author="dneg" created="Tue, 6 Jul 2021 20:27:08 +0000"  >&lt;p&gt;Rebuilt zfs and zfs-dkms with those two patches.&#160; I&apos;ll schedule a reboot soon to pick up the patched versions.&#160; Happy for you to close this and I can re-open if necessary or leave it as is for a few days while it&apos;s soak tested - whichever is your preference.&lt;/p&gt;</comment>
                            <comment id="306377" author="pjones" created="Tue, 6 Jul 2021 20:30:11 +0000"  >&lt;p&gt;Great. Why not just let us know after the weekend if there is a noticeable improvement (or sooner of course if there are still problems)?&lt;/p&gt;</comment>
                            <comment id="306965" author="dneg" created="Mon, 12 Jul 2021 15:58:02 +0000"  >&lt;p&gt;Looking good.&#160; Thanks for your help.&lt;/p&gt;</comment>
                            <comment id="306966" author="pjones" created="Mon, 12 Jul 2021 16:02:06 +0000"  >&lt;p&gt;Excellent - thanks for the update&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="39523" name="lustre-logs.tar.gz" size="20666072" author="dneg" created="Tue, 6 Jul 2021 16:31:15 +0000"/>
                            <attachment id="39522" name="stack-trace" size="51994" author="dneg" created="Tue, 6 Jul 2021 16:31:51 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i01yl3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>