<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:11:28 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-14638] crash after lnet_inet_enumerate() failed</title>
                <link>https://jira.whamcloud.com/browse/LU-14638</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I tried starting 2.14.0 in my local VM (just &lt;tt&gt;llmount.sh&lt;/tt&gt; on kernel  &lt;tt&gt;3.10.0-1160.21.1.el7_lustre.ddn13.x86_64&lt;/tt&gt; after a clean build) and twice got a crash after &lt;tt&gt;lnet_inet_enumerate()&lt;/tt&gt; reported a down interface:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[52067.917105] LNet: 8128:0:(config.c:1565:lnet_inet_enumerate()) lnet: Ignoring interface : it&apos;s down
[52067.926616] BUG: unable to handle kernel NULL pointer dereference at 0000000000000168
[52067.933558] IP: [&amp;lt;ffffffff8464f1f6&amp;gt;] dev_get_flags+0x6/0x70
[52068.014929] CPU: 1 PID: 8128 Comm: insmod Kdump: loaded Tainted: G           OE  ------------   3.10.0-1160.21.1.el7_lustre.ddn13.x86_64 #1
[52068.024565] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[52068.098650] Call Trace:
[52068.101291]  [&amp;lt;ffffffffc0727149&amp;gt;] ? lnet_inet_enumerate+0x59/0x2d0 [lnet]
[52068.112194]  [&amp;lt;ffffffffc07ce9ea&amp;gt;] ksocknal_startup+0x12a/0xf90 [ksocklnd]
[52068.117704]  [&amp;lt;ffffffffc0721de5&amp;gt;] lnet_startup_lndnet+0x135/0x800 [lnet]
[52068.123269]  [&amp;lt;ffffffffc0724085&amp;gt;] LNetNIInit+0x735/0xcf0 [lnet]
[52068.133225]  [&amp;lt;ffffffffc0b3d1aa&amp;gt;] ptlrpc_ni_init+0x2a/0x1a0 [ptlrpc]
[52068.138628]  [&amp;lt;ffffffffc0b3d331&amp;gt;] ptlrpc_init_portals+0x11/0xf0 [ptlrpc]
[52068.144344]  [&amp;lt;ffffffffc0d131ae&amp;gt;] ptlrpc_init+0x1ae/0x1000 [ptlrpc]
[52068.148379]  [&amp;lt;ffffffff8400210a&amp;gt;] do_one_initcall+0xba/0x240
[52068.153886]  [&amp;lt;ffffffff8411e62a&amp;gt;] load_module+0x271a/0x2bb0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It would be useful for the &lt;tt&gt;lnet_inet_enumerate()&lt;/tt&gt; message to print &lt;b&gt;which&lt;/b&gt; interface is down, but I see looking at the code that it is &lt;em&gt;trying&lt;/em&gt; to do that but the interface name must be empty.  Using &lt;tt&gt;&apos;%s&apos;&lt;/tt&gt; around the name would make that more clear.  Adding a bit of extra debugging shows that it is failing right away, without checking any other interfaces:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 1297.032731] LNet: 4540:0:(config.c:1560:lnet_inet_enumerate()) lnet: checking interface &apos;&apos;
[ 1297.040231] LNet: 4540:0:(config.c:1566:lnet_inet_enumerate()) lnet: Ignoring interface &apos;&apos;: it&apos;s down
[ 1297.048456] BUG: unable to handle kernel NULL pointer dereference at 0000000000000168
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It seems like it is somehow trying to start with an empty device list, but the VM definitely has an interface that is up (I was logged into the VM via SSH when running the &lt;tt&gt;llmount.sh&lt;/tt&gt; command) and I haven&apos;t had any issues running other releases (I don&apos;t recall if I&apos;ve ever run vanilla 2.14.0 in this VM):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[   38.761576] e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[   38.771906] IPv6: ADDRCONF(NETDEV_UP): enp0s3: link is not ready
[   38.781866] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s3: link becomes ready
:
:
# ifconfig
enp0s3: flags=4163&amp;lt;UP,BROADCAST,RUNNING,MULTICAST&amp;gt;  mtu 1500
        inet 192.168.10.99  netmask 255.255.255.0  broadcast 192.168.10.255
        inet6 fe80::e9c4:7d8c:e641:5e6e  prefixlen 64  scopeid 0x20&amp;lt;link&amp;gt;
        ether 08:00:27:1d:4b:97  txqueuelen 1000  (Ethernet)
        RX packets 1594  bytes 272650 (266.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 441  bytes 79324 (77.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp0s3:0: flags=4163&amp;lt;UP,BROADCAST,RUNNING,MULTICAST&amp;gt;  mtu 1500
        inet 192.168.20.99  netmask 255.255.255.0  broadcast 192.168.20.255
        ether 08:00:27:1d:4b:97  txqueuelen 1000  (Ethernet)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;There are no lnet or socklnd module options in use:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat /etc/modprobe.d/lustre.conf
options mdt max_mod_rpcs_per_client=16
options ptlrpc at_min=10 at_max=900
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I&apos;ll try with the tip of master next (&lt;tt&gt;v2_14_51-85-ga2b5290d4284&lt;/tt&gt;) in case this has already been fixed, but filing this ticket to capture details while I have them, and in case anyone else running vanilla 2.14.0 has the same problem it will provide breadcrumbs to find the fix.&lt;/p&gt;</description>
                <environment></environment>
        <key id="63921">LU-14638</key>
            <summary>crash after lnet_inet_enumerate() failed</summary>
                <type id="3" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11318&amp;avatarType=issuetype">Task</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                    </labels>
                <created>Fri, 23 Apr 2021 18:27:58 +0000</created>
                <updated>Sat, 24 Apr 2021 16:09:51 +0000</updated>
                            <resolved>Sat, 24 Apr 2021 16:09:51 +0000</resolved>
                                    <version>Lustre 2.14.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="299610" author="simmonsja" created="Fri, 23 Apr 2021 19:22:34 +0000"  >&lt;p&gt;I noticed you have an IPv4 and IPv6 addresses assigned to enp0s3. Can you try removing the inet6 address&#160; and see if it stops crashing.&lt;/p&gt;</comment>
                            <comment id="299660" author="adilger" created="Sat, 24 Apr 2021 16:09:51 +0000"  >&lt;p&gt;After doing a full clean build I am not able to reproduce this, with it without IPv6 addresses of the interfaces. &lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i01t07:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>