<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:33:45 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17235] kernel panic on kiblnd_startup with logical interfaces</title>
                <link>https://jira.whamcloud.com/browse/LU-17235</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;kernel crashes if MR enabled with logical interfaces.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;options lnet networks=&quot;o2ib12(ib0,ib0:1)&quot;
# modprobe lustre
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[  143.167995] LNet: Using FastReg for registration
[  143.648939] LNet: Added LNI 10.0.0.1@o2ib12 [8/512/0/180]
[  144.128240] BUG: unable to handle kernel NULL pointer dereference at 00000000000004f0
[  144.136091] PGD 0 P4D 0  [  144.138631] Oops: 0000 [#1] SMP NOPTI
[  144.142299] CPU: 7 PID: 2739 Comm: modprobe Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-425.13.1.el8_7.x86_64 #1
[  144.154133] Hardware name: Intel Corporation S2600BPB/S2600BPB, BIOS SE5C620.86B.02.01.0015.032120220358 03/21/2022
[  144.164573] RIP: 0010:kiblnd_startup+0x1194/0x1720 [ko2iblnd]
[  144.170338] Code: 44 24 08 4c 8b a8 50 01 00 00 41 8b 4d 68 85 c9 0f 84 7d 02 00 00 49 8b 47 38 48 8b bb a0 01 00 00 48 8d 70 24 e8 9c e1 d5 e7 &amp;lt;80&amp;gt; b8 f0 04 00 00 02 74 0d 80 b8 34 02 00 00 06 0f 84 63 04 00 00
[  144.189114] RSP: 0018:ffffac2a0952fb30 EFLAGS: 00010046
[  144.194352] RAX: 0000000000000000 RBX: ffff9f37ac0a3400 RCX: 0000000000000028
[  144.201494] RDX: ffff9f37582aa800 RSI: ffff9f3775d1d224 RDI: 00000000cbaad2c9
[  144.208638] RBP: ffff9f3775d1d340 R08: a7c5921741031163 R09: 0000000000000005
[  144.215776] R10: ffff9f3775e3fd80 R11: ffff9f37553718f0 R12: ffff9f3775d1d340
[  144.222919] R13: ffff9f376d402a00 R14: 0000000000000007 R15: ffff9f3759f07600
[  144.230061] FS:  00007fbb91ba0740(0000) GS:ffff9f4e20bc0000(0000) knlGS:0000000000000000
[  144.238156] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  144.243906] CR2: 00000000000004f0 CR3: 0000000135c02005 CR4: 00000000007706e0
[  144.251049] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  144.258189] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  144.265592] PKRU: 55555554
[  144.268521] Call Trace:
[  144.271144]  lnet_startup_lndnet+0x14f/0x7e0 [lnet]
[  144.276207]  LNetNIInit+0x6e1/0xd70 [lnet]
[  144.280486]  ? 0xffffffffc0c58000
[  144.283964]  ptlrpc_init_portals+0x27/0x250 [ptlrpc]
[  144.289160]  ? 0xffffffffc0c58000
[  144.292646]  ptlrpc_init+0x196/0x1000 [ptlrpc]
[  144.297307]  do_one_initcall+0x46/0x1d0
[  144.301306]  ? do_init_module+0x22/0x230
[  144.305386]  ? kmem_cache_alloc_trace+0x142/0x280
[  144.310246]  do_init_module+0x5a/0x230
[  144.314149]  load_module+0x14bf/0x17f0
[  144.318053]  ? __do_sys_finit_module+0xb1/0x110
[  144.322745]  __do_sys_finit_module+0xb1/0x110
[  144.327259]  do_syscall_64+0x5b/0x1b0
[  144.331081]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[  144.336289] RIP: 0033:0x7fbb90ab59bd
[  144.340020] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 &amp;lt;48&amp;gt; 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9b 64 38 00 f7 d8 64 89 01 48
[  144.359098] RSP: 002b:00007ffe07f8ec88 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  144.366826] RAX: ffffffffffffffda RBX: 000055c022b91800 RCX: 00007fbb90ab59bd
[  144.374125] RDX: 0000000000000000 RSI: 000055c021ebd8b6 RDI: 0000000000000006
[  144.381423] RBP: 000055c021ebd8b6 R08: 0000000000000000 R09: 0000000000000000
[  144.388718] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000000
[  144.396011] R13: 000055c022b917a0 R14: 0000000000040000 R15: 0000000000000000
[  144.403304] Modules linked in: ko2iblnd(OE) ptlrpc(OE+) obdclass(OE) lnet(OE) libcfs(OE) beegfs(OE) uio_pci_generic uio vfio_pci vfio_virqfd vfio_iommu_type1 vfio irqbypass cuse rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) sunrpc vfat fat intel_rapl_msr intel_rapl_common isst_if_common ipmi_ssif skx_edac iTCO_wdt nfit iTCO_vendor_support libnvdimm ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm x86_pkg_temp_thermal intel_powerclamp coretemp drm_kms_helper crct10dif_pclmul crc32_pclmul syscopyarea ghash_clmulni_intel sysfillrect rapl sysimgblt acpi_ipmi intel_cstate fb_sys_fops ipmi_si mei_me drm joydev pcspkr ipmi_devintf mei intel_uncore ioatdma i2c_i801 lpc_ich wmi ipmi_msghandler acpi_power_meter acpi_pad binfmt_misc knem(OE) ext4 mbcache jbd2 mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sd_mod t10_pi sg mlx5_core(OE) mlxfw(OE) pci_hyperv_intf ixgbe tls ahci libahci psample mlxdevm(OE) mdio libata crc32c_intel mlx_compat(OE) dca xpmem(OE) fuse
[  144.403358]  [last unloaded: libcfs]
[  144.494056] CR2: 00000000000004f0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If lnet setup is normal without logical interfaces like &apos;options lnet networks=&quot;o2ib12(ib0)&quot;&apos;, that works.&lt;/p&gt;</description>
                <environment>Rockylinux8.7 (4.18.0-425.13.1.el8_7.x86_64)&lt;br/&gt;
OFED 5.8-1.1.2.1&lt;br/&gt;
master (commit:d7d1644)</environment>
        <key id="78643">LU-17235</key>
            <summary>kernel panic on kiblnd_startup with logical interfaces</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="ssmirnov">Serguei Smirnov</assignee>
                                    <reporter username="sihara">Shuichi Ihara</reporter>
                        <labels>
                    </labels>
                <created>Sat, 28 Oct 2023 05:31:44 +0000</created>
                <updated>Fri, 10 Nov 2023 21:27:47 +0000</updated>
                            <resolved>Fri, 10 Nov 2023 21:27:47 +0000</resolved>
                                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="390964" author="ssmirnov" created="Sat, 28 Oct 2023 14:35:11 +0000"  >&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;Could you please add details on how the two interfaces are configured, for example please share &quot;ip a&quot; output, and other steps-to-reproduce.&lt;/p&gt;

&lt;p&gt;Did o2iblnd use to work with logical interfaces before? Crashing is definitely bad in this case, but reusing the same device does appear problematic without special configuration in IB layer, e.g. PKEYs.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;/p&gt;

&lt;p&gt;Serguei.&lt;/p&gt;</comment>
                            <comment id="390965" author="sihara" created="Sat, 28 Oct 2023 14:43:43 +0000"  >&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@ec01 io500.git]# ip a
1: lo: &amp;lt;LOOPBACK,UP,LOWER_UP&amp;gt; mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp179s0f0np0: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether b8:59:9f:f6:89:98 brd ff:ff:ff:ff:ff:ff
    altname ens801f0np0
3: eno1: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a4:bf:01:5d:e3:d0 brd ff:ff:ff:ff:ff:ff
    altname enp23s0f0
    inet 10.128.11.1/21 brd 10.128.15.255 scope global noprefixroute eno1
       valid_lft forever preferred_lft forever
    inet6 fe80::a6bf:1ff:fe5d:e3d0/64 scope link 
       valid_lft forever preferred_lft forever
4: eno2: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether a4:bf:01:5d:e3:d1 brd ff:ff:ff:ff:ff:ff
    altname enp23s0f1
5: ib0: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 2044 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:11:49:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:f6:89:99 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 10.0.0.1/12 brd 10.15.255.255 scope global noprefixroute ib0
       valid_lft forever preferred_lft forever
    inet 10.0.2.1/12 brd 10.15.255.255 scope global secondary noprefixroute ib0:2
       valid_lft forever preferred_lft forever
    inet 10.0.4.1/12 brd 10.15.255.255 scope global secondary noprefixroute ib0:4
       valid_lft forever preferred_lft forever
    inet 10.0.1.1/12 brd 10.15.255.255 scope global secondary noprefixroute ib0:1
       valid_lft forever preferred_lft forever
    inet 10.0.7.1/12 brd 10.15.255.255 scope global secondary noprefixroute ib0:7
       valid_lft forever preferred_lft forever
    inet 10.0.5.1/12 brd 10.15.255.255 scope global secondary noprefixroute ib0:5
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/12 brd 10.15.255.255 scope global secondary noprefixroute ib0:3
       valid_lft forever preferred_lft forever
    inet 10.0.6.1/12 brd 10.15.255.255 scope global secondary noprefixroute ib0:6
       valid_lft forever preferred_lft forever
    inet6 fe80::ba59:9f03:f6:8999/64 scope link 
       valid_lft forever preferred_lft forever
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;blockquote&gt;&lt;p&gt;Did o2iblnd use to work with logical interfaces before? Crashing is definitely bad in this case, but reusing the same device does appear problematic without special configuration in IB layer, e.g. PKEYs.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Yes, this configuration has been working for at least more than 3 years when I&apos;ve started to use.&lt;/p&gt;

&lt;p&gt;The reason of this setting is metadata performance improvements. I thought increasing conns_per_peer helps, but it didn&apos;t. Many NIDs still make better metadata performance.&lt;/p&gt;

&lt;p&gt;We need to investigate and make same performance imrovements without this workaround though.&lt;/p&gt;</comment>
                            <comment id="390977" author="sihara" created="Sun, 29 Oct 2023 03:50:27 +0000"  >&lt;p&gt;I found that &quot;commit: 09c6e2b872 &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16836&quot; title=&quot;LNet: initial ni status is &amp;quot;up&amp;quot; if starting with link disconnected&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16836&quot;&gt;&lt;del&gt;LU-16836&lt;/del&gt;&lt;/a&gt; lnet: ensure dev notification on lnd startup&quot; is first place causes this problem.&lt;br/&gt;
Before this commit landed, the configuration with logical interfaces has been working well.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="390978" author="sihara" created="Sun, 29 Oct 2023 04:01:39 +0000"  >&lt;p&gt;And, if fact, this is not problem in MR with logical interfaces, but it gets crash even if LNET starts against a logical interface.&lt;br/&gt;
So,&#160;it hits problem below too.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;options lnet networks=&quot;o2ib12(ib0:1)&quot; in modproe.conf
# modprobe lustre
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="391106" author="gerrit" created="Mon, 30 Oct 2023 19:15:51 +0000"  >&lt;p&gt;&quot;Serguei Smirnov &amp;lt;ssmirnov@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/52894&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/52894&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17235&quot; title=&quot;kernel panic on kiblnd_startup with logical interfaces&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17235&quot;&gt;&lt;del&gt;LU-17235&lt;/del&gt;&lt;/a&gt; o2iblnd: adding alias ib interface causes crash&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 4dc9164b23eb275b4050f9c013d13a469fd662fb&lt;/p&gt;</comment>
                            <comment id="392297" author="gerrit" created="Wed, 8 Nov 2023 22:00:39 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/52894/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/52894/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17235&quot; title=&quot;kernel panic on kiblnd_startup with logical interfaces&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17235&quot;&gt;&lt;del&gt;LU-17235&lt;/del&gt;&lt;/a&gt; o2iblnd: adding alias ib interface causes crash&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 02b22df6431a764c00ed0fbbc3286c2ed4dfbab0&lt;/p&gt;</comment>
                            <comment id="392769" author="pjones" created="Fri, 10 Nov 2023 21:27:47 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="76128">LU-16836</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03zvj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>