<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:08:08 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-7351] LNet router crash during bring up of infiniband interface.</title>
                <link>https://jira.whamcloud.com/browse/LU-7351</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;In our testing on our medium size Cray system we encountered the following crash while attempting to bring up LNet on the routers.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:14&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Lustre: kgnilnd build version: 2.7.61-DNE2-1.0502.0.2.7-jsimmons-Unknown-2015-10-21-11:16&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:14&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;LNet: Added LNI 12@gni2 &lt;span class=&quot;error&quot;&gt;&amp;#91;16/8192/0/0&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;LNetError: 149:0:(o2iblnd_cb.c:2239:kiblnd_passive_connect()) Can&apos;t accept conn from 10.36.226.4@o2ib on NA (ib0:0:10.36.223.1): bad dst nid 10.36.223.1@o2ib&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;BUG: unable to handle kernel NULL pointer dereference at 0000000000000080&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;IP: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03e3e5b&amp;gt;&amp;#93;&lt;/span&gt; kiblnd_passive_connect+0xfb/0x16b0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ko2iblnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;PGD 3dc9ae067 PUD 3ddc0c067 PMD 0 &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Oops: 0000 &lt;a href=&quot;#1&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;1&lt;/a&gt; SMP &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;CPU 5 &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Modules linked in: ko2iblnd kgnilnd lnet crc32c libcfs binfmt_misc rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core mlx4_en mlx4_ib ib_sa ib_mad ib_core mlx4_core compat nic_compat dm_mod kdreg gpcd_gem ipogif_gem kgni_gem hwerr(P) rca hss_os(P) heartbeat simplex(P) ghal_gem cgm craytrace&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Pid: 149, comm: kworker/5:1 Tainted: P             3.0.101-0.46.1_1.0502.8871-cray_gem_s #1  &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;RIP: 0010:&lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03e3e5b&amp;gt;&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03e3e5b&amp;gt;&amp;#93;&lt;/span&gt; kiblnd_passive_connect+0xfb/0x16b0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ko2iblnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;RSP: 0018:ffff8803f1b0fb10  EFLAGS: 00010246&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;RAX: 000000000000003f RBX: ffffffffa03ee513 RCX: ffffffff81368c50&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff8803e8ac7680&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;RBP: ffff8803f1b0fbd0 R08: 0000000000000005 R09: 0000000000000005&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;R10: 0000000000000003 R11: 00000000ffffffff R12: 0000000000000012&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;R13: 0000000000000000 R14: 0000000000000000 R15: ffff8803c0e58620&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;FS:  00007f9e343b7700(0000) GS:ffff880407d40000(0000) knlGS:0000000000000000&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;CR2: 0000000000000080 CR3: 00000003dc9bd000 CR4: 00000000000007e0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;LNet: Added LNI 10.36.223.1@o2ib &lt;span class=&quot;error&quot;&gt;&amp;#91;63/2560/0/180&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Process kworker/5:1 (pid: 149, threadinfo ffff8803f1b0c000, task ffff8803f1b09040)&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Stack:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; ffff8803c0e58620 ffffffffa02f6080 0000000000000000 ffff8803ea8bcc00&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; ffffffffa02f6080 000500000a24e204 0000000000000001 ffff8803e4740000&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; 000300120be91b91 0000000000000000 000010000000003f ffffffffa017b636&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Call Trace:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03e59bd&amp;gt;&amp;#93;&lt;/span&gt; kiblnd_cm_callback+0x5ad/0x2070 &lt;span class=&quot;error&quot;&gt;&amp;#91;ko2iblnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa024145b&amp;gt;&amp;#93;&lt;/span&gt; cma_req_handler+0x1eb/0x550 &lt;span class=&quot;error&quot;&gt;&amp;#91;rdma_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa016ff57&amp;gt;&amp;#93;&lt;/span&gt; cm_process_work+0x27/0x130 &lt;span class=&quot;error&quot;&gt;&amp;#91;ib_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0171fb0&amp;gt;&amp;#93;&lt;/span&gt; cm_req_handler+0x750/0xa00 &lt;span class=&quot;error&quot;&gt;&amp;#91;ib_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0172385&amp;gt;&amp;#93;&lt;/span&gt; cm_work_handler+0x125/0xf4c &lt;span class=&quot;error&quot;&gt;&amp;#91;ib_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81060953&amp;gt;&amp;#93;&lt;/span&gt; process_one_work+0x163/0x440&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81063473&amp;gt;&amp;#93;&lt;/span&gt; worker_thread+0x183/0x400&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81067ace&amp;gt;&amp;#93;&lt;/span&gt; kthread+0x9e/0xb0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81490074&amp;gt;&amp;#93;&lt;/span&gt; kernel_thread_helper+0x4/0x10&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Code: 0f 84 da 01 00 00 66 3d 00 11 0f 84 d0 01 00 00 66 c7 45 84 12 00 45 31 f6 48 8b 05 38 03 01 00 ba 00 01 00 00 8b 00 66 89 45 90 &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; 8b 86 80 00 00 00 85 c0 0f 45 d0 48 8b bd 58 ff ff ff 48 8d &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;RIP  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03e3e5b&amp;gt;&amp;#93;&lt;/span&gt; kiblnd_passive_connect+0xfb/0x16b0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ko2iblnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; RSP &amp;lt;ffff8803f1b0fb10&amp;gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;CR2: 0000000000000080&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;--&lt;del&gt;[ end trace 311d9fd8dd61b1cf ]&lt;/del&gt;--&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Kernel panic - not syncing: Fatal exception&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Pid: 149, comm: kworker/5:1 Tainted: P      D      3.0.101-0.46.1_1.0502.8871-cray_gem_s #1&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt;Call Trace:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81006651&amp;gt;&amp;#93;&lt;/span&gt; try_stack_unwind+0x161/0x1a0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81004eb9&amp;gt;&amp;#93;&lt;/span&gt; dump_trace+0x89/0x430&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810060bc&amp;gt;&amp;#93;&lt;/span&gt; show_trace_log_lvl+0x5c/0x80&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810060f5&amp;gt;&amp;#93;&lt;/span&gt; show_trace+0x15/0x20&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8148b31c&amp;gt;&amp;#93;&lt;/span&gt; dump_stack+0x79/0x84&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8148b3bb&amp;gt;&amp;#93;&lt;/span&gt; panic+0x94/0x1da&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81005ed8&amp;gt;&amp;#93;&lt;/span&gt; oops_end+0xa8/0xe0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81027589&amp;gt;&amp;#93;&lt;/span&gt; no_context+0xf9/0x260&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81027855&amp;gt;&amp;#93;&lt;/span&gt; __bad_area_nosemaphore+0x165/0x1f0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810278f3&amp;gt;&amp;#93;&lt;/span&gt; bad_area_nosemaphore+0x13/0x20&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81027e4e&amp;gt;&amp;#93;&lt;/span&gt; do_page_fault+0x2fe/0x440&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8148e8cf&amp;gt;&amp;#93;&lt;/span&gt; page_fault+0x1f/0x30&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03e3e5b&amp;gt;&amp;#93;&lt;/span&gt; kiblnd_passive_connect+0xfb/0x16b0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ko2iblnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03e59bd&amp;gt;&amp;#93;&lt;/span&gt; kiblnd_cm_callback+0x5ad/0x2070 &lt;span class=&quot;error&quot;&gt;&amp;#91;ko2iblnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa024145b&amp;gt;&amp;#93;&lt;/span&gt; cma_req_handler+0x1eb/0x550 &lt;span class=&quot;error&quot;&gt;&amp;#91;rdma_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa016ff57&amp;gt;&amp;#93;&lt;/span&gt; cm_process_work+0x27/0x130 &lt;span class=&quot;error&quot;&gt;&amp;#91;ib_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0171fb0&amp;gt;&amp;#93;&lt;/span&gt; cm_req_handler+0x750/0xa00 &lt;span class=&quot;error&quot;&gt;&amp;#91;ib_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0172385&amp;gt;&amp;#93;&lt;/span&gt; cm_work_handler+0x125/0xf4c &lt;span class=&quot;error&quot;&gt;&amp;#91;ib_cm&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81060953&amp;gt;&amp;#93;&lt;/span&gt; process_one_work+0x163/0x440&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81063473&amp;gt;&amp;#93;&lt;/span&gt; worker_thread+0x183/0x400&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81067ace&amp;gt;&amp;#93;&lt;/span&gt; kthread+0x9e/0xb0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2015-10-26 15:51:15&amp;#93;&lt;/span&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;c0-0c0s6n0&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81490074&amp;gt;&amp;#93;&lt;/span&gt; kernel_thread_helper+0x4/0x10&lt;/p&gt;</description>
                <environment>Cray routers running Lustre 2.7.61 in an SLES11 SP3 environment.</environment>
        <key id="32878">LU-7351</key>
            <summary>LNet router crash during bring up of infiniband interface.</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="doug">Doug Oucharek</assignee>
                                    <reporter username="simmonsja">James A Simmons</reporter>
                        <labels>
                    </labels>
                <created>Wed, 28 Oct 2015 17:43:53 +0000</created>
                <updated>Fri, 20 Nov 2015 20:55:26 +0000</updated>
                            <resolved>Fri, 20 Nov 2015 18:45:51 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                    <fixVersion>Lustre 2.8.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>11</watches>
                                                                            <comments>
                            <comment id="131879" author="doug" created="Wed, 28 Oct 2015 17:57:36 +0000"  >&lt;p&gt;James, what IB card is in the router (FDR, EDR)?  Looks like mlx5 is being used.  Is this upstream OFED or MOFED?&lt;/p&gt;</comment>
                            <comment id="131880" author="simmonsja" created="Wed, 28 Oct 2015 17:59:35 +0000"  >&lt;p&gt;Its a mlx5 FDR card using the OFED 3.12 stack.&lt;/p&gt;</comment>
                            <comment id="131886" author="doug" created="Wed, 28 Oct 2015 18:17:13 +0000"  >&lt;p&gt;Is this a production system or a test system?  Should it be sev 1 (highest priority)?&lt;/p&gt;</comment>
                            <comment id="131891" author="simmonsja" created="Wed, 28 Oct 2015 18:32:03 +0000"  >&lt;p&gt;This was tested on production system. We had to roll back to 2.5 version to have it work again. I made it a blocker since it prevents Lustre bring up for anyone attempting to the latest pre-2.8 clients.&lt;/p&gt;</comment>
                            <comment id="131893" author="doug" created="Wed, 28 Oct 2015 18:35:23 +0000"  >&lt;p&gt;I agree this is a blocker.  Adding sev 1 means a production system is down and needs immediate attention to get it back up again.  If the system in question is back up and running, can we change this to a sev 2?&lt;/p&gt;</comment>
                            <comment id="131899" author="simmonsja" created="Wed, 28 Oct 2015 18:41:33 +0000"  >&lt;p&gt;Sure you can change it to 2.&lt;/p&gt;</comment>
                            <comment id="131900" author="doug" created="Wed, 28 Oct 2015 18:44:23 +0000"  >&lt;p&gt;The o2iblnd_cb.c I have in 2.7.61 does not seem to match yours.  Can you attach your copy of o2iblnd_cb.c so I can see the differences?&lt;/p&gt;</comment>
                            <comment id="131927" author="ashehata" created="Wed, 28 Oct 2015 21:15:17 +0000"  >&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;(gdb) l *kiblnd_cm_callback+0x5ad
0x1661d is in kiblnd_cm_callback (/home/ashehata/lustre-master/lnet/klnds/o2iblnd/o2iblnd_cb.c:2889).
2884                    kiblnd_peer_decref(peer);
2885                    &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; rc;                      &lt;span class=&quot;code-comment&quot;&gt;/* rc != 0 destroys cmid */&lt;/span&gt;
2886
2887            &lt;span class=&quot;code-keyword&quot;&gt;case&lt;/span&gt; RDMA_CM_EVENT_ROUTE_ERROR:
2888                    peer = (kib_peer_t *)cmid-&amp;gt;context;
2889                    CNETERR(&lt;span class=&quot;code-quote&quot;&gt;&quot;%s: ROUTE ERROR %d\n&quot;&lt;/span&gt;,
2890                            libcfs_nid2str(peer-&amp;gt;ibp_nid), event-&amp;gt;status);
2891                    kiblnd_peer_connect_failed(peer, 1, -EHOSTUNREACH);
2892                    kiblnd_peer_decref(peer);
2893                    &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; -EHOSTUNREACH;           &lt;span class=&quot;code-comment&quot;&gt;/* rc != 0 destroys cmid */&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;looks like peer might be NULL&lt;br/&gt;
If there is a change in the way OFED works that might explain this error.&lt;br/&gt;
can you put a few debug statements around this part to see if this is indeed the case? A check to see if cmid or peer are NULL.&lt;/p&gt;</comment>
                            <comment id="131941" author="doug" created="Wed, 28 Oct 2015 23:00:53 +0000"  >&lt;p&gt;Amir: the previous line is a call to kiblnd_peer_connect_failed() where peer is dereferenced.  I would have thought the crash would have happened there.  Are you looking at the proper binary?  The files James sent us are different than 2.7.61.&lt;/p&gt;</comment>
                            <comment id="131944" author="doug" created="Wed, 28 Oct 2015 23:25:46 +0000"  >&lt;p&gt;James: we suspect that cmid-&amp;gt;context is coming back NULL when we don&apos;t expect such a thing to happen.  Can you verify the line of the crash with your binary?  I&apos;m not sure that the binary Amir used is equivalent to yours.  Once we know for sure that a NULL cmid-&amp;gt;context is the cause, we can start to figure out how such a thing can happen.&lt;/p&gt;</comment>
                            <comment id="132120" author="morrone" created="Fri, 30 Oct 2015 01:00:02 +0000"  >&lt;p&gt;This sounds like it is a blocker for the 2.8 release as stated.  I marked it as such, but you can correct it if I&apos;m wrong.&lt;/p&gt;</comment>
                            <comment id="133012" author="simmonsja" created="Mon, 9 Nov 2015 18:12:20 +0000"  >&lt;p&gt;Nope the problem is not cmd-&amp;gt;context being NULL. I&apos;m going to give it another run to see what it is.&lt;/p&gt;</comment>
                            <comment id="133069" author="simmonsja" created="Mon, 9 Nov 2015 22:31:59 +0000"  >&lt;p&gt;I added the latest &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3322&quot; title=&quot;ko2iblnd support for different map_on_demand and peer_credits between systems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3322&quot;&gt;&lt;del&gt;LU-3322&lt;/del&gt;&lt;/a&gt; patch and I&apos;m not seeing the crashes anymore. This is at a smallest scale that I saw this problem before but we don&apos;t know if it resolves this at titan scales.&lt;/p&gt;</comment>
                            <comment id="133383" author="doug" created="Thu, 12 Nov 2015 18:48:55 +0000"  >&lt;p&gt;James: Can I close this linked to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3322&quot; title=&quot;ko2iblnd support for different map_on_demand and peer_credits between systems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3322&quot;&gt;&lt;del&gt;LU-3322&lt;/del&gt;&lt;/a&gt;?&lt;/p&gt;</comment>
                            <comment id="133402" author="simmonsja" created="Thu, 12 Nov 2015 19:57:08 +0000"  >&lt;p&gt;Can we wait until &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3322&quot; title=&quot;ko2iblnd support for different map_on_demand and peer_credits between systems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3322&quot;&gt;&lt;del&gt;LU-3322&lt;/del&gt;&lt;/a&gt; is settled. I noticed that patch has changed again but is now disliked. I want to wait until a agreed on solution is presented before I will try to test it again. Is that okay?&lt;/p&gt;</comment>
                            <comment id="133882" author="doug" created="Wed, 18 Nov 2015 23:31:06 +0000"  >&lt;p&gt;Hi James.  I see you successfully tested &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3322&quot; title=&quot;ko2iblnd support for different map_on_demand and peer_credits between systems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3322&quot;&gt;&lt;del&gt;LU-3322&lt;/del&gt;&lt;/a&gt;.  Does that mean this issue has been resolved?&lt;/p&gt;</comment>
                            <comment id="134103" author="pjones" created="Fri, 20 Nov 2015 18:45:51 +0000"  >&lt;p&gt;Seems like this is a duplicate &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3322&quot; title=&quot;ko2iblnd support for different map_on_demand and peer_credits between systems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3322&quot;&gt;&lt;del&gt;LU-3322&lt;/del&gt;&lt;/a&gt;. James, please speak up if you think otherwise&lt;/p&gt;</comment>
                            <comment id="134113" author="simmonsja" created="Fri, 20 Nov 2015 20:42:32 +0000"  >&lt;p&gt;I agree. The patch from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3322&quot; title=&quot;ko2iblnd support for different map_on_demand and peer_credits between systems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3322&quot;&gt;&lt;del&gt;LU-3322&lt;/del&gt;&lt;/a&gt; resolves this.&lt;/p&gt;</comment>
                            <comment id="134115" author="pjones" created="Fri, 20 Nov 2015 20:55:26 +0000"  >&lt;p&gt;Great - thanks for confirming James&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="18907">LU-3322</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>lnet</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxrpb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>