<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:25:06 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2427] Hit &quot;kernel BUG&quot; when running on debug kernel during recovery</title>
                <link>https://jira.whamcloud.com/browse/LU-2427</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I&apos;m trying to run Lustre-Orion against a debug kernel on the MDS and hit this BUG twice yesterday. There are a couple &quot;&lt;span class=&quot;error&quot;&gt;&amp;#91;...&amp;#93;&lt;/span&gt; used gratest stack depth&quot; messages, so I&apos;m curious if the stack was stomped on causing the crash.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;zpool used greatest stack depth: 1552 bytes left
Lustre: Lustre: Build Version: 2.0.59-llnl3-base-DEBUG--CHANGED-2.6.32-220.23.1.1chaos.ch5.x86_64.debug
Lustre: MGS: Mounted grove-mds2/mgs
mount.lustre used greatest stack depth: 1280 bytes left
LustreError: 11-0: MGC172.20.5.2@o2ib500: Communicating with 0@lo, operation llog_origin_handle_create failed with -2
LustreError: 20904:0:(mgc_request.c:248:do_config_log_add()) failed processing sptlrpc log: -2
Lustre: 20909:0:(fld_index.c:354:fld_index_init()) srv-lstest-MDT0000: File &quot;fld&quot; doesn&apos;t support range lookup, using stub. DNE and FIDs on OST will not work with this backend
ib0: no IPv6 routers present
Lustre: lstest-MDT0000: Temporarily refusing client connection from 172.20.3.154@o2ib500
Lustre: lstest-MDT0000: Temporarily refusing client connection from 172.20.3.191@o2ib500
Lustre: lstest-MDT0000: Mounted grove-mds2/mdt0
Lustre: lstest-MDT0000: Will be in recovery for at least 5:00, or until 256 clients reconnect.
------------[ cut here ]------------
kernel BUG at /usr/src/kernels/2.6.32-220.23.1.1chaos.ch5.x86_64.debug/include/linux/scatterlist.h:65!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/module/ptlrpc/initstate
CPU 20 
Modules linked in: osp(U) mdt(U) mdd(U) lod(U) mgs(U) mgc(U) osd_zfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) acpi_cpufreq freq_table mperf ko2iblnd(U) lnet(U) libcfs(U) ib_ipoib ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core dm_mirror dm_region_hash dm_log dm_round_robin dm_multipath dm_mod vhost_net macvtap macvlan tun kvm_intel kvm zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate ses enclosure sg sd_mod crc_t10dif mpt2sas scsi_transport_sas raid_class serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core shpchp ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc mlx4_en mlx4_core igb dca [last unloaded: cpufreq_ondemand]

Pid: 20891, comm: ll_mgs_02 Tainted: P        W  ----------------   2.6.32-220.23.1.1chaos.ch5.x86_64.debug #1 Supermicro X8DTH-i/6/iF/6F/X8DTH
RIP: 0010:[&amp;lt;ffffffffa0693ca6&amp;gt;]  [&amp;lt;ffffffffa0693ca6&amp;gt;] kiblnd_setup_rd_iov+0x1f6/0x2f0 [ko2iblnd]
RSP: 0018:ffff88178f87d960  EFLAGS: 00010293
RAX: ffffea00a792e280 RBX: ffff882fdddbe408 RCX: 0000000000000000
RDX: 00000000000020c0 RSI: 0000000087654321 RDI: ffff882fe0d30148
RBP: ffff88178f87d9b0 R08: ffff882fdddbe408 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff882fe0d30148
R13: ffffc9004ed47000 R14: 00000000000020c0 R15: 0000000000000000
FS:  00007ffff7fdc700(0000) GS:ffff881895800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006d3e70 CR3: 0000002ffa068000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ll_mgs_02 (pid: 20891, threadinfo ffff88178f87c000, task ffff88178f878ac0)
Stack:
 ffffc9002b567748 ffff8817b7ee9818 ffff8817b5004000 00000001dddbc058
&amp;lt;0&amp;gt; 00000000000020c0 ffff882fdddbc058 00000000000020c0 ffffc9002b567748
&amp;lt;0&amp;gt; 0000000000000001 000501f4ac14043d ffff88178f87da50 ffffffffa069892a
Call Trace:
 [&amp;lt;ffffffffa069892a&amp;gt;] kiblnd_send+0x59a/0x870 [ko2iblnd]
 [&amp;lt;ffffffffa062e359&amp;gt;] ? lnet_send+0x59/0x9f0 [lnet]
 [&amp;lt;ffffffffa062a14b&amp;gt;] lnet_ni_send+0x4b/0x110 [lnet]
 [&amp;lt;ffffffffa062e55b&amp;gt;] lnet_send+0x25b/0x9f0 [lnet]
 [&amp;lt;ffffffffa062f5bb&amp;gt;] LNetPut+0x2ab/0x670 [lnet]
 [&amp;lt;ffffffffa086a71e&amp;gt;] ptl_send_buf+0x18e/0x440 [ptlrpc]
 [&amp;lt;ffffffffa08875f0&amp;gt;] ? at_measured+0x1e0/0x320 [ptlrpc]
 [&amp;lt;ffffffffa08a2285&amp;gt;] ? null_authorize+0x75/0x110 [ptlrpc]
 [&amp;lt;ffffffffa086ac2f&amp;gt;] ptlrpc_send_reply+0x25f/0x770 [ptlrpc]
 [&amp;lt;ffffffffa08425e4&amp;gt;] target_send_reply_msg+0x54/0x160 [ptlrpc]
 [&amp;lt;ffffffffa0842a3e&amp;gt;] target_send_reply+0x34e/0x680 [ptlrpc]
 [&amp;lt;ffffffffa08868d3&amp;gt;] ? llog_origin_handle_read_header+0x193/0x520 [ptlrpc]
 [&amp;lt;ffffffffa0c8cd16&amp;gt;] mgs_handle+0xd6/0x1020 [mgs]
 [&amp;lt;ffffffffa0706a0f&amp;gt;] ? keys_fill+0x6f/0x1a0 [obdclass]
 [&amp;lt;ffffffffa08717f4&amp;gt;] ? lustre_msg_get_transno+0x54/0x90 [ptlrpc]
 [&amp;lt;ffffffffa087bc6c&amp;gt;] ptlrpc_server_handle_request+0x3fc/0xce0 [ptlrpc]
 [&amp;lt;ffffffffa059256e&amp;gt;] ? cfs_timer_arm+0xe/0x10 [libcfs]
 [&amp;lt;ffffffffa059ff09&amp;gt;] ? lc_watchdog_touch+0x79/0x110 [libcfs]
 [&amp;lt;ffffffffa0876e20&amp;gt;] ? ptlrpc_wait_event+0xb0/0x2b0 [ptlrpc]
 [&amp;lt;ffffffff810aeb6d&amp;gt;] ? trace_hardirqs_on+0xd/0x10
 [&amp;lt;ffffffff81055043&amp;gt;] ? __wake_up+0x53/0x70
 [&amp;lt;ffffffffa087df00&amp;gt;] ptlrpc_main+0x710/0x1190 [ptlrpc]
 [&amp;lt;ffffffff810aeb6d&amp;gt;] ? trace_hardirqs_on+0xd/0x10
 [&amp;lt;ffffffffa087d7f0&amp;gt;] ? ptlrpc_main+0x0/0x1190 [ptlrpc]
 [&amp;lt;ffffffff8100c20a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffff815231f0&amp;gt;] ? _spin_unlock_irq+0x30/0x40
 [&amp;lt;ffffffff8100bb50&amp;gt;] ? restore_args+0x0/0x30
 [&amp;lt;ffffffffa087d7f0&amp;gt;] ? ptlrpc_main+0x0/0x1190 [ptlrpc]
 [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20
Code: 35 f2 01 00 00 00 04 00 e8 28 a8 f0 ff 48 c7 c7 60 2e 6b a0 c7 05 df f1 01 00 00 00 04 00 e8 02 e2 ef ff 0f 0b eb fe 0f 0b eb fe &amp;lt;0f&amp;gt; 0b 0f 1f 84 00 00 00 00 00 eb f6 48 c7 c7 20 2e 6b a0 48 c7 
RIP  [&amp;lt;ffffffffa0693ca6&amp;gt;] kiblnd_setup_rd_iov+0x1f6/0x2f0 [ko2iblnd]
 RSP &amp;lt;ffff88178f87d960&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;
crash&amp;gt; bt
PID: 20891  TASK: ffff88178f878ac0  CPU: 20  COMMAND: &quot;ll_mgs_02&quot;
 #0 [ffff88178f87d620] machine_kexec at ffffffff81032ad0
 #1 [ffff88178f87d680] crash_kexec at ffffffff810cab52
 #2 [ffff88178f87d750] oops_end at ffffffff81524c20
 #3 [ffff88178f87d780] die at ffffffff8100f3bb
 #4 [ffff88178f87d7b0] do_trap at ffffffff81524334
 #5 [ffff88178f87d810] do_invalid_op at ffffffff8100cff5
 #6 [ffff88178f87d8b0] invalid_op at ffffffff8100bf9b
    [exception RIP: kiblnd_setup_rd_iov+502]
    RIP: ffffffffa0693ca6  RSP: ffff88178f87d960  RFLAGS: 00010293
    RAX: ffffea00a792e280  RBX: ffff882fdddbe408  RCX: 0000000000000000
    RDX: 00000000000020c0  RSI: 0000000087654321  RDI: ffff882fe0d30148
    RBP: ffff88178f87d9b0   R8: ffff882fdddbe408   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: ffff882fe0d30148
    R13: ffffc9004ed47000  R14: 00000000000020c0  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff88178f87d9b8] kiblnd_send at ffffffffa069892a [ko2iblnd]
 #8 [ffff88178f87da58] lnet_ni_send at ffffffffa062a14b [lnet]
 #9 [ffff88178f87da78] lnet_send at ffffffffa062e55b [lnet]
#10 [ffff88178f87dae8] LNetPut at ffffffffa062f5bb [lnet]
#11 [ffff88178f87db48] ptl_send_buf at ffffffffa086a71e [ptlrpc]
#12 [ffff88178f87dbf8] ptlrpc_send_reply at ffffffffa086ac2f [ptlrpc]
#13 [ffff88178f87dc78] target_send_reply_msg at ffffffffa08425e4 [ptlrpc]
#14 [ffff88178f87dca8] target_send_reply at ffffffffa0842a3e [ptlrpc]
#15 [ffff88178f87dd18] mgs_handle at ffffffffa0c8cd16 [mgs]
#16 [ffff88178f87dda8] ptlrpc_server_handle_request at ffffffffa087bc6c [ptlrpc]
#17 [ffff88178f87de98] ptlrpc_main at ffffffffa087df00 [ptlrpc]
#18 [ffff88178f87df48] kernel_thread at ffffffff8100c20a
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;From scatterlist.h:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt; 55 static inline void sg_assign_page(struct scatterlist *sg, struct page *page)&lt;br/&gt;
 56 &lt;/p&gt;
{
 57         unsigned long page_link = sg-&amp;gt;page_link &amp;amp; 0x3;
 58 
 59         /*
 60          * In order for the low bit stealing approach to work, pages
 61          * must be aligned at a 32-bit boundary as a minimum.
 62          */
 63         BUG_ON((unsigned long) page &amp;amp; 0x03);
 64 #ifdef CONFIG_DEBUG_SG
 65         BUG_ON(sg-&amp;gt;sg_magic != SG_MAGIC);
 66         BUG_ON(sg_is_chain(sg));
 67 #endif
 68         sg-&amp;gt;page_link = page_link | (unsigned long) page;
 69 }
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>kernel: 2.6.32-220.23.1.1chaos.ch5.x86_64.debug&lt;br/&gt;
lustre: orion-2_3_49_54_1-55chaos + &lt;a href=&quot;http://review.whamcloud.com/3355&quot;&gt;http://review.whamcloud.com/3355&lt;/a&gt;</environment>
        <key id="15246">LU-2427</key>
            <summary>Hit &quot;kernel BUG&quot; when running on debug kernel during recovery</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="liang">Liang Zhen</assignee>
                                    <reporter username="prakash">Prakash Surya</reporter>
                        <labels>
                    </labels>
                <created>Tue, 17 Jul 2012 12:15:51 +0000</created>
                <updated>Tue, 4 Dec 2012 12:40:33 +0000</updated>
                            <resolved>Tue, 4 Dec 2012 12:40:33 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="41936" author="behlendorf" created="Tue, 17 Jul 2012 13:14:17 +0000"  >&lt;p&gt;This appears to be caused by a missing call to sg_init_table() in the o2iblnd .  With the debug kernel CONFIG_DEBUG_SG is set which causes us to check a magic value in the scatterlist struct... a magic value we are not setting.&lt;/p&gt;

&lt;p&gt;This looks like a pretty straight forward issue for someone more familiar with the LNET code.&lt;/p&gt;</comment>
                            <comment id="41944" author="prakash" created="Tue, 17 Jul 2012 17:37:01 +0000"  >&lt;p&gt;See: &lt;a href=&quot;http://review.whamcloud.com/3423&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/3423&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="48735" author="liang" created="Tue, 4 Dec 2012 12:40:33 +0000"  >&lt;p&gt;this patch should have fixed this issue: &lt;a href=&quot;http://review.whamcloud.com/#change,3709&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,3709&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="15427">LU-1714</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzuxdz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3071</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>