<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:23:47 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2268] SMP scalablity enhancements break FMR pools</title>
                <link>https://jira.whamcloud.com/browse/LU-2268</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The SMP scalability patch never uses FMR due to the following code from master at 2299:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;   2286         }
   2287 
   2288         &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; (i = 0; i &amp;lt; ncpts; i++) {
   2289                 cpt = (cpts == NULL) ? i : cpts[i];
   2290                 rc = kiblnd_init_fmr_poolset(net-&amp;gt;ibn_fmr_ps[cpt], cpt, net,
   2291                                              kiblnd_fmr_pool_size(ncpts),
   2292                                              kiblnd_fmr_flush_trigger(ncpts));
   2293                 &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc == -ENOSYS &amp;amp;&amp;amp; i == 0) &lt;span class=&quot;code-comment&quot;&gt;/* no FMR */&lt;/span&gt;
   2294                         &lt;span class=&quot;code-keyword&quot;&gt;break&lt;/span&gt;; &lt;span class=&quot;code-comment&quot;&gt;/* create PMR pool */&lt;/span&gt;
   2295                 &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc != 0)
   2296                         &lt;span class=&quot;code-keyword&quot;&gt;goto&lt;/span&gt; failed; &lt;span class=&quot;code-comment&quot;&gt;/* a real error */&lt;/span&gt;
   2297         }
   2298 
   2299         cfs_percpt_free(net-&amp;gt;ibn_fmr_ps);
   2300         net-&amp;gt;ibn_fmr_ps = NULL;
   2301 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I had hoped that just adding the following would be sufficient:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc &amp;gt; 0)
         &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; 0; &lt;span class=&quot;code-comment&quot;&gt;/* FMR success */&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;However, when attempting to run with that I am seeing a kernel panic.  Right now I have the patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1757&quot; title=&quot;Short I/O support&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1757&quot;&gt;&lt;del&gt;LU-1757&lt;/del&gt;&lt;/a&gt; in the code I will remove and test again just to make sure its not that patch.&lt;/p&gt;</description>
                <environment>RHEL 6.3 with Lustre 2.3</environment>
        <key id="16557">LU-2268</key>
            <summary>SMP scalablity enhancements break FMR pools</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="liang">Liang Zhen</assignee>
                                    <reporter username="jfilizetti">Jeremy Filizetti</reporter>
                        <labels>
                    </labels>
                <created>Fri, 2 Nov 2012 22:05:37 +0000</created>
                <updated>Fri, 19 Apr 2013 20:47:11 +0000</updated>
                            <resolved>Tue, 27 Nov 2012 21:44:29 +0000</resolved>
                                    <version>Lustre 2.3.0</version>
                                    <fixVersion>Lustre 2.4.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="47350" author="jfilizetti" created="Sat, 3 Nov 2012 00:14:51 +0000"  >&lt;p&gt;Correction on the fix for FMR:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (i &amp;gt; 0)
         &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; 0; &lt;span class=&quot;code-comment&quot;&gt;/* FMR success */&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;One other side-effect I failed to mention was since FMR always fails it falls back to PMR with call ib_reg_phys_mr which is not supported on mlx4 cards.&lt;/p&gt;

&lt;p&gt;LNetError: 3496:0:(o2iblnd.c:1952:kiblnd_pmr_pool_map()) Failed ib_reg_phys_mr: -38&lt;br/&gt;
LNetError: 3496:0:(o2iblnd_cb.c:611:kiblnd_pmr_map_tx()) Failed to create MR by phybuf: -38&lt;/p&gt;

&lt;p&gt;After testing with master I still see a kernel panic.  Here is the stacktrace:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;BUG: unable to handle kernel NULL pointer dereference at (&lt;span class=&quot;code-keyword&quot;&gt;null&lt;/span&gt;)
IP: [&amp;lt;ffffffffa08dd64f&amp;gt;] kiblnd_map_tx+0x21f/0x550 [ko2iblnd]
PGD 1819af8067 PUD 181f82a067 PMD 0 
Oops: 0002 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map
CPU 3 
Modules linked in: osp(U) ofd(U) ost(U) mgc(U) fsfilt_ldiskfs(U) exportfs osd_ldiskfs(U) lquota(U) mdd(U) fid(U) fld(U) ksocklnd(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core igb sg microcode serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

Pid: 15459, comm: ll_ost_io01_002 Not tainted 2.6.32-279.5.1.el6_lustre.gb16fe80.x86_64 #1 SGI.COM C1104-2TY9/X8DTT-IBQF
RIP: 0010:[&amp;lt;ffffffffa08dd64f&amp;gt;]  [&amp;lt;ffffffffa08dd64f&amp;gt;] kiblnd_map_tx+0x21f/0x550 [ko2iblnd]
RSP: 0018:ffff88119f5ef710  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88044333c080 RCX: 0000000000000000
RDX: 0000000574cca000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88119f5ef780 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88044333c080 R11: ffff88182143e940 R12: 0000000000000001
R13: ffff88182151c7c0 R14: ffffc900230ccf48 R15: 0000000000100000
FS:  00007f451c390700(0000) GS:ffff88002c260000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000018201a0000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
&lt;span class=&quot;code-object&quot;&gt;Process&lt;/span&gt; ll_ost_io01_002 (pid: 15459, threadinfo ffff88119f5ee000, task ffff88119f5eaae0)
Stack:
 ffff88000006d4a8 0000000000000002 0000001000000100 ffffffff81a945c0
&amp;lt;d&amp;gt; ffff88182143e940 0000010000000002 ffff880000053600 0006125000000001
&amp;lt;d&amp;gt; 0000000000000246 ffff880423589060 0000000000000001 0000000000000000
Call Trace:
 [&amp;lt;ffffffffa08dda91&amp;gt;] kiblnd_setup_rd_kiov+0x111/0x2d0 [ko2iblnd]
 [&amp;lt;ffffffffa08e39a3&amp;gt;] kiblnd_send+0x5b3/0x9f0 [ko2iblnd]
 [&amp;lt;ffffffffa04d7edb&amp;gt;] lnet_ni_send+0x4b/0x110 [lnet]
 [&amp;lt;ffffffffa04dc486&amp;gt;] lnet_send+0x6e6/0xc10 [lnet]
 [&amp;lt;ffffffffa04dcc94&amp;gt;] LNetGet+0x2e4/0x830 [lnet]
 [&amp;lt;ffffffffa0732b21&amp;gt;] ptlrpc_start_bulk_transfer+0x151/0x640 [ptlrpc]
 [&amp;lt;ffffffffa0703dd0&amp;gt;] target_bulk_io+0x180/0x950 [ptlrpc]
 [&amp;lt;ffffffffa042bbe0&amp;gt;] ? cfs_alloc+0x30/0x60 [libcfs]
 [&amp;lt;ffffffffa042b885&amp;gt;] ? cfs_waitq_init+0x15/0x20 [libcfs]
 [&amp;lt;ffffffffa0729926&amp;gt;] ? new_bulk+0x106/0x210 [ptlrpc]
 [&amp;lt;ffffffffa0726618&amp;gt;] ? __ptlrpc_prep_bulk_page+0x68/0x1a0 [ptlrpc]
 [&amp;lt;ffffffffa0b06627&amp;gt;] ost_brw_write+0x1327/0x15d0 [ost]
 [&amp;lt;ffffffffa0738e6c&amp;gt;] ? lustre_msg_get_version+0x8c/0x100 [ptlrpc]
 [&amp;lt;ffffffffa0738fc8&amp;gt;] ? lustre_msg_check_version+0xe8/0x100 [ptlrpc]
 [&amp;lt;ffffffffa0b0bec2&amp;gt;] ost_handle+0x32e2/0x4690 [ost]
 [&amp;lt;ffffffffa042bbe0&amp;gt;] ? cfs_alloc+0x30/0x60 [libcfs]
 [&amp;lt;ffffffffa074015b&amp;gt;] ? ptlrpc_update_export_timer+0x4b/0x470 [ptlrpc]
 [&amp;lt;ffffffffa07485cc&amp;gt;] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
 [&amp;lt;ffffffffa042b65e&amp;gt;] ? cfs_timer_arm+0xe/0x10 [libcfs]
 [&amp;lt;ffffffffa043d17f&amp;gt;] ? lc_watchdog_touch+0x6f/0x180 [libcfs]
 [&amp;lt;ffffffffa073f999&amp;gt;] ? ptlrpc_wait_event+0xa9/0x2a0 [ptlrpc]
 [&amp;lt;ffffffff810533f3&amp;gt;] ? __wake_up+0x53/0x70
 [&amp;lt;ffffffffa0749bbc&amp;gt;] ptlrpc_main+0xc0c/0x19f0 [ptlrpc]
 [&amp;lt;ffffffffa0748fb0&amp;gt;] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
 [&amp;lt;ffffffff8100c14a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffffa0748fb0&amp;gt;] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
 [&amp;lt;ffffffffa0748fb0&amp;gt;] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
 [&amp;lt;ffffffff8100c140&amp;gt;] ? child_rip+0x0/0x20
Code: 8d 0c 40 31 c0 48 c1 e1 02 8b 7c 0b 08 85 ff 74 28 0f 1f 00 49 8b 55 18 48 23 54 0b 0c 4c 63 c0 49 63 fc 41 83 c4 01 49 8d 14 10 &amp;lt;48&amp;gt; 89 14 fe 41 03 45 14 3b 44 0b 08 72 db 41 83 c1 01 44 3b 4b 
RIP  [&amp;lt;ffffffffa08dd64f&amp;gt;] kiblnd_map_tx+0x21f/0x550 [ko2iblnd]
 RSP &amp;lt;ffff88119f5ef710&amp;gt;
CR2: 0000000000000000
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="47352" author="pjones" created="Sat, 3 Nov 2012 01:08:56 +0000"  >&lt;p&gt;Liang&lt;/p&gt;

&lt;p&gt;Could you please comment?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="47392" author="liang" created="Mon, 5 Nov 2012 07:40:30 +0000"  >&lt;p&gt;Hi Jeremy, I think it&apos;s because kiblnd_create_tx_pool() should be called after creation of FMR pool, otherwise it will not allocate tx_pages for kib_tx_t.&lt;br/&gt;
I&apos;ve posted a patch for this: &lt;a href=&quot;http://review.whamcloud.com/#change,4462&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,4462&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="48459" author="liang" created="Tue, 27 Nov 2012 21:44:29 +0000"  >&lt;p&gt;patch landed&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvbmn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5427</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>