<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:03:02 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-29] obdfilter-survey doesn&apos;t work well if cpu_cores (/w hyperT) &gt; 16</title>
                <link>https://jira.whamcloud.com/browse/LU-29</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;it seems obdfilter-survey is not working well on 12 cores system (can see 24 cores on OSS if hyper_thread=on).&lt;br/&gt;
Here is quick results on 12 cores, 6 cores and 8 on same OSSs. For 6 and 8 cores, I turned CPUs off by &quot;echo 0 &amp;gt; /sys/devices/system/cpu/cpuX/online&quot; on 12 core system. (X5670, Westmere 6 cores x 2 sockets)&lt;br/&gt;
Testing on &quot;# of cpu cores &amp;lt;= 16&quot; seems no problem, but on 24 cores, it can&apos;t be working well.&lt;br/&gt;
This has been discussing on bug 22980, but still nothing solution to run obdfilter-survery on current Westmere box.&lt;/p&gt;

&lt;p&gt;#TEST-1 4xOSSs, 56OSTs(14 OSTs per OSS), 12 cores (# of CPU cores is 24)&lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr   56 write 3323.91 [  39.96,  71.93] read 5967.91 [  94.91, 127.93] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  112 write 5807.10 [  72.93, 120.77] read 6182.79 [  96.91, 140.86] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  224 write 6377.41 [  75.93, 176.83] read 6193.18 [  81.98, 139.86] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  448 write 6279.64 [  69.93, 185.83] read 6162.43 [  77.88, 162.86] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  896 write 6114.28 [   9.99, 226.79] read 6017.08 [  14.98, 220.80] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr 1792 write 6078.08 [   8.99, 285.73] read 5923.64 [  16.98, 161.85] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr 3584 write 6168.36 [  76.92, 250.75] read 5828.33 [  85.95, 174.77] &lt;/p&gt;


&lt;p&gt;#TEST-2 4xOSSs, 56OSTs(14 OSTs per OSS), 6 cores (# of CPU cores is 12, all physical cpu_id=1 are turned off)&lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr   56 write 3677.43 [  36.97,  75.93] read 8355.91 [ 137.87, 168.85] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  112 write 7045.25 [  89.92, 141.87] read 10672.33 [ 153.87, 212.80] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  224 write 9909.58 [ 116.88, 217.78] read 10235.82 [ 140.87, 203.83] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  448 write 9796.21 [ 106.90, 214.80] read 10803.78 [ 142.87, 348.93] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  896 write 9377.85 [  54.95, 265.75] read 10700.27 [ 126.76, 279.74] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr 1792 write 9257.48 [   0.00, 384.63] read 10726.18 [ 121.87, 291.74] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr 3584 write 9162.01 [   0.00, 242.78] read 10627.94 [ 115.89, 271.74] &lt;/p&gt;


&lt;p&gt;#TEST-3 4xOSSx, 56OSTs(14 OSTs per OSS), 8 cores (# of CPU cores is 16, core_id=&lt;/p&gt;
{2, 10}
&lt;p&gt; from both sockets are turned off)&lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr   56 write 3614.92 [  43.96,  75.93] read 7919.40 [ 122.88, 169.84] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  112 write 6703.91 [  71.94, 135.87] read 9899.53 [ 156.87, 201.81] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  224 write 9901.78 [ 123.88, 233.78] read 10401.05 [ 151.85, 202.81] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  448 write 9721.29 [ 115.89, 212.80] read 10812.26 [ 151.86, 241.54] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  896 write 9330.51 [  94.91, 257.50] read 10672.22 [ 112.90, 342.66] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr 1792 write 9053.42 [  22.98, 263.75] read 10657.08 [  95.91, 286.73] &lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr 3584 write 9081.75 [  45.96, 239.57] read 10562.43 [  78.93, 270.75] &lt;/p&gt;</description>
                <environment></environment>
        <key id="10138">LU-29</key>
            <summary>obdfilter-survey doesn&apos;t work well if cpu_cores (/w hyperT) &gt; 16</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="ihara">Shuichi Ihara</reporter>
                        <labels>
                    </labels>
                <created>Wed, 22 Dec 2010 06:54:53 +0000</created>
                <updated>Tue, 28 Jun 2011 15:01:37 +0000</updated>
                            <resolved>Mon, 14 Feb 2011 06:06:29 +0000</resolved>
                                    <version>Lustre 1.8.6</version>
                                    <fixVersion>Lustre 1.8.6</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="10336" author="liang" created="Wed, 22 Dec 2010 07:29:15 +0000"  >&lt;p&gt;Hi Ihara,&lt;/p&gt;

&lt;p&gt;I&apos;m a little confusing about these data, I think your box has 2 * 6-cores (see 24-cores with Hyper-threading), right? could you please give me a simple list of performance data like this:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Hyper-threading OFF&lt;br/&gt;
  1) 2 cores&lt;br/&gt;
  2) 4 cores&lt;br/&gt;
  3) 6 cores&lt;br/&gt;
  4) 8 cores&lt;br/&gt;
  5) 10 cores&lt;br/&gt;
  6) 12 cores&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Hyper-threading ON&lt;br/&gt;
  1) 4 cores&lt;br/&gt;
  2) 6 cores&lt;br/&gt;
  3) 8 cores&lt;br/&gt;
  4) 12 cores&lt;br/&gt;
  5) 16 cores&lt;br/&gt;
  6) 18 creos&lt;br/&gt;
  7) 20 cores&lt;br/&gt;
  8) 24 cores&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I don&apos;t need many data samples, just an average value should be good enough. I&apos;m a little suspecting it could be an issue of utility. &lt;/p&gt;

&lt;p&gt;Thanks&lt;br/&gt;
Liang&lt;/p&gt;</comment>
                            <comment id="10337" author="ihara" created="Wed, 22 Dec 2010 08:24:59 +0000"  >&lt;p&gt;Liang, &lt;/p&gt;

&lt;p&gt;Yes, I&apos;m testing on Intel 6 cores x 2 socket box. &lt;br/&gt;
I can&apos;t turn Hyper-Thread off due to can&apos;t reboot right now, but I just collected the data with 4, 6, 8, 12... cores when HT is enabled. Here is results.&lt;/p&gt;

&lt;p&gt;Ran obdfilter-survey on single OSS with 14 OSTs. (obj=1, thr=64)&lt;/p&gt;

&lt;p&gt;#core write    read&lt;br/&gt;
 4    2728.94  2672.93&lt;br/&gt;
 6    2679.19  2669.31&lt;br/&gt;
 8    2677.86  2663.86&lt;br/&gt;
12    2658.00  2660.31&lt;br/&gt;
16    2633.77  2650.97&lt;br/&gt;
18    2626.06  2653.20&lt;br/&gt;
20    2618.16  2649.80&lt;br/&gt;
22    2586.50  2620.15&lt;br/&gt;
24    1685.99  1575.12&lt;/p&gt;

&lt;p&gt;The numbers rapidly drop only on 24 cores.&lt;br/&gt;
Let me try to run same testing with HT=off later.&lt;/p&gt;</comment>
                            <comment id="10338" author="ihara" created="Wed, 22 Dec 2010 18:03:38 +0000"  >
&lt;p&gt;Here is same test results on same box, but HT=off.&lt;/p&gt;

&lt;p&gt;#core write    read&lt;br/&gt;
 2    3019.27  2798.00&lt;br/&gt;
 4    2983.61  2754.41&lt;br/&gt;
 6    2914.59  2748.15&lt;br/&gt;
 8    2897.25  2731.10&lt;br/&gt;
10    2877.80  2724.94&lt;br/&gt;
12    2896.62  2711.43&lt;/p&gt;

&lt;p&gt;not big changes by number of cores, but another interesting thing is that the write number is better than the results on HT=on. does HT harm the lustre performance, basically?&lt;/p&gt;</comment>
                            <comment id="10344" author="liang" created="Thu, 23 Dec 2010 00:47:37 +0000"  >&lt;p&gt;Ihara, thanks for these data, yes I think hyper-threading will not help on performance of lustre(server side), at least for any of current releases.&lt;/p&gt;

&lt;p&gt;I actually have a stupid question, I assume you were disable those cores symmetrically right? (i.e: for 8 cores tests, it&apos;s disabled 2 cores on the first socket and 2 cores on the second socket)&lt;/p&gt;
</comment>
                            <comment id="10345" author="liang" created="Thu, 23 Dec 2010 01:59:39 +0000"  >&lt;p&gt;reassign to Niu for the next step survey&lt;/p&gt;</comment>
                            <comment id="10347" author="niu" created="Fri, 24 Dec 2010 00:30:42 +0000"  >&lt;p&gt;obdfilter_survey calls &apos;lctl test_brw&apos; which issue ioctl to kernel, however, ioctl needs BKL! Though we released the BKL in echo_client_iocontrl() before I/O start (and reacquired it after I/O done), the overhead of lock contention could be huge in our test senario (dozens of cores and hundreds of processes).&lt;/p&gt;

&lt;p&gt;I think we&apos;d better support &apos;unlocked_ioctl&apos; for lustre file_operations, then move all the performance sensitive ioctls into &apos;unlocked_ioctl&apos;, OBD_IOC_BRW_READ/WRITE for example.&lt;/p&gt;

&lt;p&gt;Hi, Shuichi&lt;/p&gt;

&lt;p&gt;I&apos;ll make a patch as per my above analysis. If it&apos;s handy for you, could you collect some statistics by oprofile (or lockmeter even better) to confirm my analysis? Thank you. &lt;/p&gt;</comment>
                            <comment id="10348" author="ihara" created="Fri, 24 Dec 2010 03:29:58 +0000"  >&lt;p&gt;yes, for 8 cores testing, I killed two cores from each socket.&lt;br/&gt;
Niu, I&apos;m happy to test your patches on our test box, please let me know what profile you want.&lt;/p&gt;

&lt;p&gt;Ihara&lt;/p&gt;</comment>
                            <comment id="10362" author="niu" created="Wed, 29 Dec 2010 18:39:10 +0000"  >&lt;p&gt;Adding &quot;unlocked_ioctl&quot; for preformance sensitive ioctls, such as &quot;OBD_IOC_BRW_READ/WRITE&quot;&lt;/p&gt;</comment>
                            <comment id="10363" author="niu" created="Wed, 29 Dec 2010 18:45:34 +0000"  >&lt;p&gt;Hi, Ihara&lt;/p&gt;

&lt;p&gt;Sorry for the late response, I just asked few days off for some personal issues.&lt;/p&gt;

&lt;p&gt;I&apos;ve made a patch which try to resolve this problem, it&apos;s available at &lt;a href=&quot;http://review.whamcloud.com/163&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/163&lt;/a&gt; (I also attached it here for you convenience). Please try this patch. &lt;/p&gt;</comment>
                            <comment id="10374" author="ihara" created="Thu, 30 Dec 2010 05:54:35 +0000"  >&lt;p&gt;Niu, thanks for patching. Let me try this patch in end of next week due to new year holiday.&lt;br/&gt;
I will let you know the results whether it works well or not. &lt;/p&gt;</comment>
                            <comment id="10378" author="niu" created="Thu, 30 Dec 2010 21:35:56 +0000"  >&lt;p&gt;The original patch has defect, I&apos;ve updated it with new one.&lt;/p&gt;</comment>
                            <comment id="10386" author="ihara" created="Thu, 6 Jan 2011 08:09:52 +0000"  >&lt;p&gt;Niu, &lt;/p&gt;

&lt;p&gt;I just tested your latest patch, but obdfilter-suvery result is still low on 24 cores. Here is results.&lt;/p&gt;

&lt;p&gt;12 cores (HT=disabled)&lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  896 write 9871.99 [  85.75, 229.56] read 10802.02 [ 125.88, 309.74] &lt;/p&gt;

&lt;p&gt;24 cores (HT=enabled)&lt;br/&gt;
ost 56 sz 469762048K rsz 1024K obj   56 thr  896 write 6076.08 [  21.98, 557.93] read 5614.03 [  12.98, 748.07]&lt;/p&gt;</comment>
                            <comment id="10388" author="ihara" created="Thu, 6 Jan 2011 14:33:17 +0000"  >&lt;p&gt;btw, I&apos;ve been testing this on lustre-1.8.4. So, I did some code adjustments from &lt;a href=&quot;http://review.whamcloud.com/163&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/163&lt;/a&gt; for lustre-1.8.&lt;/p&gt;

&lt;p&gt;Ihara&lt;/p&gt;</comment>
                            <comment id="10389" author="ihara" created="Thu, 6 Jan 2011 17:12:30 +0000"  >&lt;p&gt;adjusted patch for 1.8.x&lt;/p&gt;</comment>
                            <comment id="10391" author="niu" created="Thu, 6 Jan 2011 18:46:25 +0000"  >&lt;p&gt;Thank you, Ihara.&lt;/p&gt;

&lt;p&gt;Could you run a full test, and post all the output (like what you did in the first comment) to see if there is any differences?&lt;/p&gt;

&lt;p&gt;I suspect there is some other contention dragged down the performance, could you use oprofile to collect some data while running the test?&lt;/p&gt;

&lt;p&gt;btw, what&apos;s the kernel version?&lt;/p&gt;</comment>
                            <comment id="10400" author="adilger" created="Fri, 7 Jan 2011 16:59:55 +0000"  >&lt;p&gt;See also &lt;a href=&quot;https://bugzilla.lustre.org/show_bug.cgi?id=22980#c18&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugzilla.lustre.org/show_bug.cgi?id=22980#c18&lt;/a&gt; for a similar issue.  I suspect that the performance bottleneck may be in userspace, but we can only find out with some oprofile and/or lockmeter data.&lt;/p&gt;</comment>
                            <comment id="10401" author="ihara" created="Fri, 7 Jan 2011 17:24:21 +0000"  >&lt;p&gt;Niu, sorry, it looked like something bad in the storage side when I did benchmark yesterday. Once I fixed the storage, tried obdfilter-survey with applied your patches. It seems patches fixe the problem on 24 core system and getting close number to when HT=off. Here is results on 12 cores (HT=off) and 24 cores (HT=on).&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# 12 cores (HT=off), 4 OSSs, 56 OSTs (14OSTs per OSS)
ost 56 sz 469762048K rsz 1024K obj   56 thr   56 write 3546.88 [  37.96,  70.86] read 7633.11 [ 124.88, 156.85] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  112 write 6420.31 [  91.91, 130.75] read 10121.79 [ 159.70, 202.60] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  224 write 9576.76 [ 125.84, 216.80] read 10444.91 [ 167.84, 216.79] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  448 write 10264.63 [  98.95, 207.61] read 10972.26 [ 150.68, 232.78] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  896 write 9842.69 [  91.91, 305.69] read 10896.16 [ 121.89, 330.57] 
ost 56 sz 469762048K rsz 1024K obj   56 thr 1792 write 9613.51 [  28.96, 251.70] read 10792.37 [ 123.88, 277.50] 
ost 56 sz 469762048K rsz 1024K obj   56 thr 3584 write 9597.46 [   0.00, 253.78] read 10698.87 [ 118.89, 271.75] 

# 24 cores (HT=on), 4 OSSs, 56 OSTs (14OSTs per OSS)
ost 56 sz 469762048K rsz 1024K obj   56 thr   56 write 3345.48 [  42.96,  66.94] read 6981.70 [ 102.91, 153.86] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  112 write 6327.40 [  88.92, 128.89] read 9826.28 [ 156.85, 208.80] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  224 write 9792.45 [ 139.87, 218.77] read 10409.23 [ 173.84, 303.70] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  448 write 10262.20 [ 106.90, 235.78] read 10903.93 [ 157.86, 253.79] 
ost 56 sz 469762048K rsz 1024K obj   56 thr  896 write 9905.94 [  98.91, 233.78] read 10829.35 [ 127.88, 266.75] 
ost 56 sz 469762048K rsz 1024K obj   56 thr 1792 write 9656.78 [   6.99, 251.79] read 10761.36 [ 115.89, 333.68] 
ost 56 sz 469762048K rsz 1024K obj   56 thr 3584 write 9596.28 [   0.00, 261.76] read 10742.13 [ 119.89, 324.68] 

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="10403" author="niu" created="Fri, 7 Jan 2011 19:40:07 +0000"  >&lt;p&gt;Thanks for your good news, Ihara. Looks the patch works as we expected. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;Hi, Andreas&lt;/p&gt;

&lt;p&gt;The user space semaphore used to protect the shmem is another contention source, however, it looks not so severe as the BKL of each ioctl. Should we post the patch to the bug 22890 to see if it resovles the problem? &lt;/p&gt;

&lt;p&gt;BTW, I thought there isn&apos;t any Lustre ioctls depends on BKL, and it&apos;s safe to introduce &apos;unlocked_ioctl&apos;. Could you confirm it?  &lt;/p&gt;</comment>
                            <comment id="10446" author="pjones" created="Wed, 19 Jan 2011 11:13:19 +0000"  >&lt;p&gt;As per Andreas, you probably mean bz 22980, rather than 22890. Yes please, can you attach your patch to the bz - thanks!&lt;/p&gt;</comment>
                            <comment id="10447" author="adilger" created="Wed, 19 Jan 2011 13:36:18 +0000"  >&lt;p&gt;I don&apos;t think there are any ioctls that depend on BKL, but I haven&apos;t looked through them closely.  In particular, I&apos;m not sure if there is proper serialization around the configuration ioctls or not.&lt;/p&gt;

&lt;p&gt;That said, since the configuration is almost always done by mount/unmount and not by the old lctl commands, I don&apos;t think this will be a serious risk, so I think it makes sense to move the Lustre ioctl handling over to -&amp;gt;unlocked_ioctl().  That should be done only for kernels which support the -&amp;gt;unlocked_ioctl() method, which means a configure check is needed to set HAVE_UNLOCKED_IOCTL if that method is present in struct file_operations.&lt;/p&gt;</comment>
                            <comment id="10448" author="niu" created="Wed, 19 Jan 2011 18:18:57 +0000"  >&lt;p&gt;Yes, I meant 22980, thanks Peter.&lt;/p&gt;

&lt;p&gt;Andreas, the HAVE_UNLOCKED_IOCTL is defined by the kernel which has the &apos;unlocked_ioctl&apos; method.&lt;/p&gt;</comment>
                            <comment id="10488" author="ihara" created="Tue, 25 Jan 2011 08:19:14 +0000"  >&lt;p&gt;Niu, I&apos;m investigating for test infrastructure on VMs (KVM: Kernel based Virtual Machine). Once apply the your patch and run obdfilter survey on VM, the performance is going to bad. Without the patch, I&apos;m getting reasonable number even on VMs. So, the the patch seems have some impacts if I run obdfilter-survey on VM.&lt;/p&gt;

&lt;p&gt;I will file results and more information (will get oprofile on VM) in a couple of days.&lt;/p&gt;

&lt;p&gt;Ihara&lt;/p&gt;</comment>
                            <comment id="10493" author="niu" created="Tue, 25 Jan 2011 21:00:10 +0000"  >&lt;p&gt;Ihara, that&apos;s interesting, I don&apos;t have any ideas on that so far, let&apos;s see what happened when your result comes out. Thanks.&lt;/p&gt;</comment>
                            <comment id="10540" author="ihara" created="Mon, 7 Feb 2011 05:07:58 +0000"  >&lt;p&gt;Niu, it was my fault, sorry. the problem was NOT caused by this patches. The problem came from CPU affinity setting on VM. Maybe many context happened?&lt;br/&gt;
Anyway, once I did correct CPU affinity on VM, the patches works as well as 24 cores system.&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="10564" author="adilger" created="Wed, 9 Feb 2011 00:40:53 +0000"  >&lt;p&gt;I&apos;m currently unable to post to bugzilla...&lt;/p&gt;

&lt;p&gt;Inspection template(s):&lt;br/&gt;
Bug:       22980 &lt;br/&gt;
Developer: niu@whamcloud.com &lt;br/&gt;
Size:      7 Lines of Change &lt;br/&gt;
Date:      2011-2-8&lt;br/&gt;
Defects:   1&lt;br/&gt;
Type:      CODE&lt;br/&gt;
Inspector: adilger@whamcloud.com&lt;/p&gt;

&lt;p&gt;--------------&lt;br/&gt;
&amp;gt;@@ -1681,24 +1680,15 @@ int jt_obd_test_brw(int argc, char **argv)&lt;br/&gt;
&amp;gt;                 } else if (be_verbose(verbose, &amp;amp;next_time,i, &amp;amp;next_count,count)) {&lt;br/&gt;
&amp;gt;-                        shmem_lock ();&lt;br/&gt;
&amp;gt;                         printf(&quot;%s: %s number %d @ &quot;LPD64&quot;:&quot;LPU64&quot; for %d\n&quot;,&lt;br/&gt;
&amp;gt;                                jt_cmdname(argv&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;), write ? &quot;write&quot; : &quot;read&quot;, i,&lt;br/&gt;
&amp;gt;                                data.ioc_obdo1.o_id, data.ioc_offset,&lt;br/&gt;
&amp;gt;                                (int)(pages * getpagesize()));&lt;br/&gt;
&amp;gt;-                        shmem_unlock ();&lt;/p&gt;

&lt;p&gt;I would be surprised if the locking here affects the performance.  be_verbose()&lt;br/&gt;
should be true at most every few seconds, and otherwise the shmem_lock/unlock()&lt;br/&gt;
is never hit.  I think this was put in place to avoid all of the printf()&lt;br/&gt;
statements from overlapping, which ruins the whole result from the test.  If&lt;br/&gt;
there actually IS overhead from this locking, it just means that the message&lt;br/&gt;
rate is too high and needs to be reduced.&lt;/p&gt;

&lt;p&gt;&amp;gt;@@ -1622,20 +1622,19 @@ int jt_obd_test_brw(int argc, char **argv)&lt;br/&gt;
&amp;gt; &lt;br/&gt;
&amp;gt; #ifdef MAX_THREADS&lt;br/&gt;
&amp;gt;         if (thread) {&lt;br/&gt;
&amp;gt;-                shmem_lock ();&lt;br/&gt;
&amp;gt;                 if (nthr_per_obj != 0) &lt;/p&gt;
{
&amp;gt;                         /* threads interleave */
&amp;gt;                         obj_idx = (thread - 1)/nthr_per_obj;
&amp;gt;                         objid += obj_idx;
&amp;gt;                         stride *= nthr_per_obj;
&amp;gt;-                        if ((thread - 1) % nthr_per_obj == 0)
&amp;gt;-                                shared_data-&amp;gt;offsets[obj_idx] = stride + thr_offset;
&amp;gt;                         thr_offset += ((thread - 1) % nthr_per_obj) * len;
&amp;gt;                 }
&lt;p&gt; else &lt;/p&gt;
{
&amp;gt;                         /* threads disjoint */
&amp;gt;                         thr_offset += (thread - 1) * len;
&amp;gt;                 }
&lt;p&gt;&amp;gt; &lt;br/&gt;
&amp;gt;+                shmem_lock ();&lt;br/&gt;
&amp;gt;+&lt;br/&gt;
&amp;gt;                 shared_data-&amp;gt;barrier--;&lt;br/&gt;
&amp;gt;                 if (shared_data-&amp;gt;barrier == 0)&lt;br/&gt;
&amp;gt;                         l_cond_broadcast(&amp;amp;shared_data-&amp;gt;cond);&lt;br/&gt;
&amp;gt;                 if (!repeat_offset) {&lt;br/&gt;
&amp;gt; #ifdef MAX_THREADS&lt;br/&gt;
&amp;gt;-                        if (stride == len) &lt;/p&gt;
{
&amp;gt;-                                data.ioc_offset += stride;
&amp;gt;-                        }
&lt;p&gt; else if (i &amp;lt; count) &lt;/p&gt;
{
&amp;gt;-                                shmem_lock ();
&amp;gt;-                                data.ioc_offset = shared_data-&amp;gt;offsets[obj_idx];
&amp;gt;-                                shared_data-&amp;gt;offsets[obj_idx] += len;
&amp;gt;-                                shmem_unlock ();
&amp;gt;-                        }
&lt;p&gt;&amp;gt;+                        data.ioc_offset += stride;&lt;/p&gt;


&lt;p&gt;(defect) I don&apos;t think this is going to result in the same test load at all.&lt;br/&gt;
It means that only &quot;len/stride&quot; fraction of each object is written, and in&lt;br/&gt;
fact it looks like there will be holes in every object because the common&lt;br/&gt;
data-&amp;gt;ioc_offset is being incremented by every thread in a racy manner so the&lt;br/&gt;
offset will get large too quickly.&lt;/p&gt;

&lt;p&gt;What about adding per-object shmem_locks to protect offsets[] values?  That&lt;br/&gt;
would avoid most of the contention on this lock, if that is the overhead.&lt;br/&gt;
However, like previously stated, I think it is best to get some real data&lt;br/&gt;
(e.g. oprofile for kernel and gprof for userspace, collected over two different&lt;br/&gt;
test runs to avoid too much overhead).&lt;/p&gt;</comment>
                            <comment id="10566" author="niu" created="Wed, 9 Feb 2011 02:08:07 +0000"  >&lt;p&gt;Thank you, Andreas. &lt;/p&gt;

&lt;p&gt;Yes, I agree with you that we&apos;d better collect some oprofile data in next step, will have a meeting with Bull&apos;s people tonight for the b22980.&lt;/p&gt;

&lt;p&gt;The test result shows that the shmem lock isn&apos;t a major factor, so I think we can put it aside for a while. One thing confused me is that the &apos;unlocked_ioctl&apos; patch works for Ihara&apos;s test, but doesn&apos;t work well for b22980. Liang discussed this issue with me yesterday, and we concluded three differences between the two tests:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Ihara&apos;s test is against 1.8, b22980 is against 2.0; (I checked the obdecho code, seems there is no major difference between 1.8 and 2.0 in &apos;case=disk&apos; mode)&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;With/without patch applied, Ihara compared 12 cores with 24 cores, b22980 compared 8 cores with 32 cores; (Will ask Bull&apos;s people to do more tests with patch applied, 16 cores, 24 cores...)&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;This test is running on SMP (Ihara, please correct me if I&apos;m wrong), but b22980 is running on NUMA; (Liang mentioned that without cpu affinity, the performance degradation could be huge in NUMA arch, will ask them to do more test to see if it&apos;s NUMA dependent).&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Will talk to Bull&apos;s people on what we found, and ask them to supply some oprofile data in next test. If you have any comments, please let me know.&lt;/p&gt;</comment>
                            <comment id="10568" author="liang" created="Wed, 9 Feb 2011 03:34:31 +0000"  >&lt;p&gt;As Niu said, the major difference between this issue and b22980 is, b22980 is running over NUMIOA system.&lt;br/&gt;
Diego mentioned (on b22980) Lustre client can driver OSS harder than obdfilter-survey, I think it could be because IO threads of OST are numa-affinity (ptlrpc_service::srv_cpu_affinity), but lctl threads of obdfilter-survey don&apos;t have any affinity so they probably have much more cross-nodes traffic.&lt;/p&gt;

&lt;p&gt;I hope Bull can collect some data for us (i.e: if each numa node has 8 cores):&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;only enable 1/2/3/4 numa node and run obdfilter-survey&lt;/li&gt;
	&lt;li&gt;enable 8/16/24/32 cores, but these cores distribute on different numa nodes&lt;br/&gt;
If we get quite different performance, then I think our assumption is correct, otherwise we pointed to wrong place. Of course, we should get oprofiles while running with these tests.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="10586" author="pjones" created="Wed, 9 Feb 2011 10:50:31 +0000"  >&lt;p&gt;Am I right in thinking that the NUMIOA issue is being tracked under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-66&quot; title=&quot;obdfilter-survey performance issue on NUMA system&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-66&quot;&gt;&lt;del&gt;LU-66&lt;/del&gt;&lt;/a&gt; and this ticket can be marked as resolved? Bugzilla seems to be available again btw.&lt;/p&gt;</comment>
                            <comment id="10592" author="ihara" created="Wed, 9 Feb 2011 14:46:07 +0000"  >&lt;p&gt;I believe so. In my case, obdfilter-survery didn&apos;t work well on 24 cores system (actually it was 12 cores, but 24 cores with HT=on) and this is NOT NUMIOA platform.&lt;br/&gt;
I confirmed this problem was fixed by Niu patches here, and obdfilter-survey worked well for 24 cores on non-NUMIOA platform.&lt;/p&gt;

&lt;p&gt;btw, I also have a NUMIOA system, but using with VMs and spiting CPU and IO chip by KVM(Kernel-based Virtual Machine) now. However, I&apos;m really interested in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-66&quot; title=&quot;obdfilter-survey performance issue on NUMA system&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-66&quot;&gt;&lt;del&gt;LU-66&lt;/del&gt;&lt;/a&gt;. I do keep an eye this and, want to test patches in my test system if it&apos;s available.&lt;/p&gt;

&lt;p&gt;Thanks again!&lt;/p&gt;
</comment>
                            <comment id="10636" author="pjones" created="Mon, 14 Feb 2011 06:06:29 +0000"  >&lt;p&gt;Marking as resolved as landed upstream for Oracle Lustre 1.8.6&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="10081" name="LU-29.patch" size="4125" author="niu" created="Thu, 30 Dec 2010 21:33:57 +0000"/>
                            <attachment id="10082" name="bug22980-for-1.8.x.patch" size="3594" author="ihara" created="Thu, 6 Jan 2011 17:12:30 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                    <customfield id="customfield_10020" key="com.atlassian.jira.plugin.system.customfieldtypes:float">
                        <customfieldname>Bugzilla ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>22980.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvsof:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8550</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>