<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:56:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6063] conf-sanity test_76a fails on RHEL7, SLES12</title>
                <link>https://jira.whamcloud.com/browse/LU-6063</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;conf-sanity, test_76a fails every time on any el7 client as far as I can tell.  This test attempts to prove permanent param changes made with &apos;lctl set_param -P&apos;.  This mechanism doesn&apos;t seem to work at all when the client is el7.&lt;/p&gt;

&lt;p&gt;I can manually reproduce the problem by mounting a lustre filesystem, observe the &apos;max_dirty_mb&apos; param on the client with &apos;lctl get_param osc.&amp;#42;.max_dirty_mb&apos; on the client, then manually alter that param on the mgs by manually exectuting &apos;lctl set_param -P osc.*.max_dirty_mb=64&apos; from the command line on the mgs.   If I have the lustre filesystem mounted on both an el6 and an el7 client I can see the change from 32 (the default) up to 64 in the results of get_param cmd on the el6 client after a few seconds.   The value is never seen to change on the el7 client at all.  It appears to stay at the default value of 32 forever, never visibly changing.&lt;/p&gt;

&lt;p&gt;The fact that the change can be observed on an el6 client indicates the change on the mgs is really happening and is eventually reaching the el6 client, but somehow it is never reflected back into the el7 client.&lt;/p&gt;

&lt;p&gt;There must be some significant difference on el7 causing the failure there, but I&apos;m at a loss to explain it.  I think I need a higher level expert to help with this problem.  Without some solution I don&apos;t think we will get a 100% test run on an el7 client ever.&lt;/p&gt;</description>
                <environment>el7 client, sles12 client</environment>
        <key id="28001">LU-6063</key>
            <summary>conf-sanity test_76a fails on RHEL7, SLES12</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bogl">Bob Glossman</assignee>
                                    <reporter username="bogl">Bob Glossman</reporter>
                        <labels>
                            <label>HB</label>
                    </labels>
                <created>Sun, 21 Dec 2014 22:51:10 +0000</created>
                <updated>Sun, 1 Apr 2018 03:22:25 +0000</updated>
                            <resolved>Sun, 8 Feb 2015 04:50:56 +0000</resolved>
                                    <version>Lustre 2.7.0</version>
                                    <fixVersion>Lustre 2.7.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="102173" author="pjones" created="Mon, 22 Dec 2014 15:00:47 +0000"  >&lt;p&gt;Mike&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="102176" author="bogl" created="Mon, 22 Dec 2014 15:20:20 +0000"  >&lt;p&gt;this problem seems to be not exclusive to el7.  I see similar behavior on sles12 clients.&lt;/p&gt;

&lt;p&gt;btw, all cases where I can reproduce the problem are with el6 servers.&lt;/p&gt;</comment>
                            <comment id="103631" author="adilger" created="Thu, 15 Jan 2015 18:43:34 +0000"  >&lt;p&gt;James S., any ideas on this?  I&apos;d guess that the RHEL7 and SLES12 kernels are using a new /proc implementation, and this isn&apos;t working properly with the MGS/MGC-driven tunables?&lt;/p&gt;</comment>
                            <comment id="103637" author="simmonsja" created="Thu, 15 Jan 2015 18:52:06 +0000"  >&lt;p&gt;Actually no one is using the old proc handling methods. It all has been ported over to seq_file. I can take a look at why it is failing.&lt;/p&gt;</comment>
                            <comment id="103765" author="simmonsja" created="Fri, 16 Jan 2015 18:33:24 +0000"  >&lt;p&gt;Bob does the server back end need to be RHEL7 or does this problem show up with just upgraded clients?&lt;/p&gt;</comment>
                            <comment id="103773" author="bogl" created="Fri, 16 Jan 2015 19:26:23 +0000"  >&lt;p&gt;James, see previous comment:&lt;/p&gt;

&lt;p&gt;&quot;btw, all cases where I can reproduce the problem are with el6 servers.&quot;&lt;/p&gt;</comment>
                            <comment id="103863" author="bogl" created="Mon, 19 Jan 2015 17:03:11 +0000"  >&lt;p&gt;randomly casting around for things RHEL 7 and SLES 12 have in common I notice that they both have systemd while older versions don&apos;t.  not saying that this has anything to do with anything, but it is a significant diff in runtime environment.&lt;/p&gt;</comment>
                            <comment id="104120" author="simmonsja" created="Tue, 20 Jan 2015 23:28:37 +0000"  >&lt;p&gt;Looked more closely at this problem and it reminds me of when ORNL encountered &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1014&quot; title=&quot;MGS with sys.timeout is ignored and if sys.timeout is changed its not synced across the file system.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1014&quot;&gt;&lt;del&gt;LU-1014&lt;/del&gt;&lt;/a&gt;. Its one of these class_process_config not working or not being called. Due to travel I might not get to it this week. I will see if I can duplicate the problem as soon as I can.&lt;/p&gt;</comment>
                            <comment id="104771" author="simmonsja" created="Mon, 26 Jan 2015 22:40:17 +0000"  >&lt;p&gt;Just got back today. I can easily reproduce the problem. For that test have you tried to see if the obdfilter.*.client_cache_count is also broken on either RHEL6.6 or RHEL7 servers?&lt;/p&gt;</comment>
                            <comment id="104778" author="bogl" created="Mon, 26 Jan 2015 22:49:25 +0000"  >&lt;p&gt;no, haven&apos;t looked at anything beyond the first failure related to max_dirty_mb.  In running the test as is it never gets beyond that.  In trying to reproduce the failure manually I focused on max_dirty_mb only.&lt;/p&gt;</comment>
                            <comment id="105491" author="adilger" created="Tue, 3 Feb 2015 12:02:12 +0000"  >&lt;p&gt;James, any suggestions on how to &quot;re-hook&quot; the /proc entries to the handler functions?&lt;/p&gt;</comment>
                            <comment id="105756" author="simmonsja" created="Wed, 4 Feb 2015 23:39:56 +0000"  >&lt;p&gt;Examining the logs I see the MGS is doing the right thing and sending the llog changes to the client. I&apos;m thinking the bug is in the class_config_llog_handler code.&lt;/p&gt;</comment>
                            <comment id="106036" author="simmonsja" created="Fri, 6 Feb 2015 16:24:50 +0000"  >&lt;p&gt;I finished examine the logs and have determined that the client side is doing the right thing. Once the client receives the packet so it can sync its llog with the MGS it then does a up call to lctl using the call_usermodehelper code. For some reason lctl fails to update the proc parameters. IMNSHO calling a user land utility to change proc entries from the kernel is ugly. I will see if any changes have happened to the usermodehelper api.&lt;/p&gt;</comment>
                            <comment id="106047" author="simmonsja" created="Fri, 6 Feb 2015 17:11:34 +0000"  >&lt;p&gt;Bob have you had any problems with the up call functionality on the MDS with RHEL7 testing? Looking at the source it seems that call_usermodehelper passes the right flag.&lt;/p&gt;</comment>
                            <comment id="106048" author="bogl" created="Fri, 6 Feb 2015 17:16:56 +0000"  >&lt;p&gt;James, haven&apos;t noticed any problems with upcalls (besides possibly this one) but haven&apos;t been looking carefully.  Doesn&apos;t extended group membership use it some?  think there are some sanity tests for that.&lt;/p&gt;</comment>
                            <comment id="106049" author="bogl" created="Fri, 6 Feb 2015 17:26:20 +0000"  >&lt;p&gt;not sure how this maps to upcall problems on MDS or MGS.  problem is seen with el6 MDS, el7 (or sles12) only on clients.&lt;/p&gt;</comment>
                            <comment id="106050" author="gerrit" created="Fri, 6 Feb 2015 17:43:37 +0000"  >&lt;p&gt;James Simmons (uja.ornl@gmail.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/13677&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13677&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6063&quot; title=&quot;conf-sanity test_76a fails on RHEL7, SLES12&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6063&quot;&gt;&lt;del&gt;LU-6063&lt;/del&gt;&lt;/a&gt; kernel: use proper flags for call_usermodehelper&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 15139bcaefc1e3d222b86ed6077eea89eee1136c&lt;/p&gt;</comment>
                            <comment id="106114" author="bogl" created="Fri, 6 Feb 2015 21:28:09 +0000"  >&lt;p&gt;If I&apos;m understanding the commit header, this problem is due to the fact that UMH_WAIT_PROC was 1 in el6, but is 2 in el7 and later.  If we has used the #define&apos;d name it would have been right in all builds, but using a literal number instead made it wrong in newer kernels.&lt;/p&gt;</comment>
                            <comment id="106120" author="simmonsja" created="Fri, 6 Feb 2015 21:47:07 +0000"  >&lt;p&gt;Correct. Also the logic for UMH_WAIT_PROC and UHM_NO_WAIT was the same at one time. See &lt;a href=&quot;https://lkml.org/lkml/2010/3/9/368&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://lkml.org/lkml/2010/3/9/368&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="106122" author="bogl" created="Fri, 6 Feb 2015 21:54:32 +0000"  >&lt;p&gt;Verified the mod does indeed fix the problem, at least for el7 clients.  The problem can no longer be reproduced either by manual command line commands or by conf-sanity, test 76a.&lt;/p&gt;

&lt;p&gt;Good call, James!&lt;/p&gt;</comment>
                            <comment id="106148" author="adilger" created="Sat, 7 Feb 2015 05:32:57 +0000"  >&lt;p&gt;James, the details of the current &lt;tt&gt;lctl set_param -P&lt;/tt&gt; implementation are in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2629&quot; title=&quot;set_param and conf_param have different syntaxes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2629&quot;&gt;&lt;del&gt;LU-2629&lt;/del&gt;&lt;/a&gt; of you are interested. It isn&apos;t really a performance critical operation, but like anything there is probably room for improvement. &lt;/p&gt;</comment>
                            <comment id="106165" author="gerrit" created="Sun, 8 Feb 2015 02:14:31 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/13677/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13677/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6063&quot; title=&quot;conf-sanity test_76a fails on RHEL7, SLES12&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6063&quot;&gt;&lt;del&gt;LU-6063&lt;/del&gt;&lt;/a&gt; kernel: use proper flags for call_usermodehelper&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8febfe0e30c5febdf716e4591c355199de4a6ab8&lt;/p&gt;</comment>
                            <comment id="106182" author="pjones" created="Sun, 8 Feb 2015 04:50:56 +0000"  >&lt;p&gt;Landed for 2.7&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="28198">LU-6123</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="27972">LU-6048</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="22582">LU-4416</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="51620">LU-10869</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx2z3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>16882</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>