<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:51:23 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12303] Use lnet_health_sensitivity for restoring health for each lnet_recovery_internal</title>
                <link>https://jira.whamcloud.com/browse/LU-12303</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Currently for each lnet_health_interval the LNet health is incremented by 1. The maximum LNet health value so it is possible to take up to 1000 seconds to recovery depending on the setup. A better way to handle this is to use the lnet_health_interval by the same amount that the health went by it.&lt;/p&gt;</description>
                <environment>Any lustre 2.12 system with LNet health enabled</environment>
        <key id="55649">LU-12303</key>
            <summary>Use lnet_health_sensitivity for restoring health for each lnet_recovery_internal</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="ashehata">Amir Shehata</assignee>
                                    <reporter username="simmonsja">James A Simmons</reporter>
                        <labels>
                    </labels>
                <created>Wed, 15 May 2019 20:37:59 +0000</created>
                <updated>Fri, 16 Oct 2020 06:21:54 +0000</updated>
                            <resolved>Tue, 31 Mar 2020 11:37:59 +0000</resolved>
                                    <version>Lustre 2.13.0</version>
                                    <fixVersion>Lustre 2.14.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="247189" author="adilger" created="Wed, 15 May 2019 22:29:55 +0000"  >&lt;p&gt;If the &lt;tt&gt;lnet_health_sensitivity&lt;/tt&gt; was also used to increment the interface health, then there would be no point in having a variable &lt;tt&gt;lnet_health_sensitivity&lt;/tt&gt; value.  It would mean that there are N health decrements and the same N health increments for an interface.  Also note (AFAIK, but Amir to confirm) that while the retry interval is 1s, it will increment the health for every successful RPC sent/received.&lt;/p&gt;

&lt;p&gt;The reason that &lt;tt&gt;lnet_health_sensitivity=100&lt;/tt&gt; for decrements, but 1 for increments, this implies that the interface can only lose 1/100 = 1% of messages on that interface for it to continue to be in use.  If it fails RPCs more than 1% of the time it will decrement faster than increment, which is good because you don&apos;t want to be using that interface.  If it fails less than 1% it will generally remain in use.  This &quot;minimum acceptable failure ratio&quot; is tunable by &lt;tt&gt;lnet_health_sensitivity&lt;/tt&gt;.&lt;/p&gt;</comment>
                            <comment id="247286" author="simmonsja" created="Thu, 16 May 2019 17:05:05 +0000"  >&lt;p&gt;Your docs are wrong &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&#160;It states at&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://wiki.whamcloud.com/display/LNet/LNet+Health+User+Documentation&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.whamcloud.com/display/LNet/LNet+Health+User+Documentation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;that the health increments every time interval which is one seconds. By those docs it could be 1000 seconds before the interface is seen as healthy.&lt;/p&gt;</comment>
                            <comment id="259116" author="gerrit" created="Wed, 4 Dec 2019 01:00:44 +0000"  >&lt;p&gt;Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36920&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36920&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12303&quot; title=&quot;Use lnet_health_sensitivity for restoring health for each lnet_recovery_internal&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12303&quot;&gt;&lt;del&gt;LU-12303&lt;/del&gt;&lt;/a&gt; lnet: recover health at same rate as dec&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 601be615b8409dc74f2f6e5c49fe0810bc443a73&lt;/p&gt;</comment>
                            <comment id="266386" author="gerrit" created="Tue, 31 Mar 2020 07:00:08 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36920/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36920/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12303&quot; title=&quot;Use lnet_health_sensitivity for restoring health for each lnet_recovery_internal&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12303&quot;&gt;&lt;del&gt;LU-12303&lt;/del&gt;&lt;/a&gt; lnet: recover health at same rate as dec&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 1d94a29dbc018fd00aa1c8a7a7ae343e0c9a4b83&lt;/p&gt;</comment>
                            <comment id="266410" author="pjones" created="Tue, 31 Mar 2020 11:37:59 +0000"  >&lt;p&gt;Landed for 2.14&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="55620">LU-12292</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00gcf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>