<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:43:02 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4473] Disable LNET routes without disrupting ongoing filesystem operations</title>
                <link>https://jira.whamcloud.com/browse/LU-4473</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;It is desirable to be able to gracefully take an LNET router out of service without disrupting ongoing filesystem operations. Since not all RPCs are re-sent we need a way to prevent routes from being used for new traffic while existing buffered messages continue to drain. I have a patch implementing one approach to achieving this behavior. &lt;/p&gt;

&lt;p&gt;The patch creates a pair of lctl commands, down_interfaces and up_interfaces. The down_interfaces command, when executed on an LNET router, sets the ni-&amp;gt;ni_status-&amp;gt;ns_status of each lnet_ni_t in the global LND instance list (except for LOLND) to a new status introduced by this patch, LNET_NI_STATUS_ADMINDOWN. An admin would use this command to remove an LNET router node from service in the following way:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Admin executes &apos;lctl down_interfaces&apos; on the router node being removed.&lt;/li&gt;
	&lt;li&gt;After a small waiting period ( on the order of router_ping_timeout + max(dead_router_check_interval, live_router_check_interval) ) all clients and servers should have ping&apos;d this router and received a response.&lt;/li&gt;
	&lt;li&gt;The response payload should show that all of this router&apos;s NIs are down (lnet_parse_rc_info() is modified so LNET_NI_STATUS_ADMINDOWN is treated the same as LNET_NI_STATUS_DOWN).&lt;/li&gt;
	&lt;li&gt;Now, when client or server attempts to send a new message to a remote network, and this router&apos;s routes are considered for the next hop, the routes are discarded since the servers and clients know that the router&apos;s NIs for the remote networks are down (see lnet_send()-&amp;gt;lnet_find_route_locked()).&lt;/li&gt;
	&lt;li&gt;At this point the router should not be receiving any new incoming traffic other than router_checker pings.&lt;/li&gt;
	&lt;li&gt;The administrator can watch for any queued messages on the router node to drain via appropriate /proc interface.&lt;/li&gt;
	&lt;li&gt;Once the router no longer has any messages to send LNET can be stopped and unloaded.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The up_interfaces command simply sets the ni-&amp;gt;ni_status-&amp;gt;ns_status of each lnet_ni_t in the global LND instance list (except for LOLND) to LNET_NI_STATUS_UP.&lt;/p&gt;</description>
                <environment></environment>
        <key id="22710">LU-4473</key>
            <summary>Disable LNET routes without disrupting ongoing filesystem operations</summary>
                <type id="2" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11311&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="hornc">Chris Horn</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Fri, 10 Jan 2014 19:55:59 +0000</created>
                <updated>Mon, 13 Jan 2014 20:27:59 +0000</updated>
                            <resolved>Mon, 13 Jan 2014 20:27:59 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="74748" author="hornc" created="Fri, 10 Jan 2014 20:18:19 +0000"  >&lt;p&gt;For your consideration:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/8803&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8803&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="74750" author="hornc" created="Fri, 10 Jan 2014 20:26:56 +0000"  >&lt;p&gt;One thing I forgot to mention is that the patch also modifies lnet_update_ni_status_locked() so that the router_checker will not mark &quot;admindown&quot; routes to &quot;down&quot;. This is to prevent a situation where the router_checker might mark the the NI as &quot;down&quot; (which is fine in itself since this will also prevent new traffic) but then later get a response and want to mark the NI &quot;up&quot; which defeats the purpose of admindown status.&lt;/p&gt;</comment>
                            <comment id="74848" author="ashehata" created="Mon, 13 Jan 2014 20:04:20 +0000"  >&lt;p&gt;This functionality is being added as part of the Dynamic LNet Configuration (DLC) Project.  The same feature you&apos;re requesting is being implemented in a slightly different way.  &lt;/p&gt;

&lt;p&gt;Instead of bringing up and down the interface, routing is turned on and off.  When routing is turned on all routing buffers are allocated, when routing is turned off the unused buffers are freed, and the in-use buffers are drained and then freed when they are no longer used.  &lt;/p&gt;

&lt;p&gt;When clients ping a node which has routing turned off, the node responds with a flag that states that routing is turned off and the client then skips routes which use this router as a next-hop.&lt;/p&gt;

&lt;p&gt;This implies that both clients and servers must be the DLC build.&lt;/p&gt;

&lt;p&gt;However, in your description, you have:&lt;br/&gt;
The administrator can watch for any queued messages on the router node to drain via appropriate /proc interface.&lt;/p&gt;

&lt;p&gt;I&apos;m not sure how that is done.  can you please elaborate.&lt;/p&gt;

&lt;p&gt;below are the dlc patches&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8020&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8020&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8021&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8021&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8022&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8022&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8023&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8023&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8025&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8025&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8026&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8026&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="74850" author="hornc" created="Mon, 13 Jan 2014 20:17:56 +0000"  >&lt;p&gt;Ah, this is good to know. I will abandon my patchset, and this ticket can be closed.&lt;/p&gt;

&lt;p&gt;&quot;However, in your description, you have:&lt;br/&gt;
The administrator can watch for any queued messages on the router node to drain via appropriate /proc interface.&lt;br/&gt;
I&apos;m not sure how that is done. can you please elaborate.&quot;&lt;/p&gt;

&lt;p&gt;I just meant that an admin could look at, for example, /proc/sys/lnet/buffers to see when all the credits are free.&lt;/p&gt;</comment>
                            <comment id="74852" author="pjones" created="Mon, 13 Jan 2014 20:27:59 +0000"  >&lt;p&gt;ok - thanks Chris!&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwcnj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12253</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>