[LU-4473] Disable LNET routes without disrupting ongoing filesystem operations Created: 10/Jan/14  Updated: 13/Jan/14  Resolved: 13/Jan/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor
Reporter: Chris Horn Assignee: WC Triage
Resolution: Won't Fix Votes: 0
Labels: patch

Rank (Obsolete): 12253

 Description   

It is desirable to be able to gracefully take an LNET router out of service without disrupting ongoing filesystem operations. Since not all RPCs are re-sent we need a way to prevent routes from being used for new traffic while existing buffered messages continue to drain. I have a patch implementing one approach to achieving this behavior.

The patch creates a pair of lctl commands, down_interfaces and up_interfaces. The down_interfaces command, when executed on an LNET router, sets the ni->ni_status->ns_status of each lnet_ni_t in the global LND instance list (except for LOLND) to a new status introduced by this patch, LNET_NI_STATUS_ADMINDOWN. An admin would use this command to remove an LNET router node from service in the following way:

  • Admin executes 'lctl down_interfaces' on the router node being removed.
  • After a small waiting period ( on the order of router_ping_timeout + max(dead_router_check_interval, live_router_check_interval) ) all clients and servers should have ping'd this router and received a response.
  • The response payload should show that all of this router's NIs are down (lnet_parse_rc_info() is modified so LNET_NI_STATUS_ADMINDOWN is treated the same as LNET_NI_STATUS_DOWN).
  • Now, when client or server attempts to send a new message to a remote network, and this router's routes are considered for the next hop, the routes are discarded since the servers and clients know that the router's NIs for the remote networks are down (see lnet_send()->lnet_find_route_locked()).
  • At this point the router should not be receiving any new incoming traffic other than router_checker pings.
  • The administrator can watch for any queued messages on the router node to drain via appropriate /proc interface.
  • Once the router no longer has any messages to send LNET can be stopped and unloaded.

The up_interfaces command simply sets the ni->ni_status->ns_status of each lnet_ni_t in the global LND instance list (except for LOLND) to LNET_NI_STATUS_UP.



 Comments   
Comment by Chris Horn [ 10/Jan/14 ]

For your consideration:

http://review.whamcloud.com/8803

Comment by Chris Horn [ 10/Jan/14 ]

One thing I forgot to mention is that the patch also modifies lnet_update_ni_status_locked() so that the router_checker will not mark "admindown" routes to "down". This is to prevent a situation where the router_checker might mark the the NI as "down" (which is fine in itself since this will also prevent new traffic) but then later get a response and want to mark the NI "up" which defeats the purpose of admindown status.

Comment by Amir Shehata (Inactive) [ 13/Jan/14 ]

This functionality is being added as part of the Dynamic LNet Configuration (DLC) Project. The same feature you're requesting is being implemented in a slightly different way.

Instead of bringing up and down the interface, routing is turned on and off. When routing is turned on all routing buffers are allocated, when routing is turned off the unused buffers are freed, and the in-use buffers are drained and then freed when they are no longer used.

When clients ping a node which has routing turned off, the node responds with a flag that states that routing is turned off and the client then skips routes which use this router as a next-hop.

This implies that both clients and servers must be the DLC build.

However, in your description, you have:
The administrator can watch for any queued messages on the router node to drain via appropriate /proc interface.

I'm not sure how that is done. can you please elaborate.

below are the dlc patches
http://review.whamcloud.com/8020
http://review.whamcloud.com/8021
http://review.whamcloud.com/8022
http://review.whamcloud.com/8023
http://review.whamcloud.com/8025
http://review.whamcloud.com/8026

Comment by Chris Horn [ 13/Jan/14 ]

Ah, this is good to know. I will abandon my patchset, and this ticket can be closed.

"However, in your description, you have:
The administrator can watch for any queued messages on the router node to drain via appropriate /proc interface.
I'm not sure how that is done. can you please elaborate."

I just meant that an admin could look at, for example, /proc/sys/lnet/buffers to see when all the credits are free.

Comment by Peter Jones [ 13/Jan/14 ]

ok - thanks Chris!

Generated at Sat Feb 10 01:43:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.