[LU-8271] nodemap: retrying a large configuration transfer should have a delay Created: 14/Jun/16  Updated: 04/Jan/18  Resolved: 04/Jan/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: Lustre 2.11.0

Type: Improvement Priority: Major
Reporter: Kit Westneat Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

In order to avoid thrashing the MGS during a bulk configuration update, the nodemap config clients should delay before retrying a config get.

When a nodemap is larger than a single RPC, clients need to use multiple RPCs to get the nodemap config. If the config changes between RPCs, the client needs to drop the config using the previous RPCs and restart the transfer. If there are many configuration changes occurring, it's possible that a config get could be restarted multiple times, causing unnecessary load. The config get clients should wait some time before restarting the transfer, to allow the server to finish updating its config.

It may be possible to re-enqueue the config lock to have the main MGC lock thread restart the transfer, which would add a random delay of between 5-10s.



 Comments   
Comment by Andreas Dilger [ 19/Apr/17 ]

Kit, do you have any cycles to look into this?

Comment by Kit Westneat [ 21/Apr/17 ]

Sure, I'll get a patch together.

Comment by Gerrit Updater [ 21/Apr/17 ]

Kit Westneat (kit.westneat@gmail.com) uploaded a new patch: https://review.whamcloud.com/26781
Subject: LU-8271 nodemap: wait before getting large conf if changed
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b453a08f9604bc5f47f022b94da52bbf77480cbb

Comment by Peter Jones [ 15/Dec/17 ]

Emoly

Can you please follow up to get this patch landed?

Thanks

Peter

Comment by Gerrit Updater [ 04/Jan/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/26781/
Subject: LU-8271 nodemap: wait before getting large conf if changed
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f75631655890260549b12233589ee4b2074f20ce

Comment by Peter Jones [ 04/Jan/18 ]

Landed for 2.11

Generated at Sat Feb 10 02:16:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.