[LU-14657] nondemap asynchronous update creates inconsistent entries. Created: 30/Apr/21  Updated: 18/May/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.3
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: James Beal Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-14516 make mgc's wait-before-reprocess conf... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

One issue we had to address was that if you configured all pieces of the nodemap policy at once the system did not apply the configuration consistently across all elements of the filesystem. The script creates the nodemap and sets the network range, deny_unknown, trusted and admin policies before waiting until these are set on the other mds’s before setting the fileset and identity mapping policies.

 

function wait_for_sync ()
{
while :
do
       sleep 1
       admin=$(ssh "$PARTNER" lctl get_param -n "nodemap.${TENANT_NAME}.admin_nodemap" 2>/dev/null)
       trusted=$(ssh "$PARTNER" lctl get_param -n "nodemap.${TENANT_NAME}.trusted_nodemap" 2>/dev/null)
       [ "$admin" == "$ADMIN" ] && [ "$trusted" == "$TRUSTED" ] && break
done
}
function create_nodemap () {
 SSH_AUTH_SOCK=""
 lctl nodemap_info "${TENANT_NAME}" > /dev/null  2>&1 
 if [ $? -ne 0 ] ; then
   lctl nodemap_add "${TENANT_NAME}"
   lctl nodemap_add_range  --name "${TENANT_NAME}" --range "${TENANT_RANGE}"
 fi
 lctl nodemap_modify     --name "${TENANT_NAME}" --property deny_unknown --value "$DENY"
 lctl nodemap_modify     --name "${TENANT_NAME}" --property trusted --value "$TRUSTED"
 lctl nodemap_modify     --name "${TENANT_NAME}" --property admin  --value "$ADMIN"
 wait_for_sync
 lctl nodemap_set_fileset --name "${TENANT_NAME}" --fileset "${TENANT_DIR}"
 # and the rest
}


 Comments   
Comment by Andreas Dilger [ 18/May/21 ]

LU-14516 is tracking a tunable parameter to reduce the "wait to reprocess" time on the clients, so that configuration updates are visible on clients more quickly. The current default is 5s+rand(5s) to avoid a "thundering herd" on the MGS as (potentially) tens of thousands of clients try to fetch new configuration updates. For smaller clusters there is no reason to wait so long.

Generated at Sat Feb 10 03:11:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.