[LU-17250] Add new MDT to existing filesystem misses OST pools, nodemaps, and other config Created: 01/Nov/23  Updated: 30/Jan/24

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.3
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Andreas Dilger Assignee: Etienne Aujames
Resolution: Unresolved Votes: 1
Labels: None

Issue Links:
Related
is related to LU-17308 makes "lctl pool_*" more reliable for... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

If new MDTs are added to an existing filesystem that has OST Pools configured, the new MDTs are missing all of the configuration parameters that describe the pools. In fact, the new MDTs will also be missing any other persistent config parameters that are stored in the per-target config logs by "lctl conf_param", "lctl pool_*", "lctl nodemap_*", etc.

This issue does not apply to persistent parameters set with "lctl set_param -P", since this is a generic config log that is applied to all nodes and does not store specific configuration records into a per-target log.

It probably makes sense for the MGS to process the config llog from another target (eg. fsname-MDT0000) to add these parameters to newly-added MDTs (and OSTs for that matter). The "regular" configuration records (attach, setup, add_uuid, new_profile, add_osc) should be skipped along with SKIP records, while conf_param, new_pool, add_pool, etc. records should be copied into the new target device log, with suitable replacements (eg. "s/MDT0000/MDTxxxx/").



 Comments   
Comment by Andreas Dilger [ 01/Nov/23 ]

I'm not sure whether it is better to exclude some logs and copy everything else, or to only include specific records and skip everything else? The former is more forward compatible and less likely to need constant attention, but the latter is a bit safer and would not unintentionally copy settings that are not needed.

There would be some risk of copying a device-specific setting (eg. disable DoM on MDT0000), but that could be addressed afterward by cancelling that record from the llog.

Comment by Gerrit Updater [ 08/Jan/24 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53614
Subject: LU-17250 mgs: generate a new MDT configuration by copy
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8bc2ec1ae80519a9b9184e6149f662402118e3df

Generated at Sat Feb 10 03:33:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.