[LU-12998] DNE3: tunable to disable directory creation on MDT Created: 22/Nov/19  Updated: 25/Jan/24  Resolved: 19/Nov/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.12.4
Fix Version/s: Lustre 2.16.0

Type: Improvement Priority: Major
Reporter: Andreas Dilger Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: medium

Issue Links:
Cloners
is cloned by LU-17299 DNE3: disable new regular file creati... Open
Related
is related to LU-12025 Adding OST may cause EIO - delay acti... Resolved
is related to LU-16024 Allow permanently removing an MDT fro... Open
is related to LU-7668 permanently remove deactivated OSTs f... Resolved
is related to LU-17300 Avoid creating new dir/file/object on... Open
is related to LU-17334 Client should handle dir/file/object ... In Progress
Rank (Obsolete): 9223372036854775807
Epic Link: MDT rebalance v3

 Description   

In order to allow draining an MDT for removal from the filesystem, or to disable it temporarily, it makes sense to add an "mdt.*.no_create" parameter on the MDT similar to "obdfilter.*.no_precreate" on the OST, and a matching mount option "-o no_create". The OS_STATE_NOPRECREATE" flag should be set in obd_statfs from the MDT to let clients/MDS know that it should be skipped.

This should result in new stripes directories skipping the MDT during selection for "lfs mkdir -i -1", auto striping, etc.



 Comments   
Comment by Andreas Dilger [ 18/May/21 ]

The obdfilter.*.no_precreate functionality was added for OSTs in patch https://review.whamcloud.com/35029 "LU-12025 osp: allow OS_STATE_* flags from OSTs" and patch https://review.whamcloud.com/36716 "LU-12036 ofd: add 'no_precreate' mount option".

Comment by Andreas Dilger [ 24/Nov/21 ]

It would probably be best to use "no_create" for the mount option and parameter on the MDT, since "no_precreate" doesn't make much sense for the MDT. For consistency, the OST code should also add a "no_create" mount option and parameter, and rename OS_STATFS_NOPRECREATE to OS_STATFS_NOCREATE, and then deprecate the "no_precreate" option, since it was only added to master in commit 2.14.0, and the b2_12 backport patch https://review.whamcloud.com/37133 "LU-12036 ofd: add "no_precreate" mount option" has not landed, though it appears the backport patch https://review.whamcloud.com/36872 "LU-12025 osp: allow OS_STATE_* flags from OSTs" was included in 2.12.4.

Comment by Andreas Dilger [ 24/Nov/21 ]

Another important issue that needs to be addressed is how the no_create state is returned to the clients? With patch "LU-13440 lmv: add default LMV inherit depth" having the clients use the MDT OBD_STATFS data to make space balancing decisions, returning OS_STATFS_NOCREATE to the client can definitely influence the client MDT selection, but I'd think that no_create=1 would be a hard restriction against MDT usage (e.g. draining the MDT).

As such, if a client does try to create a file or directory on the MDT, the MDT should return an error (-EREMOTE if that would work with the existing clients, or preferably a unique code like -ENOANO or -EBADRQC) that makes it clear the client should use a different MDT for creation. We shouldn't use -EROFS to indicate the MDT cannot be used, since that would be returned to the application and the create would fail. We might also consider -ENOSPC to have the client try a different MDT, since we would also want the client to do this if the MDT was actually out of space.

Creating remote subdirectories to avoid a specific MDT is fairly easy today, but avoiding the creation of files or directory entries in an existing directory is much harder, and we might consider that as a second stage. If the initial goal is to prevent "new" use of an MDT, the existing directory would not necessarily count, and if MDT evacuation is the goal then some external action would be needed to migrate the existing directories off the MDT, so the added files could also be migrated at that time.

A more complete solution would be to turn the directory into a striped directory with a shard on another MDT with the LMV_HASH_FLAG_MIGRATION flag set, and then create all new inodes on the new MDT shard (both files and directories). That wouldn't affect existing inodes in the directory, but would prevent all new inodes from being allocated on that MDT in that directory. Something similar would need to be done during file migration anyway, so a later "lfs migrate -m" would "resume" the migration and move any remaining inodes on the disabled MDT.

Comment by Gerrit Updater [ 23/Apr/22 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47124
Subject: LU-12998 mds: add no_create parameter to stop creates
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 21b06ef06a84f248fd4a592d2cd46bc7874883f2

Comment by Gerrit Updater [ 18/Nov/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/47124/
Subject: LU-12998 mds: add no_create parameter to stop creates
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1dbcd0bab881fac38d8a5e4ef1559f12618f8f0e

Comment by Andreas Dilger [ 19/Nov/23 ]

The disabling of new files on the MDT is tracked under LU-17299.

Comment by Gerrit Updater [ 13/Dec/23 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53437
Subject: LU-12998 lod: statfs upon nocreate check
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 066262a04cb8e0cbf49a20b7bf036d4484399afe

Generated at Sat Feb 10 02:57:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.