[LU-11234] Choose lustre pool to place file’s objects by user-defined policies - Whamcloud Community JIRA

Details

Type: New Feature
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
- TS

Rank (Obsolete):
9223372036854775807

Description

Choose lustre pool to place file’s objects by user-defined policies

This feature fits to heterogeneous lustre environment where lustre ost targets consist of different devices（such as disk，ssd）and are maybe with different configures (Such as osd-zfs, osd-ldiskfs, RAID1, RAID6). This feature can help users to make more fine-grained data placement.

When using this feature, lustre ost targets are firstly divided into many pools. Every pool is composed of homogeneous targets. That means, targets in one pools are with the same type of devices and with the same configure. Then, users define file create rules to put files into different pools. When the rules are enabled, file create operations will search the users-defined rules, find the corresponding pool, and then allocate data objects on the choosed pool.

We use lustre source compiling to illustrate this feature.

When building the lustre source code, files are composed of the following types

1. Source code with filename extension .c or .h and configure files to present build rules, which is usually without any filename extension.

2. temporary files that with filename extension .o

3. kernel modules and other final files.

For the above 3 types, 1, 3 need more reliability than I/O performance，while 2 need more IO performance than reliability.

Firstly, build two pools, one ( eg. named pool_src )use disk array with RAID1 configuration to provide high reliability. The other (eg. named pool_temp) use SSD with RAID0 to provide high IO performance.

Secondly, define the following pool policy:

lfs setstripe -p pool_src lustre-source
lctl set_param lod.*.policies=“add post=\“o\” pool_temp”  # file with post extortion .o will put in pool_temp

When the policies are applied, filename with extortion .o will be put in pool_temp, while the others will be put in pool_src.

Attachments

Issue Links

is related to

LU-10345 allow specifying file layout by FID and filename extension

Open

is related to

LU-9982 Clients striping from mapped FID in nodemap

Open

Activity

[LU-11234] Choose lustre pool to place file’s objects by user-defined policies

Andreas Dilger added a comment - 04/Apr/19 8:36 PM

Instead of specifying just the OST pool for files matching the policy, it would be much more useful to base this on patch https://review.whamcloud.com/28972 "LU-9982 lustre: Clients striping from mapped FID in nodemap" and allow specifying the FID that is the source for the file layout. That allows specifying a pool, but more importantly allows specifying the rest of the layout (stripe count, PFL components with multiple different pools, FLR mirroring, etc.). Getting patch 28972 landed would itself be useful in any case for nodemap/fileset/isolation purposes. Once that patch lands, I think the changes to 33126 would be relatively straight forward, adding a fid=[SEQ:OID:VER] option to the matching rules.

Andreas Dilger added a comment - 04/Apr/19 8:36 PM Instead of specifying just the OST pool for files matching the policy, it would be much more useful to base this on patch https://review.whamcloud.com/28972 " LU-9982 lustre: Clients striping from mapped FID in nodemap " and allow specifying the FID that is the source for the file layout. That allows specifying a pool, but more importantly allows specifying the rest of the layout (stripe count, PFL components with multiple different pools, FLR mirroring, etc.). Getting patch 28972 landed would itself be useful in any case for nodemap/fileset/isolation purposes. Once that patch lands, I think the changes to 33126 would be relatively straight forward, adding a fid= [SEQ:OID:VER] option to the matching rules.

Andreas Dilger added a comment - 21/Feb/19 7:52 PM

Rather than make all of these policies global, they should be implemented as part of a nodemap, even if it is the" default" nodemap to start with. This will allow different policies to be implemented for different clients if needed (eg. prefer local OST pool for clients connected over WAN).

Andreas Dilger added a comment - 21/Feb/19 7:52 PM Rather than make all of these policies global, they should be implemented as part of a nodemap, even if it is the" default" nodemap to start with. This will allow different policies to be implemented for different clients if needed (eg. prefer local OST pool for clients connected over WAN).

Li Xi added a comment - 30/Oct/18 7:22 AM

As discussed with Teddy and Ihara, we want to implement something like:

1) Use Jobid/UID/GID to define expressions of rules (Just like TBF rules).

2) Each rule has an operation, the current operation is simple, just to create the file in a dedicated OST pool. That means, the rule is a tuple of (expression, OST pool name)

3) Define a list of rules with order. The file creation process will walk through this list, trying to match the first ones of them. if matches, the file will be created on the defined OST pool.

4) The rule can be defined through /proc entry, just like what did for TBF.

A lot of codes can be shared by TBF and this new feature. Maybe we can put some the functions into shared library rather than copy and paste.

Li Xi added a comment - 30/Oct/18 7:22 AM As discussed with Teddy and Ihara, we want to implement something like: 1) Use Jobid/UID/GID to define expressions of rules (Just like TBF rules). 2) Each rule has an operation, the current operation is simple, just to create the file in a dedicated OST pool. That means, the rule is a tuple of (expression, OST pool name) 3) Define a list of rules with order. The file creation process will walk through this list, trying to match the first ones of them. if matches, the file will be created on the defined OST pool. 4) The rule can be defined through /proc entry, just like what did for TBF. A lot of codes can be shared by TBF and this new feature. Maybe we can put some the functions into shared library rather than copy and paste.

Gerrit Updater added a comment - 08/Sep/18 5:55 AM

Teddy Zheng (jjkky@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33126
Subject: LU-11234 lod: Choose pools to place file's objects by filename extension policies
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5615ef1ee1db30e362e61ecc29add9907637a1e2

Gerrit Updater added a comment - 08/Sep/18 5:55 AM Teddy Zheng (jjkky@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33126 Subject: LU-11234 lod: Choose pools to place file's objects by filename extension policies Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 5615ef1ee1db30e362e61ecc29add9907637a1e2

Andreas Dilger added a comment - 18/Aug/18 3:04 AM

There is nothing preventing users from allocating all of their files on the flash OSTs today, they can just use "lfs setstripe -o idx,idx,idx" directly if they want. We need to have OST quotas to prevent users from using these OSTs.

In that respect, allowing users some mechanism to specify "pools" or filename extension policies doesn't affect this, but could make like a lot easier for some applications and users.

One simple way to enable this would be to allow "lfs setstripe -o" to specify a smaller number of stripes than the OSTs listed. This would be treated the same way as a stripe count in a pool - the specified number of stripes is allocated to the file from the listed OSTs.

After that point, it would be possible for users to create their own pools, which are just named lists of OSTs, and "lfs setstripe" can just get those lists from _/.lustrerc or similar. It is a bit more tricky if the files are not being created via lfs, but eg. we could store the FID for the user /.lustrerc in the LOV EA so that the MDS can access it.

Andreas Dilger added a comment - 18/Aug/18 3:04 AM There is nothing preventing users from allocating all of their files on the flash OSTs today, they can just use " lfs setstripe -o idx,idx,idx " directly if they want. We need to have OST quotas to prevent users from using these OSTs. In that respect, allowing users some mechanism to specify "pools" or filename extension policies doesn't affect this, but could make like a lot easier for some applications and users. One simple way to enable this would be to allow " lfs setstripe -o " to specify a smaller number of stripes than the OSTs listed. This would be treated the same way as a stripe count in a pool - the specified number of stripes is allocated to the file from the listed OSTs. After that point, it would be possible for users to create their own pools, which are just named lists of OSTs, and " lfs setstripe " can just get those lists from /.lustrerc or similar. It is a bit more tricky if the files are not being created via lfs , but eg. we could store the FID for the user /.lustrerc in the LOV EA so that the MDS can access it.

Teddy added a comment - 18/Aug/18 2:09 AM - edited

In the description, ‘user’ means ‘administrator’: application users are only allowed to propose requirements for rules, but applying the rules to the system should be allowed to execute by authoritative administrator. Otherwise, if application users can define and apply rules by themselves, most will prefer to put their data on the faster devices. Besides, applying the rules by application users will increase the rate of misconfiguration.

Indeed, I think, the rules of this feature deserves more careful designs. We can define rule conjunction, just like rules in TBF. For example, rules with UID and filename extension will only allow control file’s layout of users with UID( in ~~LU-9658~~, we have already add uid and gid in creat RPC).

I'll upload the basic patch that only supports filename extension rule for review.

Teddy added a comment - 18/Aug/18 2:09 AM - edited In the description, ‘user’ means ‘administrator’: application users are only allowed to propose requirements for rules, but applying the rules to the system should be allowed to execute by authoritative administrator. Otherwise, if application users can define and apply rules by themselves, most will prefer to put their data on the faster devices. Besides, applying the rules by application users will increase the rate of misconfiguration. Indeed, I think, the rules of this feature deserves more careful designs. We can define rule conjunction, just like rules in TBF. For example, rules with UID and filename extension will only allow control file’s layout of users with UID( in LU-9658 , we have already add uid and gid in creat RPC). I'll upload the basic patch that only supports filename extension rule for review.

Andreas Dilger added a comment - 12/Aug/18 7:39 AM

When you write "user" here, you really mean "administrator", since it typically is not possible for a regular user to login on the MGS to set a policy. Also, it appears that this policy would be global to all users.

I think this proposal is a very useful one, we've been discussing it for years, but nobody has been working on it It would be nice if we could work out a mechanism for regular users to be able to create OST pools by themselves, and have per-user or per-directory policies for file creation. With the patch https://review.whamcloud.com/32814 "~~LU-11146~~ lustre: fix setstripe for specific osts upon dir" it is possible to set a directory default layout that selects specific OSTs, which is almost like a pool. The main difference is that the specific-OST layout currently requires the file to be striped over all OSTs, but it would be nice to allow striping overconly a subset of OSTs.

The patch https://review.whamcloud.com/28972 "UL-9982 lustre: Clients striping from mapped FID in nodemap" allows specifying an arbitrary source FID for the default layout for files in a subdirectory mount.

I think it would be possible to combine these ideas by having a $HOME/.lustre/defaults directory (nothing special there, just a regular directory) that users could create policies in or on (eg. different layouts for filename extensions), and then the $HOME/.lustre directory FID can be set as the default layout for all user directories if they want to apply this policy to their directories (it would be inherited by subdirectories automatically, or fetched from the fs root if needed. .

Since this is not a configuration parameter, the user does not need to have any special login permisssion. Since thos layout is explicitly applied to user directories (there can be arbitrarily many different layouts for a user, they just need to supply the directory FID).

Andreas Dilger added a comment - 12/Aug/18 7:39 AM When you write "user" here, you really mean "administrator", since it typically is not possible for a regular user to login on the MGS to set a policy. Also, it appears that this policy would be global to all users. I think this proposal is a very useful one, we've been discussing it for years, but nobody has been working on it It would be nice if we could work out a mechanism for regular users to be able to create OST pools by themselves, and have per-user or per-directory policies for file creation. With the patch https://review.whamcloud.com/32814 " LU-11146 lustre: fix setstripe for specific osts upon dir" it is possible to set a directory default layout that selects specific OSTs, which is almost like a pool. The main difference is that the specific-OST layout currently requires the file to be striped over all OSTs, but it would be nice to allow striping overconly a subset of OSTs. The patch https://review.whamcloud.com/28972 "UL-9982 lustre: Clients striping from mapped FID in nodemap" allows specifying an arbitrary source FID for the default layout for files in a subdirectory mount. I think it would be possible to combine these ideas by having a $HOME/.lustre/defaults directory (nothing special there, just a regular directory) that users could create policies in or on (eg. different layouts for filename extensions), and then the $HOME/.lustre directory FID can be set as the default layout for all user directories if they want to apply this policy to their directories (it would be inherited by subdirectories automatically, or fetched from the fs root if needed. . Since this is not a configuration parameter, the user does not need to have any special login permisssion. Since thos layout is explicitly applied to user directories (there can be arbitrarily many different layouts for a user, they just need to supply the directory FID).

People

Assignee:: Teddy

Reporter:: Teddy

Votes:: 1 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 12/Aug/18 4:38 AM

Updated:: 19/Jun/20 7:27 PM