Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18222

Implement quota aggregation limits by ID range

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None

    Description

      A number of sites have requested the ability to have hierarchical/nested project quotas in order to aggregate multiple project IDs for a tenant into a single aggregate limit, in order to restrict the total usage for multiple projects.

      Rather than implement this at the level of the backing filesystem (which doesn't currently have space in the inode to store more IDs), one option would be for the MDS to add the quota usage from multiple project IDs into a single total, in a manner similar to OST pool quotas, and in general how the MDS is managing quota limits rather than quota accounting.

      The MDS would sum the quota usage across all IDs in the aggregate, and then decide whether IDs would be granted new qunits when the OSTs have consumed their total space.

      Some open questions include:

      • what is a good name for this feature? "aggregate quota", "parent quota", "collective quota", other? "group quota" is already used.
      • should this allow aggregating quotas from multiple/random disjoint project IDs? would it be enough to allow a range of IDs (e.g. 1000-1999 or 100000-199999) to be aggregated into single parent quota limit
      • should an actual project ID (e.g. 1000 or 100000) be allocated as the placeholder for the "aggregate" quota accounting/limits? or should this grouping be "virtual" without an actual ID (which might cause confusion between the actual "p=1000" usage and the aggregate usage)
      • should it be possible to aggregate a hierarchy of projects (e.g. 1000-1999 in one parent A, 3000-3999 into a second parent B, then add "A" and "B" into a third aggregate)? it may be that we can add this afterward on an as-needed basis, if it isn't easily done in the initial implementation

      Attachments

        Issue Links

          Activity

            [LU-18222] Implement quota aggregation limits by ID range

            "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57269
            Subject: LU-18222 quota: project Pool Quotas POC
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ad39a65d13a625d68f1a3c9e2c57a3f351d9f361

            gerrit Gerrit Updater added a comment - "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57269 Subject: LU-18222 quota: project Pool Quotas POC Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ad39a65d13a625d68f1a3c9e2c57a3f351d9f361

            So I hope it won't be a problem to have also "Project Pool Quotas" for inodes(number of files).

            Yes that would be excellent, having on inodes as well would be very beneficial, as an admin often wants to set inodes per capacity limits, so having a quota limit on both makes sense.

            mrasobarnett Matt Rásó-Barnett added a comment - So I hope it won't be a problem to have also "Project Pool Quotas" for inodes(number of files). Yes that would be excellent, having on inodes as well would be very beneficial, as an admin often wants to set inodes per capacity limits, so having a quota limit on both makes sense.

            This looks pretty similar with what we have already done in OST Pool Quotas. In a case of project ranges we should respect not only separate PROJECT id limits(i.e. apply the lowest one) but Pool Quotas also.

            I haven't done any deep investigation, but at 1st look we need to have some universal Pools where we would have OST Pool Quotas, MDT Pool Quotas, Project Pool Quotas, Uid Pool Quotas, etc ... Trying too look at this problem from technical view. If Im right(it should be clear after a short investigation), we would just need to implement quota project ranges configuration mechanism. On the lower level we would consider project ranges together with existing OST Pool Quotas. Another words we would need to clone already existed OST pools configuration code for ranges of quota projects/UIDs/GIDs. Right now OST Pool Quotas look up different OST pools and apply the minimum limit. Nothing needs to be changed in this code if we would add there "Project Pools Quotas" - there would be also a list of pools, so no difference at first look. The only extra effort should be done to start support Pool inode limits - it doesn't exist right now as we don't have MDT Pools. However, I designed OST PQ to be able to add inodes support later as much simplier as possible. So I hope it won't be a problem to have also "Project Pool Quotas" for inodes(number of files).

            scherementsev Sergey Cheremencev added a comment - This looks pretty similar with what we have already done in OST Pool Quotas. In a case of project ranges we should respect not only separate PROJECT id limits(i.e. apply the lowest one) but Pool Quotas also. I haven't done any deep investigation, but at 1st look we need to have some universal Pools where we would have OST Pool Quotas, MDT Pool Quotas, Project Pool Quotas, Uid Pool Quotas, etc ... Trying too look at this problem from technical view. If Im right(it should be clear after a short investigation), we would just need to implement quota project ranges configuration mechanism. On the lower level we would consider project ranges together with existing OST Pool Quotas. Another words we would need to clone already existed OST pools configuration code for ranges of quota projects/UIDs/GIDs. Right now OST Pool Quotas look up different OST pools and apply the minimum limit. Nothing needs to be changed in this code if we would add there "Project Pools Quotas" - there would be also a list of pools, so no difference at first look. The only extra effort should be done to start support Pool inode limits - it doesn't exist right now as we don't have MDT Pools. However, I designed OST PQ to be able to add inodes support later as much simplier as possible. So I hope it won't be a problem to have also "Project Pool Quotas" for inodes(number of files).

            It should still be possible to set lower limits on individual PROJIDs, and those would apply first. It is OK if the sum of the individual PROJID limits is larger than the aggregate range limit, but whichever limit is hit first would apply.

            Possibly the overall limit would be set for the first PROJID in the range, so that it is compatible with the expected initial implementation of a single PROJID per nodemap, and then when this feature is enabled it would apply to the whole range?

            One potential issue is that the number of IDs in a quota range can be relatively large (100k) so adding them up for each qunit grant may be too much overhead? It may only be necessary to accumulate the range quota occasionally when the quota usage is getting close to the limit, or accumulate all the PROJIDs in a range once at mount time and keep the aggregate value updated incrementally when new qunits are granted for any PROJID in that range?

            adilger Andreas Dilger added a comment - It should still be possible to set lower limits on individual PROJIDs, and those would apply first. It is OK if the sum of the individual PROJID limits is larger than the aggregate range limit, but whichever limit is hit first would apply. Possibly the overall limit would be set for the first PROJID in the range, so that it is compatible with the expected initial implementation of a single PROJID per nodemap, and then when this feature is enabled it would apply to the whole range? One potential issue is that the number of IDs in a quota range can be relatively large (100k) so adding them up for each qunit grant may be too much overhead? It may only be necessary to accumulate the range quota occasionally when the quota usage is getting close to the limit, or accumulate all the PROJIDs in a range once at mount time and keep the aggregate value updated incrementally when new qunits are granted for any PROJID in that range?

            Yes, agreed on this point - only using project ID quotas makes the management of this much simpler. My GID example was from a pre-project-quota era system, so this is not really relevant any more

            mrasobarnett Matt Rásó-Barnett added a comment - Yes, agreed on this point - only using project ID quotas makes the management of this much simpler. My GID example was from a pre-project-quota era system, so this is not really relevant any more

            mrasobarnett, if both PROJID and GID (or UID, or OST pool, or aggregate) quota limits are used at the same time, then the lowest quota limit of any ID will apply and prevent the user from writing any new data.

            As such, when there is a need for multiple independent quota limits (eg. home/, scratch/, project/) I usually recommend that only project quota be used, and there be multiple independent project IDs for each part of the filesystem (e.g. PROJID=UID for home/, PROJID=(UID+100000) for scratch/, PROJID=(UID+200000) for project/.

            adilger Andreas Dilger added a comment - mrasobarnett , if both PROJID and GID (or UID, or OST pool, or aggregate) quota limits are used at the same time, then the lowest quota limit of any ID will apply and prevent the user from writing any new data. As such, when there is a need for multiple independent quota limits (eg. home/ , scratch/ , project/ ) I usually recommend that only project quota be used, and there be multiple independent project IDs for each part of the filesystem (e.g. PROJID=UID for home/ , PROJID=(UID+100000) for scratch/ , PROJID=(UID+200000) for project/ .
            • I personally prefer 'aggregate quota' or 'compound quota'
            • Allowing disjoint IDs would be a nice feature if this is no more complex to implement. You could for example use this to do an adhoc 'departmental' quota over a set of disparate project directories created at different times (so the projID's may not be in a range)
            • Could a project ID be part of multiple 'aggregate quotas'? This would be ideal, the aggregates could then represent different business logic, for example if a given project was part of two different departments for accounting purposes.
            • I'm not sure whether there should be an numeric aggregate ID or not - what other options could there be, could you have an alphanumeric ID for the aggregations? If numeric ID, I think a distinct ID completely unrelated to the individual project/user/group IDs within the aggregate
            • Hierarchy of aggregates could be a very nice extension in the future
            • Adding user/group ID aggregates could be useful - mostly in my view for helping an administrator model their organisational breakdown when allocating storage.

            An example of this from experience - an administrator could use a GID quota on an individual user's primary GID to allocate private per-user scratch directories to each user, and then use project ID quotas to manage allocations for shared project directories (this assumes the project directories have a separate project GID and use the setgid bit so that files created within are not using the user's primary GID). Having an aggregate GID quota combining the primary-GIDs of those users would provide the ability to allocate to a research group a portion of per-user scratch space to use, and then, separately, with a project ID aggregation set a limit on the department's shared research space.

            Aggregations are very useful conceptually to model the way storage is often used - to continue the example, the department's aggregate project ID allocation may then be broken down into multiple individual project ID quotas for the different individual research groups underneath. Perhaps they have a 1PiB aggregate projID quota, and then this is formed of 100TiB to group A, 300TiB to group B, 50TiB to group C, and so on.

            mrasobarnett Matt Rásó-Barnett added a comment - I personally prefer 'aggregate quota' or 'compound quota' Allowing disjoint IDs would be a nice feature if this is no more complex to implement. You could for example use this to do an adhoc 'departmental' quota over a set of disparate project directories created at different times (so the projID's may not be in a range) Could a project ID be part of multiple 'aggregate quotas'? This would be ideal, the aggregates could then represent different business logic, for example if a given project was part of two different departments for accounting purposes. I'm not sure whether there should be an numeric aggregate ID or not - what other options could there be, could you have an alphanumeric ID for the aggregations? If numeric ID, I think a distinct ID completely unrelated to the individual project/user/group IDs within the aggregate Hierarchy of aggregates could be a very nice extension in the future Adding user/group ID aggregates could be useful - mostly in my view for helping an administrator model their organisational breakdown when allocating storage. An example of this from experience - an administrator could use a GID quota on an individual user's primary GID to allocate private per-user scratch directories to each user, and then use project ID quotas to manage allocations for shared project directories (this assumes the project directories have a separate project GID and use the setgid bit so that files created within are not using the user's primary GID). Having an aggregate GID quota combining the primary-GIDs of those users would provide the ability to allocate to a research group a portion of per-user scratch space to use, and then, separately, with a project ID aggregation set a limit on the department's shared research space. Aggregations are very useful conceptually to model the way storage is often used - to continue the example, the department's aggregate project ID allocation may then be broken down into multiple individual project ID quotas for the different individual research groups underneath. Perhaps they have a 1PiB aggregate projID quota, and then this is formed of 100TiB to group A, 300TiB to group B, 50TiB to group C, and so on.

            It could also be possible to aggregate quota by UID/GID for a given range of IDs within a nodemap. I'm not sure if that provides any additional useful functionality over projid aggregation, since the "total" limit would presumably be the same?

            adilger Andreas Dilger added a comment - It could also be possible to aggregate quota by UID/GID for a given range of IDs within a nodemap. I'm not sure if that provides any additional useful functionality over projid aggregation, since the "total" limit would presumably be the same?

            People

              scherementsev Sergey Cheremencev
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: