[LU-9555] "df /path/to/project" should return projid-specific values Created: 24/May/17  Updated: 06/Jan/24  Resolved: 03/Dec/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.0
Fix Version/s: Lustre 2.14.0

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: Wang Shilong (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-15721 projid quota limit statfs() on direct... Resolved
is related to LU-4017 Add project quota support feature Resolved
is related to LU-12480 add STATX_PROJID to upstream kernel Open
is related to LU-7236 connections on demand Resolved
is related to LU-10018 MDT as a statfs proxy Resolved
is related to LU-17395 df -h is limited by project quota eve... Open
is related to LUDOC-460 document statfs functionality for pro... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

With local ext4 and XFS filesystems, it is possible to use "df /path/to/directory" to return the current quota usage for the projid associated with that directory as "used", and min(projid quota limit, free space) as "total". This is a natural interface for users/applications, since it represents the used/maximum space for that subdirectory. Otherwise, the user will get EDQUOT back when the project quota runs out for that directory and applications will not be able to figure out how much data they could write into that directory.

static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
{
        buf->f_type = EXT4_SUPER_MAGIC;
        buf->f_bsize = sb->s_blocksize;
        buf->f_blocks = ext4_blocks_count(es) - EXT4_C2B(sbi, overhead);
        buf->f_bfree = EXT4_C2B(sbi,
                percpu_counter_sum_positive(&sbi->s_freeclusters_counter) -
                percpu_counter_sum_positive(&sbi->s_dirtyclusters_counter));
        buf->f_bavail = buf->f_bfree -
                        (ext4_r_blocks_count(es) + resv_blocks);
        if (buf->f_bfree < (ext4_r_blocks_count(es) + resv_blocks))
                buf->f_bavail = 0;
        buf->f_files = le32_to_cpu(es->s_inodes_count);
        buf->f_ffree = percpu_counter_sum_positive(&sbi->s_freeinodes_counter);

#ifdef CONFIG_QUOTA
        if (ext4_test_inode_flag(dentry->d_inode, EXT4_INODE_PROJINHERIT) &&
            sb_has_quota_limits_enabled(sb, PRJQUOTA))
                ext4_statfs_project(sb, EXT4_I(dentry->d_inode)->i_projid, buf);
#endif
}
/*
 * Directory tree accounting is implemented using project quotas, where
 * the project identifier is inherited from parent directories.
 * A statvfs (df, etc.) of a directory that is using project quota should
 * return a statvfs of the project, not the entire filesystem.
 * This makes such trees appear as if they are filesystems in themselves.
 */
void xfs_fill_statvfs_from_dquot(struct kstatfs *statp, struct xfs_dquot *dqp)
{
        uint64_t                limit;

        limit = dqp->q_core.d_blk_softlimit ?
                be64_to_cpu(dqp->q_core.d_blk_softlimit) :
                be64_to_cpu(dqp->q_core.d_blk_hardlimit);
        if (limit && statp->f_blocks > limit) {
                statp->f_blocks = limit;
                statp->f_bfree = statp->f_bavail =
                        (statp->f_blocks > dqp->q_res_bcount) ?
                         (statp->f_blocks - dqp->q_res_bcount) : 0;
        }

        limit = dqp->q_core.d_ino_softlimit ?
                be64_to_cpu(dqp->q_core.d_ino_softlimit) :
                be64_to_cpu(dqp->q_core.d_ino_hardlimit);
        if (limit && statp->f_files > limit) {
                statp->f_files = limit;
                statp->f_ffree = (statp->f_files > dqp->q_res_icount) ?
                         (statp->f_ffree - dqp->q_res_icount) : 0;
        }
}

void xfs_qm_statvfs(xfs_inode_t *ip, struct kstatfs *statp)
{
        if (!xfs_qm_dqget(mp, NULL, xfs_get_projid(ip), XFS_DQ_PROJ, 0, &dqp)) {
                xfs_fill_statvfs_from_dquot(statp, dqp);
                xfs_qm_dqput(dqp);
        }
}

It be useful to be able to do the same with Lustre. One option would be to transfer the projid to the MDS and OSS in the OBD_STATFS request order to get the space usage for that projid directly. Currently the client RPC request body is empty, so there is no place to put the projid, but a new body could be added. The other option would be to check for a projid on the inode in ll_statfs() and ll_obd_statfs(), and if projid != 0 send the equivalent of "lfs quota -p" for that projid, and use it for the statfs "used" and "total" values.



 Comments   
Comment by Andreas Dilger [ 16/Sep/17 ]

Another option that is being discussed in the context of LU-7236 (idle clients disconnect from servers) is to have the client send the STATFS RPCs only to MDT0000, and the MDS can aggregate the statfs data to send back to the client (not for lfs df, however).

The MDS already sends STATFS RPCs to the OSTs every few seconds for space balancing allocations, so it isn't much extra work to aggregate this for the client. The quota master is also on the MDS with MDT0000 so getting the projid usage at the same time would be relatively easy.

Comment by Andreas Dilger [ 16/Dec/17 ]

Sebastien, I think this would be useful for your virtualization work. If a projid is used for each container (subdir mount and nodemap) then df in that container can be controlled by the project quota directly, and the container will only see as much space as it is assigned.

It would also benefit from LU-9982, to be able to control the layout within a container.

Comment by Andreas Dilger [ 16/May/18 ]

Shilong, is there any plan for DDN to work on this?

I think it would be good for Lustre project quota to be consistent with ext4/XFS project quota, and it also could help virtualization so that "df" with a nodemap+project quota only shows space usage for that project.

Comment by Wang Shilong (Inactive) [ 16/May/18 ]

Hi Andreas,

I will try to work on this.

Thanks,
Shilong

Comment by Andreas Dilger [ 02/Nov/19 ]

Hi Shilong, is there any chance you could look into this? We are close to finishing 2.13 and will soon open master for 2.14 feature landings. I don't think it will be a huge amount of work, just doing quota lookups in the statfs() call on the client to limit the maximum used/free for that quota ID. This would have to be done at the llite level before the results are returned to userspace, since it can't be cached at lower layers due to other project IDs having different quota limits, unless we wanted to save one projid statfs result in cache for a second to avoid repeated quota lookups for the same user.

Comment by Wang Shilong (Inactive) [ 04/Nov/19 ]

We might need pass project ID in struct obd_statfs, for example use os_spare3, and then
send quotactl in LMV/LOV layer with os_projid and limit space there.

This might be a little tricky, as obd_statfs is expected to get data, but we use it as arg input.

What do you think?

Comment by Wang Shilong (Inactive) [ 05/Nov/19 ]

Maybe we don't need pass project ID down, just extra RPC might be simple and direct, let me try that.

Comment by Andreas Dilger [ 05/Nov/19 ]

I don't think we need to get the project quota info from the lower layers of osc_statfs() and mdc_statfs(). Instead, it should get the normal quota information in ll_statfs() via quotactl_ioctl() and then apply limits to osfs returned from ll_statfs_internal() in a similar manner to how ext4_statfs() calls ext4_statfs_project().

Comment by Wang Shilong (Inactive) [ 05/Nov/19 ]

Andreas, you missed 'lfs df' case.

Comment by Andreas Dilger [ 05/Nov/19 ]

It should be possible for "lfs df" to pass a new flag like "LL_STATFS_PROJECT" (all the "LL_STATFS_*" flags should be moved into e.g. "enum ll_statfs_flags" to make it easier to find them), which asks the kernel to get the project ID for the inode in ll_obd_statfs(), and then do the quota call from the kernel for each user+OST (which I think is possible).  The user doesn't have to send an explicit project ID from userspace, this can be found in the kernel from the inode.

No projid-specific handling for "lfs df".

Comment by Wang Shilong (Inactive) [ 05/Nov/19 ]

I got a further questions for 'lfs df' case:

[root@server_el7_vm1 lustre]# lfs df /mnt/lustre
UUID 1K-blocks Used Available Use% Mounted on
lustre-MDT0000_UUID 5825660 47240 5255588 1% /mnt/lustre[MDT:0]
lustre-OST0000_UUID 9662472 38040 9083760 1% /mnt/lustre[OST:0]
lustre-OST0001_UUID 9662472 38044 9083756 1% /mnt/lustre[OST:1]
lustre-OST0002_UUID 9662472 38044 9083756 1% /mnt/lustre[OST:2]
lustre-OST0003_UUID 9662472 38044 9083756 1% /mnt/lustre[OST:3]

filesystem_summary: 38649888 152172 36335028 1% /mnt/lustre

What exactly 'avail' space and total space we should report for each OST/MDT?
I supposed we might need get each slave's granted space as total space and calculate
avail space based on that?

And existed quotactl doesn't report that information back, we might need extend to support that?

Comment by Andreas Dilger [ 06/Nov/19 ]

I think to be consistent with ext4 and ZFS the total space would be the quota limit for that projid, the free space is the remaining quota space for that projid, and the available space is based on the quota softlimit, and the same for inodes, something like:

        if (inode has project limit) {
                f_blocks = dqb_bhardlimit;
                f_bfree = min(dqb_bhardlimit - dqb_curspace, f_bfree);
                f_bavail = min(dqb_bsoftlimit - dqb_curspace, f_bavail);
                f_files = dqb_ihardlimit;
                f_ffree = min(dqb_ihardlimit - dqb_curinodes, f_ifree);
        }

I thought that there is already a way for the "lfs quota" command to print per-OST usage amounts? If there isn't a "granted" field, then it would be possible to just take the unused quota space and divide it evenly across all of the OSTs, so that it adds up to the total unused quota space again at the end.

Comment by Gerrit Updater [ 06/Nov/19 ]

Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/36685
Subject: LU-9555 quota: df should return projid-specific values
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 844d2d2f519c9cffdf10ad637a194bdef1efd3e3

Comment by Andreas Dilger [ 14/Nov/19 ]

It would also be good to submit a patch to the upstream kernel to include projid into the statx() output. That is tracked under LU-12480.

Comment by Gerrit Updater [ 03/Dec/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36685/
Subject: LU-9555 quota: df should return projid-specific values
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e5c8f6670fbeea9ec8c6092dfa4369508da54485

Comment by Peter Jones [ 03/Dec/20 ]

Landed for 2.14

Comment by Stephane Thiell [ 25/Feb/22 ]

Hello,
I tried to apply the patch on top of b2_12 but as some prototypes have changed, I'm a bit unsure on how to do it in a good way(tm). Would Whamcloud consider a backport to 2.12? We would LOVE to have this feature on our systems. Thanks for considering it!

Generated at Sat Feb 10 02:27:13 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.