[LU-3220] Quota doesn't work right with DNE Created: 24/Apr/13  Updated: 16/Oct/13  Resolved: 11/Jul/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.5.0

Type: Bug Priority: Major
Reporter: Sarah Liu Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: dne

Severity: 3
Rank (Obsolete): 7862

 Description   

Here are the test steps:
1. setup the system with 2 MDTs and quota enabled.
2. setup inode quota for user/group quota_usr

/usr/bin/lfs setquota -u quota_usr -b 0 -B 10M -i 0 -I 5 /mnt/lustre
/usr/bin/lfs setquota -g quota_usr -b 0 -B 10M -i 0 -I 5 /mnt/lustre

3. check user quota shows that MDT0 has limit 5 while MD1 has limit 0

[root@client-5 tests]# lfs quota -v -u quota_usr /mnt/lustre
Disk quotas for user quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       4       0   10240       -       1       0       5       -
lustre-MDT0000_UUID
                      4       -       0       -       1       -       5       -
lustre-MDT0001_UUID
                      0       -       0       -       0       -       0       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -

4. create 2 dirs "test" and "test2", "test" is on the default MDT, "test2" is on the second MDT with "lfs mkdir -i 1 ..."
5. chown to quota_usr for these 2 dirs
6. I can only touch file under dir "test" while got "Disk quota exceede" in "test2"

[root@client-5 tests]# chown quota_usr.quota_usr /mnt/lustre/test
[root@client-5 tests]# chown quota_usr.quota_usr /mnt/lustre/test2/
[root@client-5 tests]# lfs quota -v -u quota_usr /mnt/lustre
Disk quotas for user quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       8       0   10240       -       2       0       5       -
lustre-MDT0000_UUID
                      4       -       0       -       1       -       5       -
lustre-MDT0001_UUID
                      4       -       0       -       1*      -       1       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
[root@client-5 tests]# ./runas -u 60000 -g 60000 touch /mnt/lustre/test2/a1
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [touch] [/mnt/lustre/test2/a1]
touch: cannot touch `/mnt/lustre/test2/a1': Disk quota exceeded
[root@client-5 tests]# ./runas -u 60000 -g 60000 touch /mnt/lustre/test/a1
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [touch] [/mnt/lustre/test/a1]
[root@client-5 tests]# 


 Comments   
Comment by Johann Lombardi (Inactive) [ 24/Apr/13 ]

Niu, any thoughts?

Comment by Niu Yawei (Inactive) [ 25/Apr/13 ]

I think this is expected since our minmal iunit size is 1K inodes. The limit in this test is much smaller than 1 iunit size, so they all be ocupied by single MDT (revoke glimpse won't be triggered in this case, because we've reached minimal iunit).

As long as we keep using 1K minmal iunit, tests should use inode limit in K unit. (see the quota DNE tests in s-q, test_7e & test_12b)

Comment by Peter Jones [ 25/Apr/13 ]

Niu

Is there anything that we can do to improve this test while running under DNE? Should we improve our documentation to make is easier to understand what to expect when using DNE in conjunction with quotas?

Peter

Comment by Niu Yawei (Inactive) [ 26/Apr/13 ]

Peter, this is mentioned in Lustre manual:

"To reduce quota requests, quota space is initially allocated to QSDs in very large chunks. How much unused quota space can be hold by a target is controlled by the qunit size. When quota space for a given ID is close to exhaustion on the QMT, the qunit size is reduced and QSDs are notified of the new qunit size value via a glimpse callback. Slaves are then responsible for releasing quota space above the new qunit value. The qunit size isn't shrunk indefinitely and there is a minimal value of 1MB for blocks and 1,000 for inodes. This means that the quota space rebalancing process will stop when this mininum value is reached. As a result, quota exceeded can be returned while many slaves still have 1MB or 1,000 inodes of spare quota space."

Comment by Johann Lombardi (Inactive) [ 26/Apr/13 ]

I would advise to add a real-life example with DNE in the manual.
Besides, i wonder whether we should add a warning when someone sets a limit smaller than [bi]unit.

Niu, what do you think?

Comment by Niu Yawei (Inactive) [ 26/Apr/13 ]

The quota commands used for DNE are exactly same as non-DNE, so I don't think the example for DNE is any different from the non-DNE.

Ok, I would add a warning message when setting small limit, and maybe we could extend the help descrption of 'lfs setquota' to explain such situation as well.

Comment by Niu Yawei (Inactive) [ 27/Apr/13 ]

http://review.whamcloud.com/6182

Comment by Sarah Liu [ 24/May/13 ]

I rerun this test with inode limit set to 1005, still hit the same issue.

[root@client-15 tests]# ./runas -u 60000 -g 60000 touch /mnt/lustre/test2/a1008
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [touch] [/mnt/lustre/test2/a1008]
touch: cannot touch `/mnt/lustre/test2/a1008': Disk quota exceeded
[root@client-15 tests]# ./runas -u 60000 -g 60000 touch /mnt/lustre/test2/a1008
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [touch] [/mnt/lustre/test2/a1008]
touch: cannot touch `/mnt/lustre/test2/a1008': Disk quota exceeded
[root@client-15 tests]# lfs quota -v -u quota_usr /mnt/lustre
Disk quotas for user quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre      52       0   10240       -       2       0    1005       -
lustre-MDT0000_UUID
                     48       -       0       -       1       -    1005       -
lustre-MDT0001_UUID
                      4       -       0       -       1*      -       1       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
[root@client-15 tests]# ./runas -u 60000 -g 60000 touch /mnt/lustre/test/a1008
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [touch] [/mnt/lustre/test/a1008]
[root@client-15 tests]# 
Comment by Niu Yawei (Inactive) [ 27/May/13 ]

Sarah, sorry, the least inode qunit size is 1K but not 1000, I'll update the patch and manual. Thanks.

Comment by Niu Yawei (Inactive) [ 11/Jul/13 ]

patch landed for 2.5

Generated at Sat Feb 10 01:32:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.