[LU-4465] Quota doesn't work right after MDT online addition Created: 10/Jan/14  Updated: 14/Jan/14  Resolved: 14/Jan/14

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: WC Triage
Resolution: Not a Bug Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 12236

 Description   

During the testing of MDT online addition, I found quota doesn't work as expect. Here is the test steps:

1. format a system(ldiskfs) with only one MDT
2. set inode quota to 2048
3. touch some files
4. keep the system on, format and add the second MDT, quota on new MDT is enabled
5. create a dir on the sec MDT and it shows quota exceeded when trying to touch a file under it

I also verified if the system is setup with 2 MDT from the beginning, quota works as expect.

[root@client-18 tests]# lfs quota -v -u quota_usr /mnt/lustre
ustre
Disk quotas for user quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre      56       0   10240       -    1025       0    2048       -
lustre-MDT0000_UUID
                     56       -       0       -    1025       -    2048       -
lustre-MDT0001_UUID
                      0       -       0       -       0       -       0       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 2048, total allocated block limit: 0
[root@client-18 tests]# lfs quota -v -g quota_usr /mnt/lustre
Disk quotas for group quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre      56       0   10240       -    1025       0    2048       -
lustre-MDT0000_UUID
                     56       -       0       -    1025       -    2048       -
lustre-MDT0001_UUID
                      0       -       0       -       0       -       0       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 2048, total allocated block limit: 0

[root@client-18 tests]# lfs mkdir -i 1 /mnt/lustre/test2
/mnt/lustre/test2
[root@client-18 tests]# chown quota_usr.quota_usr /mnt/lustre/test2
[root@client-18 tests]# lfs quota -v -g quota_usr /mnt/lustre
Disk quotas for group quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre      60       0   10240       -    1026       0    2048       -
lustre-MDT0000_UUID
                     56       -       0       -    1025       -    2048       -
lustre-MDT0001_UUID
                      4       -       0       -       1*      -       1       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 2049, total allocated block limit: 0
[root@client-18 tests]# touch /mnt/lustre/test2/b
[root@client-18 tests]# lfs quota -v -g quota_usr /mnt/lustre
Disk quotas for group quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre      60       0   10240       -    1026       0    2048       -
lustre-MDT0000_UUID
                     56       -       0       -    1025       -    2048       -
lustre-MDT0001_UUID
                      4       -       0       -       1*      -       1       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 2049, total allocated block limit: 0
[root@client-18 tests]# ./runas -u 60000 -g 60000 touch /mnt/lustre/test2/b
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [touch] [/mnt/lustre/test2/b]
touch: cannot touch `/mnt/lustre/test2/b': Disk quota exceeded


 Comments   
Comment by Niu Yawei (Inactive) [ 10/Jan/14 ]

The total limits is 2k inodes, and the least qunit is 1k, and 1025 files were created on MDT1, so MDT1 has used up the limit (2 qunits) already.

I'm wondering why the test can pass when the 2 MDTs both start from the beginning, could you post the output of "lfs quota -v" for that test as well? Thanks.

Comment by Sarah Liu [ 10/Jan/14 ]

Hello Niu,

So you mean the case with MDT online addition is right?

And here is the "lfs quota -v" with 2 MDTs both start from the beginning:

[root@client-18 tests]# lfs quota -v -g quota_usr /mnt/lustre
Disk quotas for group quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre      60       0   10240       -    1027       0    2048       -
lustre-MDT0000_UUID
                     56       -       0       -    1025       -    1537       -
lustre-MDT0001_UUID
                      4       -       0       -       2       -     511       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 2048, total allocated block limit: 0
[root@client-18 tests]# lfs quota -v -u quota_usr /mnt/lustre
Disk quotas for user quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre      60       0   10240       -    1027       0    2048       -
lustre-MDT0000_UUID
                     56       -       0       -    1025       -    1537       -
lustre-MDT0001_UUID
                      4       -       0       -       2       -     511       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 2048, total allocated block limit: 0
[root@client-18 tests]# 
Comment by Niu Yawei (Inactive) [ 13/Jan/14 ]

So you mean the case with MDT online addition is right?

I think so, there are only 2 least qunit limits, and they all used by MDT0, so MDT1 won't able to create files.

I see why the test can pass on 2 MDTs (from beginning): the quota master would try to expand each quota acquire/pre-acquire request, and the expanded count vary with quota slave count. In your test case: 2k limit, two MDTs, MDT0 creating 1025 files will happen to alloc 1537 limit on MDT0, so that limit will not be used up fortunately, but if you try to creating 1026 files on MDT0, then all limit will be used up, and the test will fail.

I think the test could be changed a little bit:

  • creating less files (< 1k) files on MDT0; or
  • setting higher limit(>= 3k);
Comment by Sarah Liu [ 13/Jan/14 ]

thank for the explanation, I will try again.

Generated at Sat Feb 10 01:42:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.