[LU-13353] quickly determine if a quota is exceeded Created: 10/Mar/20  Updated: 13/Jun/20

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Gian-Carlo Defazio Assignee: Hongchao Zhang
Resolution: Unresolved Votes: 0
Labels: llnl

Issue Links:
Related
Epic/Theme: Performance
Rank (Obsolete): 9223372036854775807

 Description   

We run a script that checks the quota status of users and informs them if they are over quota. Currently we run lfs for each user and it can take a long time for a large system with a lot of users.

So far I've found that I can use llapi_quotactl() (which lfs also uses) to get the necessary information. However, to make llapi_quotactl() run faster, I set dqb_valid to non-zero per the man page for llapi_quotactl(). This seems to work correctly in that I don't cause communication between the QMT and QSDs, but I don't get some of the important information (dqb_curspace and dqb_curinodes) that's used to determine if a user is exceeding a quota (also as expected, per the man page).

However, I believe the struct lquota_entry.lqe_edquot flag has the information I want. It may not be completely up to date at all times, but that's ok because it's not being used in real time for this script. I can get a hold of it by passing it back to the llapi_quotactl() call in unused space in the struct obd_quotactl which comes out in user space as a struct if_quotactl. I'm just putting all the flags in the struct lquota_entry into the unused flags portion of the struct if_quotactl.

Is there an easier way to get this information (relatively) quickly?



 Comments   
Comment by Andreas Dilger [ 11/Mar/20 ]

Hongchao, could you please comment.

Comment by Hongchao Zhang [ 11/Mar/20 ]

Hi,
The relative quick way to check whether the users/groups is over quota limit is get it directly QMT(at MDT0000)
for instance, check the status of users

At MDT0000

# cat /proc/fs/lustre/qmt/lustre-QMT0000/dt-0x0/glb-usr
global_pool0_dt_usr
- id:      0
  limits:  { hard:                    0, soft:                    0, granted:                    0, time:               604800 }
- id:      60000
  limits:  { hard:               102400, soft:               102400, granted:                32768, time:                    0 }
- id:      60001
  limits:  { hard:               102400, soft:               102400, granted:                 8192, time:                    0 }

Print the users

#cat /proc/fs/lustre/qmt/lustre-QMT0000/dt-0x0/glb-usr | grep -B 1 "limits" | sed '$!N;s/\n/,/' |  awk -F "[:,]" ' { if ($9 > $5) print $2 " is over quota hard limit" ; else if ($9 > $7) print $2 " is over quota soft limit" }'
Comment by Olaf Faaland [ 11/Mar/20 ]

Hongchao, thanks for your response.

Gian-Carlo is actually asking about an approach for an enhancement he is working on.  He's asking about sending the lquota_entry.lqe_edquot flag back from the QMT to the MDC and through to userspace.

Comment by Wang Shilong (Inactive) [ 12/Mar/20 ]

One of another interesting feature is to add 'lfs quota --list-all' an interface which try to get all existed quota type record from MDS, and even possible passing some flags to MDS(only filter overquota) and then return them back to clients, i heard several times that some people want this, this could save RPCs call for your case(mostly it might be just several RPCs to return back all quota informations).

Comment by Olaf Faaland [ 12/Mar/20 ]

Great!

an interface which try to get all existed quota type record from MDS

and

only filter overquota

Those are the kind of options we have considered and would like your input on.

Gian-Carlo is extracting the key bits out of a patch he's used for testing, and he'll push that so we can use it to talk about the options.

Comment by James A Simmons [ 12/Mar/20 ]

Has anyone looked at quota_nld? The linux kernel sends netlink packets about the quota state. 

Comment by Hongchao Zhang [ 12/Mar/20 ]

Hi Olaf,

The "lquota_entry.lqe_edquot" can be packed into "dqi_flags" (struct obd_dqinfo) in QMT and return to client side,
the more efficient way to get the quota information of all users/groups could be defining a new quota calling type to
getting all quota information (the processing is similar as "cat /proc/fs/lustre/qmt/lustre-QMT0000/dt-0x0/glb-usr")
from QMT and send it to client.

Comment by Gerrit Updater [ 12/Mar/20 ]

Gian-Carlo DeFazio (defazio1@llnl.gov) uploaded a new patch: https://review.whamcloud.com/37908
Subject: LU-13353 quota: quickly determine if quota exceeded
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8bd14f3e1e7cfab53ad045465483c7c6373fcf13

Comment by Olaf Faaland [ 01/Apr/20 ]

Hi Gian-Carlo, can you change the "type" on this ticket to "Improvement"? For whatever reason I'm unable to do it. Thanks.

Comment by Andreas Dilger [ 01/Apr/20 ]

Olaf, I'm unable to do that either. It might be something with Jira. I think Peter has the most admin privilege here, so hopefully he can change it.

Comment by Olaf Faaland [ 01/Apr/20 ]

Thanks Andreas

Comment by Peter Jones [ 01/Apr/20 ]

Well, I can't edit that either. jgmitter I know that we have seen this kind of thing in the past - is the workaround to move to the ticket to this same project? I did not want to do that without confirmation as it will mess up the commit message for the the patch in flight.

Comment by Gian-Carlo Defazio [ 01/Apr/20 ]

I can't change the type either, but that's probably not a surprise at this point.

Comment by Joseph Gmitter (Inactive) [ 02/Apr/20 ]

Yes, that is correct Peter. That’s the only way to be able to reset the type from a question/request to either bug, improvement, or feature.

Comment by Peter Jones [ 02/Apr/20 ]

Thanks jgmitter. Ah good - I expected it to increment the id in the LU project when I moved it but it retained the original one so it's now fixed without any side-effects. Really, not intuitive but it works...

Comment by Gerrit Updater [ 08/Apr/20 ]

Gian-Carlo DeFazio (defazio1@llnl.gov) uploaded a new patch: https://review.whamcloud.com/38183
Subject: LU-13353 quota: add man page for lfs quota
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7119836871013eb61b4f276a16a098786994283b

Comment by Gerrit Updater [ 19/Apr/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38183/
Subject: LU-13353 quota: add man page for lfs quota
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: fba945ca0daa6a442a01f5174a0ba8b6d94294b9

Generated at Sat Feb 10 03:00:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.