[LU-11303] slow chgrp as user when quotas are enabled Created: 30/Aug/18 Updated: 18/Jan/22 Resolved: 25/Aug/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.4 |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | SC Admin (Inactive) | Assignee: | Hongchao Zhang |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||||||
| Description |
|
Hi, we have had a user complain that chgrp of a few 1000 file directory tree takes 3x longer than the untar of that data. it seems likely that this is due to is there another way to do this which avoids the dt_sync? in my experience most HPC sites use secondary (supplementary) groups extensively so that users can be members of several research projects. for various reasons this results in lots of files created with the wrong group for the file's location. as root we periodically trawl the filesystem to correct the group ownership of files to match their physical location (ie. poor mans directory/project quotas), but sometimes users still want to change the group ownerships themselves to "do the right thing", and now this goes a lot slower for them. so I suppose your expectation that unpriv users doing chgrp is rare is sort of valid because we do most of it manually and sporadically for them as root, but (again, in my experience) because of extensive use of supplementary groups in HPC, users wanting to do a chgrp is perhaps more common than you might think. project quotas would remove most of our reasons for using chgrp but maybe not all. unfortunately we aren't likely to try any more new things like project quotas any time soon. BTW it would be good to have lustre test users that had secondary groups in order to find problems like this. I don't see any at the moment. I was looking because I need one to make a regression test case for cheers, |
| Comments |
| Comment by Andreas Dilger [ 30/Aug/18 ] |
|
Robin, do you also have quotas enabled on this filesystem? |
| Comment by SC Admin (Inactive) [ 30/Aug/18 ] |
|
yes, the big dagg filesystem has group quotas enforcing. we have user quotas enforcing on the /home Lustre filesystem. the other 2 small filesystems don't use quotas (/apps and /images). cheers, |
| Comment by Peter Jones [ 30/Aug/18 ] |
|
Hongchao Can you please investigate? Thanks Peter |
| Comment by Gerrit Updater [ 04/Sep/18 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33107 |
| Comment by Gerrit Updater [ 21/Sep/18 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33107/ |
| Comment by Peter Jones [ 21/Sep/18 ] |
|
Landed for 2.12 |
| Comment by Lukasz Flis [ 12/Oct/18 ] |
|
We can confirm the same problem in the: 2.10.5 on the HPC system in CYFRONET quota enforcement: enabled single chgrp on single file to a secondary group executed by non-root user can take from 10-140 seconds on a busy filesystem. chgrp command blocks on fchownat syscall
@Peter Jones: do you plan to include fix for next b2_10 release ( 2.10.6) ? |
| Comment by Lukasz Flis [ 12/Oct/18 ] |
|
@adilger could you please comment if this patch solves the problem with slow chgrp introduced by I have backported this patch (https://review.whamcloud.com/33107/) to b2_10 |
| Comment by Andreas Dilger [ 06/Nov/18 ] |
|
The landed patch was just a code cleanup and did not address the issue in this ticket. |
| Comment by Andreas Dilger [ 06/Nov/18 ] |
|
I see that patch https://review.whamcloud.com/16699 " It possibly makes sense to do a simple check if the user is close to exceeding the quotas before enforcing the sync behaviour (e.g. quota free > file size). If they are not close to the quota limit there is no need to enforce the sync behaviour. |
| Comment by Gerrit Updater [ 09/Jan/19 ] |
|
Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33996 |
| Comment by Gerrit Updater [ 19/Jan/19 ] |
|
https://review.whamcloud.com/33996 has been updated |
| Comment by Gerrit Updater [ 25/Aug/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/33996/ |
| Comment by Peter Jones [ 25/Aug/21 ] |
|
Landed for 2.15 |