[LU-1310] setfsuid() and quotas Created: 12/Apr/12  Updated: 07/May/12  Resolved: 07/May/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Thomas LEIBOVICI - CEA (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Won't Fix Votes: 1
Labels: None
Environment:

RedHat 6.0


Attachments: File reproducer_v2.c     File setfsuid.c    
Severity: 3
Rank (Obsolete): 6416

 Description   

The following code as root doesn't raise "EDQUOT" error on Lustre even if the user "ME" exceeded its data quota:

#define SIZE 10240
#define ME 500

int main( int argc, char * argv[] )

{ char * buff = NULL ; int fd =0 ; buff = malloc( SIZE ) ; setfsuid( ME ) ; fd = open( argv[1], O_CREAT|O_RDWR, 0644 ) ; printf( "buff=%p, errno=%u\n", buff, errno ) ; printf( "fd = %d\n", fd ) ; printf( "bytes written = %d, errno=%u\n", write( fd, buff, SIZE), errno ); printf( "fsync:%d errno=%u\n", fsync( fd ), errno ) ; printf( "close:%d errno=%u\n", close( fd ), errno ) ; }

Running it returns no error even if user 500 is out of data quota:
buff=0xee8010, errno=0
fd = 3
bytes written = 10240, errno=0
fsync:0 errno=0
close:0 errno=0

The issue is only for data quota. setfsuid() works fine for inode quota.



 Comments   
Comment by Peter Jones [ 12/Apr/12 ]

Niu

Could you please comment on this one?

Thanks

Peter

Comment by Thomas LEIBOVICI - CEA (Inactive) [ 12/Apr/12 ]

reproducer

Comment by Niu Yawei (Inactive) [ 13/Apr/12 ]

Hi, Thomas

Did you test it on any other fs? I ran it on my local ext4 fs, and got the same result.

Lustre checks CAP_SYS_RESOURCE on client to determine if the process can over data quota on OST. This test program calls setfsuid only, but leave the CAP_SYS_RESOURCE capability preserved, so this process should still be able to over quota.

To make the process not over-run quota, I think the application should both setfsuid and clear the corresponding superior capabilities, what's your opinion? Thanks.

Comment by Thomas LEIBOVICI - CEA (Inactive) [ 17/Apr/12 ]

Thanks, I have to check this with the CEA developper who reported me this issue.

Comment by Philippe DENIEL [ 03/May/12 ]

Hi Niu,

I do confirm that CAP_SYS_RESOURCE is ON for the process. I'll try to remove it by using prctl() or capset() or capng_update() (the one which will be the most portable and effective). Then, I'll update this thread.

regards

Philippe

Comment by Philippe DENIEL [ 03/May/12 ]

Hi,

I have attached to this mail a new reproducer (reproducer_v2.c) which is the same as the former one, but does call to capget and capset.
Those calls are use to remove the CAP_SYS_RESOURCE capability. The behavior does not change (write operation is allowed as errno=EDQUOT should be returned).

Comment by Niu Yawei (Inactive) [ 03/May/12 ]

Could you run 'lfs quota -v -u $USR $MNT' to see if the user is really alreay over quota first? If the user do run out of quota, could you run the reproducer 2 times to see if the second run will get EDQUOT? (for instance, ./reproducer_v2 /mnt/lustre/a; ./reproducer_v2 /mnt/lustre/b) The -EDQUOT on second run is expected behavior, otherwise, we have to collect more information to see what's wrong here.

Let me explain why the first write will success even if the user is running out of quota:
The write in reproducer is asynchronous, that first write will be written into cache, and the cache flush always ignore quota in lustre (otherwise, the cached data will be lost), when the cache flush return from server, client will know that the user is out of quota, and the second write will be turned into sync write internally in lustre, and fail for -EDQUOT at the end.

Comment by Philippe DENIEL [ 03/May/12 ]

Hi,

lfs quota shows this (/gl is the mounted LUSTRE fs)

lfs quota -v -u 3051 /gl
Disk quotas for user 3051 (uid 3051):
Filesystem kbytes quota limit grace files quota limit grace
/gl 2876856* 1048576 2097152 - 78 100 150 -
gl-MDT0000_UUID 4 - 1024 - 78 - 80 -
gl-OST0000_UUID 698284* - 551936 - - - - -
gl-OST0001_UUID 1033972* - 770048 - - - - -
gl-OST0002_UUID 772192* - 699392 - - - - -
gl-OST0003_UUID 372404* - 74752 - - - - -

There are "*" after every OST information, I guess they all have "exceeded quotas". Do you confirm this ?

I ran the reproducer (as root), with two different files as parameter. No -EDQUOT is returned and two new files of file 10240 are created. But the call will always increase the counter in lfs quota. Now lfs quota shows that :

lfs quota -u deniel /gl
Disk quotas for user deniel (uid 3051):
Filesystem kbytes quota limit grace files quota limit grace
/gl 2876880* 1048576 2097152 - 80 100 150 -

The "used" space has increased from 2876856 to 2876880, much more than the 2097152 hard limit.

What can I do to provide you with more information ?

Regards

Philippe

Comment by Niu Yawei (Inactive) [ 04/May/12 ]

There are "*" after every OST information, I guess they all have "exceeded quotas". Do you confirm this ?

Yes, all OSTs have exceeded quotas.

Seems there is a defect in your reproducer:

  capdata.effective &= ~CAP_SYS_RESOURCE;
  capdata.permitted &= ~CAP_SYS_RESOURCE;

should be changed to:

  capdata.effective &= ~CAP_TO_MASK(CAP_SYS_RESOURCE);
  capdata.permitted &= ~CAP_TO_MASK(CAP_SYS_RESOURCE);

And I'm not sure what's your default stripe count, if the stripe count is 1, you might need to repeat the reproducer 4 times. (there are 4 OSTs, you need to write each OST once to retrieve the out of quota informaiton to client for each of them)

Comment by Philippe DENIEL [ 04/May/12 ]

Hi Niu,

you are perfectly right, CAP_TO_MASK is what was missing in my reproducer. Things look much better now and I finally could have the EDQUOT error that I expected. I will backport the logic inside the fixed reproducer to my program.
Thanks a lot for your help.

Regards

Philippe

Comment by Niu Yawei (Inactive) [ 07/May/12 ]

Hi, Philippe/Thomas, can we close this ticket?

Comment by Philippe DENIEL [ 07/May/12 ]

Yes, you can

Generated at Sat Feb 10 01:15:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.