[LU-5006] chown/chgrp doesn't work for files created by lfs setstripe Created: 05/May/14 Updated: 14/Feb/18 Resolved: 12/Nov/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0, Lustre 2.5.1, Lustre 2.4.3 |
| Fix Version/s: | Lustre 2.7.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Niu Yawei (Inactive) | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HB | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 13855 | ||||||||||||
| Description |
|
This is easy to reproduce:
The uid on OST object will not be changed after chown. Look into the code osp_attr_set(): /* new object, the very first ->attr_set() * initializing attributes needs no logging * all subsequent one are subject to the * logging and synchronization with OST */ if (o->opo_new) { o->opo_new = 0; RETURN(0); } I don't quite see why this is necessary, but seems it'll break chown in above situation: OST object is created in 'dd' operation, and the first attr_set operation after create is the 'chown', and it's skipped. |
| Comments |
| Comment by Niu Yawei (Inactive) [ 05/May/14 ] |
|
Mike, could you comment on this one? Thanks. |
| Comment by Alex Zhuravlev [ 05/May/14 ] |
|
the idea behind this check is to skip the very fist ->attr_set() which is called right after ->do_create(). otherwise we'll be generating additional MDS-OST interaction on every file creation. probably the solution is to clear opo_new in the beginning of osp_attr_set() |
| Comment by Niu Yawei (Inactive) [ 05/May/14 ] |
Alex, you mean a very first attr_set() issued from kernel or lustre? I didn't see in lustre where attr_set() is issued after osp object created.
In the situation I described, the first attr_set() is from the chown, so I'm afraid this solution won't work. |
| Comment by Alex Zhuravlev [ 05/May/14 ] |
|
mdd_create() does: this is the first dt_attr_set() I meant. it's used to initialize attributes (including UID/GID), but we ignore that as those are supposed to be initialized via the client. |
| Comment by Niu Yawei (Inactive) [ 05/May/14 ] |
|
I see, so the problem is that for the create with MDS_OPEN_DELAY_CREATE flag, osp object will not be created in mdd_create(), and it'll be created later on next open (in mdd_create_data()), which means the very first attr_set() from mdd_create() won't be applied on those osp objects. Probably we can introduce a flag for lu_attr->la_valid to inform osp layer that the the attr_set is come from mdd_create()? |
| Comment by Alex Zhuravlev [ 05/May/14 ] |
|
no flags like that, please. Ideally we shouldn't be setting anything with dt_attr_set() in mdd_create() because dt_create() has attributes as an argument. though it's hard to change now. Instead we can call dt_attr_set() in mdd_create_data() like mdd_create() does. with a good comment, explaining this [censored]. |
| Comment by Andreas Dilger [ 30/May/14 ] |
|
Isn't it better to remove the special-case check in osp_attr_set() and instead have mdd_create() not call dt_attr_set() at all? That removes complexity instead if adding it. |
| Comment by Jodi Levi (Inactive) [ 07/Oct/14 ] |
|
Niu, |
| Comment by Niu Yawei (Inactive) [ 09/Oct/14 ] |
|
Hi, Alex/Andreas I'm wondering why the first attr_set call on object creation is necessary (mdd_object_initialize()), it seems to me that the object attribute has been initialized in OSD layer on create (__osd_object_create() -> osd_attr_init()). Did I miss anything? |
| Comment by Alex Zhuravlev [ 09/Oct/14 ] |
|
we did this way in Orion, but there was an issue related to ACLs. iirc, default ACLs being applied can change mode stored in the object so additonal dt_attr_set() was needed. but I'd welcome a fresh look at this part and even more I'd welcome removal of that dt_attr_set() and workarounds in OSP (opo_new flag). |
| Comment by Niu Yawei (Inactive) [ 09/Oct/14 ] |
|
Thanks, Alex. I didn't see how default ACLs can change the object mode from code. (seems default ACLs is set after the additional attr_set call in current code). I cooked a patch to remove this additional attr_set: http://review.whamcloud.com/#/c/12243/ |
| Comment by Alex Zhuravlev [ 09/Oct/14 ] |
|
please check mdd_acl_init() -> __mdd_fix_mode_acl(). it looks like the code was reorganized since that. the fingers are crossed for the testing |
| Comment by Alex Zhuravlev [ 09/Oct/14 ] |
|
forgot to mention - sanity/103 does check ACLs very well, please make sure it passes with ldiskfs and ZFS. |
| Comment by Jodi Levi (Inactive) [ 12/Nov/14 ] |
|
Patch landed to Master. Please reopen ticket if more work is needed. |
| Comment by Mahmoud Hanafi [ 06/May/15 ] |
|
Is this patch required for 2.5.3? |
| Comment by Niu Yawei (Inactive) [ 08/May/15 ] |
Yes, 2.5.3 doesn't have this fix. If your application perform the commands like the reproducer does, it'll hit the problem. |
| Comment by Jay Lan (Inactive) [ 08/May/15 ] |
|
Can you cook up patches for 2.5.3 and 2.4.3, Niu? Cherry-picking resulted in big difference, especially in lustre/osp/osp_object.c. |
| Comment by Niu Yawei (Inactive) [ 11/May/15 ] |
|
Ok, I'll port it soon. |
| Comment by Gerrit Updater [ 12/May/15 ] |
|
Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/14773 |
| Comment by Gerrit Updater [ 12/May/15 ] |
|
Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/14774 |