[LU-5006] chown/chgrp doesn't work for files created by lfs setstripe Created: 05/May/14  Updated: 14/Feb/18  Resolved: 12/Nov/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.1, Lustre 2.4.3
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Critical
Reporter: Niu Yawei (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: HB

Issue Links:
Duplicate
Related
is related to LU-5951 sanity test_39k: mtime is lost on close Resolved
Severity: 3
Rank (Obsolete): 13855

 Description   

This is easy to reproduce:

  • lfs setstripe -c 1 -i 0 /mnt/lustre/a
  • dd if=/dev/zero of=/mnt/lustre/a bs=1M count=1 conv=notrunc
  • chown $ID /mnt/lustre/a

The uid on OST object will not be changed after chown.

Look into the code osp_attr_set():

        /* new object, the very first ->attr_set()
         * initializing attributes needs no logging
         * all subsequent one are subject to the
         * logging and synchronization with OST */
        if (o->opo_new) {
                o->opo_new = 0;
                RETURN(0);
        }

I don't quite see why this is necessary, but seems it'll break chown in above situation: OST object is created in 'dd' operation, and the first attr_set operation after create is the 'chown', and it's skipped.



 Comments   
Comment by Niu Yawei (Inactive) [ 05/May/14 ]

Mike, could you comment on this one? Thanks.

Comment by Alex Zhuravlev [ 05/May/14 ]

the idea behind this check is to skip the very fist ->attr_set() which is called right after ->do_create(). otherwise we'll be generating additional MDS-OST interaction on every file creation. probably the solution is to clear opo_new in the beginning of osp_attr_set()

Comment by Niu Yawei (Inactive) [ 05/May/14 ]

the idea behind this check is to skip the very fist ->attr_set() which is called right after ->do_create(). otherwise we'll be generating additional MDS-OST interaction on every file creation.

Alex, you mean a very first attr_set() issued from kernel or lustre? I didn't see in lustre where attr_set() is issued after osp object created.

probably the solution is to clear opo_new in the beginning of osp_attr_set()

In the situation I described, the first attr_set() is from the chown, so I'm afraid this solution won't work.

Comment by Alex Zhuravlev [ 05/May/14 ]

mdd_create() does:
dt_create()
dt_attr_set()

this is the first dt_attr_set() I meant. it's used to initialize attributes (including UID/GID), but we ignore that as those are supposed to be initialized via the client.

Comment by Niu Yawei (Inactive) [ 05/May/14 ]

I see, so the problem is that for the create with MDS_OPEN_DELAY_CREATE flag, osp object will not be created in mdd_create(), and it'll be created later on next open (in mdd_create_data()), which means the very first attr_set() from mdd_create() won't be applied on those osp objects.

Probably we can introduce a flag for lu_attr->la_valid to inform osp layer that the the attr_set is come from mdd_create()?

Comment by Alex Zhuravlev [ 05/May/14 ]

no flags like that, please. Ideally we shouldn't be setting anything with dt_attr_set() in mdd_create() because dt_create() has attributes as an argument. though it's hard to change now. Instead we can call dt_attr_set() in mdd_create_data() like mdd_create() does. with a good comment, explaining this [censored].

Comment by Andreas Dilger [ 30/May/14 ]

Isn't it better to remove the special-case check in osp_attr_set() and instead have mdd_create() not call dt_attr_set() at all? That removes complexity instead if adding it.

Comment by Jodi Levi (Inactive) [ 07/Oct/14 ]

Niu,
Would you be able to have a look at this one?
Thank you!

Comment by Niu Yawei (Inactive) [ 09/Oct/14 ]

Hi, Alex/Andreas

I'm wondering why the first attr_set call on object creation is necessary (mdd_object_initialize()), it seems to me that the object attribute has been initialized in OSD layer on create (__osd_object_create() -> osd_attr_init()). Did I miss anything?

Comment by Alex Zhuravlev [ 09/Oct/14 ]

we did this way in Orion, but there was an issue related to ACLs. iirc, default ACLs being applied can change mode stored in the object so additonal dt_attr_set() was needed. but I'd welcome a fresh look at this part and even more I'd welcome removal of that dt_attr_set() and workarounds in OSP (opo_new flag).

Comment by Niu Yawei (Inactive) [ 09/Oct/14 ]

Thanks, Alex. I didn't see how default ACLs can change the object mode from code. (seems default ACLs is set after the additional attr_set call in current code). I cooked a patch to remove this additional attr_set: http://review.whamcloud.com/#/c/12243/

Comment by Alex Zhuravlev [ 09/Oct/14 ]

please check mdd_acl_init() -> __mdd_fix_mode_acl(). it looks like the code was reorganized since that. the fingers are crossed for the testing

Comment by Alex Zhuravlev [ 09/Oct/14 ]

forgot to mention - sanity/103 does check ACLs very well, please make sure it passes with ldiskfs and ZFS.

Comment by Jodi Levi (Inactive) [ 12/Nov/14 ]

Patch landed to Master. Please reopen ticket if more work is needed.

Comment by Mahmoud Hanafi [ 06/May/15 ]

Is this patch required for 2.5.3?

Comment by Niu Yawei (Inactive) [ 08/May/15 ]

Is this patch required for 2.5.3?

Yes, 2.5.3 doesn't have this fix. If your application perform the commands like the reproducer does, it'll hit the problem.

Comment by Jay Lan (Inactive) [ 08/May/15 ]

Can you cook up patches for 2.5.3 and 2.4.3, Niu? Cherry-picking resulted in big difference, especially in lustre/osp/osp_object.c.

Comment by Niu Yawei (Inactive) [ 11/May/15 ]

Ok, I'll port it soon.

Comment by Gerrit Updater [ 12/May/15 ]

Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/14773
Subject: LU-5006 mdd: don't call attr_set on object create
Project: fs/lustre-release
Branch: b2_5
Current Patch Set: 1
Commit: c79136559617ffd23415dab3d995e8aec994fab0

Comment by Gerrit Updater [ 12/May/15 ]

Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/14774
Subject: LU-5006 mdd: don't call attr_set on object create
Project: fs/lustre-release
Branch: b2_4
Current Patch Set: 1
Commit: d0a2067981a16a1d536db3801d58328fbad2ff7e

Generated at Sat Feb 10 01:47:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.