[LU-9423] Wrong uid and gid for ost objects Created: 01/May/17  Updated: 20/Feb/18  Resolved: 20/Feb/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Mahmoud Hanafi Assignee: Peter Jones
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

We have discovered  objects that have the wrong UID or GID. For example on one filesystem there are over >600000 files effect.

 

Here is the details:

 ls -ln /nobackupp8/file_removed_for_security
-rw-r--r-- 1 10576 40770 0 Dec 8 21:40 /nobackupp8/file_removed_for_security


lfs path2fid /nobackupp8/file_removed_for_security
[0x36045b238:0x6f94:0x0]


lfs getstripe /nobackupp8/file_removed_for_security

/nobackupp8/file_removed_for_security
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 272
 obdidx objid objid group
 272 48704742 0x2e72ce6 0

# debugfs -c -R "stat /O/0/d$((48704742 % 32))/48704742" /dev/mapper/nbp8_13_OST272
debugfs 1.42.13.wc5 (15-Apr-2016)
/dev/mapper/nbp8_13_OST272: catastrophic mode - not reading inode or group bitmaps
Inode: 361442 Type: regular Mode: 0666 Flags: 0x80000
Generation: 2149307429 Version: 0x00000027:023c6769
User: 0 Group: 0 Size: 0
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 0
Fragment: Address: 0 Number: 0 Size: 0
 ctime: 0x584a43d1:00000000 -- Thu Dec 8 21:40:33 2016
 atime: 0x00000000:00000000 -- Wed Dec 31 16:00:00 1969
 mtime: 0x584a43d1:00000000 -- Thu Dec 8 21:40:33 2016
crtime: 0x584a43c0:3db75e98 -- Thu Dec 8 21:40:16 2016
Size of extra inode fields: 28
Extended attributes stored in inode body: 
 lma = "08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 e6 2c e7 02 00 00 00 00 " (24)
 lma: fid=[0x100000000:0x2e72ce6:0x0] compat=8 incompat=0
 fid = "38 b2 45 60 03 00 00 00 94 6f 00 00 00 00 00 00 " (16)
 fid: parent=[0x36045b238:0x6f94:0x0] stripe=0
EXTENTS:




 

What would cause this discrepancy and how do we fix them.

 



 Comments   
Comment by Andreas Dilger [ 02/May/17 ]

The UID nand GID are only set on objects when they are written by the client or ownership explicitly changed by chown/chmod . In this example you can see that the object size is zero, so it has no data in it.

Even if the OST object ownership is incorrect, this does not affect the access permissions to the file, only the quota accounting. The access to the file is controlled solely by the MDS.

Comment by Mahmoud Hanafi [ 08/Jun/17 ]

I found an example of file with non-zero object size and wrong GID. There are other with wrong UID. This is having an effect on our quota account. How should we try to correct this.

ls -l /xxxxxx//xxxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME
-rwx------ 1 xxxxxxx g1119 354 Jun  7  2010 /xxxxxxx//xxxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME

lfs getstripe /xxxxx/xxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME
/xxxxx/xxxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME
lmm_stripe_count:   1
lmm_stripe_size:    1048576
lmm_pattern:        1
lmm_layout_gen:     0
lmm_stripe_offset:  37
	obdidx		 objid		 objid		 group
	    37	       2943541	     0x2cea35	             0

# debugfs -c -R "stat /O/0/d$((2943541 %32))/2943541" /dev/mapper/nbp7-ost37
debugfs 1.42.13.wc5 (15-Apr-2016)
/dev/mapper/nbp7-ost37: catastrophic mode - not reading inode or group bitmaps
Inode: 7808805 Type: regular Mode: 0666 Flags: 0x80000
Generation: 3578042382 Version: 0x0000000c:0013b4c7
User: 1968 Group: 0 Size: 354
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 8
Fragment: Address: 0 Number: 0 Size: 0
 ctime: 0x529619dd:00000000 -- Wed Nov 27 08:12:13 2013
 atime: 0x5293db12:00000000 -- Mon Nov 25 15:19:46 2013
 mtime: 0x4c0d292a:00000000 -- Mon Jun 7 10:15:22 2010
crtime: 0x529616bc:e828e8cc -- Wed Nov 27 07:58:52 2013
Size of extra inode fields: 28
Extended attributes stored in inode body: 
 lma = "00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 35 ea 2c 00 00 00 00 00 " (24)
 lma: fid=[0x100000000:0x2cea35:0x0] compat=0 incompat=0
 fid = "7c 4e 01 00 02 00 00 00 3a 01 00 00 00 00 00 00 " (16)
 fid: parent=[0x200014e7c:0x13a:0x0] stripe=0
EXTENTS:
(0):1999046356


As you can see the OST GID is 0

Comment by Andreas Dilger [ 09/Jun/17 ]

Based on the age of this file (2013) it is entirely possible that this is from some older problem that has been fixed in the code already.

One way to fix this problem is to run LFSCK "layout" scan and repair the OST objects. If that functionality is not available in your version of Lustre is to run a "find" across the whole filesystem and "chown" the files back to the same owner and group. That will reset the ownership on the OST objects, but will also affect the ctime of every file.

Comment by Mahmoud Hanafi [ 20/Feb/18 ]

We can close this case.

 

Comment by Peter Jones [ 20/Feb/18 ]

ok - thanks Mahmoud

Generated at Sat Feb 10 02:26:03 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.