[LU-9423] Wrong uid and gid for ost objects Created: 01/May/17 Updated: 20/Feb/18 Resolved: 20/Feb/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Peter Jones |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
We have discovered objects that have the wrong UID or GID. For example on one filesystem there are over >600000 files effect.
Here is the details: ls -ln /nobackupp8/file_removed_for_security -rw-r--r-- 1 10576 40770 0 Dec 8 21:40 /nobackupp8/file_removed_for_security lfs path2fid /nobackupp8/file_removed_for_security [0x36045b238:0x6f94:0x0] lfs getstripe /nobackupp8/file_removed_for_security /nobackupp8/file_removed_for_security lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 272 obdidx objid objid group 272 48704742 0x2e72ce6 0 # debugfs -c -R "stat /O/0/d$((48704742 % 32))/48704742" /dev/mapper/nbp8_13_OST272 debugfs 1.42.13.wc5 (15-Apr-2016) /dev/mapper/nbp8_13_OST272: catastrophic mode - not reading inode or group bitmaps Inode: 361442 Type: regular Mode: 0666 Flags: 0x80000 Generation: 2149307429 Version: 0x00000027:023c6769 User: 0 Group: 0 Size: 0 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x584a43d1:00000000 -- Thu Dec 8 21:40:33 2016 atime: 0x00000000:00000000 -- Wed Dec 31 16:00:00 1969 mtime: 0x584a43d1:00000000 -- Thu Dec 8 21:40:33 2016 crtime: 0x584a43c0:3db75e98 -- Thu Dec 8 21:40:16 2016 Size of extra inode fields: 28 Extended attributes stored in inode body: lma = "08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 e6 2c e7 02 00 00 00 00 " (24) lma: fid=[0x100000000:0x2e72ce6:0x0] compat=8 incompat=0 fid = "38 b2 45 60 03 00 00 00 94 6f 00 00 00 00 00 00 " (16) fid: parent=[0x36045b238:0x6f94:0x0] stripe=0 EXTENTS:
What would cause this discrepancy and how do we fix them.
|
| Comments |
| Comment by Andreas Dilger [ 02/May/17 ] |
|
The UID nand GID are only set on objects when they are written by the client or ownership explicitly changed by chown/chmod . In this example you can see that the object size is zero, so it has no data in it. Even if the OST object ownership is incorrect, this does not affect the access permissions to the file, only the quota accounting. The access to the file is controlled solely by the MDS. |
| Comment by Mahmoud Hanafi [ 08/Jun/17 ] |
|
I found an example of file with non-zero object size and wrong GID. There are other with wrong UID. This is having an effect on our quota account. How should we try to correct this. ls -l /xxxxxx//xxxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME -rwx------ 1 xxxxxxx g1119 354 Jun 7 2010 /xxxxxxx//xxxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME lfs getstripe /xxxxx/xxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME /xxxxx/xxxxxx/ecco_orbit20_Kah-Song_sgi-wkload_SEE_ReadME lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 37 obdidx objid objid group 37 2943541 0x2cea35 0 # debugfs -c -R "stat /O/0/d$((2943541 %32))/2943541" /dev/mapper/nbp7-ost37 debugfs 1.42.13.wc5 (15-Apr-2016) /dev/mapper/nbp7-ost37: catastrophic mode - not reading inode or group bitmaps Inode: 7808805 Type: regular Mode: 0666 Flags: 0x80000 Generation: 3578042382 Version: 0x0000000c:0013b4c7 User: 1968 Group: 0 Size: 354 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 8 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x529619dd:00000000 -- Wed Nov 27 08:12:13 2013 atime: 0x5293db12:00000000 -- Mon Nov 25 15:19:46 2013 mtime: 0x4c0d292a:00000000 -- Mon Jun 7 10:15:22 2010 crtime: 0x529616bc:e828e8cc -- Wed Nov 27 07:58:52 2013 Size of extra inode fields: 28 Extended attributes stored in inode body: lma = "00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 35 ea 2c 00 00 00 00 00 " (24) lma: fid=[0x100000000:0x2cea35:0x0] compat=0 incompat=0 fid = "7c 4e 01 00 02 00 00 00 3a 01 00 00 00 00 00 00 " (16) fid: parent=[0x200014e7c:0x13a:0x0] stripe=0 EXTENTS: (0):1999046356 As you can see the OST GID is 0 |
| Comment by Andreas Dilger [ 09/Jun/17 ] |
|
Based on the age of this file (2013) it is entirely possible that this is from some older problem that has been fixed in the code already. One way to fix this problem is to run LFSCK "layout" scan and repair the OST objects. If that functionality is not available in your version of Lustre is to run a "find" across the whole filesystem and "chown" the files back to the same owner and group. That will reset the ownership on the OST objects, but will also affect the ctime of every file. |
| Comment by Mahmoud Hanafi [ 20/Feb/18 ] |
|
We can close this case.
|
| Comment by Peter Jones [ 20/Feb/18 ] |
|
ok - thanks Mahmoud |