Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
There is a discrepancy spotted between group inode quota and consumption.
for example in group smith the "lfs quota status" shows:
Disk quotas for grp smith (gid 6789): Filesystem kbytes quota limit grace files quota limit grace /scratch 54380276 75497472 150994944 - 224737 264000 328000 - scr1-MDT0000_UUID 82304 - 93548 - 34053 - 34618 - scr1-MDT0001_UUID 0 - 244364 - 0 - 0 - scr1-MDT0002_UUID 0 - 241368 - 0 - 33369 - scr1-MDT0003_UUID 0 - 242804 - 0 - 0 - scr1-MDT0004_UUID 0 - 262144 - 0 - 0 - scr1-MDT0005_UUID 0 - 49232 - 0 - 3092 - scr1-MDT0006_UUID 0 - 65552 - 0 - 3077 - scr1-MDT0007_UUID 0 - 65560 - 0 - 3079 - scr1-MDT0008_UUID 0 - 65548 - 0 - 1028 - scr1-MDT0009_UUID 0 - 60 - 0 - 1031 - scr1-MDT000a_UUID 0 - 40 - 0 - 1028 - scr1-MDT000b_UUID 0 - 20 - 0 - 1031 - scr1-MDT000c_UUID 0 - 20 - 0 - 0 - scr1-MDT000d_UUID 414392 - 424912 - 121483 - 122271 - scr1-MDT000e_UUID 73636 - 87852 - 34085 - 34826 - scr1-MDT000f_UUID 68160 - 82304 - 35116 - 36051 -
whereas the filesystem scan reveals only 73757 files exist under that group.
similarly for another group jones:
Disk quotas for grp jones (gid 5678): Filesystem kbytes quota limit grace files quota limit grace /scratch 1425241616 3670016000 7340032000 - 466507 684000 1368000 - scr1-MDT0000_UUID 0 - 0 - 5 - 1029 - scr1-MDT0001_UUID 0 - 65536 - 0 - 4096 - scr1-MDT0002_UUID 0 - 52216 - 0 - 18733 - scr1-MDT0003_UUID 0 - 88 - 0 - 22 - scr1-MDT0004_UUID 0 - 28 - 0 - 7 - scr1-MDT0005_UUID 0 - 28 - 0 - 7 - scr1-MDT0006_UUID 0 - 20 - 0 - 5 - scr1-MDT0007_UUID 0 - 65536 - 0 - 4096 - scr1-MDT0008_UUID 0 - 65536 - 0 - 0 - scr1-MDT0009_UUID 0 - 65536 - 0 - 0 - scr1-MDT000a_UUID 1055284 - 1908948 - 296133 - 298943 - scr1-MDT000b_UUID 88672 - 1068664 - 51046 - 53892 - scr1-MDT000c_UUID 88688 - 1001728 - 48391 - 51229 - scr1-MDT000d_UUID 163564 - 1202556 - 70924 - 73763 - scr1-MDT000e_UUID 0 - 0 - 7 - 4102 - scr1-MDT000f_UUID 0 - 0 - 1 - 1025 -
whereas consumption is 121687 files across filesystem, hence a difference of 344820.
group dirs are striped on 4 MDTs.
observations have been across more than a week.
at the moment, group quota is bumped so user jobs don't run into disk quota exceeded errors.
also the first MDT in the allocated set, always has more entries than the rest. a quick test with a group creating 100 4-way striped dirs and no files showed same behaviour:
Filesystem kbytes quota limit grace files quota limit grace /scratch 3520 0 0 - 880 0 0 - scr1-MDT0004 2008 - 65536 - 502 - 4096 - scr1-MDT0005 504 - 65536 - 126 - 4096 - scr1-MDT0006 504 - 65536 - 126 - 4096 - scr1-MDT0007 504 - 65536 - 126 - 4096 -
It looks like it is caused by the "agent" directory inode, which is created to track the remote striped directory inode in other MDTs,
the "agent" inode is created with its parent directory gid,
static struct inode *osd_create_local_agent_inode(const struct lu_env *env, struct osd_device *osd, struct osd_object *pobj, const struct lu_fid *fid, __u32 type, struct thandle *th) { ... local = ldiskfs_create_inode(oh->ot_handle, pobj->oo_inode, type, NULL); <----- it pass argument "owner" as "NULL" if (IS_ERR(local)) { CERROR("%s: create local error %d\n", osd_name(osd), (int)PTR_ERR(local)); RETURN(local); } /* * restore i_gid in case S_ISGID is set, we will inherit S_ISGID and set * correct gid on remote file, not agent here */ local->i_gid = current_fsgid(); <----- restore to gid 0(ROOT) ldiskfs_set_inode_state(local, LDISKFS_STATE_LUSTRE_NOSCRUB); ... } struct inode *__ldiskfs_new_inode(handle_t *handle, struct inode *dir, umode_t mode, const struct qstr *qstr, __u32 goal, uid_t *owner, int handle_type, unsigned int line_no, int nblocks) { ... /* * Initalize owners and quota early so that we don't have to account * for quota initialization worst case in standard inode creating * transaction */ if (owner) { inode->i_mode = mode; i_uid_write(inode, owner[0]); i_gid_write(inode, owner[1]); } else if (test_opt(sb, GRPID)) { inode->i_mode = mode; inode->i_uid = current_fsuid(); inode->i_gid = dir->i_gid; } else inode_init_owner(inode, dir, mode); dquot_initialize(inode); ... } void inode_init_owner(struct inode *inode, const struct inode *dir, umode_t mode) { inode->i_uid = current_fsuid(); if (dir && dir->i_mode & S_ISGID) { <---- create with the its parent directory GID inode->i_gid = dir->i_gid; if (S_ISDIR(mode)) mode |= S_ISGID; } else inode->i_gid = current_fsgid(); inode->i_mode = mode; }
- OSD create the agent inode with the argument "owner" as NULL
- LDiskFS (ext4) create the inode, and set its GID as its parent's GID because its "S_ISGID" is set.
- OSD change the GID of the newly created agent inode to GID 0(ROOT), which is not tracked by the quota
then the quota usage in LDiskFS (ext4) is damaged.
Attachments
Issue Links
- is related to
-
LU-14032 The agent inode under directory with S_ISGID causes group quota discrepancy
- Resolved