[LU-91] Impossible to use quotas on RHEL6.0 Created: 22/Feb/11 Updated: 25/Mar/11 Resolved: 24/Mar/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.0.0 |
| Fix Version/s: | Lustre 2.1.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Diego Moreno (Inactive) | Assignee: | Johann Lombardi (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
RHEL6.0 GA with kernel 2.6.32-71 |
||
| Attachments: |
|
| Severity: | 3 |
| Bugzilla ID: | 23,707 |
| Epic: | RHEL6, ext4, ldiskfs, quotacheck, quotas |
| Rank (Obsolete): | 5091 |
| Description |
|
It's not possible for us to use quotas on our Lustre 2.0 over RHEL6.0. When we try to do "lfs quotacheck -ug /fs1/" from a client, and being fs1 a Lustre fs, command hangs without replying (you can see the logs on bugzilla#23707). After investigation, we found that Lustre have some conflicts with new code introduced in kernel's quotas code (fs/quota/). The Lustre stack would be like this: ll_quota_on ( Then, in the ldiskfs (ext4) stack we have the next entries: v2_read_file_info (from line "dqopt->ops[type]->read_file_info(sb, type) on vfs_load_quota_inode) rc=-1 The -1 value comes from the beginning of v2_read_file_info function: static int v2_read_file_info(struct super_block *sb, int type) if (!v2_read_header(sb, type, &dqhead)) The first condition statement (!v2_read_header(sb, type, &dqhead)) is false, the second is true so v2_read_file_info returns ' This new condition statement was introduced in 2.6.33-rc (commit 869835dfad3eb6f7d90c3255a24b084fea82f30d "quota: Improve checking of quota file header") and then it was accepted for RHEL6.0 GA so Lustre would hit this problem in any new kernel with this commit on it. I was looking where do we initialize the dqhead.dqh_version (this is the 'bad' value as we use QFMT_VFS_V0 for quotas on Lustre, isn't it?) but I didn't find it. I also looked how ext4 initialize this value but I didn't find it. This is the first time I'm looking at the quotas code so I ask for help to somebody knowing more than me on quotas:
|
| Comments |
| Comment by Diego Moreno (Inactive) [ 23/Feb/11 ] |
|
I continued analysing the issue and I found where does dqhead version value get initialized. Actually I didn't find it before because it was obviously initialized the first time we run "lfs quotacheck" on client. It's initialized twice:
I changed all initializations to zero and quotas are now working properly. But the problem comes from the last macro, V2_INITQVERSIONS, which is in quotas kernel code (fs/quota/quotaio_v2.h). I don't know if this is a kernel bug but, do you think it could be a kernel bug? Using another macro to initialize dqhead version to '0' can be a WA for new kernels but it doesn't seem like a proper solution for old 2.6.18, what do you think? |
| Comment by Johann Lombardi (Inactive) [ 24/Feb/11 ] |
|
The 32-bit quota format is no longer supported on 2.x, so i think it should be fine to just replace QFMT_VFS_V0 with QFMT_VFS_V1 in lustre/lvfs/fsfilt_ext3.c. Diego, would you mind giving this a try? |
| Comment by Diego Moreno (Inactive) [ 25/Feb/11 ] |
|
That made the trick. Actually when I tried that solution I forgot to update OSS packages... Now it works. Thanks Johann, |
| Comment by Diego Moreno (Inactive) [ 25/Feb/11 ] |
|
Patch changing QFMT_VFS_V0 to QFMT_VFS_V1 for quotas initialization on recent kernels. |
| Comment by Johann Lombardi (Inactive) [ 25/Feb/11 ] |
|
Thanks for the quick feedback. If we don't want to break older kernels (like RHEL5), we need to add something like: Actually, we might just want to use something like QFMT_LUSTRE in the lustre code and define it as appropriate. |
| Comment by Johann Lombardi (Inactive) [ 28/Feb/11 ] |
|
Updated patch pushed to gerrit: |
| Comment by Build Master (Inactive) [ 24/Mar/11 ] |
|
Integrated in Johann Lombardi : 93171cb31cacbd12d976a71ae056775c8e4583b9
|
| Comment by Build Master (Inactive) [ 24/Mar/11 ] |
|
Integrated in Oleg Drokin : b25eb219a157dda49a57e14f7bbc400a52a10a9d
|
| Comment by Peter Jones [ 24/Mar/11 ] |
|
Fix is now landed to master |
| Comment by Build Master (Inactive) [ 25/Mar/11 ] |
|
Integrated in Brian J. Murrell : cc6dc918f19cbabdcb7333d7dccd48fd6e3d72cf
|
| Comment by Build Master (Inactive) [ 25/Mar/11 ] |
|
Integrated in Brian J. Murrell : cc6dc918f19cbabdcb7333d7dccd48fd6e3d72cf
|
| Comment by Build Master (Inactive) [ 25/Mar/11 ] |
|
Integrated in Brian J. Murrell : cc6dc918f19cbabdcb7333d7dccd48fd6e3d72cf
|
| Comment by Build Master (Inactive) [ 25/Mar/11 ] |
|
Integrated in Brian J. Murrell : cc6dc918f19cbabdcb7333d7dccd48fd6e3d72cf
|