[LU-11024] Broken inode accounting of MDT on ZFS Created: 02/May/18 Updated: 28/May/18 Resolved: 21/May/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.12.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Li Xi (Inactive) | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Roughly 6,200 10MB files are created with 'dd' loop: [thcrowe@td-mngt01 thcrowe] for i in `seq 1 10000` ; do dd if=/dev/zero of=file-$i bs=1M count=10; done All files were written to MDT index 0. [thcrowe@td-mngt01 thcrowe]$ lfs getstripe So at this point, POSIX says I have 6,208 files named file-*. [thcrowe@td-mngt01 thcrowe]$ lfs quota -u 415432 /mnt/slate Here is the same data directly from the MDT [root@slate-mds01 ~]# grep -A1 415432 /proc/fs/lustre/osd-zfs/slate-MDT0000/quota_slave/acct_user
Following commands were used to dump MDT index 0's zfs contents: [root@slate-mds01 ~]# zpool list Once the zdb completed it output, it is a simple grep to see what uid 415432 has going on. [root@slate-mds01 ~]# grep uid /tmp/slate_mdt0000-zdb-ddddd | grep -c 415432 6210 is not 6208, because there are 2 directory objects owned by 415432 in the zdb output. Further tests were run to check /proc/fs/lustre/osd-zfs/slate-MDT0000/quota_slave/acct_user reported correct information. If created 100 files, the account number increased from 120 to 205, and if created 1000 files, the account number increased from 120 to 929. Space accounting on MDT works well.
|
| Comments |
| Comment by Andreas Dilger [ 02/May/18 ] |
|
What version of ZFS and Lustre is installed on the MDS? Using ZFS 0.6 had a Lustre-specific inode accounting implementation, but that was replaced with an in-ZFS dnode accounting mechanism in ZFS 0.7. |
| Comment by Peter Jones [ 03/May/18 ] |
|
lixi which Lustre version is being used here? |
| Comment by Li Xi (Inactive) [ 04/May/18 ] |
|
The ZFS version is 0.7.5, and userobj_accounting feature is running actively on ZFS. The Lustre version is 2.10.3. |
| Comment by Andreas Dilger [ 16/May/18 ] |
|
This looks like it is a bug in the autoconf checking for ZFS dnode accounting introduced by the landing of patch https://review.whamcloud.com/30540 " diff --git a/config/lustre-build-zfs.m4 b/config/lustre-build-zfs.m4
index 9e39c80..297e790 100644
--- a/config/lustre-build-zfs.m4
+++ b/config/lustre-build-zfs.m4
@@ -523,10 +523,10 @@ your distribution.
dnl # ZFS 0.7.0 feature: SPA_FEATURE_USEROBJ_ACCOUNTING
dnl #
LB_CHECK_COMPILE([if zfs has native dnode accounting supported],
- dmu_objset_userobjspace_upgrade, [
+ dmu_objset_id_quota_upgrade, [
#include <sys/dmu_objset.h>
],[
- dmu_objset_userobjspace_upgrade(NULL);
+ dmu_objset_id_quota_upgrade(NULL);
],[
AC_DEFINE(HAVE_DMU_USEROBJ_ACCOUNTING, 1,
[Have native dnode accounting in ZFS])
In particular, the dmu_objset_id_quota_upgrade() function only exists in ZFS master (for 0.8), while dmu_objset_userobjspace_upgrade() is what exists in ZFS 0.7. We need to check for both. Lustre doesn't call this function directly (that would cause the on-disk format to be changed and prevent downgrade to an older ZFS release), so we don't need any compatibility functions in Lustre, just the detection needs to be fixed. |
| Comment by nasf (Inactive) [ 16/May/18 ] |
|
Here is the patch: |
| Comment by Gerrit Updater [ 16/May/18 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/32422 |
| Comment by Andreas Dilger [ 16/May/18 ] |
|
Note that this bug caused the ZFS dnode/inode quota to be reported incorrectly by Lustre - it wasn't using the DMU interface for reporting the dnode quota, but had fallen back to estimating dnode usage based on the user's space usage as if using ZFS 0.6.x which didn't have that interface. ZFS was still accounting the dnode usage correctly internally. Once the autoconf patch is applied and the Lustre server is rebuilt/restarted, then the ZFS dnode/inode quota will be reported correctly to Lustre. |
| Comment by Gerrit Updater [ 18/May/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/32422/ |
| Comment by Gerrit Updater [ 21/May/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32418/ |
| Comment by Peter Jones [ 21/May/18 ] |
|
Landed for 2.12 |