[LU-11024] Broken inode accounting of MDT on ZFS Created: 02/May/18  Updated: 28/May/18  Resolved: 21/May/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0, Lustre 2.10.4

Type: Bug Priority: Critical
Reporter: Li Xi (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: llnl

Issue Links:
Related
is related to LU-7991 Add project quota for ZFS Resolved
is related to LU-5638 sanity-quota test_33 for ZFS-based ba... Closed
is related to LU-9592 sanity-quota test cases 33 remove fro... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Roughly 6,200 10MB files are created with 'dd' loop:

[thcrowe@td-mngt01 thcrowe] for i in `seq 1 10000` ; do dd if=/dev/zero of=file-$i bs=1M count=10; done
^C
[thcrowe@td-mngt01 thcrowe]

All files were written to MDT index 0.

[thcrowe@td-mngt01 thcrowe]$ lfs getstripe m file* | sort | uniq
0
[thcrowe@td-mngt01 thcrowe]$ ls 1 file* | wc -l
6208

So at this point, POSIX says I have 6,208 files named file-*.
Lustre however, reports the numbers differently.

[thcrowe@td-mngt01 thcrowe]$ lfs quota -u 415432 /mnt/slate
Disk quotas for usr 415432 (uid 415432):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/slate 61619203 0 0 - 1473 0 0 -
[thcrowe@td-mngt01 thcrowe]$

Here is the same data directly from the MDT

[root@slate-mds01 ~]# grep -A1 415432 /proc/fs/lustre/osd-zfs/slate-MDT0000/quota_slave/acct_user

  • id: 415432
    usage: { inodes: 1473, kbytes: 53825 }
    [root@slate-mds01 ~]#

Following commands were used to dump MDT index 0's zfs contents:

[root@slate-mds01 ~]# zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
mgs 186G 6.35M 186G - 0% 0% 1.00x ONLINE -
slate_mdt0000 8.67T 16.1G 8.66T - 0% 0% 1.00x ONLINE -
[root@slate-mds01 ~]# zpool set cachefile=/tmp/slate_mdt0000.cache slate_mdt0000
[root@slate-mds01 ~]# cp /tmp/slate_mdt0000.cache /tmp/slate_mdt0000.cache1
[root@slate-mds01 ~]# zpool set cachefile="" slate_mdt0000
[root@slate-mds01 ~]# zdb -ddddd -U /tmp/slate_mdt0000.cache1 slate_mdt0000 > /tmp/slate_mdt0000-zdb-ddddd

Once the zdb completed it output, it is a simple grep to see what uid 415432 has going on.

[root@slate-mds01 ~]# grep uid /tmp/slate_mdt0000-zdb-ddddd | grep -c 415432
6210

6210 is not 6208, because there are 2 directory objects owned by 415432 in the zdb output.

Further tests were run to check /proc/fs/lustre/osd-zfs/slate-MDT0000/quota_slave/acct_user reported correct information. If created 100 files, the account number increased from 120 to 205, and if created 1000 files, the account number increased from 120 to 929.

Space accounting on MDT works well.

 

 

 



 Comments   
Comment by Andreas Dilger [ 02/May/18 ]

What version of ZFS and Lustre is installed on the MDS?  Using ZFS 0.6 had a Lustre-specific inode accounting implementation, but that was replaced with an in-ZFS dnode accounting mechanism in ZFS 0.7.

Comment by Peter Jones [ 03/May/18 ]

lixi which Lustre version is being used here?

Comment by Li Xi (Inactive) [ 04/May/18 ]

The ZFS version is 0.7.5, and userobj_accounting feature is running actively on ZFS. The Lustre version is 2.10.3.

Comment by Andreas Dilger [ 16/May/18 ]

This looks like it is a bug in the autoconf checking for ZFS dnode accounting introduced by the landing of patch https://review.whamcloud.com/30540 "LU-7991 quota: project quota against ZFS backend" to b2_10 (patch https://review.whamcloud.com/27093 on master).

diff --git a/config/lustre-build-zfs.m4 b/config/lustre-build-zfs.m4
index 9e39c80..297e790 100644
--- a/config/lustre-build-zfs.m4
+++ b/config/lustre-build-zfs.m4
@@ -523,10 +523,10 @@ your distribution.
                dnl # ZFS 0.7.0 feature: SPA_FEATURE_USEROBJ_ACCOUNTING
                dnl #
                LB_CHECK_COMPILE([if zfs has native dnode accounting supported],
-               dmu_objset_userobjspace_upgrade, [
+               dmu_objset_id_quota_upgrade, [
                        #include <sys/dmu_objset.h>
                ],[
-                       dmu_objset_userobjspace_upgrade(NULL);
+                       dmu_objset_id_quota_upgrade(NULL);
                ],[
                        AC_DEFINE(HAVE_DMU_USEROBJ_ACCOUNTING, 1,
                                [Have native dnode accounting in ZFS])

In particular, the dmu_objset_id_quota_upgrade() function only exists in ZFS master (for 0.8), while dmu_objset_userobjspace_upgrade() is what exists in ZFS 0.7. We need to check for both. Lustre doesn't call this function directly (that would cause the on-disk format to be changed and prevent downgrade to an older ZFS release), so we don't need any compatibility functions in Lustre, just the detection needs to be fixed.

Comment by nasf (Inactive) [ 16/May/18 ]

Here is the patch:
https://review.whamcloud.com/#/c/32418/

Comment by Gerrit Updater [ 16/May/18 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/32422
Subject: LU-11024 osd-zfs: properly detect ZFS dnode accounting
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 3a4677a73c885035e48d4aabeefa2592596a64fd

Comment by Andreas Dilger [ 16/May/18 ]

Note that this bug caused the ZFS dnode/inode quota to be reported incorrectly by Lustre - it wasn't using the DMU interface for reporting the dnode quota, but had fallen back to estimating dnode usage based on the user's space usage as if using ZFS 0.6.x which didn't have that interface. ZFS was still accounting the dnode usage correctly internally. Once the autoconf patch is applied and the Lustre server is rebuilt/restarted, then the ZFS dnode/inode quota will be reported correctly to Lustre.

Comment by Gerrit Updater [ 18/May/18 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/32422/
Subject: LU-11024 osd-zfs: properly detect ZFS dnode accounting
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 71943a5d23498fc76d621e3855d580ff90f0757a

Comment by Gerrit Updater [ 21/May/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32418/
Subject: LU-11024 osd-zfs: properly detect ZFS dnode accounting
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 4813eee9b06f6ceb4f39cce42d904ec1516a824a

Comment by Peter Jones [ 21/May/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:40:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.