Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
Lustre 2.12.6 + patches
-
3
-
9223372036854775807
Description
We observed invalid quota accounting for some groups on several OST:
"lctl get_param osd-ldiskfs.<OST>.quota_slave.acct_group" returned 2-7 greater number of inodesĀ that with "find -gid <gid>" on the ldiskfs OST target.
After dropping dentry caches on all OSS, the accounting values came back to normal.
echo 2 > /proc/sys/vm/drop_caches
For example, for the gid 34132 on OST001e:
- Before dropping cache: accounting space used = 38705302288* kbytes ~ 36T
- After dropping cache: accounting space used = 13905337816 kbytes ~ 12T
- diff = 24855082348 kbytes ~ 23T (-64%)
This filesystem is used for small files operations (SSD disks for targets).
We think that issue occurred after repeatedly:
- untar 30 000 000 files (total ~ 100T)
- stats this files
- delete this files
The tool used to tar/untar is mpifileutils (https://github.com/hpc/mpifileutils) used with 360 process (dispatched on 10 lustre clients).