Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
9223372036854775807
Description
Some applications do stat() calls under a directory within which all the children files have regularized file name:
- mdtest benchmark: mdtest.$i
- du more than 10000 entries: file.$i
- ML/AI with ingested data that have typically a rule of filename in the directory.
These applications call stat() on the files in the Alphabetic sorting order of the file name.
This kind of metadata operations can not be optimized by current statahead mechanism.
The current statahead mechanism works as follows:
1. Open the directory via opendir() call. It will authorize the statahead.
2. readdir() to get the name and inode number for the dentries.
3. do stat() on the dentries one by one.
4. Close the directory which will deauthorize the statahead.
In current statahead mechanism, the stat() calls in the order of populate the dentries via readdir(). For the ldiskfs backend, it is ordering by the hash of the file name, not a kinds fo sorting oder.
However, we can improve the statahead to support statahead pattern with regularized file name to optimize the metadata performance for the above applications