Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
9223372036854775807
Description
During discussion with LBNL on their Hadoop Spark project, they said that there was considerable overhead when running on Lustre because of repeated open+close of the same files causing extra RPC traffic to the MDS.
Lustre has the ability to cache opens on the client with a DLM openlock, but this isn't done for regular opens by default because it has extra overhead compared to uncached opens, but only for NFS opens because the knfsd repeatedly opens the same file.
It would be worthwhile to firstly implement a tunable to enable opencache on a per-client basis (LU-5426) and then measure the performance impact of this tunable for normal usage and for Spark.