Details
-
Improvement
-
Resolution: Unresolved
-
Medium
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
In Lustre, filesystem “full” behavior is effectively driven by per-OST capacity. When an OST approaches 100% utilization, writes targeting that OST can fail with ENOSPC, which often forces administrators to start deleting data early (commonly around ~98% OST usage) to avoid unpredictable job failures. On very large filesystems (e.g., tens of PB), this operational “fear margin” translates into substantial unusable capacity and cost inefficiency.
Currently, when ENOSPC occurs, operators often cannot quickly determine which file was involved, making incident response and policy decisions difficult. This lack of attribution/visibility is a major factor preventing confident near-full utilization.