Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-19779

ENOSPC observability to enable safe near-full OST utilization

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Medium
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      In Lustre, filesystem “full” behavior is effectively driven by per-OST capacity. When an OST approaches 100% utilization, writes targeting that OST can fail with ENOSPC, which often forces administrators to start deleting data early (commonly around ~98% OST usage) to avoid unpredictable job failures. On very large filesystems (e.g., tens of PB), this operational “fear margin” translates into substantial unusable capacity and cost inefficiency.

      Currently, when ENOSPC occurs, operators often cannot quickly determine which file was involved, making incident response and policy decisions difficult. This lack of attribution/visibility is a major factor preventing confident near-full utilization.

      Attachments

        Activity

          People

            skoyama Sohei Koyama
            skoyama Sohei Koyama
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: