Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-19900

lov: DIO with O_APPEND sends all data to stripe 0

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Medium
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      When a file is opened with O_APPEND and written via direct IO (either explicit O_DIRECT or via hybrid IO promotion), all data is incorrectly routed to stripe 0 instead of being distributed across stripes. This causes inflated i_size and subsequent position corruption or hangs on multi-stripe files.

      Root cause: Two places in the LOV layer assume that DIO pages are pre-split to a single stripe by lov_io_rw_iter_init(). This is true for non-APPEND DIO, but lov_io_rw_iter_init() explicitly skips stripe splitting for O_APPEND (since the final file position is not known until the OST processes the write). As a result:

      1. lov_page_init_composite() caches the first DIO page's stripe index and applies it to all subsequent pages - wrong for APPEND where pages span multiple stripes.

      2. lov_io_submit() uses O(1) cl_page_list_splice to send all DIO pages to the first page's stripe - wrong for APPEND where pages belong to different stripes.

      The fix adds cl_io_is_append() checks to both code paths so that APPEND DIO pages are handled like buffered IO (per-page stripe grouping), while preserving the O(1) splice optimization for non-APPEND DIO.

      Reproducer: Write 4MB with O_DIRECT|O_APPEND to a 2-stripe 1MB file. Expected file size: 4MB. Actual: 7MB (all data on stripe 0, LOV calculates wrong size from per-object sizes).

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              paf0186 Patrick Farrell
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: