Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2028

Potential data corruption in 'o2iblnd' (the IB LND driver) when using pre-mapped DMA buffers

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • None
    • 3
    • 4155

    Description

      Code inspection of the o2iblnd DMA handling code (for pre-mapped DMA buffers) found incorrect use of the DMA API that could potentially cause very-hard-to-debug data corruptions.

      The DMA API Howto document (http://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt) clearly states:

      If you need to use the same streaming DMA region multiple times and touch the data in between the DMA transfers, the buffer needs to be synced properly in order for the cpu and device to see the most uptodate and correct copy of the DMA buffer.
      So, firstly, just map it with dma_map_

      {single,sg}

      , and after each DMA transfer call either:
      dma_sync_single_for_cpu(dev, dma_handle, size, direction);
      or:
      dma_sync_sg_for_cpu(dev, sglist, nents, direction);
      as appropriate.

      'o2iblnd' does not make these calls in-between the DMA transfers. Without 'dma_sync_single_for_cpu', the new data might still be in the CPU cache, so when the HCA tries to DMA and send it out, it might DMA and send the obsolete data => resulting in data corruption.

      It appears that at the moment we are luck that this issue has not affected us, but it just might be something difficult to hit/encounter on the x86/x86_64 systems.

      The fix is trivial, and the benefit is prevention of very-hard-to-debug data corruption issues on HW architectures which would expose the incorrect use of the DMA API.

      Attachments

        Activity

          People

            ashehata Amir Shehata (Inactive)
            mlizon Martin Lizon (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: