Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13973

4K random write performance impacts on large sparse files

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.14.0
    • None
    • master
    • 3
    • 9223372036854775807

    Description

      Here is a tested workload.

      4k, random write, FPP(File per process)

      [randwrite]
      ioengine=libaio
      rw=randwrite
      blocksize=4k
      iodepth=4
      direct=1
      size=${SIZE}
      runtime=60
      numjobs=16
      group_reporting
      directory=/ai400x/out
      create_serialize=0
      filename_format=f.$jobnum.$filenum
      

      The test case is that 2 clients have each 16 fio processes and each fio process does 4k random write to different files.
      However, if file size is large (128GB in this case), it causes the huge performance impacts. Here is two test results.

      1GB file

      # SIZE=1g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio
      
      write: IOPS=16.8k, BW=65.5MiB/s (68.7MB/s)(3930MiB/60004msec); 0 zone resets
       

      128GB file

      # SIZE=128g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio
      
      write: IOPS=2894, BW=11.3MiB/s (11.9MB/s)(679MiB/60039msec)
       

      As far as I observed those two cases and collected cpu profiles on OSS, in 128GB file case, there were big spinlocks in ldiskfs_mb_new_block() and ldiskfs_mb_normalized_request() and it spent 89% time (14085/15823 samples) of total ost_io_xx() against 20% (1895/9296 samples) in 1GB file case. Please see attached framegraph.

      Attachments

        Issue Links

          Activity

            People

              qian_wc Qian Yingjin
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: