Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13973

4K random write performance impacts on large sparse files

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.14.0
    • None
    • master
    • 3
    • 9223372036854775807

    Description

      Here is a tested workload.

      4k, random write, FPP(File per process)

      [randwrite]
      ioengine=libaio
      rw=randwrite
      blocksize=4k
      iodepth=4
      direct=1
      size=${SIZE}
      runtime=60
      numjobs=16
      group_reporting
      directory=/ai400x/out
      create_serialize=0
      filename_format=f.$jobnum.$filenum
      

      The test case is that 2 clients have each 16 fio processes and each fio process does 4k random write to different files.
      However, if file size is large (128GB in this case), it causes the huge performance impacts. Here is two test results.

      1GB file

      # SIZE=1g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio
      
      write: IOPS=16.8k, BW=65.5MiB/s (68.7MB/s)(3930MiB/60004msec); 0 zone resets
       

      128GB file

      # SIZE=128g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio
      
      write: IOPS=2894, BW=11.3MiB/s (11.9MB/s)(679MiB/60039msec)
       

      As far as I observed those two cases and collected cpu profiles on OSS, in 128GB file case, there were big spinlocks in ldiskfs_mb_new_block() and ldiskfs_mb_normalized_request() and it spent 89% time (14085/15823 samples) of total ost_io_xx() against 20% (1895/9296 samples) in 1GB file case. Please see attached framegraph.

      Attachments

        Issue Links

          Activity

            [LU-13973] 4K random write performance impacts on large sparse files
            pjones Peter Jones made changes -
            Assignee Original: WC Triage [ wc-triage ] New: Qian Yingjin [ qian_wc ]
            pjones Peter Jones made changes -
            Link New: This issue is related to EX-1772 [ EX-1772 ]
            sihara Shuichi Ihara made changes -
            Description Original: Here is a tested workload.

            4k, random write, FPP(File per process)
            {noformat}
            [randwrite]
            ioengine=libaio
            rw=randwrite
            blocksize=4k
            iodepth=4
            direct=1
            size=${SIZE}
            runtime=60
            numjobs=16
            group_reporting
            directory=/ai400x/out
            create_serialize=0
            filename_format=f.$jobnum.$filenum
            {noformat}
            The test case is that 2 clients have each 16 fio processes and each fio process does 4k random write to different files.
             However, if file size is large (128GB in this case), it causes the huge performance impacts. Here is two test results.

            1GB file
            {noformat}
            # SIZE=1g /work/ihara/fio.git/fio --client=ec01 --client=ec02 randomwrite.fio

            write: IOPS=16.8k, BW=65.5MiB/s (68.7MB/s)(3930MiB/60004msec); 0 zone resets
             {noformat}
            128GB file
            {noformat}
            # SIZE=128g /work/ihara/fio.git/fio --client=ec01 --client=ec02 randomwrite.fio

            write: IOPS=2894, BW=11.3MiB/s (11.9MB/s)(679MiB/60039msec)
             {noformat}
            As far as I observed those two cases and collected cpu profiles on OSS, in 128GB file case, there were big spinlocks in ldiskfs_mb_new_block() and ldiskfs_mb_normalized_request() and it spent 89% time (14085/15823 samples) of total ost_io_xx() against 20% (1895/9296 samples) in 1GB file case. Please see attached framegraph.
            New: Here is a tested workload.

            4k, random write, FPP(File per process)
            {noformat}
            [randwrite]
            ioengine=libaio
            rw=randwrite
            blocksize=4k
            iodepth=4
            direct=1
            size=${SIZE}
            runtime=60
            numjobs=16
            group_reporting
            directory=/ai400x/out
            create_serialize=0
            filename_format=f.$jobnum.$filenum
            {noformat}
            The test case is that 2 clients have each 16 fio processes and each fio process does 4k random write to different files.
             However, if file size is large (128GB in this case), it causes the huge performance impacts. Here is two test results.

            1GB file
            {noformat}
            # SIZE=1g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio

            write: IOPS=16.8k, BW=65.5MiB/s (68.7MB/s)(3930MiB/60004msec); 0 zone resets
             {noformat}
            128GB file
            {noformat}
            # SIZE=128g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio

            write: IOPS=2894, BW=11.3MiB/s (11.9MB/s)(679MiB/60039msec)
             {noformat}
            As far as I observed those two cases and collected cpu profiles on OSS, in 128GB file case, there were big spinlocks in ldiskfs_mb_new_block() and ldiskfs_mb_normalized_request() and it spent 89% time (14085/15823 samples) of total ost_io_xx() against 20% (1895/9296 samples) in 1GB file case. Please see attached framegraph.
            sihara Shuichi Ihara made changes -
            Link New: This issue is related to LU-13765 [ LU-13765 ]
            sihara Shuichi Ihara created issue -

            People

              qian_wc Qian Yingjin
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: