[LU-13973] 4K random write performance impacts on large sparse files - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: Lustre 2.14.0
Labels:
None
Environment:
master

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

Here is a tested workload.

4k, random write, FPP(File per process)

[randwrite]
ioengine=libaio
rw=randwrite
blocksize=4k
iodepth=4
direct=1
size=${SIZE}
runtime=60
numjobs=16
group_reporting
directory=/ai400x/out
create_serialize=0
filename_format=f.$jobnum.$filenum

The test case is that 2 clients have each 16 fio processes and each fio process does 4k random write to different files.
However, if file size is large (128GB in this case), it causes the huge performance impacts. Here is two test results.

1GB file

# SIZE=1g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio

write: IOPS=16.8k, BW=65.5MiB/s (68.7MB/s)(3930MiB/60004msec); 0 zone resets

128GB file

# SIZE=128g /work/ihara/fio.git/fio --client=hostfile randomwrite.fio

write: IOPS=2894, BW=11.3MiB/s (11.9MB/s)(679MiB/60039msec)

As far as I observed those two cases and collected cpu profiles on OSS, in 128GB file case, there were big spinlocks in ldiskfs_mb_new_block() and ldiskfs_mb_normalized_request() and it spent 89% time (14085/15823 samples) of total ost_io_xx() against 20% (1895/9296 samples) in 1GB file case. Please see attached framegraph.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

128g-4krandomwrite.svg
435 kB
20/Sep/20 7:57 AM
1g-4krandomwrite.svg
592 kB
20/Sep/20 7:57 AM

Issue Links

is related to

LU-13765 ldiskfs_mb_mark_diskspace_used:3472: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata

Resolved

Activity

People

Assignee:: Qian Yingjin

Reporter:: Shuichi Ihara

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 20/Sep/20 7:57 AM

Updated:: 29/Oct/20 11:49 AM