Loading...

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.6.0
Affects Version/s: Lustre 2.4.0
Labels:
- HB
Environment:
RHEL6.3 and current master

Severity:
2
Rank (Obsolete):
8114

Description

There is a performance regression on the current master(c864582b5d4541c7830d628457e55cd859aee005) if we have multiple IOR threads per client. As far as I can test, ~~LU-2576~~ might cause this performance regression. Here is quick test results on each commit.

client : commit ac37e7b4d101761bbff401ed12fcf671d6b68f9c

# mpirun -np 8 /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
IOR-2.10.3: MPI Coordinated Test of Parallel I/O

Run began: Sun May  5 12:24:09 2013
Command line used: /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
Machine: Linux s08 2.6.32-279.19.1.el6_lustre.x86_64 #1 SMP Sat Feb 9 21:55:32 PST 2013 x86_64
Using synchronized MPI timer
Start time skew across all tasks: 0.00 sec
Path: /lustre/ior.out
FS: 683.5 TiB   Used FS: 0.0%   Inodes: 5.0 Mi   Used Inodes: 0.0%
Participating tasks: 8
Using reorderTasks '-C' (expecting block, not cyclic, task assignment)
task 0 on s08
task 1 on s08
task 2 on s08
task 3 on s08
task 4 on s08
task 5 on s08
task 6 on s08
task 7 on s08

Summary:
	api                = POSIX
	test filename      = /lustre/ior.out/file
	access             = file-per-process
	pattern            = segmented (1 segment)
	ordering in a file = sequential offsets
	ordering inter file=constant task offsets = 1
	clients            = 8 (8 per node)
	repetitions        = 1
	xfersize           = 1 MiB
	blocksize          = 8 GiB
	aggregate filesize = 64 GiB

Using Time Stamp 1367781849 (0x5186b1d9) for Data Signature
Commencing write performance test.
Sun May  5 12:24:09 2013

access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s) total(s)  iter
------    ---------  ---------- ---------  --------   --------   --------  --------   ----
write     3228.38    8388608    1024.00    0.001871   20.30      1.34       20.30      0    XXCEL
Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)  Op grep #Tasks tPN reps  fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize

---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------  --------
write        3228.38    3228.38     3228.38      0.00    3228.38    3228.38     3228.38      0.00  20.29996   8 8 1 1 1 1 0 0 1 8589934592 1048576 68719476736 -1 POSIX EXCEL

Max Write: 3228.38 MiB/sec (3385.20 MB/sec)

Run finished: Sun May  5 12:24:30 2013

client : commit 5661651b2cc6414686e7da581589c2ea0e1f1969

# mpirun -np 8 /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
IOR-2.10.3: MPI Coordinated Test of Parallel I/O

Run began: Sun May  5 12:16:35 2013
Command line used: /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
Machine: Linux s08 2.6.32-279.19.1.el6_lustre.x86_64 #1 SMP Sat Feb 9 21:55:32 PST 2013 x86_64
Using synchronized MPI timer
Start time skew across all tasks: 0.00 sec
Path: /lustre/ior.out
FS: 683.5 TiB   Used FS: 0.0%   Inodes: 5.0 Mi   Used Inodes: 0.0%
Participating tasks: 8
Using reorderTasks '-C' (expecting block, not cyclic, task assignment)
task 0 on s08
task 1 on s08
task 2 on s08
task 3 on s08
task 4 on s08
task 5 on s08
task 6 on s08
task 7 on s08

Summary:
	api                = POSIX
	test filename      = /lustre/ior.out/file
	access             = file-per-process
	pattern            = segmented (1 segment)
	ordering in a file = sequential offsets
	ordering inter file=constant task offsets = 1
	clients            = 8 (8 per node)
	repetitions        = 1
	xfersize           = 1 MiB
	blocksize          = 8 GiB
	aggregate filesize = 64 GiB

Using Time Stamp 1367781395 (0x5186b013) for Data Signature
Commencing write performance test.
Sun May  5 12:16:35 2013

access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s) total(s)  iter
------    ---------  ---------- ---------  --------   --------   --------  --------   ----
write     550.28     8388608    1024.00    0.001730   119.10     2.76       119.10     0    XXCEL
Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)  Op grep #Tasks tPN reps  fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize

---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------  --------
write         550.28     550.28      550.28      0.00     550.28     550.28      550.28      0.00 119.09623   8 8 1 1 1 1 0 0 1 8589934592 1048576 68719476736 -1 POSIX EXCEL

Max Write: 550.28 MiB/sec (577.01 MB/sec)

Run finished: Sun May  5 12:18:34 2013

Both tests, the servers are running current master (c864582b5d4541c7830d628457e55cd859aee005)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

collectl.log
13 kB
08/May/13 4:20 PM
stat.log
16 kB
08/May/13 4:20 PM

Issue Links

is blocking

LU-2139 Tracking unstable pages

Resolved

is related to

LU-2576 Hangs in osc_enter_cache due to dirty pages not being flushed

Resolved

LU-2139 Tracking unstable pages

Resolved

LU-2139 may cause the performance regression

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates