[LU-11618] implement ladvise rpc_size for optimized performance Created: 05/Nov/18  Updated: 07/Nov/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor
Reporter: Li Xi Assignee: Li Xi
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-6179 Lock ahead - Request extent locks fro... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

Originally, the maximum RPC size for bulk I/O can be controlled by
per-OSC parameter max_pages_per_rpc. Whenever it is possible, the
OSC will do bulk I/O as large as max_pages_per_rpc for better
performance. Thus, changing the value of max_pages_per_rpc usually
affects the I/O performance a lot. However, due to the I/O pattern
difference, not all applications can get the best performance with
the same value of max_pages_per_rpc.

We want to add a new type of ladvise to enabling applications to
set different RPC sizes for different files. max_pages_per_rpc is
still the upper limit of the RPC size in all cases. And new
parameter default_pages_per_rpc has been added and its value is
the default RPC size. If a ladvise of rpc_size is given to a file,
the RPC size of the file will be changed according to the ladvise.
But the maximum RPC size will still limited by max_pages_per_rpc.

The RPC size of a file configured by ladivse is neither a global
attribute nor a persistent attribute. Each client may have
different RPC size for the same file. And the RPC size of the file
will change back to default_pages_per_rpc when the hint kept in
memory is lost due to memory shrinkage.



 Comments   
Comment by Gerrit Updater [ 05/Nov/18 ]

Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/33573
Subject: LU-11618 osc: implement ladvise rpc_size
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ce387d26f369dec24496eaed4f9213b76bb3f47f

Comment by Andreas Dilger [ 05/Nov/18 ]

In the past, the RPC size was also affected by the stripe size, so that applications could specify this on a per-file basis. Also, storing the stripe_size in the file is persistent, and does not require the application to be modified.

Comment by Li Xi [ 07/Nov/18 ]

Thanks Andreas. Has this mechanism of stripe-based-RPC-size being changed? I can't find any related codes now.

We are thinking of using ladvise hint through MPI ROMIO like LU-6179. That would require some arguments when calling MPI run command, but doesn't require modification of the appliation. And we are tired of tuning the global RPC size parameter to get good performance.

 

Comment by Li Xi [ 07/Nov/18 ]

Following is how to use it:

MGS
$ lctl conf_param lipe1-OST*.obdfilter.brw_size=16
$ lctl conf_param lipe1-OST*.osc.max_pages_per_rpc=16M

Client
$ mount -t lustre 10.0.1.148@tcp:10.0.1.149@tcp:/lipe1 /mnt/lustre_lipe1
$ cat /proc/fs/lustre/osc/lipe1-OST0000-osc-*/default_pages_per_rpc
4096
$ cat /proc/fs/lustre/osc/lipe1-OST0000-osc-*/max_pages_per_rpc
4096
$ lfs setstripe -c 1 -i 0 /mnt/lustre_lipe1/file
$ dd if=/dev/zero of=/mnt/lustre_lipe1/file bs=1048576 count=10
$ cat /proc/fs/lustre/osc/lipe1-OST0000-osc-*/rpc_stats
...
read write
pages per rpc rpcs % cum % | rpcs % cum %
1: 0 0 0 | 0 0 0
2: 0 0 0 | 0 0 0
4: 0 0 0 | 0 0 0
8: 0 0 0 | 0 0 0
16: 0 0 0 | 0 0 0
32: 0 0 0 | 0 0 0
64: 0 0 0 | 0 0 0
128: 0 0 0 | 0 0 0
256: 0 0 0 | 0 0 0
512: 0 0 0 | 0 0 0
1024: 0 0 0 | 0 0 0
2048: 0 0 0 | 0 0 0
4096: 0 0 0 | 1 100 100
...
$ echo > /proc/fs/lustre/osc/lipe1-OST0000-osc-*/rpc_stats
$ echo 256 > /proc/fs/lustre/osc/lipe1-OST0000-osc-*/default_pages_per_rpc
$ dd if=/dev/zero of=/mnt/lustre_lipe1/file bs=1048576 count=10
$ cat /proc/fs/lustre/osc/lipe1-OST0000-osc-*/rpc_stats
...
read write
pages per rpc rpcs % cum % | rpcs % cum %
1: 0 0 0 | 0 0 0
2: 0 0 0 | 0 0 0
4: 0 0 0 | 0 0 0
8: 0 0 0 | 0 0 0
16: 0 0 0 | 0 0 0
32: 0 0 0 | 0 0 0
64: 0 0 0 | 0 0 0
128: 0 0 0 | 0 0 0
256: 0 0 0 | 10 100 100
...
$ echo > /proc/fs/lustre/osc/lipe1-OST0000-osc-*/rpc_stats
$ lfs ladvise -a rpcsize -r 16M /mnt/lustre_lipe1/file
$ dd if=/dev/zero of=/mnt/lustre_lipe1/file bs=1048576 count=16
$ cat /proc/fs/lustre/osc/lipe1-OST0000-osc-*/rpc_stats
...
read write
pages per rpc rpcs % cum % | rpcs % cum %
1: 0 0 0 | 0 0 0
2: 0 0 0 | 0 0 0
4: 0 0 0 | 0 0 0
8: 0 0 0 | 0 0 0
16: 0 0 0 | 0 0 0
32: 0 0 0 | 0 0 0
64: 0 0 0 | 0 0 0
128: 0 0 0 | 0 0 0
256: 0 0 0 | 0 0 0
512: 0 0 0 | 0 0 0
1024: 0 0 0 | 0 0 0
2048: 0 0 0 | 0 0 0
4096: 0 0 0 | 1 100 100
...
$ echo > /proc/fs/lustre/osc/lipe1-OST0000-osc-*/rpc_stats
$ lfs ladvise -a rpcsize -r 4M /mnt/lustre_lipe1/file
$ dd if=/dev/zero of=/mnt/lustre_lipe1/file bs=1048576 count=16
$ cat /proc/fs/lustre/osc/lipe1-OST0000-osc-*/rpc_stats
...
read write
pages per rpc rpcs % cum % | rpcs % cum %
1: 0 0 0 | 0 0 0
2: 0 0 0 | 0 0 0
4: 0 0 0 | 0 0 0
8: 0 0 0 | 0 0 0
16: 0 0 0 | 0 0 0
32: 0 0 0 | 0 0 0
64: 0 0 0 | 0 0 0
128: 0 0 0 | 0 0 0
256: 0 0 0 | 0 0 0
512: 0 0 0 | 0 0 0
1024: 0 0 0 | 4 100 100
...

Generated at Sat Feb 10 02:45:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.