[LU-2032] small random read i/o performance regression Created: 26/Sep/12 Updated: 21/Sep/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Robin Humble | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
2.1.3 server with 2.1.3 and 1.8.8 clients. centos6, x86_64, QDR IB |
||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 4173 | ||||||||||||||||||||
| Description |
|
I've been doing some 2.1.3 pre-rollout testing, and there seems to be a client problem with small random reads. performance is considerably worse on 2.1.3 clients than 1.8.8 clients. it's about a 35x slowdown for 4k random read i/o. tests use the same files on a rhel6 x86_64 2.1.3 server (stock 2.1.3 rpm is used), QDR IB fabric, single disk or md8+2 lun for an OST, all client & server VFS caches dropped between trials. checksums on or off and client rpc's 8 or 32 makes little difference. I've also tried umount'ing the fs from the 1.8.8 client between tests to make sure there's no hidden caching, but that didn't change anything. random read ->
although these numbers are for a single process, the same trend applies when the IOR is scaled up to 8 processes/node and to multiple nodes. |
| Comments |
| Comment by Robin Humble [ 26/Sep/12 ] | ||||||
|
it seems current master has the same problem.
| ||||||
| Comment by Robin Humble [ 27/Sep/12 ] | ||||||
|
rpc's, especially pages and bytes per rpc, look pretty different between 1.8.8 and 2.1.3. osc/*/rpc_stats -> 1.8.8 read write
pages per rpc rpcs % cum % | rpcs % cum %
1: 68 12 12 | 0 0 0
2: 34 6 19 | 0 0 0
4: 57 10 30 | 0 0 0
8: 52 9 39 | 0 0 0
16: 63 11 51 | 0 0 0
32: 50 9 61 | 0 0 0
64: 66 12 73 | 0 0 0
128: 67 12 86 | 0 0 0
256: 72 13 100 | 0 0 0
read write
rpcs in flight rpcs % cum % | rpcs % cum %
0: 483 91 91 | 0 0 0
1: 39 7 98 | 0 0 0
2: 5 0 99 | 0 0 0
3: 2 0 100 | 0 0 0
read write
offset rpcs % cum % | rpcs % cum %
0: 101 19 19 | 0 0 0
1: 26 4 24 | 0 0 0
2: 49 9 33 | 0 0 0
4: 48 9 42 | 0 0 0
8: 61 11 53 | 0 0 0
16: 64 12 65 | 0 0 0
32: 56 10 76 | 0 0 0
64: 55 10 86 | 0 0 0
128: 69 13 100 | 0 0 0
2.1.3 read write
pages per rpc rpcs % cum % | rpcs % cum %
1: 25568 99 99 | 0 0 0
2: 1 0 99 | 0 0 0
4: 0 0 99 | 0 0 0
8: 5 0 100 | 0 0 0
read write
rpcs in flight rpcs % cum % | rpcs % cum %
0: 25574 100 100 | 0 0 0
read write
offset rpcs % cum % | rpcs % cum %
0: 100 0 0 | 0 0 0
1: 100 0 0 | 0 0 0
2: 200 0 1 | 0 0 0
4: 400 1 3 | 0 0 0
8: 800 3 6 | 0 0 0
16: 1600 6 12 | 0 0 0
32: 3200 12 25 | 0 0 0
64: 6383 24 49 | 0 0 0
128: 12791 50 100 | 0 0 0
osc/*/import -> 1.8.8 ...
read_data_averages:
bytes_per_rpc: 200200
usec_per_rpc: 7673
MB_per_sec: 26.09
2.1.3 read_data_averages:
bytes_per_rpc: 4100
usec_per_rpc: 6108
MB_per_sec: 0.67
also read ahead stats hit/miss ratios are pretty much reversed between 1.8.8 and 2.1.3, but that might just be accounting? 1.8.8 snapshot_time 1348738815.173086 secs.usecs hits 25120 samples [pages] misses 480 samples [pages] readpage not consecutive 25583 samples [pages] failed grab_cache_page 25583 samples [pages] read but discarded 256 samples [pages] 2.1.3 snapshot_time 1348738815.198485 secs.usecs hits 26 samples [pages] misses 25574 samples [pages] readpage not consecutive 25589 samples [pages] miss inside window 3 samples [pages] failed grab_cache_page 25599 samples [pages] zero size window 1 samples [pages] | ||||||
| Comment by Jinshan Xiong (Inactive) [ 27/Sep/12 ] | ||||||
|
Can you please run your test case again with D_READA enabled on the client side? This seems to be a readahead problem for sure. | ||||||
| Comment by Robin Humble [ 28/Sep/12 ] | ||||||
|
sure. this is +reada +vfstrace for a 1.8.8 client and a 2.1.3 client. |