Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.11.0, Lustre 2.10.2
Affects Version/s: Lustre 2.9.0
Labels:
None
Environment:
RHEL 7 servers, RHEL 6 and 7 clients.

Epic/Theme:
- Performance
Severity:
3
Rank (Obsolete):
9223372036854775807

Description

We recently noticed that the large file read performance on our 2.9 LFS is dramatically worse than it used to be. The attached plot is the result of a test script that uses dd to write a large file (50GB) to disk, read that file and then copy it to a 2nd file to test write, read and read/write speeds for large files for various stripe sizes and counts. The two sets of data on this plot are on the same server and client hardware. The LFS was originally built and formatted with 2.8.0 but we eventually upgraded to 2.9.0 on the servers and clients. The behavior we are used to seeing is increasing performance as you increase the stripe count with a peak in performance around 4 or 6 OST's and a degradation after that as more OST's are used. This is what we saw under 2.8 (red lines in the plots). With 2.9 we still get very good write performance (almost line rate on our 10 GbE clients). But for reads we see extremely good performance with a single OST and significantly degraded performance for multiple OST's – black lines in the plots. Using a git bisect to compile and test different clients, we were able to isolate it to this commit:

commit d8467ab8a2ca15fbbd5be3429c9cf9ceb0fa78b8
~~LU-7990~~ clio: revise readahead to support 16MB IO

There is slightly more info here:

http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2017-May/014509.html

Please let me know if you need any other data or info.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

debug.1node_bs_1m.2.9.0_patch25996.log.gz
1.08 MB
01/Jun/17 8:59 PM
debug.1node.2.9.0_patch25996.log.gz
1.80 MB
01/Jun/17 8:59 PM
debug.4node_bs_1m.2.9.0_patch25996.log.gz
3.65 MB
01/Jun/17 8:59 PM
debug.4node.2.10.0_patch27388_set2.log
5.46 MB
22/Aug/17 7:56 PM
debug.4node.2.9.0_patch25996.log.gz
3.12 MB
01/Jun/17 8:59 PM
debug.4node.2.9.0_patch27388.log.gz
5.03 MB
05/Jun/17 10:39 PM
debug.aftermount.2.9.0_patch25996.log.gz
10 kB
31/May/17 10:27 PM
debug.aftermount.2.9.0_revert_d8467ab.log.gz
10 kB
31/May/17 10:27 PM
debug.aftertest.2.9.0_patch25996.log.gz
4.05 MB
31/May/17 10:27 PM
debug.aftertest.2.9.0_revert_d8467ab.log.gz
4.14 MB
31/May/17 10:27 PM
lustre_performance.pdf
19 kB
31/May/17 10:27 PM
test_ss_sn.sh
3 kB
01/Jun/17 5:12 PM

Activity

People

Assignee:: Jinshan Xiong (Inactive)

Reporter:: Darby Vicker

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 30/May/17 7:55 PM

Updated:: 08/Apr/18 12:49 PM

Resolved:: 21/Sep/17 12:17 PM