[LU-12518] improve Lustre unaligned IO read performances Created: 08/Jul/19  Updated: 13/Sep/23  Resolved: 14/Feb/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0

Type: Improvement Priority: Minor
Reporter: Wang Shilong (Inactive) Assignee: Wang Shilong (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-12043 improve Lustre single thread read per... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

Currently, Lustre works well for aligned IO, but performance is pretty bad for unaligned IO stride read, we might need take some efforts to improve this situation.

One of the main problem with current stride read is it is based on Page Index, so if
we hit unaligned page case, Stride Read detection will not work well.

To support unaligned page stride read, we might change page index to bytes offset thus
stride read pattern detection work well and we won't hit many small pages RPC and readahead window reset.

At the same time, we shall keep as much as performances for existed cases and make
sure there won't be obvious regressions for aligned-stride and sequential read.



 Comments   
Comment by Wang Shilong (Inactive) [ 08/Jul/19 ]

Proposal patch pushed here(will be refreshed with correct LU title)
https://review.whamcloud.com/#/c/35437/

Comment by Wang Shilong (Inactive) [ 08/Jul/19 ]

Here is quick test resutls of patch on ior_hard_read workload(SSF, 47kb, strided read) From Ihara:

10 client, 240 process(24/client)

  1. ior -r -R -s 132000 -i 1 -C -Q 1 -g -G 27 -k -e -t 47008 -b 47008 -o /cache1/io500.out/ior_hard/IOR_file -O stoneWallingStatusFile=/cache1/io500.out/ior_hard/stonewall
master
Max Read:  7373.80 MiB/sec (7731.98 MB/sec)
patch
Max Read:  19784.09 MiB/sec (20745.12 MB/sec)

patch significant improves performance, but there are still some a bit strange behaviors we need to investigate.
As far as I observe stats during IOR.
1. I see 28GB/sec at beginning of IOR, but when memory reclaims (later read size > llite.*.cached_mb) started, performance dropped and swing.
2. tail of IOR is a bit longer. some mpi ranks finished earlier, but other ranks were not finished at same time.
That's two thing why IOR reports 20GB/sec even we are getting 28GB/sec at middle of IOR.

Comment by Shuichi Ihara [ 09/Jul/19 ]

Here is more test results.

  FPP Read(MB/sec) SSF Read(MB/sec)
master  44,636  7,731
master+patch35437
(overstriping, -C 240 -S 16M)
 44,318  20,745
FPP
# ior -r -R -s 132000 -F -i 1 -C -Q 1 -g -G 27 -k -e -t 47008 -b 47008 -o /cache1/io500.out2/ior_hard/IOR_file -O stoneWallingStatusFile=/cache1/io500.out2/ior_hard/stonewall
SSF
# ior -r -R -s 132000 -i 1 -C -Q 1 -g -G 27 -k -e -t 47008 -b 47008 -o /cache1/io500.out/ior_hard/IOR_file -O stoneWallingStatusFile=/cache1/io500.out/ior_hard/stonewall 
Comment by Gerrit Updater [ 10/Jul/19 ]

Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35457
Subject: LU-12518 llite: Accept EBUSY for unaligned stride
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c4ab79e65acadc16ef2e2c9f5eb73ed94f16c976

Comment by Gerrit Updater [ 19/Aug/19 ]

Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/35829
Subject: LU-12518 readahead: convert stride page index to byte
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ac5cded17af7882d30f3dcda9394f618f5a5d6ad

Comment by Gerrit Updater [ 23/Aug/19 ]

Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/35893
Subject: LU-12518 llite: fix stride window increase
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3cabf63b9aba215e702ddbca30fccc0d3a87c800

Comment by Gerrit Updater [ 16/Sep/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35829/
Subject: LU-12518 readahead: convert stride page index to byte
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0923e405511605b66e73a99dda12ed961ca9e30b

Comment by Andreas Dilger [ 25/Oct/19 ]

Shilong, are all of the patches in this ticket still needed?

Comment by Wang Shilong (Inactive) [ 26/Oct/19 ]

Yup, patches needed for bursting performances up, i am not sure for lustre-wc repo though.

Comment by Gerrit Updater [ 06/Dec/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35437/
Subject: LU-12518 llite: support page unaligned stride readahead
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 91d2645515087df3f912b285419cfff73d9fca9e

Comment by Gerrit Updater [ 15/Jan/20 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37248
Subject: LU-12518 llite: proper names/types for offset/pages
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 46fae0d64ea26ce743eea6265e169c1a8eb7c187

Comment by Gerrit Updater [ 28/Jan/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37248/
Subject: LU-12518 llite: proper names/types for offset/pages
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 83d8dd1d7c30c41e837b07b97198ad77bd903eea

Comment by Gerrit Updater [ 28/Jan/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35457/
Subject: LU-12518 llite: Accept EBUSY for page unaligned read
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: b9c155065d2ca4a6037a0ca4bfc788d6961fdc8e

Comment by Gerrit Updater [ 14/Feb/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35893/
Subject: LU-12518 llite: fix stride window increase
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0cc48c4adbe540f8e529f80e4262b6ff47649e7c

Comment by Peter Jones [ 14/Feb/20 ]

It looks like all this work has now landed for 2.14

Comment by Gerrit Updater [ 07/Apr/20 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38154
Subject: LU-12518 llite: rename count and nob variables to bytes
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 921aa269e749d09b9d2bc5827f5f9ffe02b586b8

Comment by Gerrit Updater [ 13/Sep/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/38154/
Subject: LU-12518 llite: rename count and nob variables to bytes
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 271f838c5cd1c539f4a7de5008dfc7ffebb156c0

Generated at Sat Feb 10 02:53:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.