[LU-12748] parallel readahead needs to be optimized at high number of process Created: 11/Sep/19 Updated: 17/Feb/21 Resolved: 17/Feb/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Shuichi Ihara | Assignee: | Wang Shilong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
master |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
parallel readahead is enabled by default in master, it contributes to sequential read performance a lot. Client 2 x Platinum 8160 CPU @ 2.10GHz, 192GB memory, 2 x IB-EDR(multi-rail) CentOS7.6(3.10.0-957.27.2.el7.x86_64) OFED-4.5 for i in 6 12 24 48; do
size=$((768/i))
/work/tools/mpi/gcc/openmpi/2.1.1/bin/mpirun --allow-run-as-root -np $i /work/tools/bin/ior -w -r -t 1m -b ${size}g -e -F -vv -o /scratch0/file | tee
ior-1n${i}p-${VER}.log
done
Summaruy of Read Performance(MB/sec)
pRA=off - disabling parallel readahead (llite.*.read_ahead_async_file_threshold_mb=0) |
| Comments |
| Comment by Wang Shilong (Inactive) [ 11/Sep/19 ] |
|
The problem could be that we try to submit too much async ra workers, even we limit number of active workers for workqueue did not help. I think to fix the problem we could introduce similar idea like what we did for limit RA memory. And we limit flighting active async to number of active cpu cores etc which will give us a balance |
| Comment by Andreas Dilger [ 11/Sep/19 ] |
|
It looks like the crossover is at about NCPU/2 where the performance of parallel readahead and in-process readahead is the same. If we stop using async readahead at that point it should give the best of both worlds. |
| Comment by Gerrit Updater [ 15/Mar/20 ] |
|
Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/37927 |
| Comment by James A Simmons [ 15/Mar/20 ] |
|
I was just discussing with Wang about this issue with the |
| Comment by Gerrit Updater [ 24/Mar/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37927/ |
| Comment by Andreas Dilger [ 05/May/20 ] |
|
Li Xi, note that the ability to change labels on the ticket is one of the reasons that we only mark tickets "Resolved" instead of "Closed". Otherwise, it is necessary to re-open and close the ticket to change it again. |
| Comment by Li Xi [ 06/May/20 ] |
|
That toally makes sense. Thanks for the explanation, Andreas! |