[LU-15725] Client side Mdtest File Read Regression introduced with fix for LU-11623 Created: 06/Apr/22 Updated: 10/May/22 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Petros Koutoupis | Assignee: | Lai Siyao |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
While testing 2.15 and comparing it to our 2.12 branch, I observed a noticeable file read regression on the client side and after doing a git bisect, I narrowed it down to the patch https://review.whamcloud.com/38763 "LU-11623 llite: hash just created files if lock allows". After reverting the patch, my read performance was immediately restored but it was at the expense of the huge file stat boost. File stats File Reads mdtest script: #!/bin/bash NODES=21 PPN=16 PROCS=$(( $NODES * $PPN )) MDT_COUNT=1 PAUSED=120 # Unique directory # srun -N $NODES --ntasks-per-node $PPN ~bloewe/benchmarks/ior-3.3.0-CentOS-8.2/install/bin/mdtest -v -i 5 -p $PAUSED -C -E -T -r -n $(( $MDT_COUNT * 1048576 / $PROCS )) -u -d /mnt/kjlmo13/pkoutoupis/mdt0/test.`date +"%Y%m%d.%H%M%S"` 2>&1 |& tee f_mdt0_0k_ost_uniq.out srun -N $NODES --ntasks-per-node $PPN ~bloewe/benchmarks/ior-3.3.0-CentOS-8.2/install/bin/mdtest -v -i 5 -p $PAUSED -C -w 32768 -E -e 32768 -T -r -n $(( $MDT_COUNT * 1048576 / $PROCS )) -u -d /mnt/kjlmo13/pkoutoupis/mdt0/test.`date +"%Y%m%d.%H%M%S"` 2>&1 |& tee f_mdt0_32k_ost_uniq.out
|
| Comments |
| Comment by Andreas Dilger [ 06/Apr/22 ] |
|
Petros, it would be useful if you edited your original description to indicate "git describe" versions for the "Original" and "Before Revert" tests. Is "Original" the commit before the LU-11623 patch, and "After Revert" on master with that patch reverted? Or is "Original" the 2.12.x test results? |
| Comment by Andreas Dilger [ 06/Apr/22 ] |
|
The first thing to check is whether there is something that is not being done correctly in this case. Unfortunately, the original patch did not show the "File read" results, or it might have been more visible if there was a regression. In some cases, performance issues like this are caused by incorrectly conflicting/cancelling the lock on the client, and it might be possible to "have your lock and read it too" by avoiding the extra cancellation(s) or efficiently handling the cancellation (if needed) with ELC (Early Lock Cancellation). In situations where there is no single "good answer" for whether the extra lock should be taken or not, it may be that there a weighted history of what is done to the file (e.g. similar to patch https://review.whamcloud.com/46696 " |
| Comment by Lai Siyao [ 07/Apr/22 ] |
|
Petros, what are the results of "Directory stat" before and after the revert? |
| Comment by Oleg Drokin [ 07/Apr/22 ] |
|
There was a follow-on patch https://review.whamcloud.com/#/c/33585/ that was not landed for a variety of reasons, I wonder if it could be tried too. |
| Comment by Petros Koutoupis [ 07/Apr/22 ] |
|
Andreas, I modified the description. I hope that clarifies things.
Lai, Directory stats are unchanged in all cases. |
| Comment by Lai Siyao [ 25/Apr/22 ] |
|
The last patch of LU-11623 https://review.whamcloud.com/#/c/33585/ was updated, will you cherry-pick and try again? |
| Comment by Petros Koutoupis [ 25/Apr/22 ] |
|
@Lai Siyao, I cherry picked the patch on top of 2.15.0 RC3 and reran the same tests. Unfortunately, the file read performance looks worse. 2.15.0 RC3 without the patch: [ ... ] Operation Max Min Mean Std Dev --------- --- --- ---- ------- File stat : 710652.674 680830.320 695315.322 10282.708 File read : 267242.290 211957.110 243331.807 20164.563 [ ... ] 2.15.0 RC3 with the patch: [ ... ] Operation Max Min Mean Std Dev --------- --- --- ---- ------- File stat : 704615.924 665430.996 690638.517 13355.073 File read : 255746.075 194060.211 226496.114 21414.336 [ ... ] |
| Comment by Lai Siyao [ 29/Apr/22 ] |
|
Petros, https://review.whamcloud.com/#/c/33585/ is updated, local test looks promising. |
| Comment by Petros Koutoupis [ 09/May/22 ] |
|
With updated patch cherry-picked on top of 2.15: File stat : 703505.087 689172.890 696705.795 4933.824 File read : 270560.870 217336.834 248256.171 17416.326 There does not seem to be much difference with 2.15.0 RC3 without the patch. Please refer to my mdtest script above for testing parameters. Thank you for working on this. |
| Comment by Lai Siyao [ 10/May/22 ] |
|
I did more test, it looks When the total number of file is too large, client can't cache all the locks, then the cached locks won't help. I'll see how to improve this. |
| Comment by Andreas Dilger [ 10/May/22 ] |
|
Petros, Lai, Collecting a flame graph during the test on the client and server with/without the open cache would definitely help isolate where the time is being spent. Initially I thought it might relate to the delay in cancelling the open lock when a second client node is reading the file, and that hurts read performance (either because of the extra lock cancel, or possibly delayed flushing due to write cache). However, there is a 120s sleep between phases, and I didn't see the "mdtest -N stride" option being used to force file access from a different node, so reads should be local to the node that wrote the file. There are only about 50k files and 1.6GB of data being created on each client, so this shouldn't exceed the client cache size, and reads should be "free" in this case. |