[LU-9972] Performance regressions on unique directory removal Created: 11/Sep/17 Updated: 01/Mar/18 Resolved: 06/Feb/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Shuichi Ihara (Inactive) | Assignee: | Alex Zhuravlev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
2.10 (and 2.11) |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
There is a performance regression on dir removal. Server and client : RHEL7.3 mpirun --allow-run-as-root /work/tools/bin/mdtest -n 5000 -v -d /scratch0/mdtest.out -D -i 3 -p 10 -w 0 -u SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 89757.381 65618.928 74607.900 10774.356 Directory stat : 320946.433 319888.242 320294.264 465.749 Directory removal : 19028.569 17837.487 18351.200 499.838 Tree creation : 434.446 158.826 318.943 116.860 Tree removal : 27.018 25.210 26.281 0.775 |
| Comments |
| Comment by Andreas Dilger [ 12/Sep/17 ] |
|
Compared to which version/kernel? |
| Comment by Shuichi Ihara (Inactive) [ 12/Sep/17 ] |
|
For example lustre-2.7(IEEL3.0)/CentOS7.3 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 46577.991 42249.894 44871.081 1881.494 Directory stat : 373243.136 367643.706 370043.774 2354.791 Directory removal : 78530.701 66152.245 72781.092 5091.584 File creation : 107283.764 96953.405 103118.187 4447.973 File stat : 385082.155 375112.919 379387.910 4191.828 File read : 185463.654 177089.199 182367.310 3750.818 File removal : 127467.768 113218.809 122566.251 6612.256 Tree creation : 349.409 91.996 262.234 120.388 Tree removal : 20.765 18.039 19.132 1.176 I'm going to test lustre-2.9 to compare. |
| Comment by Shuichi Ihara (Inactive) [ 16/Sep/17 ] |
|
Sorry delay response. I needed to change hardware configuration, but here is new results on b2_10 (2.10.1_RC1). mpirun -np 128 mdtest -n 5000 -v -d /scratch0/mdtest.out -i 3 -p 30 -D (for shared directory ) Here is a directory operations to a shared directory. SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 91979.485 69249.863 79842.797 9343.315 Directory stat : 197008.811 180039.999 189342.439 7023.422 Directory removal : 140527.764 128798.718 133567.639 5032.803 Tree creation : 5462.720 1034.229 3084.207 1822.788 Tree removal : 92.639 74.702 86.019 8.041 And here is unique directory's results. SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 84094.691 75575.177 80444.764 3583.407 Directory stat : 463370.743 431285.266 448299.685 13170.724 Directory removal : 18722.965 18461.182 18558.573 116.903 Tree creation : 593.577 310.356 472.213 119.117 Tree removal : 37.275 33.999 35.691 1.340 V-1: Entering timestamp... |
| Comment by Andreas Dilger [ 18/Sep/17 ] |
|
Cliff, do we have similar mdtest results from the performance test cluster, in particular 2.10.0/1, 2.10.52/53, and 2.9.x? That would give us a ballpark of where this performance regression has been introduced, and allow git bisect to narrow it down to a particular patch. |
| Comment by Shuichi Ihara (Inactive) [ 19/Sep/17 ] |
|
I think the problem has been exist in b2_9 at least. mpirun -np 128 mdtest -n 5000 -v -d /scratch0/mdtest.out -i 3 -p 30 -D -u (unique directory) SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 91409.935 72314.781 84242.568 8491.267 Directory stat : 184806.326 183688.542 184367.927 487.111 Directory removal : 20718.518 20303.157 20555.893 181.147 Tree creation : 552.285 400.441 473.117 62.160 Tree removal : 40.413 29.341 35.321 4.563 V-1: Entering timestamp... mpirun -np 128 mdtest -n 5000 -v -d /scratch0/mdtest.out -i 3 -p 30 -D (shared directory) SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 70310.000 45717.790 58926.161 10122.282 Directory stat : 178080.331 175913.598 176783.485 934.667 Directory removal : 86194.900 72838.446 79018.261 5498.119 Tree creation : 5527.274 2804.821 3744.496 1261.231 Tree removal : 80.959 24.059 61.936 26.784 V-1: Entering timestamp... |
| Comment by Cliff White (Inactive) [ 19/Sep/17 ] |
|
Our hardware config has changed a bit since 2.9, we have seen noticeable improvements since changing the tuned-adm profile. Of course all our old results are on Sharepoint: If you look at our last EE 3.0 runs from June 2017, you will see Dir rm is 4x better. (b_ieel3_0 build 214) So I would look at some deltas there: http://tinyurl.com/yanedznq |
| Comment by Andreas Dilger [ 19/Sep/17 ] |
|
Results from master builds (18 threads 16 client mdtestfpp) : Build Version Dir create Dir stat Dir rm master-3596 2.9.58_22 21514 188773 11822 master-3598 2.9.58_57 21570 209599 11653 master-3601 2.9.59 21063 223101 11879 master-2607 2.8.59-35 19328 211813 11797 master-3637 2.10.52_83 25987 234954 15033 Results from EE builds: Build Version Dir create Dir stat Dir rm b_ieel3_0-105 2.7.18 21239 167421 73112 b_ieel3_0-89 2.7.16.1 19758 169050 77119 b_ieel3_0-204 2.7.19.12 28330 288267 59444 b_ieel3_0-214 2.7.20.2 28136 331563 60515 |
| Comment by Peter Jones [ 20/Sep/17 ] |
|
Saurabh Please can you narrow down where the change occurred? Thanks Peter |
| Comment by Andreas Dilger [ 20/Sep/17 ] |
|
Discussed this with Saurabh and Cliff. Cliff thinks the problem may date back to DNE2 landings, since EE 2.7 predates the DNE2 changes, and they appeared as early as 2.8.0. Saurabh will try a git bisect starting with v2_7_50 (== 2.7.0) to see if that has good performance on our test cluster (good ~= 70k rmdir/sec) and go from there. We would like to keep the kernel version the same, at RHEL 7.4, to avoid potential interference with the results from changing the kernel or other configuration options. |
| Comment by Gerrit Updater [ 21/Sep/17 ] |
|
Saurabh Tandan (saurabh.tandan@intel.com) uploaded a new patch: https://review.whamcloud.com/29126 |
| Comment by Shuichi Ihara (Inactive) [ 10/Oct/17 ] |
|
Any progress on finding regression point? |
| Comment by Saurabh Tandan (Inactive) [ 10/Oct/17 ] |
|
Still working on it Shuichi, will soon post some findings. |
| Comment by Peter Jones [ 20/Oct/17 ] |
|
Alex I daresay that Saurabh may elaborate but I understand that he has found that your patch Do you have any ideas on how to avoid this? Peter |
| Comment by Saurabh Tandan (Inactive) [ 20/Oct/17 ] |
|
There was approximately a drop of 90% in performance fir "Dir removal" for "mdtestfpp" results from tag 2.7.65. Following is the data for all the runs: Tag Dir removal 2.7.56 18298 2.7.57 121954 2.7.61 64849 2.7.64 111655 good 2.7.64-g63a3e412 (LU-7419) 74384 good 2.7.64-gc965fc8a (LU-7450) 72374 good 2.7.64-g6765d785 (LU-7408) 92029 good 2.7.64-g9ae3a289 (LU-7053) 11517 bad 2.7.64-g0d3a07a8 (LU-7430) 15114 bad 2.7.64-g959f8f78 (LU-7573) 11530 bad 2.7.65 11375 bad 2.7.66 11403 bad 2.10.53 12473 2.10.54 9649 |
| Comment by John Hammond [ 20/Oct/17 ] |
|
There must have been more runs than just these if you were able to isolate https://review.whamcloud.com/#/c/17092/. |
| Comment by Andreas Dilger [ 20/Oct/17 ] |
|
I've updated the results to show the commit-order test results for the bisect (not the bisect order), to show there is a clear break between |
| Comment by Alex Zhuravlev [ 23/Oct/17 ] |
|
I tried to revert going to proceed with mdtest, but that will take some time. |
| Comment by Gerrit Updater [ 23/Oct/17 ] |
|
Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: https://review.whamcloud.com/29709 () |
| Comment by Alex Zhuravlev [ 23/Oct/17 ] |
|
please, try with the patch above. |
| Comment by Saurabh Tandan (Inactive) [ 26/Oct/17 ] |
|
Performance results with patch above: MDTEST RESULTS 000: SUMMARY: (of 3 iterations) 000: Operation Max Min Mean Std Dev 000: --------- --- --- ---- ------- 000: Directory creation: 24200.233 15693.923 18542.903 4000.371 000: Directory stat : 235384.333 230225.191 233266.050 2204.926 000: Directory removal : 82809.228 30178.490 51633.084 22559.256 000: File creation : 50508.075 33979.054 41331.172 6870.202 000: File stat : 232294.752 226222.165 228931.104 2521.978 000: File read : 124620.945 110484.961 119398.361 6333.672 000: File removal : 128159.287 77027.802 106534.012 21605.386 000: Tree creation : 124.922 65.746 103.702 26.901 000: Tree removal : 9.709 7.841 8.650 0.783 Results for dir removal with this patch have surely improved but still not to the mark where they were before |
| Comment by Andreas Dilger [ 26/Oct/17 ] |
Build Version Dir create Dir stat Dir rm b_ieel3_0-89 2.7.16.1 19758 169050 77119 b_ieel3_0-105 2.7.18 21239 167421 73112 b_ieel3_0-214 2.7.20.2 28136 331563 60515 master-3601 2.9.59 21063 223101 11879 master-2607 2.8.59-35 19328 211813 11797 master-3637 2.10.52_83 25987 234954 15033 master 2.10.54 18676 210767 10282 review-51618 2.10.54-20-g66bb2d1 18542 233266 51633 So it looks like the rmdir performance is significantly improved, but the mkdir performance is down a bit from 2.10.53, but about on par with 2.10.54. It looks like the file create/stat/unlink performance is a bit lower vs. 2.10.54, but definitely still a lot better than 2.7.64. Alex, would it make sense to only add directory objects to the cache, instead of adding all objects? That may give us the best of both worlds. |
| Comment by Alex Zhuravlev [ 27/Oct/17 ] |
|
Andreas, at the moment I don't quite understand how the cache can decrease performance as it's TLS, lockless, tiny and it replaces a lookup in LU cache which in contrast much larger and needs locking. I keep investigating.. |
| Comment by Alex Zhuravlev [ 27/Oct/17 ] |
|
please, benchmark master with https://review.whamcloud.com/#/c/29821/ |
| Comment by John Hammond [ 27/Oct/17 ] |
|
Before |
| Comment by Alex Zhuravlev [ 27/Oct/17 ] |
|
IDC is lockless and per-thread while FLDB is a shared structure. |
| Comment by John Hammond [ 27/Oct/17 ] |
|
Yes, but I'm not talking about the cost of accessing the IDC cache. I mean the extra cost of OI lookup to initialize the IDC entry. |
| Comment by Alex Zhuravlev [ 27/Oct/17 ] |
|
usually it's initialized from other preceding methods (like osd_declare_ref_ {add|del} in https://review.whamcloud.com/29709 |
| Comment by Alex Zhuravlev [ 27/Oct/17 ] |
|
to verify that I added printk() to osd_idc_find_or_init() and got zero calls to osd_remote_fid() and osd_oi_lookup() during rmdir. |
| Comment by Saurabh Tandan (Inactive) [ 27/Oct/17 ] |
|
Results of patch https://review.whamcloud.com/#/c/29821/ which reverts just MIB RESULTS MDTEST RESULTS 000: SUMMARY: (of 3 iterations) 000: Operation Max Min Mean Std Dev 000: --------- --- --- ---- ------- 000: Directory creation: 21177.729 16443.602 18498.262 1982.554 000: Directory stat : 232409.626 229897.859 230874.272 1098.955 000: Directory removal : 111897.307 40906.454 75055.964 29044.331 000: File creation : 44061.464 38438.052 41863.997 2454.596 000: File stat : 225941.782 193254.640 210777.086 13448.211 000: File read : 146598.308 86669.775 126562.125 28208.247 000: File removal : 155512.154 106886.927 136397.357 21168.452 000: Tree creation : 120.178 56.129 96.113 28.468 000: Tree removal : 8.906 8.122 8.620 0.354 000: Dir removal for this run has been "75055.964" (mean) approx 45% higher in performance in comparison to the patch https://review.whamcloud.com/29709 under exactly same conditions and number of iterations. |
| Comment by Alex Zhuravlev [ 29/Oct/17 ] |
|
thanks for the data. I've updated https://review.whamcloud.com/29709 please attach it here if printed. I'm working on a followup patch to optimize calls to osd_remote_fid(), this isn't directly related to |
| Comment by Alex Zhuravlev [ 14/Nov/17 ] |
|
please, try with the updated patch. |
| Comment by Joseph Gmitter (Inactive) [ 04/Jan/18 ] |
|
Hi Ihara, Have you been able to confirm that Alex's patch resolves the issue? Thanks. |
| Comment by Gerrit Updater [ 06/Feb/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29709/ |
| Comment by Peter Jones [ 06/Feb/18 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 07/Feb/18 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/31211 |
| Comment by Gerrit Updater [ 01/Mar/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/31211/ |