[LU-10005] File creation to slave MDT is much slower than primary MDT on DNE1 configuration Created: 19/Sep/17 Updated: 05/Sep/18 Resolved: 17/Dec/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Shuichi Ihara (Inactive) | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
b2_10 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
There is a MDS and two MDTs. both MDT's hardware setup is symmetric. [root@c01 ~]# lfs mkdir -i 0 /scratch0/dir0 [root@c01 ~]# lfs mkdir -i 1 /scratch0/dir1 If it run mdtest to each MDT separately, File creation to slave MDT (MDT0001) is much slower than primary MDT (MDT0000). 1. MDT0000 on MDT0 : 154K ops/sec Also tested MDT1 device as MDT0000. reformated MDT1 device as MDT0000 and also reformated MDT0 device as MDT0001. (which means swapped MDT0 and MDT1 device) 3. MDT0000 on original MDT1 devcide : 151K ops/sec From those benchmark results, MDT device and backend storage are no problem and it doesn't master. In any case, file creation to MDT0001 is slower than MDT0000. Here is full mutest results. Format MDT0 device with MDT0000 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 155589.552 152790.159 154009.399 1170.995 File stat : 454932.894 444775.351 449009.216 4315.516 File read : 233628.858 230038.744 232081.775 1507.029 File removal : 188460.588 184435.235 186712.008 1685.251 Tree creation : 551.714 444.141 493.856 44.292 Tree removal : 19.593 18.601 18.984 0.436 V-1: Entering timestamp... Format MDT1 device with MDT0001 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 97428.133 92734.086 94657.494 2007.797 File stat : 463844.746 439627.133 450037.946 10174.240 File read : 234910.249 232565.024 233533.717 999.923 File removal : 186289.259 181171.423 184208.010 2195.839 Tree creation : 476.266 32.866 325.249 206.784 Tree removal : 19.429 14.144 17.055 2.191 V-1: Entering timestamp... Reformat MDT1 device as MDT0000 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 155432.973 145656.215 151697.151 4311.335 File stat : 436363.906 420914.320 428509.377 6309.935 File read : 231848.424 229879.273 230823.486 805.926 File removal : 189856.501 186441.697 187710.599 1525.794 Tree creation : 564.044 432.872 504.217 54.166 Tree removal : 18.839 17.053 17.802 0.757 V-1: Entering timestamp... Reformat MDT0 device as MDT0001 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 110312.440 103512.106 106042.905 3036.285 File stat : 443284.493 425246.521 435923.695 7728.341 File read : 226239.692 225898.388 226067.629 139.351 File removal : 186702.519 181944.612 184773.293 2043.883 Tree creation : 533.233 28.863 342.123 223.290 Tree removal : 17.901 17.260 17.650 0.280 V-1: Entering timestamp... |
| Comments |
| Comment by Andreas Dilger [ 19/Sep/17 ] |
|
Hi Ihara, # cd /scratch/dir1 # mpirun -np 128 /work/tools/bin/mdtest -n 10000 -v -d . -i 3 -p 60 -F -u The scratch directory should always be cached on the client, but I'm wondering if there is some problem with the locking on dir1 that is preventing it from being cached? |
| Comment by Di Wang [ 19/Sep/17 ] |
|
Another possible cause is that the default lov stripping cache does not work correctly, which might cause each file open/create (on non-root MDT) tries to get default striping from root MDT (extra RPC). See lod_ah_init()->lod_get_default_lov_striping(). I am not sure OSP cache works in this case, I will check. |
| Comment by Gerrit Updater [ 19/Sep/17 ] |
|
wangdi (di.wang@intel.com) uploaded a new patch: https://review.whamcloud.com/29078 |
| Comment by Di Wang [ 19/Sep/17 ] |
|
Ihara, please try this patch, thanks. |
| Comment by Shuichi Ihara (Inactive) [ 20/Sep/17 ] |
|
Thanks WangDi MDT0000 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 145472.642 137093.283 140274.791 3706.089 File stat : 443154.312 431557.793 436764.649 4807.570 File read : 233326.068 229897.041 231796.549 1424.131 File removal : 186842.911 186376.008 186627.793 192.368 Tree creation : 572.336 436.418 499.243 55.961 Tree removal : 19.798 18.276 19.165 0.647 V-1: Entering timestamp... MDT0001 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 153350.909 135825.706 143892.288 7222.027 File stat : 462687.460 449961.013 457977.445 5697.362 File read : 230307.880 224196.385 226475.436 2726.092 File removal : 192887.031 187816.726 189799.023 2212.682 Tree creation : 514.550 399.017 451.076 47.852 Tree removal : 18.876 17.538 18.222 0.546 V-1: Entering timestamp... btw, I didn't see such performance differences with IEEL3.0. somethjing we did in lusre-2.7, but missed or changed after lustre-2.7 and showed up this issue? |
| Comment by Shuichi Ihara (Inactive) [ 20/Sep/17 ] |
|
BTW, after applied patch https://review.whamcloud.com/29078. run mdtest 10 times with 60 sec interval without patch [root@c01 ~]# grep 'V-1: File creation' mdtest-default-dir0-loop.log V-1: File creation : 8.723 sec, 146739.418 ops/sec V-1: File creation : 8.855 sec, 144543.110 ops/sec V-1: File creation : 8.829 sec, 144978.787 ops/sec V-1: File creation : 8.803 sec, 145404.742 ops/sec V-1: File creation : 8.637 sec, 148192.295 ops/sec V-1: File creation : 9.084 sec, 140911.792 ops/sec V-1: File creation : 8.837 sec, 144853.049 ops/sec V-1: File creation : 9.288 sec, 137808.205 ops/sec V-1: File creation : 9.046 sec, 141502.448 ops/sec V-1: File creation : 9.392 sec, 136287.278 ops/sec with patch [root@c01 ~]# grep 'V-1: File creation' mdtest-LU10005-dir0-loop.log V-1: File creation : 8.874 sec, 144246.332 ops/sec V-1: File creation : 8.552 sec, 149675.160 ops/sec V-1: File creation : 9.211 sec, 138957.104 ops/sec V-1: File creation : 9.058 sec, 141315.265 ops/sec V-1: File creation : 9.297 sec, 137673.095 ops/sec V-1: File creation : 9.263 sec, 138185.327 ops/sec V-1: File creation : 9.469 sec, 135184.898 ops/sec V-1: File creation : 9.266 sec, 138134.736 ops/sec V-1: File creation : 9.373 sec, 136563.934 ops/sec V-1: File creation : 9.486 sec, 134930.710 ops/sec |
| Comment by Di Wang [ 20/Sep/17 ] |
btw, I didn't see such performance differences with IEEL3.0. somethjing we did in lusre-2.7, but missed or changed after lustre-2.7 and showed up this issue? I think this is brought in by BTW, after applied patch https://review.whamcloud.com/29078. Overall, average file creation performance to primary MDT (MDT0000) drops. Hmm, I did not touch anything in the path of local file open/creation. Is it repeatable? This drop seems unlikely related with the patch, IMHO. I will check again. thanks |
| Comment by Gerrit Updater [ 17/Dec/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29078/ |
| Comment by Peter Jones [ 17/Dec/17 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 18/Dec/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30585 |
| Comment by Gerrit Updater [ 22/Dec/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30643 |
| Comment by Gerrit Updater [ 12/Apr/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30585/ |