[LU-9840] LU-3529 causes 25% metadata performance regressions even without DNE Created: 07/Aug/17 Updated: 15/Nov/17 Resolved: 21/Sep/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.2 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Shuichi Ihara (Inactive) | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
master |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Finally, we found a commit and root cause of 25% metadata performance regression. 5f3e926ac9ff8ad134ad920d0e8545e16395ef3b is the first bad commit
commit 5f3e926ac9ff8ad134ad920d0e8545e16395ef3b
Author: wang di <di.wang@intel.com>
Date: Wed Jul 31 00:00:40 2013 -0700
LU-3529 lod: create striped directory
1. Add "lfs setdirstripe -i -c" to create striped
directory.
2. client send create request to the master MDT, which
will allocate FIDs and create slaves. for all of slaves.
3. Client needs to revalidate slaves during intent getattr
and open request.
4. lmv_stripe_md will include attributes(size, nlink etc)
from all of stripe, which will be protected by UPDATE lock.
client needs to merge these attributes when update inode.
5. send create request to the MDT where the file is located,
which can help creating master stripe of striped directory.
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I7ac560e39dcb415e310dc5e6ade531d76227ffae
Reviewed-on: http://review.whamcloud.com/7196
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Here is test configuration
[e19b51372ad94818a7a79b1fbae5b55c665ba59f] LU-4196 build: Reenable OFED-3.5 support on SLES11 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 88017.734 82493.520 84509.613 2489.825 File stat : 151997.740 142649.444 148656.020 4256.270 File read : 162847.716 154605.697 158734.759 3364.809 File removal : 84993.971 78063.127 80600.677 3118.973 Tree creation : 3692.169 2931.030 3262.108 318.519 Tree removal : 51.245 47.758 49.678 1.446 V-1: Entering timestamp... [5f3e926ac9ff8ad134ad920d0e8545e16395ef3b] LU-3529 lod: create striped directory SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 66695.074 66242.158 66524.596 201.138 File stat : 152026.405 143681.866 148948.529 3741.740 File read : 165470.364 163291.085 164307.211 895.740 File removal : 86953.641 82117.776 84285.373 2005.726 Tree creation : 4165.148 2841.669 3603.150 558.417 Tree removal : 59.119 52.690 55.581 2.664 V-1: Entering timestamp... Even no DNE, we are losing 25% performance regression. |
| Comments |
| Comment by Peter Jones [ 07/Aug/17 ] |
|
Lai Can you please advise? Thanks Peter |
| Comment by Di Wang [ 07/Aug/17 ] |
|
Ihara, Thanks for finding this, but I did not find the patch adding any overhead into the open/create path for non-DNE case. (except adding a few bytes into the LOD object, which should not cause such big performance drop). Could you please try client (without patch) vs server (with patch) and client (with patch) vs server (without patch) to see who cause the trouble? Does mdtest support zero-stripe creation? if not, could you please try mdsrate (lustre/tests/mpi/mdsrate.c) ? to see if mknod performance also drops? Thanks |
| Comment by Shuichi Ihara (Inactive) [ 07/Aug/17 ] |
|
I don't think this is client side problem. My client version was been Lustre-2.5 for all testing which means no patch included on the client side. I have also tested Lustre-2.7 cleint (whidh should include patch) against server with/without patches, but it's same behaviors and still ~20% regression=s. [e19b51372ad94818a7a79b1fbae5b55c665ba59f] LU-4196 build: Reenable OFED-3.5 support on SLES11 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 82368.780 82025.716 82246.435 156.378 File stat : 155345.312 149657.195 152438.609 2323.853 File read : 153267.971 146148.288 150669.345 3208.767 File removal : 84368.998 81469.899 82617.512 1258.223 Tree creation : 4236.671 1364.445 3221.788 1315.225 Tree removal : 58.109 57.026 57.622 0.449 V-1: Entering timestamp... [5f3e926ac9ff8ad134ad920d0e8545e16395ef3b] LU-3529 lod: create striped directory SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 71733.913 69104.187 70276.415 1092.367 File stat : 154928.974 143628.505 150433.332 4893.833 File read : 150986.271 149548.465 150302.120 589.036 File removal : 90852.671 83976.575 86282.682 3231.517 Tree creation : 3938.314 1353.438 2962.975 1146.604 Tree removal : 57.287 54.732 56.202 1.078 V-1: Entering timestamp... Let me try zero-striping. |
| Comment by Shuichi Ihara (Inactive) [ 07/Aug/17 ] |
|
BTW, what I said this regression is File (maybe DIR too) creation into a single shared directory. mpirun -np 128 -ppn 4 -hostfile ./hostfile.32 mdtest -n 5000 -v -d /scratch/mdtest.out -p 30 -i 3 -F -u [e19b51372ad94818a7a79b1fbae5b55c665ba59f] LU-4196 build: Reenable OFED-3.5 support on SLES11 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 102493.883 96797.852 99298.922 2376.595 File stat : 334530.859 322827.499 329729.583 5003.479 File read : 153050.474 150524.026 152070.676 1106.564 File removal : 106763.376 102263.541 104794.116 1879.438 Tree creation : 456.002 280.988 389.842 77.565 Tree removal : 33.068 32.685 32.915 0.166 V-1: Entering timestamp... [5f3e926ac9ff8ad134ad920d0e8545e16395ef3b] LU-3529 lod: create striped directory SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 103803.524 99922.978 101450.587 1688.300 File stat : 340341.346 331701.230 335322.537 3663.119 File read : 150589.967 148096.413 149200.655 1037.755 File removal : 102280.943 93578.046 97385.654 3635.234 Tree creation : 312.007 43.829 221.429 125.590 Tree removal : 32.941 32.516 32.795 0.198 V-1: Entering timestamp... So, it seems something wrong on shared directory operaton. |
| Comment by Di Wang [ 08/Aug/17 ] |
|
Hello, Ihara Thanks for these information. Unfortunately I do not have 2.5 environment myself. --- a/lustre/lod/lod_lov.c
+++ b/lustre/lod/lod_lov.c
@@ -860,17 +860,6 @@ int lod_load_striping(const struct lu_env *env, struct lod_object *lo)
info->lti_buf.lb_buf = info->lti_ea_store;
info->lti_buf.lb_len = info->lti_ea_store_size;
rc = lod_parse_striping(env, lo, &info->lti_buf);
- } else if (lu_object_attr(lod2lu_obj(lo)) & S_IFDIR) {
- rc = lod_get_lmv_ea(env, lo);
- if (rc <= 0)
- GOTO(out, rc);
- /*
- * there is LOV EA (striping information) in this object
- * let's parse it and create in-core objects for the stripes
- */
- info->lti_buf.lb_buf = info->lti_ea_store;
- info->lti_buf.lb_len = info->lti_ea_store_size;
- rc = lod_parse_dir_striping(env, lo, &info->lti_buf);
}
out:
dt_write_unlock(env, next);
And also could you also please collect some debug log(-1 mask) on the MDS and post it here ? Btw: what kernel do you use for the test? rhel-6.3? |
| Comment by Di Wang [ 08/Aug/17 ] |
|
Ihara |
| Comment by Shuichi Ihara (Inactive) [ 09/Aug/17 ] |
|
WangDi, Sure, I know what you poined out. I will test them and feedback you sooner. |
| Comment by Shuichi Ihara (Inactive) [ 14/Aug/17 ] |
patch improved ~15%, but still another 5% performamnce regressions and patch helped for File creation, but go regressions on unlink. [e19b51372ad94818a7a79b1fbae5b55c665ba59f] LU-4196 build: Reenable OFED-3.5 support on SLES11 SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 85568.624 84412.561 84814.381 533.711 File stat : 156095.695 145840.993 152336.768 4612.121 File read : 161833.820 157226.792 160061.457 2025.264 File removal : 89463.742 83852.552 85970.898 2488.413 Tree creation : 4185.932 2597.092 3482.057 661.160 Tree removal : 58.671 47.865 52.494 4.545 V-1: Entering timestamp... [5f3e926ac9ff8ad134ad920d0e8545e16395ef3b] LU-3529 lod: create striped directory SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 72231.370 71757.985 71941.350 207.453 File stat : 156218.994 146766.402 152100.221 3953.490 File read : 151082.612 148981.339 149687.568 986.470 File removal : 86520.243 85361.115 85847.648 491.161 Tree creation : 3731.587 1801.677 3001.790 855.194 Tree removal : 59.288 55.627 57.821 1.581 V-1: Entering timestamp... [5f3e926ac9ff8ad134ad920d0e8545e16395ef3b] LU-3529 lod: create striped directory + patch SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 82139.647 80821.907 81312.553 588.222 File stat : 157364.652 151950.461 153937.210 2433.799 File read : 153314.454 151299.630 152332.901 823.361 File removal : 79767.310 77557.914 78420.017 965.027 Tree creation : 3905.311 3134.756 3433.015 337.792 Tree removal : 53.836 50.449 51.712 1.511 V-1: Entering timestamp... |
| Comment by Di Wang [ 14/Aug/17 ] |
|
Ihara: Could you please try this patch? Thanks |
| Comment by Di Wang [ 22/Aug/17 ] |
|
Any news? |
| Comment by Shuichi Ihara (Inactive) [ 12/Sep/17 ] |
|
WangDi, |
| Comment by Di Wang [ 12/Sep/17 ] |
|
Sure, Ihara |
| Comment by Gerrit Updater [ 13/Sep/17 ] |
|
wangdi (di.wang@intel.com) uploaded a new patch: https://review.whamcloud.com/28962 |
| Comment by Shuichi Ihara (Inactive) [ 13/Sep/17 ] |
|
Thanks WangDi. I did quick test your patches against laster master. test case: mdtest to a shared directory, 32 clients, 128 processes and 10,000 files per process which means 1.28 Million files for total. master: 1fc4ed3ac40ab0e11b1c59d7d147a100636cbda0 without any patches. SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 87877.948 77161.656 81981.331 4440.689 File stat : 146500.548 139206.464 143471.400 3103.363 File read : 157850.695 155719.694 157052.803 948.731 File removal : 109180.271 103474.828 105885.648 2411.618 Tree creation : 4812.597 1625.810 3218.841 1301.001 Tree removal : 41.205 38.867 39.730 1.048 V-1: Entering timestamp... SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 94691.817 87748.601 90106.652 3242.642 File stat : 145824.020 139999.453 142233.530 2564.019 File read : 154804.912 152446.911 153784.582 988.457 File removal : 110741.797 105265.586 107834.815 2248.374 Tree creation : 4708.453 2226.195 3100.325 1138.556 Tree removal : 40.618 21.520 33.672 8.622 V-1: Entering timestamp... Somehow, master branch without patches are a bit better than lustre-2.7 we saw before. gap between 2.5 and current master is small. |
| Comment by Di Wang [ 13/Sep/17 ] |
|
Ihara, thanks for testing. hmm, I did not expect master could be better here, maybe there are some other optimization here. Anyway it is good news. Could you please also try https://jira.hpdd.intel.com/secure/attachment/28000/28000_LU-9840.patch directly based on [5f3e926ac9ff8ad134ad920d0e8545e16395ef3b] |
| Comment by Gerrit Updater [ 21/Sep/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28962/ |
| Comment by Peter Jones [ 21/Sep/17 ] |
|
This patch has landed to master but is more work still to come? |
| Comment by Peter Jones [ 21/Sep/17 ] |
|
As per Ihara - this issue is resolved by the patch |
| Comment by Gerrit Updater [ 21/Sep/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29143 |
| Comment by Gerrit Updater [ 15/Nov/17 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/29143/ |