[LU-276] ost-pools: test-18 Degradation with missing pool is 26.07 % (> 15 %) Created: 04/May/11 Updated: 07/Jan/16 Resolved: 07/Jan/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0, Lustre 2.2.0, Lustre 2.1.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Jinshan Xiong (Inactive) | Assignee: | Hongchao Zhang |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Bugzilla ID: | 23,408 |
| Rank (Obsolete): | 5036 |
| Description |
|
We've seen this bug again it's supposed to be fixed in bug 23408. Need to figure out why it became wrong again. Also, there is alignment problem in the code of ost-pools:test-18, needs to be fixed as well. |
| Comments |
| Comment by Johann Lombardi (Inactive) [ 04/May/11 ] |
|
FWIW, Wangdi landed a patch to master to fix the problem of qos_remedy_create() which could allocate objects outside the pool. |
| Comment by Peter Jones [ 04/May/11 ] |
|
HongChao This is causing failures with some of the automated test runs so could you please look into this as a priority Thanks Peter |
| Comment by Hongchao Zhang [ 05/May/11 ] |
|
take it and has started to work on it. |
| Comment by Hongchao Zhang [ 10/May/11 ] |
|
there are several cases leading to performance degradation while using pool, but local tests show no big difference in these 3 case(without pool, wide pool, missing pool), the current file number 9877 is a little small, how about increasing it to a bigger value to lessen the affect of those |
| Comment by Hongchao Zhang [ 12/May/11 ] |
|
the tests in Toro show the same result, will create a patch to increase the count of |
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Build Master (Inactive) [ 17/May/11 ] |
|
Integrated in Oleg Drokin : 016c5a0f6e7307a2a3e05eafa8a36ac16b209643
|
| Comment by Peter Jones [ 17/May/11 ] |
|
Fix landed for 2.1. Please reopen if issue reoccurs or more work is still required |
| Comment by Jian Yu [ 28/Jul/11 ] |
|
Lustre Clients: Lustre Servers: ost-pools test 18 failed with the similar issue: Avg time taken for 9877 creates without pool: 16.32 Avg time taken for 9877 creates with pool: 19.20 Avg time taken for 9877 creates with missing pool: 19.40 No pool to wide pool: 17.64 %. ost-pools test_18: @@@@@@ IGNORE (bz23408): Degradation with wide pool is 17.64 % (> 15 %) Maloo report: https://maloo.whamcloud.com/test_sets/02ffc662-b91b-11e0-8bdf-52540025f9af |
| Comment by Hongchao Zhang [ 29/Jul/11 ] |
|
the files is still 9877 files in the new occurrence for the patch was only landed on master, but the test is run on b1_8 |
| Comment by Jian Yu [ 29/Jul/11 ] |
|
With "numfiles=30000", the test 18 still failed: == test 18: File create in a directory which references a deleted pool == 00:50:15 Create performance, iteration 1, 30000 files x 3 total: 30000 creates in 50.11 seconds: 598.69 creates/second iter 1: 30000 creates without pool: 50.11 fat-amd-1-ib: Pool lustre.pool1 created fat-amd-1-ib: OST lustre-OST0000_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0001_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0002_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0003_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0004_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0005_UUID added to pool lustre.pool1 total: 30000 creates in 68.71 seconds: 436.59 creates/second iter 1: 30000 creates with pool: 68.71 fat-amd-1-ib: OST lustre-OST0000_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0001_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0002_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0003_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0004_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0005_UUID removed from pool lustre.pool1 fat-amd-1-ib: Pool lustre.pool1 destroyed total: 30000 creates in 64.46 seconds: 465.39 creates/second iter 1: 30000 creates with missing pool: 64.46 Create performance, iteration 2, 30000 files x 3 total: 30000 creates in 49.84 seconds: 601.93 creates/second iter 2: 30000 creates without pool: 49.84 fat-amd-1-ib: Pool lustre.pool1 created fat-amd-1-ib: OST lustre-OST0000_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0001_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0002_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0003_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0004_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0005_UUID added to pool lustre.pool1 total: 30000 creates in 61.67 seconds: 486.42 creates/second iter 2: 30000 creates with pool: 61.67 fat-amd-1-ib: OST lustre-OST0000_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0001_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0002_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0003_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0004_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0005_UUID removed from pool lustre.pool1 fat-amd-1-ib: Pool lustre.pool1 destroyed total: 30000 creates in 69.00 seconds: 434.80 creates/second iter 2: 30000 creates with missing pool: 69.00 Create performance, iteration 3, 30000 files x 3 total: 30000 creates in 50.25 seconds: 597.06 creates/second iter 3: 30000 creates without pool: 50.25 fat-amd-1-ib: Pool lustre.pool1 created fat-amd-1-ib: OST lustre-OST0000_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0001_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0002_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0003_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0004_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0005_UUID added to pool lustre.pool1 total: 30000 creates in 69.04 seconds: 434.50 creates/second iter 3: 30000 creates with pool: 69.04 fat-amd-1-ib: OST lustre-OST0000_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0001_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0002_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0003_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0004_UUID removed from pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0005_UUID removed from pool lustre.pool1 fat-amd-1-ib: Pool lustre.pool1 destroyed total: 30000 creates in 63.49 seconds: 472.49 creates/second iter 3: 30000 creates with missing pool: 63.49 Avg time taken for 30000 creates without pool: 50.06 Avg time taken for 30000 creates with pool: 66.47 Avg time taken for 30000 creates with missing pool: 65.65 No pool to wide pool: 32.78 %. ost-pools test_18: @@@@@@ IGNORE (bz23408): Degradation with wide pool is 32.78 % (> 15 %) Dumping lctl log to /home/yujian/test_logs/2011-07-29/004914/ost-pools.test_18.*.1311927141.log No pool to missing pool: 31.14 %. ost-pools test_18: @@@@@@ IGNORE (bz23408): Degradation with missing pool is 31.14 % (> 30 %) Dumping lctl log to /home/yujian/test_logs/2011-07-29/004914/ost-pools.test_18.*.1311927144.log Resetting fail_loc on all nodes...done. Maloo report: https://maloo.whamcloud.com/test_sets/3f75d49c-b9bb-11e0-8bdf-52540025f9af |
| Comment by Hongchao Zhang [ 29/Jul/11 ] |
|
it's related to 1.8< |
| Comment by Peter Jones [ 01/Aug/11 ] |
|
So, is the action here to land this fix on the 1.8.x branch |
| Comment by Jian Yu [ 01/Aug/11 ] |
Unfortunately, the fix does not resolve the issue under the 1.8<->2.1 interop configuration. I think Hongchao is still investigating. |
| Comment by Jian Yu [ 04/Aug/11 ] |
|
Hi Hongchao, Here is the configuration: Lustre Clients: Branch: b1_8 Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32_131.2.1.el6) Build: http://newbuild.whamcloud.com/job/lustre-b1_8/119/arch=x86_64,build_type=client,distro=el6,ib_stack=inkernel/ Lustre Servers: Branch: master Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32-131.6.1.el6_lustre) Build: http://newbuild.whamcloud.com/job/lustre-master/240/arch=x86_64,build_type=server,distro=el6,ib_stack=inkernel/ Network: IB (inkernel OFED) ENABLE_QUOTA=yes |
| Comment by Jian Yu [ 28/Aug/11 ] |
|
Lustre Clients: Lustre Servers: The same issue occurred: https://maloo.whamcloud.com/test_sets/9207414a-cf7e-11e0-8d02-52540025f9af |
| Comment by Sarah Liu [ 30/Jan/12 ] |
|
Hit the same issue when running interop test between 1.8.7-wc1 and 2.1.55 |
| Comment by Peter Jones [ 08/Feb/12 ] |
|
I believe Andreas is fixing this under |
| Comment by Hongchao Zhang [ 09/Feb/12 ] |
|
this issue could be related the config reprocess due to the config change caused by the pool's operation, |
| Comment by Hongchao Zhang [ 12/Feb/12 ] |
|
the patch is tracked at http://review.whamcloud.com/#change,2136 |
| Comment by Jian Yu [ 16/Feb/12 ] |
|
Lustre Clients: Lustre Servers:Tag: v2_1_1_0_RC2 The same issue occurred: https://maloo.whamcloud.com/test_sets/32ec951e-587e-11e1-a226-5254004bbbd3 |
| Comment by John Fuchs-Chesney (Inactive) [ 07/Jan/16 ] |
|
Marking this as resolved/incomplete, given the length of time since it was last updated. If anyone disagrees, or would like this ticket re-opened, please let us know. |