[LU-3782] Divizion by zero in ost-pools 18 Created: 20/Aug/13 Updated: 17/May/16 Resolved: 14/Mar/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alexander Lezhoev | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
4-nodes virtual cluster, 2 OST 700MB per each. |
||
| Severity: | 3 |
| Rank (Obsolete): | 9784 |
| Description |
|
ost-pools: create_perf () does not check if files were actually created on each iteration: create_perf() {
...
stat=$(createmany -o $cdir/${tfile} -$numsec | tail -1)
files=$(echo $stat | cut -f 2 -d ' ')
echo $stat 1>&2
..
}
$numsec is fixed as 15 seconds. == ost-pools test 18: File create in a directory which references a deleted pool == 15:47:58 (1371152878) Create performance, iteration 1, 15 seconds x 3 total: 40940 creates in 14.28 seconds: 2867.34 creates/second iter 1: 40940 creates without pool mft51: Pool lustre.testpool created mft51: OST lustre-OST0000_UUID added to pool lustre.testpool mft51: OST lustre-OST0001_UUID added to pool lustre.testpool total: 38563 creates in 14.42 seconds: 2674.14 creates/second iter 1: 38563 creates with pool mft51: OST lustre-OST0000_UUID removed from pool lustre.testpool mft51: OST lustre-OST0001_UUID removed from pool lustre.testpool mft51: Pool lustre.testpool destroyed total: 43721 creates in 13.57 seconds: 3221.95 creates/second iter 1: 43721 creates with missing pool Create performance, iteration 2, 15 seconds x 3 total: 0 creates in 0.00 seconds: 0.00 creates/second iter 2: 0 creates without pool mft51: Pool lustre.testpool created mft51: OST lustre-OST0000_UUID added to pool lustre.testpool mft51: OST lustre-OST0001_UUID added to pool lustre.testpool total: 0 creates in 0.00 seconds: 0.00 creates/second iter 2: 0 creates with pool mft51: OST lustre-OST0000_UUID removed from pool lustre.testpool mft51: OST lustre-OST0001_UUID removed from pool lustre.testpool mft51: Pool lustre.testpool destroyed total: 0 creates in 0.00 seconds: 0.00 creates/second iter 2: 0 creates with missing pool Create performance, iteration 3, 15 seconds x 3 total: 0 creates in 0.00 seconds: 0.00 creates/second iter 3: 0 creates without pool mft51: Pool lustre.testpool created mft51: OST lustre-OST0000_UUID added to pool lustre.testpool mft51: OST lustre-OST0001_UUID added to pool lustre.testpool total: 0 creates in 0.00 seconds: 0.00 creates/second iter 3: 0 creates with pool mft51: OST lustre-OST0000_UUID removed from pool lustre.testpool mft51: OST lustre-OST0001_UUID removed from pool lustre.testpool mft51: Pool lustre.testpool destroyed total: 0 creates in 0.00 seconds: 0.00 creates/second iter 3: 0 creates with missing pool Avg files created in 15 seconds without pool: 0 Avg files created in 15 seconds with pool: 0 Avg files created in 15 seconds missing pool: 0 /usr/lib64/lustre/tests/ost-pools.sh: line 1000: (0 - 0) * 100 / 0: division by 0 (error token is "0") test_18 returned 1
|
| Comments |
| Comment by Keith Mannthey (Inactive) [ 20/Aug/13 ] |
|
Are you intending to submit a patch or just reporting the issue? |
| Comment by Alexander Lezhoev [ 21/Aug/13 ] |
|
Keith, because the current design (use fixed time instead of number of files) appears after |
| Comment by Andreas Dilger [ 21/Aug/13 ] |
|
Probably the best solution is to change createmany to allow handing both a time limit and maximum file count, and exit when either condition is hit. This would best be done by parsing named options instead of making more confusing positional parameter combinations. |
| Comment by Kirtan Shetty (Inactive) [ 07/Sep/15 ] |
|
Andreas Dilger, I have put the option of maximum file count in the test, but can you please elaborate on how this will help us in this issue ? |
| Comment by Andreas Dilger [ 01/Oct/15 ] |
|
You are correct - I don't think my suggestion will help in this case, because the test would still exit after 15s even if a (maximum) number of files was specified. I was thinking of the more normal case where we want to run a test workload for a maximum amount of time, but not create too many files if the MDS is very fast. I guess the next question is what was wrong with the system that it couldn't create any files in 90s? Is the filesystem out of inodes? Is the MDS or OSS down or in recovery? I'd probably consider it a test error if createmany wasn't able to create any files at all for this test. |
| Comment by Kirtan Shetty (Inactive) [ 05/Oct/15 ] |
|
Ok, got it. So for now just a check when zero files created should be enough right ? |
| Comment by Gerrit Updater [ 26/Oct/15 ] |
|
kirtan.shetty (kirtan.shetty@seagate.com) uploaded a new patch: http://review.whamcloud.com/16939 |
| Comment by Gerrit Updater [ 14/Mar/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16939/ |