[LU-248] port bug24194 to master (Still seeing inconsistencies in OST allocation) Created: 28/Apr/11  Updated: 01/Jun/11  Resolved: 01/Jun/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.6
Fix Version/s: Lustre 2.1.0

Type: Bug Priority: Minor
Reporter: Hongchao Zhang Assignee: Hongchao Zhang
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Bugzilla ID: 24,194
Rank (Obsolete): 5015

 Description   

port the patch in 1.8.6 to master(2.1.0)



 Comments   
Comment by Peter Jones [ 10/May/11 ]

Hongchao it would be a good idea to copy the main details of the issue from the bz ticket in case bz is unavailable when you want to work on this

Comment by Hongchao Zhang [ 10/May/11 ]

after the qos_threshold_rr was set to 100%, the allocation of OSTs is not consistent in a consistent round-robin allocation.

Each IOR test creates 48 to 240 files (multiples of 12) in a directory called multdir with:

lfs setstripe multdir -s 1m -c 1

After creating the files, we look at the output of 'lfs getstripe multdir' and add up the number of files assigned to each OST.
Most of the time the number of objects per OST is uniform, but about 1 in 10 times it is not. For example:

++++ Results are listed by OST
++ OST OSS Objects assigned
0 0 16
1 3 16
2 0 16
3 3 15 <---
4 0 16
5 3 16
6 0 16
7 3 16
8 0 16
9 3 17 <---
10 0 16
11 3 16
++ Total number of OST's assigned is 12
++ Total number of objects assigned is 192

the cause of the above problem is that there are other creating operations during the IOR test, and the OST objects are allocated
in MDS-LOV, which will be affected by all clients' creation. if there is other creation before this test, the re-seed mechanism will cause
inconsistent OST allocation, e.g.

if OST count is 12, lqr_start_count will be (1000/12 + 4)*12 = 1044,
set lqr_start_idx=2, and one file is created before the test, then lqr_start_idx increase to 3

let's do the test, creating 1044 files (87 objects in each OST), then when creating the 1044th
file, the re-seed will be triggered and lqr_start_idx will be reset to another random value(say, 5)
the final OST objects allocation will be,
OST0 87
OST1 87
OST2 86 <-- the re-seed point
OST3 87
OST4 87
OST5 88 <-- the new lqr_start_idx
OST6 87
OST7 87
OST8 87
OST9 87
OST10 87
OST11 87

btw, the OST allocation is still consistent in the system view.

the patch will increase the LOV reseed window to mitigate the issue

Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » i686,client,el5,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » i686,client,el5,ofa #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » i686,client,el6,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,ofa #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » i686,server,el5,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » i686,server,el5,ofa #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » i686,server,el6,inkernel #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,client,el5,ofa #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Build Master (Inactive) [ 31/May/11 ]

Integrated in lustre-master » x86_64,server,el5,ofa #143
LU-248 increase LOV reseed to mitigate OST allocation inconsitence

Oleg Drokin : 4f50bbfa20b8cfdf13c2647cfae057285df9e9e8
Files :

  • lustre/lov/lov_qos.c
Comment by Hongchao Zhang [ 01/Jun/11 ]

patch has been pushed into master, close the issue, please reopen it if needed

Generated at Sat Feb 10 01:05:12 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.