[LU-7756] oss_num_threads max value is sometimes too low to feed disk controllers Created: 08/Feb/16 Updated: 09/Feb/17 Resolved: 14/Mar/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.3 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Gregoire Pichon | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch, performance | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
When submitting read IOs from lustre clients to oss which uses 6 OSTs, we see in iostat that the number of IO requests in progress on each lun does not go beyond 86. The result is that throughput per lun is limited to ~600 MB/s. To be able to get the most out of the pun we need at least 100 requests at once. This is due to the max number of oss_num_threads being limited by OSS_NTHRS_MAX (512 / 6 = ~86). Raising the limit to 1024 by patching the code allowed to get enough I/Os at once on each lun and get up to 900MB/s per OST. |
| Comments |
| Comment by Gerrit Updater [ 08/Feb/16 ] |
|
Grégoire Pichon (gregoire.pichon@bull.net) uploaded a new patch: http://review.whamcloud.com/18350 |
| Comment by Andreas Dilger [ 08/Feb/16 ] |
|
Have you tried increasing the RPC size to 4MB so that fewer IOS are needed to keep the backend busy? |
| Comment by Joseph Gmitter (Inactive) [ 08/Feb/16 ] |
|
Hi Bob, |
| Comment by Bob Glossman (Inactive) [ 08/Feb/16 ] |
|
I see that the patch does what it says, adds a module param in place of a hard coded limit. However I can't speak to if this is a good change or not. I believe in the past having too many service threads actually reduced performance in some cases. |
| Comment by Andreas Dilger [ 08/Feb/16 ] |
|
I would tend to agree, and I wouldn't want to allow the number of threads to increase by default, but since this is a module parameter that explicitly needs to be set by the admin I think it is fairly safe. |
| Comment by Gerrit Updater [ 14/Mar/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18350/ |
| Comment by Andreas Dilger [ 14/Mar/16 ] |
|
Gregoire, your patch has landed for 2.9.0. Does it need to be backported to an EE release, or are you applying this locally? |
| Comment by Gregoire Pichon [ 14/Mar/16 ] |
|
Yes, we would need to have the patch backported to b2_7_fe and IEEL 3.0 if possible. |
| Comment by Minh Diep [ 24/Oct/16 ] |
|
b2_7_fe port: http://review.whamcloud.com/#/c/22391/ |