[LU-1447] MDS Load avarage Created: 30/May/12 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0 |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor |
| Reporter: | Fabio Verzelloni | Assignee: | Oleg Drokin |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
MDS HW MDT LSI 5480 Pikes Peak OSS HW OST LSI 7900 1 MDS + 1 fail over |
||
| Attachments: |
|
| Rank (Obsolete): | 10075 |
| Description |
|
Dear support, |
| Comments |
| Comment by Oleg Drokin [ 30/May/12 ] |
|
I see that there is not an issue with high CPU usage, rather some of the threads are sleeping in D state. |
| Comment by Fabio Verzelloni [ 31/May/12 ] |
|
Yesterday evening we had a hang of the file system ( ticket http://jira.whamcloud.com/browse/LU-1451 ) and now the load avarage is back to 'normal' (load average: 0.26, 0.24, 0.30 ) also during heavy I/O but we are having drop down of performance, ( http://jira.whamcloud.com/browse/LU-1455 ). |
| Comment by Fabio Verzelloni [ 31/May/12 ] |
|
This is the all cluster ps -ax | grep D after a while of working. |
| Comment by Oleg Drokin [ 31/May/12 ] |
|
Based on these it seems that weisshorn07, 09, 13 ... are having serious overload issues (likely induced by the disk subsystems there). Any chance you can survey your disk subsystem to see what's going on? I suspect it's not really happy to have a lot of parallel IO ongoing. Also since some other OSSes are less busy, it appears the IO is not distributed all that evenly. In a lot of cases with "weak" (parallel-io wise) disk subsystems limiting number of ost io threads possible should help the situation I think, by reducing the overload. The MDS does not have any processes in D state, and I assume at the time this snapshot was taken MDS Load Average was pretty small? |
| Comment by Fabio Verzelloni [ 01/Jun/12 ] |
|
The disk HW is: Our 'max_rpmcs_in_flight' on the MDS is: and the threads_[max,min,started] are: [root@weisshorn01 lustre]# cat ./mgs/MGS/mgs/threads_max – [root@weisshorn01 scratch-MDT0000]# cat ./mdt_mds/threads_max on the OSS: [root@weisshorn03 lustre]# cat ./ost/OSS/ost/threads_max on the client/MDS side the max_rpcs_in_flight is: [root@weisshorn01 lustre]# cat ./osc/scratch-OST0005-osc-MDT0000/max_rpcs_in_flight So far we didn't see anymore the high load average on the MDS instead the load average on the OSS when we run benchmark with block size of 4096k OSS load goes to 200-300 and with 'top' we see "x.x%wa" in some cases increasing. Do you have any suggestion about the right tuning based on our hardware/configuration? Also on the client side? ( Cray XE6 1500 nodes~ ) Fabio |
| Comment by Oleg Drokin [ 01/Jun/12 ] |
|
Well, it's somewhat expected that as you increase the write activity, load average on OSTs goes up. Essentially what's going on is every write RPC (in 1M chunks) from every client is going to consume 1 OSS io thread. So solutions for you are multiple: There is mostly no client-specific tuning on the client side that you can do that would relieve the situation without additionally ruining e.g. single client performance, so I suggest you to concentrate on servers here. (the possible exception is some read-ahead settings when you expect to have a lot of small read traffic). |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old ticket. |