[LU-5415] High ldlm_poold load on client Created: 25/Jul/14 Updated: 29/Oct/15 Resolved: 14/Aug/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.7.0, Lustre 2.5.3 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Li Xi (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 15059 | ||||||||||||
| Description |
|
When LRU resizing is enabled on client, sometimes, ldlm_poold have extremely high CPU load. And at the meantime, schedule_timeout() complains about negative timeout. After some time, the problem will recover without any manual intervention. But it happens really frequently when the file system is under high load. top - 09:48:51 up 6 days, 11:17, 2 users, load average: 1.00, 1.01, 1.00 Tasks: 516 total, 2 running, 514 sleeping, 0 stopped, 0 zombie Cpu(s): 0.1%us, 6.4%sy, 0.0%ni, 93.4%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 65903880k total, 24300068k used, 41603812k free, 346516k buffers Swap: 65535992k total, 0k used, 65535992k free, 18665656k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 37976 root 20 0 0 0 0 R 99.4 0.0 2412:25 ldlm_bl_04 Jul 13 12:49:30 mu01 kernel: LustreError: 11-0: lustre-OST000a-osc-ffff88080fdad800: Communicating with 10.0.2.2@o2ib, operation obd_ping failed with -107. Jul 13 12:49:30 mu01 kernel: Lustre: lustre-OST000a-osc-ffff88080fdad800: Connection to lustre-OST000a (at 10.0.2.2@o2ib) was lost; in progress operations using this service will wait for recovery to complete Jul 13 12:49:30 mu01 kernel: LustreError: 167-0: lustre-OST000a-osc-ffff88080fdad800: This client was evicted by lustre-OST000a; in progress operations using this service will fail. Jul 13 12:49:31 mu01 kernel: schedule_timeout: wrong timeout value fffffffff5c2c8c0 Jul 13 12:49:31 mu01 kernel: Pid: 4054, comm: ldlm_poold Tainted: G --------------- T 2.6.32-279.el6.x86_64 #1 Jul 13 12:49:31 mu01 kernel: Call Trace: Jul 13 12:49:31 mu01 kernel: [<ffffffff814fe759>] ? schedule_timeout+0x2c9/0x2e0 Jul 13 12:49:31 mu01 kernel: [<ffffffffa086612b>] ? ldlm_pool_recalc+0x10b/0x130 [ptlrpc] Jul 13 12:49:31 mu01 kernel: [<ffffffffa084cfb9>] ? ldlm_namespace_put+0x29/0x60 [ptlrpc] Jul 13 12:49:31 mu01 kernel: [<ffffffffa08670b0>] ? ldlm_pools_thread_main+0x1d0/0x2f0 [ptlrpc] Jul 13 12:49:31 mu01 kernel: [<ffffffff81060250>] ? default_wake_function+0x0/0x20 Jul 13 12:49:31 mu01 kernel: [<ffffffffa0866ee0>] ? ldlm_pools_thread_main+0x0/0x2f0 [ptlrpc] Jul 13 12:49:31 mu01 kernel: [<ffffffff81091d66>] ? kthread+0x96/0xa0 Jul 13 12:49:31 mu01 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20 Jul 13 12:49:31 mu01 kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0 Jul 13 12:49:31 mu01 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 Jul 13 12:49:33 mu01 kernel: Lustre: lustre-OST000a-osc-ffff88080fdad800: Connection restored to lustre-OST000a (at 10.0.2.2@o2ib) |
| Comments |
| Comment by Li Xi (Inactive) [ 25/Jul/14 ] |
|
We've seen this problem for a long time on variable systems. Ususally, as a walk around, we disable LRU resizing of ldlm on client. But maybe following patch can help. |
| Comment by Peter Jones [ 25/Jul/14 ] |
|
Lai Could you please review this patch? Thanks Peter |
| Comment by Andreas Dilger [ 25/Jul/14 ] |
|
Was this problem actually seen on Lustre 2.6/master or some other version? There were patches from Oleg that were landed for 2.5 that addressed some problems with LDLM pools, but I'm happy to see more improvements in this area. |
| Comment by Andreas Dilger [ 26/Jul/14 ] |
|
http://review.whamcloud.com/6234 |
| Comment by Peter Jones [ 26/Jul/14 ] |
|
So the patches Andreas mentions would be included on any 2.5.x based branches. |
| Comment by Li Xi (Inactive) [ 26/Jul/14 ] |
|
Yeah, those patches are included on the branch which has this problem. This problem is happening quite frequently, |
| Comment by Lai Siyao [ 28/Jul/14 ] |
|
Peter, I'll be on vacation from tomorrow, could you reassign to others? |
| Comment by Peter Jones [ 28/Jul/14 ] |
|
Bobijam Could you please look after this patch? Thanks Peter |
| Comment by Oleg Drokin [ 11/Aug/14 ] |
|
I wonder what sort of lists do you have on the client side that cause iteration of said lists to take over a second (so that the time becomes negative)? Could the problem be somewhere else and this proposed change is just papering over the real issue? |
| Comment by Li Xi (Inactive) [ 11/Aug/14 ] |
|
Yeah, that is very possible that the patch is not fixing the root cause. And it nearly becomes a common knowledge that LRU resizing of ldlm on client should be disabled, otherwise there will be ldlm high load. Is there any garantee that LRU resizing will complete in a determined period of time? If not, then the current codes has problem any way. And I found some other issues which had negative timeout values. https://jira.hpdd.intel.com/browse/LU-1733 |
| Comment by Li Xi (Inactive) [ 12/Aug/14 ] |
|
Patch for b2_5: |
| Comment by Peter Jones [ 14/Aug/14 ] |
|
Landed for 2.7 |