[LU-1088] mgs threads go nuts Created: 09/Feb/12 Updated: 04/Jun/12 Resolved: 04/Jun/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.3.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Christopher Morrone | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre 2.1.0-21chaos (github.com/chaos/lustre) |
||
| Attachments: |
|
| Severity: | 2 |
| Rank (Obsolete): | 4614 |
| Description |
|
While investigating The console is mostly unresponsive, but it did respond to a sysreq-l, so I can see that they are all in a backtrace similar to this: Call Trace: [<ffffffffa06da060>] lock_res_and_lock+0x30/0x40 [ptlrpc] [<ffffffffa06deca3>] ldlm_lock_enqueue+0x453/0x7e0 [ptlrpc] [<ffffffffa06fd206>] ldlm_handle_enqueue0+0x406/0xd70 [ptlrpc] [<ffffffffa06fdbd6>] ldlm_handle_enqueue+0x66/0x70 [ptlrpc] [<ffffffffa06fdbe0>] ? ldlm_server_completion_ast+0x0/0x590 [ptlrpc] [<ffffffffa06fe170>] ? ldlm_server_blocking_ast+0x0/0x740 [ptlrpc] [<ffffffffa0b55245>] mgs_handle+0x545/0x1350 [mgs] [<ffffffffa04933f1>] ? libcfs_debug_vmsg1+0x41/0x50 [libcfs] [<ffffffffa04933f1>] ? libcfs_debug_vmsg1+0x41/0x50 [libcfs] [<ffffffffa0723181>] ptlrpc_main+0xcd1/0x1690 [ptlrpc] [<ffffffffa07224b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffffa07224b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] [<ffffffffa07224b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] [<ffffffff8100c140>] ? child_rip+0x0/0x20 |
| Comments |
| Comment by Christopher Morrone [ 09/Feb/12 ] |
|
We gave up on waiting for the node to recover on its own. We forced a crash dump and are rebooting now. |
| Comment by Christopher Morrone [ 09/Feb/12 ] |
|
MDS went through recovery and seems happy for the moment. |
| Comment by Alex Zhuravlev [ 10/Feb/12 ] |
|
once you meet this problem again, please grab all the traces and attach them here. |
| Comment by Christopher Morrone [ 10/Feb/12 ] |
|
Alex, we have a crash dump. If you want backtraces from all tasks, we'll get you that. No need to wait for another instance. |
| Comment by Christopher Morrone [ 10/Feb/12 ] |
|
Attach "foreach bt" from momus mds. |
| Comment by Peter Jones [ 10/Feb/12 ] |
|
Added Alex as a watcher so he is aware of Chris's answer |
| Comment by Oleg Drokin [ 10/Feb/12 ] |
|
From the stack trace: So I imagine you have a lot of clients (tens of thousands?), and once all of them somehow got disconnected, they all come rushing to reconnect back and get their config lock too (all on the same resource).. That ldlm_resource_dump is D_IFO which is probably bad idea and should be D_DLMTRACE too, what is the lustre debug level you are running at? |
| Comment by Christopher Morrone [ 10/Feb/12 ] |
|
Yes, I had recently added D_INFO to get a look at what the spinning mdt thread was doing for We have a few thousand clients. Most of them should not be disconnected. There are on the order of a couple hundred that might reboot and reconnect at any time (BGP nodes). I think that we definitely need ldlm_resource_dump changed. I certainly accept that performance is reduced when I enable higher logging levels, but I don't expect a denial-of-service attack. |
| Comment by Lai Siyao [ 02/Mar/12 ] |
|
It looks okay to use RCU for resource lock dump, and compared to ldlm_lock_debug() ldlm_lock_dump() is inefficient, I'll replace it the former one. |
| Comment by Lai Siyao [ 02/Mar/12 ] |
|
Review is on http://review.whamcloud.com/#change,2250. |
| Comment by Peter Jones [ 04/Jun/12 ] |
|
Landed for 2.3 |