[LU-4624] ll_stop_statahead deadlock Created: 13/Feb/14 Updated: 11/Oct/14 Resolved: 27/Mar/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Christopher Morrone | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 12653 | ||||||||
| Description |
|
We are seeing processes on Lustre clients hang in close(): PID: 67983 TASK: ffff8804448d0080 CPU: 10 COMMAND: "java"
#0 [ffff88041b969d08] schedule at ffffffff815109b2
#1 [ffff88041b969dd0] cfs_waitq_wait+0xe at ffffffffa043577e [libcfs]
#2 [ffff88041b969de0] ll_stop_statahead+0x1b8 at ffffffffa0aad1b8 [lustre]
#3 [ffff88041b969e60] ll_file_release+0x2d8 at ffffffffa0a6b698 [lustre]
#4 [ffff88041b969ea0] ll_dir_release+0xdb at ffffffffa0a52d5b [lustre]
#5 [ffff88041b969ec0] __fput+0x108 at ffffffff81183898
#6 [ffff88041b969f10] fput+0x25 at ffffffff811839e5
#7 [ffff88041b969f20] filp_close+0x5d at ffffffff8117eddd
#8 [ffff88041b969f50] sys_close+0xa5 at ffffffff8117eeb5
#9 [ffff88041b969f80] system_call_fastpath+0x16 at ffffffff8100b0b2
RIP: 00002aaaaacdb5ad RSP: 00002aaab505f3f8 RFLAGS: 00010206
RAX: 0000000000000003 RBX: ffffffff8100b0b2 RCX: 000000000000003a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000131
RBP: 0000000000000131 R8: 00002aaaab586580 R9: 00002aaaab33a780
R10: 0000000000000131 R11: 0000000000000293 R12: 0000000000000002
R13: 00002aaabc013df0 R14: 00002aaabc107700 R15: 00002aaabc013df0
ORIG_RAX: 0000000000000003 CS: 0033 SS: 002b
PID: 67984 TASK: ffff880639bcb500 CPU: 10 COMMAND: "ll_sa_66672"
#0 [ffff88065abc9d40] schedule at ffffffff815109b2
#1 [ffff88065abc9e08] cfs_waitq_wait+0xe at ffffffffa043577e [libcfs]
#2 [ffff88065abc9e18] ll_statahead_thread+0x59e at ffffffffa0ab288e [lustre]
#3 [ffff88065abc9f48] child_rip+0xa at ffffffff8100c10a
PID: 67985 TASK: ffff880d2faba080 CPU: 11 COMMAND: "ll_agl_66672"
#0 [ffff8809d06fddc0] schedule at ffffffff815109b2
#1 [ffff8809d06fde88] cfs_waitq_wait+0xe at ffffffffa043577e [libcfs]
#2 [ffff8809d06fde98] ll_agl_thread+0x44a at ffffffffa0aadbba [lustre]
#3 [ffff8809d06fdf48] child_rip+0xa at ffffffff8100c10a
We are running 2.4.0-19chaos (see http://github.com/chaos/lustre) on these clients. |
| Comments |
| Comment by Peter Jones [ 13/Feb/14 ] |
|
Niu Could you please advise on this one? Thanks Peter |
| Comment by Niu Yawei (Inactive) [ 14/Feb/14 ] |
|
I can't see why the agl & statahead thread can't be stopped now, is there any error messages in the log? Is it possible to reproduce it and get some debug log with D_READA enabled? |
| Comment by Christopher Morrone [ 22/Feb/14 ] |
|
The synchronization between the statahead threads is pretty weak. It definitely needs work. The specific problem that we are seeing right now has to do with a race between startup and shutdown of the statahead threads. The statahead threads blindly overwrite the thread status flags to be SVC_RUNNING. If the state was set to SVC_STOPPING before that, we have now lost that notice and we wind up deadlocked. Here's quick-and-dirty patch to address the problem: http://review.whamcloud.com/9358 I only have the new threads overwrite the thread state flags if the flag is still uninitialized. Otherwise, it remains unchanged. This allows the SVC_STOPPING to remain set. I call this "quick-and-dirty" because really it would be better to recognize that SVC_STOPPING is set and immediately abort startup. My patch still allows the full startup to continue: launch the agl thread, and all the other initializations. But then in the main loop it will see that it needs to shutdown and unwind all of that. Better would be to exit early without proceding through all of that startup. But the cleanup code in both the ll_statahead_thread and ll_agl_thread both need more reworking than I have time for at the moment. But I would suggest that really needs to be done before this issue is closed. Nevertheless, I suspect there is no harm in landing my patch in the interim. I also include a patch to make the statahead debug messages a bit more useful. |
| Comment by Christopher Morrone [ 22/Feb/14 ] |
|
Here are some thoughts about things that are wrong and need to be fixed in the statahead code:
|
| Comment by Niu Yawei (Inactive) [ 24/Feb/14 ] |
I think the most common use pattern of start/stop ptlrpc_thread is that: caller calls kthread_run() to start thread, then wait until the thread start running, then stop the thread later. For the statahead, if there could be multiple callers which can concurrently start/stop the statahead thread, they must be serialized outside of the statahead thread. |
| Comment by Niu Yawei (Inactive) [ 24/Feb/14 ] |
|
Lai & Fanyong are statahead/agl experts, I'd like to ask their opinions. Lai/nasf, any comments to Chris' questions? |
| Comment by Christopher Morrone [ 24/Feb/14 ] |
|
(whoops sorry, accidentally edited your comment instead of replying...think I fix that now)
That is a strange position to take, and I would argue that it is wrong. The statahead code explicitly exports interfaces for starting and stopping the statahead threads, and that is clearly the correct place to be doing the synchronization. Forcing the calling code to know the internal sychronization requirements of the statahead code would just be bad software design. |
| Comment by Niu Yawei (Inactive) [ 25/Feb/14 ] |
Yes, the function of starting/stopping statahead thread is a good place to do the synchronization. I meant that we'd not do that inside of statahead thread, your patch looks like trying to resolve the race in the statahead thread. |
| Comment by Christopher Morrone [ 25/Feb/14 ] |
The statahead thread initializes the threads's t_flags without checking the current state of the flags. That is a bug. Any change to address that problem will necessarily require changes to the ll_statahead_thread() code. If you think there is some way to fix this without touching the ll_statahead_thread(), please show the code. Otherwise, please remove the -1 review on the patch. |
| Comment by Christopher Morrone [ 25/Feb/14 ] |
What is "common" is pretty much irrelevant to the conversation. The code needs to be correct for all cases, not just the common case. |
| Comment by Niu Yawei (Inactive) [ 25/Feb/14 ] |
I see the race of starting statahead thread in do_statahead_enter() vs. stopping thread in ll_file_release(). and looks it's hard to be fixed outside of the statahead thread, because close() should not be blocked. (current ll_file_release() wait on the thread exit, it'll be changed in the fix of Thanks for your patch, Chris. |
| Comment by Lai Siyao [ 25/Feb/14 ] |
|
Chris, there is a patch http://review.whamcloud.com/#/c/6392/ which addresses many statahead issues, I'll try to land that ASAP. So below suggestions should better be fixed in a new patch.
|
| Comment by Christopher Morrone [ 25/Feb/14 ] |
|
Lai, thanks for the pointer. While I think there is some good cleanup work in your patch, I think a lot of my comments are not addressed by that. Also, I fear that there is too much going on in that patch to get it landed to b2_4/b2_5 very soon. I think in the short term (and for the b2_5 branch) my patch should be fully reviewed. If it is acceptable, it can land and fix this deadlock while we are waiting for the larger reworking of the code to be ready. |
| Comment by Christopher Morrone [ 26/Feb/14 ] |
|
I pushed an update to my patch. There was a follow-on issue to the one I addressed already, a failure to refcount the sai buffer correctly. But the more I look at the statahead code, the more potential problems I see. Hopefully Lai's patch will be more comprehensive. |
| Comment by Christopher Morrone [ 26/Feb/14 ] |
|
I don't think that Lai's patch addresses the first issue identified here. |
| Comment by Peter Jones [ 27/Mar/14 ] |
|
All patches tracked under this ticket have landed for 2.6. Further work in this area is taking place under |