[LU-6654] Race condition in osd_thread_info/osd_iobuf freeing causes GPF/null pointers when running obdfilter-survey Created: 27/May/15 Updated: 12/Oct/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0, Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Patrick Farrell (Inactive) | Assignee: | Hongchao Zhang |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
CentOS 6.5, current master or 2.7.0. Probably earlier versions as well. |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Running obdfilter-survey causes GPF/null pointers Current master on CentOS 6.5 crashes when obdfilter is run on RAM disks with GPF or null pointers, always in the same spot. If I've diagnosed this correctly, the RAM disk only matters because it makes the race condition more likely, and is not directly related. The stack trace of the crash is always like this: Note the freed memory in RAX/RDX. These are sometimes other invalid pointers, rather than freed memory. Here's the race that's causing this. In dio_complete_routine, which is called in one the 'loop' threads, we have this sequence: The corresponding wait (for one of the 'lctl' threads) is this: Once this wait has completed, things roll up, and, eventually, the struct 'osd_thread_info' containing this 'iobuf' is freed. The bug (which is easily repeatable) is this: Then the first thread goes and 'wake_up' is called, except that the wait queue has been freed. I've given a fair bit of thought to this, and I don't see a terribly clean way to fix this. The fundamental problem, I think, is that the osd_thread_info is supposed to be thread specific information & it's being used to hold a wait queue that's used by another thread. I've created a patch which allocates the iobuf separately from the osd_thread_info, and uses a spinlock in the thread info to prevent the race condition. We've confirmed this fixes the problem in our testing, but I'm not sure I like the approach. I am very open to suggestions for different fixes. I will also attach logs and a dmesg showing the crash (with some extra debug info the logs). I can provide a dump if requested. In the attached logs, thread 1677 logs "wakeup", which is printed before the wakeup call, then for whatever reason it is delayed. When it starts up again, it is in the actual wakeup call and causes the kernel panic. (As 5188 has freed the osd_thread_info struct containing the iobuf & wait queue.) |
| Comments |
| Comment by Gerrit Updater [ 27/May/15 ] |
|
Patrick Farrell (paf@cray.com) uploaded a new patch: http://review.whamcloud.com/14963 |
| Comment by Colin Faber [X] (Inactive) [ 23/Nov/15 ] |
|
Has this patch been abandoned? I see no activity on it for a long time. |
| Comment by Hongchao Zhang [ 07/Jan/16 ] |
|
The patch http://review.whamcloud.com/14963 has been updated. |
| Comment by Jay Lan (Inactive) [ 11/Jan/16 ] |
|
Would there be a newer update given Patrick Farrell's comment on patchset #2 on Jan 7th? |
| Comment by Patrick Farrell (Inactive) [ 12/Jan/16 ] |
|
Jay - Hongchao wrote a new patch (and took the time to do it right, unlike me |
| Comment by Jay Lan (Inactive) [ 12/Jan/16 ] |
|
Thanks! |
| Comment by Jay Lan (Inactive) [ 13/Jan/16 ] |
|
Could you port your patch to b2_5_fe please? The difference are not trivial. Thank! |
| Comment by Jay Lan (Inactive) [ 13/Jan/16 ] |
|
Oops, my bad! I tried to apply to wrong repo. No need back port to b2_5_fe. What I need the patch for is b2_7_fe. I noticed that the osd_key_fini() in b2_7_fe missed the codes related to idc. Not sure if your patch depends on earlier patches that added chunk of codes related toe idc. Please advise or provide me a patch for b2_7_fe. Thanks! |
| Comment by Alex Zhuravlev [ 21/Jan/16 ] |
|
I'd think that using RCU (i.e. late deallocation) could make the patch a bit simpler. |
| Comment by Gerrit Updater [ 02/Feb/16 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: http://review.whamcloud.com/18257 |
| Comment by Andreas Dilger [ 03/Jun/16 ] |
|
Hongchao, how does your patch relate to http://review.whamcloud.com/14963 ? |
| Comment by Patrick Farrell (Inactive) [ 23/Nov/16 ] |
|
I'm just posting to 'bump' this and repeat the question from Andreas: "Hongchao, how does your patch relate to http://review.whamcloud.com/14963 ? " It would be nice to get one of these patches landed. They seem to achieve the same thing, though RCU is probably more efficient. |
| Comment by Cory Spitz [ 05/Oct/17 ] |
|
We let this one sit for a year. Does Patrick need to refresh his original patch or what? |
| Comment by Andreas Dilger [ 11/Oct/17 ] |
|
Patrick's patch is outdated and no longer applies cleanly. It would need to be updated to land. Hongchao, can you and Patrick please come to a conclusion about your patches, and get at least one of them landed. |
| Comment by Patrick Farrell (Inactive) [ 11/Oct/17 ] |
|
Sure. I think we should probably just land https://review.whamcloud.com/#/c/14963/, which is the first patch from Hongchao. (Mine definitely shouldn't land.) RCU is more efficient, sure, but it's also more complex. I'd leave RCU out unless we're sure we need it. |
| Comment by Cory Spitz [ 12/Oct/17 ] |
|
Patrick, you have me confused. You contributed PS1 of #14963 and Hongchao Zhang uploaded PS1 of #18257. You say that yours shouldn't land and that RCU should stay out of it, but #18257 is the one with RCU. I think you want to land #14963, which was your patch originally. Hongchao, will you take that over and refresh it again? If we all agree, #18257 should be abandoned. |
| Comment by Hongchao Zhang [ 12/Oct/17 ] |
|
the patch https://review.whamcloud.com/#/c/14963/ has been updated. |