[LU-13534] Landing an LU-12678 high likely introduce a random memory corruption bug Created: 07/May/20 Updated: 17/Feb/21 Resolved: 10/Jul/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alexey Lyashkov | Assignee: | Neil Brown |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
with landing an LU-12678, ptlrpc hold an object pointer without reference to it (lnet_me don't have a reference). void
lnet_md_unlink(struct lnet_libmd *md)
{
if ((md->md_flags & LNET_MD_FLAG_ZOMBIE) == 0) {
/* first unlink attempt... */
struct lnet_me *me = md->md_me;
md->md_flags |= LNET_MD_FLAG_ZOMBIE;
/* Disassociate from ME (if any), and unlink it if it was created
* with LNET_UNLINK */
if (me != NULL) {
/* detach MD from portal */
lnet_ptl_detach_md(me, md);
if (me->me_unlink == LNET_UNLINK)
lnet_me_unlink(me);
}
/* ensure all future handle lookups fail */
lnet_res_lh_invalidate(&md->md_lh);
}
if (md->md_refcount != 0) {
CDEBUG(D_NET, "Queueing unlink of md %p\n", md);
return;
}
so lnet_me isn't protected by MD reference. |
| Comments |
| Comment by James A Simmons [ 05/Jun/20 ] |
| Comment by James A Simmons [ 10/Jul/20 ] |
|
Patch https://review.whamcloud.com/#/c/38646 landed which should of resolved this issue. |