[LU-11229] server_bulk_callback()) ASSERTION( desc->bd_md_count > 0 ) failed Created: 09/Aug/18 Updated: 09/Aug/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alexey Lyashkov | Assignee: | Alexey Lyashkov |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
While testing an int ptlrpc_start_bulk_transfer(struct ptlrpc_bulk_desc *desc) .. /* Network is about to get at the memory */ if (ptlrpc_is_bulk_put_source(desc->bd_type)) rc = LNetPut(self_nid, desc->bd_mds[posted_md], LNET_ACK_REQ, peer_id, desc->bd_portal, mbits, 0, 0); So two lnet events per MD, but..
void server_bulk_callback(struct lnet_event *ev)
{
...
if (ev->unlinked) {
desc->bd_md_count--;
/* This is the last callback no matter what... */
if (desc->bd_md_count == 0)
wake_up(&desc->bd_waitq);
}
OOPS.. we have decrease a bd_md_count twice = one for LNET_SEND, second one is for LNET_ACK. 00000100:00000010:0.0:1533747855.090799:0:24663:0:(client.c:130:ptlrpc_new_bulk()) kmalloced 'desc': 416 at ffff88006080c800. 00000100:00000200:0.0:1533747855.091779:0:21701:0:(events.c:449:server_bulk_callback()) event type 5, status 0, desc ffff88006080c800 00000100:00000200:1.0:1533747855.091788:0:21700:0:(events.c:449:server_bulk_callback()) event type 4, status 0, desc ffff88006080c800 00000100:00040000:0.0:1533747855.091796:0:21701:0:(events.c:453:server_bulk_callback()) ASSERTION( desc->bd_md_count > 0 ) failed: So looks we don't need to trust an ev->unlinked (buffer is unlinked after send), but wait an ACK if it still needs. |
| Comments |
| Comment by Alexey Lyashkov [ 09/Aug/18 ] |
|
I not sure, why ACK is needs in this case. I think it just additional overhead if enabled correctly and server can able to handle a partial transfer for now. |