[LU-15189] GDS support improvements and fixes. Created: 03/Nov/21 Updated: 09/Sep/22 Resolved: 30/May/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alexey Lyashkov | Assignee: | Alexey Lyashkov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Whamcloud GDS code have several oddness. 2. GDS have a bug in page state testing function. It caused a set force_rdma flag to the ptlrpc message buffer. It's because
bool nvfs_is_gpu_page(struct page *page)
{
nvfs_mgroup_ptr_t nvfs_mgroup;
nvfs_mgroup = __nvfs_mgroup_from_page(page, false);
if (nvfs_mgroup == NULL) {
return false;
} else if (unlikely(IS_ERR(nvfs_mgroup))) {
// This is a GPU page but we did not take reference as we are in shutdown path
// But, we will return true to the caller so that caller doesn't think it is a
// CPU page and fall back to CPU path
return true; <<< true if no magic.
It's very easy to detect - just push force_rdma flag into memory mapping function and check is force_rdma buffer can mapped with GDS code. Lets fix it. |
| Comments |
| Comment by Alexey Lyashkov [ 04/Nov/21 ] |
|
WC GPU code have lack of protection from GDS module unload, existent code is racy - |
| Comment by Gerrit Updater [ 08/Nov/21 ] |
|
"Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45480 |
| Comment by Gerrit Updater [ 08/Nov/21 ] |
|
"Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45481 |
| Comment by Gerrit Updater [ 08/Nov/21 ] |
|
"Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45482 |
| Comment by Gerrit Updater [ 06/Jan/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45481/ |
| Comment by Gerrit Updater [ 03/Mar/22 ] |
|
"Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/46692 |
| Comment by Gerrit Updater [ 30/May/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45480/ |
| Comment by Gerrit Updater [ 30/May/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45482/ |
| Comment by Peter Jones [ 30/May/22 ] |
|
Landed for 2.16 |