[LU-16064] RPC from evicted client can corrupt data Created: 02/Aug/22 Updated: 01/Dec/23 |
|
| Status: | In Progress |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alex Zhuravlev | Assignee: | Alex Zhuravlev |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
when a client gets evicted OST cancels its locks, but don't wait for its RPCs to complete. this way another client can get a conflicting lock and modify data, but then in-progress RPC from the evicted client can modify data as well. then we get a situation when the healty client holding LDLM lock has some data/state in his cache which don't match actual data stored on OST. |
| Comments |
| Comment by Peter Jones [ 25/Aug/22 ] |
|
"Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48102 |
| Comment by Alex Zhuravlev [ 28/Nov/22 ] |
|
the approach taken in the patch has a prolem - MDS can get stuck if RPC being processed needs to evict own client. not sure how to handle this yet.. thinking. |
| Comment by Alexey Lyashkov [ 01/Dec/23 ] |
|
Alex, If I right understand - this problem should be solved in different way and fix will be much simple. what you think about it? |