[LU-11131] resent reint rpc failure due to reused reply data slot Created: 09/Jul/18 Updated: 15/Oct/19 Resolved: 18/Jul/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.12.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Vladimir Saveliev | Assignee: | Vladimir Saveliev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
The following scenario leads to failure of recent reint rpc: 1. mdt server has number of rpcs being handled, rpc 1 from client A 2. shutdown for the server starts 3. rpc 1 is processed, reply data is added, but client A gets ENODEV 3. shutdown reaches class_disconnect_exports() and links an export A 4. obd_zombid thread wakes up and destroy the export A, which includes 5. export B is still processing the rpc 2 and looks for free bit in 6. after failover, reply data gets restored with 7. client A reconnects and resends its rpc 1. Server does not find |
| Comments |
| Comment by Gerrit Updater [ 09/Jul/18 ] |
|
Vladimir Saveliev (c17830@cray.com) uploaded a new patch: https://review.whamcloud.com/32798 |
| Comment by Gerrit Updater [ 18/Jul/18 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32798/ |
| Comment by Peter Jones [ 18/Jul/18 ] |
|
Landed for 2.12 |