[LU-8420] unexpected? client eviction after bulk transfer timeout Created: 20/Jul/16 Updated: 07/Feb/17 Resolved: 07/Feb/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Vladimir Saveliev | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
The following scenario leading to client's eviction has been observed in acceptance testing: 1) client 1 owns PW lock on file A and sends write rpc to ost int tgt_brw_write(struct tgt_session_info *tsi)
...
rc = target_bulk_io(exp, desc, &lwi);
no_reply = rc != 0;
...
6) blocking ast callback timer expires and the server evicts client 1 AT settings managed to make client's rpc timeout bigger than blast callback timeout. |
| Comments |
| Comment by Gerrit Updater [ 20/Jul/16 ] |
|
Vladimir Saveliev (vladimir_saveliev@xyratex.com) uploaded a new patch: http://review.whamcloud.com/21448 |
| Comment by Vladimir Saveliev [ 02/Dec/16 ] |
|
2 important points were not mentioned in this scenario:
4.2) at_history passed since worst rpc took place and service estimate drops down.
5.2 prolong tries to prolong lock callback timer using decreased service estimate. That makes prolong to make no effect.
|
| Comment by Gerrit Updater [ 07/Feb/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/21448/ |
| Comment by Peter Jones [ 07/Feb/17 ] |
|
Landed for 2.10 |