[LU-9075] frequent mdt_hsm_update_request_state()/mdt_coordinator_cb() couple of error msgs when CDT has to deal with a huge backlog of actions Created: 03/Feb/17 Updated: 04/Aug/17 Resolved: 23/Apr/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Bruno Faccini (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
"(mdt_coordinator.c:1473:mdt_hsm_update_request_state()) ... Cannot find running request for cookie ..."/"(mdt_coordinator.c:339:mdt_coordinator_cb()) ... cannot cleanup timed out request ..." couple of msgs, seems to be a consequence of CDT busy parsing huge number of actions LLOG records, and this may be because they should concern active requests that have completed and thus that have already been removed from memory in mdt_hsm_update_request_state() (using mdt_cdt_remove_request() and in the context of a MDT thread handling CT's MDS_HSM_PROGRESS requests), but corresponding action LLOG record update is stuck awaiting for CDT to give-back cdt_llog_lock in mdt_agent_record_update(). Possible fix to this could be to use mdt_agent_record_update() before mdt_cdt_remove_request() in mdt_hsm_update_request_state(). |
| Comments |
| Comment by Gerrit Updater [ 03/Feb/17 ] |
|
Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/25243 |
| Comment by Gerrit Updater [ 23/Apr/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25243/ |
| Comment by Peter Jones [ 23/Apr/17 ] |
|
Landed for 2.10 |