[LU-1373] ptlrpcd shouldn't do disk I/O Created: 04/May/12  Updated: 13/Jun/12  Resolved: 13/Jun/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0
Fix Version/s: Lustre 2.3.0

Type: Bug Priority: Blocker
Reporter: Johann Lombardi (Inactive) Assignee: Johann Lombardi (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 4594

 Description   

The patch set which added imperative recovery to 2.2 modified how ASTs are sent by servers.
AST requests used to be sent by the service thread itself and it is now sent by ptlrpcds.
The drawback is that ptlrpcd threads can now do disk I/O to update the LVB if the callback failed to be sent:
ldlm_cb_interpret
-> ldlm_handle_ast_error
-> ldlm_res_lvbo_update

Although we now have multiple ptlrpcd threads, it is still a bad idea to block ptlrpcd for an undefined amount of time.

I think we can restore the original logic (i.e. using one single request set managed by the service thread) while addressing the needs of imperative recovery which wants to notify all client nodes ASAP and not wait for all ASTs in the set to be completed before sending the next wave of ASTs.

I'm going to attach a patch which is also useful for quota since we need to process glimpse ASTs as all other ASTs and we need to do I/Os in the interpret function.



 Comments   
Comment by Johann Lombardi (Inactive) [ 04/May/12 ]

Patch extracted from orion_quota branch:
http://review.whamcloud.com/2650

Comment by Peter Jones [ 13/Jun/12 ]

Landed for 2.3

Generated at Sat Feb 10 01:16:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.