[LU-10273] condition of l_wait_event is abused Created: 03/Feb/15  Updated: 23/Nov/17

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Liang Zhen (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 17342

 Description   

condition function of l_wait_event really should an bool expression or a very lightweight non-sleeping function, otherwise we may run into very unexpected situation. However, this is not always true, we may have very expensive condition function, a typical use-case is ptlrpcd_check() in ptlrpcd, it may iterate a long list and send RPCs, even worse, it can call cond_resched(), this is very dangerous, because l_wait_event() will set status of current thread to TASK_INTERRUPTIBLE, now if ptlrpcd_check() called schedule() and there is no new event for this ptlrpcd, then this thread may hang forever.

This could be difficult to check all cases and fix them, but we need a ticket as memo...


Generated at Sat Feb 10 02:33:35 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.