[LU-9886] Crash in upstream kiblnd_handle_early_rxs() Created: 17/Aug/17  Updated: 14/May/18  Resolved: 14/May/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Upstream
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Doug Oucharek (Inactive) Assignee: Doug Oucharek (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-9679 Prepare lustre for adoption into the ... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Under upstream staging commit 5a2ca43fa54f561c252c2, the list handling code in kiblnd_handle_early_rxs() got changed to list_for_each_safe(). That protects against the current thread from deleting the current entry it is looking at. It does not protect against another thread from deleting the next item in the list (which the tmp variable points to). The way this routine holds then releases a lock opens the door to other threads doing just that.

We have triggered a crash due to this so it can in fact happen (a rare race condition).

Please revert this commit on this routine. The change never made it to the community repo.



 Comments   
Comment by Peter Jones [ 17/Aug/17 ]

Thanks Doug. Oleg will take care of this.

Comment by James A Simmons [ 17/Aug/17 ]

Sigh Those stupid list_for_each_safe() changes. Thankfully Patrick Farrell put a end to those patches.

Comment by Patrick Farrell (Inactive) [ 18/Aug/17 ]

.... I did? Huh.

Comment by James A Simmons [ 14/May/18 ]

Patch landed to staging-next as commit a8da8e528cb0a7f5f7ad9880f13c3359cfb31181. Will land in Linus tree during 4.18-rc1 window

Generated at Sat Feb 10 02:30:11 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.