[LU-13986] livelock is possible in distribute_txn_commit_thread() Created: 25/Sep/20  Updated: 12/Oct/20  Resolved: 12/Oct/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Minor
Reporter: Neil Brown Assignee: Neil Brown
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-12780 Avoid using ptlrpc_thread where is in... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

A recent patch to update_trans.c changed how distribute_txn_thread() waited for more work to do.

It previously had an explicit "wait_event()" which listed all the conditions to wait for. It would then recheck each condition and possibly perform an appropriate action.

It was changed to check each condition only once (per loop). If the condition was true, the action would be performed and a flag set. If no conditions were true (indicated by flag), it would wait, otherwise it would loop and recheck all condition.

One of the "if (condition) { do work }" stanzas in the loop tested a condition that was not a condition that should wake up the loop. "batchid" was not tested at all in the wait_event(). The flag mentioned above was, however, set when that condition tested true.
This can cause the loop to spin indefinitely.

The "__set_current_state(TASK_RUNNING);" should be removed so that the value
of batchid cannot stop the loop from sleeping (calling 'schedule()').



 Comments   
Comment by Gerrit Updater [ 25/Sep/20 ]

Neil Brown (neilb@suse.de) uploaded a new patch: https://review.whamcloud.com/40043
Subject: LU-13986 target: fix possible liveloop in distribute_txn thd
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a6caa6ac9d871e549b2b1bfdaa4118b53e161dd4

Comment by Gerrit Updater [ 12/Oct/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40043/
Subject: LU-13986 target: fix possible liveloop in distribute_txn thd
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d05cab2fd7b9b38cc8414dcb03dbcc7b9ed31696

Generated at Sat Feb 10 03:05:52 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.