[LU-15205] cfs_hash rehashing can race with hash iteration Created: 10/Nov/21  Updated: 15/Nov/21  Resolved: 15/Nov/21

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Upstream

Type: Bug Priority: Minor
Reporter: Alex Zhuravlev Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

when one thread leaving cfs hash iteration it checks whether it's time to rehash in cfs_hash_for_each_exit():

	if (bits > 0) {
		cfs_hash_rehash(hs, atomic_read(&hs->hs_count) <
				    CFS_HASH_LOOP_HOG);

then cfs_hash_rehash() can start rehashing immediately (not via worker thread):

	if (!do_rehash) {
		/* launch and return */
		queue_work(cfs_rehash_wq, &hs->hs_rehash_work);
		cfs_hash_unlock(hs, 1);
		return;
	}

	/* rehash right now */
	cfs_hash_unlock(hs, 1);

	cfs_hash_rehash_worker(&hs->hs_rehash_work);

if another thread is starting to iterate the same hash it can find it being rehashed and there is no way to wait for rehashing completion, that works for rehashing in a dedicated thread only:

	if (cfs_hash_is_rehashing(hs))
		cfs_hash_rehash_cancel(hs);

so 2nd thread just proceed and find hash under rehashing:

	LASSERT(!cfs_hash_is_rehashing(hs));


 Comments   
Comment by Gerrit Updater [ 10/Nov/21 ]

"Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45516
Subject: LU-15205 libcfs: disable inline rehashing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4cfe4a0c5a218399351d8df9bf845e74cd45c5a2

Comment by Alex Zhuravlev [ 15/Nov/21 ]

will be solved in LU-15207

Generated at Sat Feb 10 03:16:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.