[LU-15207] ASSERTION( !cfs_hash_is_rehashing(hs) Created: 11/Nov/21  Updated: 30/Nov/23  Resolved: 11/Jun/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Upstream
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Alex Zhuravlev Assignee: Alex Zhuravlev
Resolution: Fixed Votes: 0
Labels: LTS15

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
cfs_hash_rehash(struct cfs_hash *hs, int do_rehash)
{
...
	hs->hs_rehash_bits = rc;
	if (!do_rehash) {
		/* launch and return */
		queue_work(cfs_rehash_wq, &hs->hs_rehash_work);
		cfs_hash_unlock(hs, 1);
		return;
}

cfs_hash_rehash_cancel(struct cfs_hash *hs)
{
	LASSERT(cfs_hash_with_rehash(hs));
	cancel_work_sync(&hs->hs_rehash_work);
}

cfs_hash_for_each_enter(struct cfs_hash *hs)
{
...
	if (cfs_hash_is_rehashing(hs))
		cfs_hash_rehash_cancel(hs);
}

if we enter iteration (cfs_hash_for_each_enter()) and find rehashing scheduled or in progress (hs_rehash_bits != 0), then we want to cancel that work using cancel_work_sync().
but if the work hasn't started yet then who would reset hs_rehash_bits back to 0?

a trivial test demonstrates this:

static struct workqueue_struct *test_rehash_wq;
struct work_struct test_rehash_work;
static volatile int test_var = 0;
static void test_worker(struct work_struct *work)
{
       test_var = 0;
}
void work_test(void)
{
       test_rehash_wq = alloc_workqueue("test", WQ_SYSFS, 4);
       INIT_WORK(&test_rehash_work, test_worker);
       test_var = 1;
       queue_work(test_rehash_wq, &test_rehash_work);
       cancel_work_sync(&test_rehash_work);
       LASSERT(test_var == 0);
       destroy_workqueue(test_rehash_wq);
}


 Comments   
Comment by Alex Zhuravlev [ 11/Nov/21 ]

seems to be introduced with LU-9859 libcfs: use a workqueue for rehash work.

Comment by Gerrit Updater [ 11/Nov/21 ]

"Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45533
Subject: LU-15207 libcfs: reset hs_rehash_bits
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2dc2b3978b840f41d3544336a9acebedba467740

Comment by Gerrit Updater [ 11/Jun/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45533/
Subject: LU-15207 libcfs: reset hs_rehash_bits
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 9257f24dfdf9f0a68512fce52d79064f78d9dc88

Comment by Peter Jones [ 11/Jun/22 ]

Landed for 2.16

Comment by Gerrit Updater [ 29/Nov/23 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53283
Subject: LU-15207 libcfs: reset hs_rehash_bits
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: f57c484fa7a15ab97fe44b6bd88d598fbcfec622

Generated at Sat Feb 10 03:16:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.