[LU-7507] the data version doesn't always change after a layout swap Created: 01/Dec/15  Updated: 18/Apr/17  Resolved: 18/Apr/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Frank Zago (Inactive) Assignee: Henri Doreau (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Blocker
is blocking LU-6081 hsm: add file migrate support Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

After a successful layout swap, the data version of a file is sometimes not changed.

llapi_get_data_version(fd1, &dv1, LL_DV_RD_FLUSH);  -> get data version
llapi_fswap_layouts(fd1, fd2, ....) -> success
lapi_get_data_version(fd1, llapi_get_data_version(fd1, &dv1, LL_DV_RD_FLUSH);  -> get same data version

Sometimes the dataversion is different, and sometimes it is not. So some caching might be involved.

I added some calls to sync() and fsync() on both file descriptors just after the call to layout swap, but that didn't fix the problem.

I pushed a reproducer in http://review.whamcloud.com/#/c/13441/. Change "#if 0" to "#if 1" in swap_layout_test.c:test42(). The calls to tests 30 and 31 should be commented out too since they are not related and take too long to execute.

	if (0) PERFORM(test30);
	if (0) PERFORM(test31);


 Comments   
Comment by Frank Zago (Inactive) [ 01/Dec/15 ]

A failed test:

Starting test test42 at 1448994477
DV = 100000c33 and 100000c33
swap_lock_test: swap_lock_test.c:886: test42: assertion 'dv1 != new_dv1' failed: got identical dataversion for fd1: 100000c33

A successful test:

Starting test test42 at 1448994475
DV = 100000b26 and 100000b24
new DV = 100000b24 and 100000b26
DV= 100000b24 and 100000b26
Finishing test test42 at 1448994475
Comment by Gerrit Updater [ 20/May/16 ]

Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/20346
Subject: LU-7507 tests: fix dataversion regression test
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7d2512158cfb0cb1e5798b28f66a01a804d8a1e5

Comment by Henri Doreau (Inactive) [ 20/May/16 ]

Isn't it just that we get two identical dataversion values for the two objects, and are therefore unable to distinguish between changed/not-changed? See my patch above (which sometimes gets stuck on LU-7073, but that is another issue).

Comment by Peter Jones [ 18/Apr/17 ]

AFAICT this was tracked under LU-8157

Generated at Sat Feb 10 02:09:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.