Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
With the https://review.whamcloud.com/47487 ("LU-15546 mdt: mdt_reint_open lookup before locking") , the OBD_FAIL_MDS_REINT_OPEN2 race timeouts in sanityn test_41i:
LustreError: 3945:0:(libcfs_fail.h:178:cfs_race()) cfs_fail_race id 16a awake: rc=0
Now, the first thread take a PW parent lock (by checking the child existence before locking) . So the second thread is waiting for lock (PR locks are compatible but not the PW locks) .
We have to force the first thread to take a PR parent lock to keep testing the full lock cycle:
- take PR parent lock
- lockup child (do not exist)
- take PW parent lock
- re-lookup
- create child
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/47506
Subject: LU-15907 mdt: fix the OBD_FAIL_MDS_REINT_OPEN2 race
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d514137a5bf5bc53c2c35fb2c81f840813c45212