[LU-14981] ldlm_handle_cp_callback() RETURN() among GOTO(out, ...)s. Created: 03/Sep/21 Updated: 18/Oct/21 Resolved: 18/Oct/21 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | John Hammond | Assignee: | WC Triage |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
7fff052c930da4822c3b2a13d130da7473a20a58 (https://review.whamcloud.com/19898
if (!ldlm_res_eq(&dlm_req->lock_desc.l_resource.lr_name,
&lock->l_resource->lr_name)) {
ldlm_resource_unlink_lock(lock);
unlock_res_and_lock(lock);
rc = ldlm_lock_change_resource(ns, lock,
&dlm_req->lock_desc.l_resource.lr_name);
if (rc < 0) {
LDLM_ERROR(lock, "Failed to allocate resource");
GOTO(out, rc);
}
LDLM_DEBUG(lock, "completion AST, new resource");
lock_res_and_lock(lock);
}
if (ldlm_is_failed(lock)) {
unlock_res_and_lock(lock);
LDLM_LOCK_RELEASE(lock);
RETURN(-EINVAL);
}
if (ldlm_is_destroyed(lock) ||
ldlm_is_granted(lock)) {
/* b=11300: the lock has already been granted */
unlock_res_and_lock(lock);
LDLM_DEBUG(lock, "Double grant race happened");
GOTO(out, rc = 0);
}
...
out:
if (rc < 0) {
lock_res_and_lock(lock);
ldlm_set_failed(lock);
unlock_res_and_lock(lock);
wake_up(&lock->l_waitq);
}
LDLM_LOCK_RELEASE(lock);
return 0;
}
Is this correct? |
| Comments |
| Comment by Oleg Drokin [ 03/Sep/21 ] |
|
This appears to be correct, if you check the out path, we set the failed flag on the lock first and then wake up all others. The RETURN call is from the failed check so it came from a racing thread that just set this failed flag and woke us up. the only other place that sets the failed flag is in cleanup_resource() when we are cleaning up all resources anyway and it wakes the queue via the call to either completion callback or cancel. |
| Comment by Oleg Drokin [ 03/Sep/21 ] |
|
Just to clarify, if there's a desire to insert a wakeup call in that exit path for whatever "just in case" or future-proofing reasons - I'd have no objections to this. It's just after the code inspection I do not see a bug here. |