[LU-17415] ldlm_cli_inodebits_convert() should not grant locks being cancelled Created: 11/Jan/24 Updated: 13/Jan/24 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alex Zhuravlev | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
ldlm_cli_inodebits_convert() release resource's spinlock to call the blocking AST callback:
unlock_res_and_lock(lock);
lock->l_blocking_ast(lock, &ld, lock->l_ast_data, LDLM_CB_CANCELING);
/* now notify server about convert */
rc = ldlm_cli_convert_req(lock, &flags, new_bits);
lock_res_and_lock(lock);
a concurrent thread can step in and start cancelling which later is seen in the original thread:
[ 2955.447199] LustreError: 15208:0:(ldlm_lock.c:1094:ldlm_grant_lock_with_skiplist()) ### not granted ns: lustre-MDT0000-mdc-ffff91cf1df76000 lock: 0000000077ba152a/0x9a9735f35bb0415d lrc: 1/0,0 mode: --/PR res: [0x200000402:0x176d:0x0].0x0 bits 0x58/0x2 rrc: 5 type: IBT gid 0 flags: 0x814829402000020 nid: local remote: 0x9a9735f35bb04179 expref: -99 pid: 251837 timeout: 0 lvb_type: 3
#define LDLM_FL_CANCELING 0x0000008000000000ULL // bit 39
and hit an assertion: !!!!!!!!!! [ 2955.448028] LustreError: 15208:0:(ldlm_lock.c:1095:ldlm_grant_lock_with_skiplist()) ASSERTION( ldlm_is_granted(lock) ) failed: !!!!!!!!!! it makes sense to check for cancelling before ldlm_cli_inodebits_convert() grant the lock. |
| Comments |
| Comment by Gerrit Updater [ 11/Jan/24 ] |
|
"Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53645 |
| Comment by Alex Zhuravlev [ 13/Jan/24 ] |
|
I'm unable to reproduce the problem with the patch above. |