Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
ldlm_cli_inodebits_convert() release resource's spinlock to call the blocking AST callback:
unlock_res_and_lock(lock);
lock->l_blocking_ast(lock, &ld, lock->l_ast_data, LDLM_CB_CANCELING);
/* now notify server about convert */
rc = ldlm_cli_convert_req(lock, &flags, new_bits);
lock_res_and_lock(lock);
a concurrent thread can step in and start cancelling which later is seen in the original thread:
[ 2955.447199] LustreError: 15208:0:(ldlm_lock.c:1094:ldlm_grant_lock_with_skiplist()) ### not granted ns: lustre-MDT0000-mdc-ffff91cf1df76000 lock: 0000000077ba152a/0x9a9735f35bb0415d lrc: 1/0,0 mode: --/PR res: [0x200000402:0x176d:0x0].0x0 bits 0x58/0x2 rrc: 5 type: IBT gid 0 flags: 0x814829402000020 nid: local remote: 0x9a9735f35bb04179 expref: -99 pid: 251837 timeout: 0 lvb_type: 3
#define LDLM_FL_CANCELING 0x0000008000000000ULL // bit 39
and hit an assertion:
!!!!!!!!!! [ 2955.448028] LustreError: 15208:0:(ldlm_lock.c:1095:ldlm_grant_lock_with_skiplist()) ASSERTION( ldlm_is_granted(lock) ) failed: !!!!!!!!!!
it makes sense to check for cancelling before ldlm_cli_inodebits_convert() grant the lock.