Details
-
Bug
-
Resolution: Fixed
-
Medium
-
None
-
None
-
3
-
9223372036854775807
Description
I was trying to run tests locally using 5.14 kernel (few versions actually) and notice strange problem in replay-dual/0a right away. basically recovery never aborts:
[ 48.360153] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 3 clients reconnect [ 48.360443] Lustre: 6042:0:(ldlm_lib.c:2069:extend_recovery_timer()) lustre-MDT0000: extended recovery timer reached hard limit: 60, extend: 0 [ 51.713599] Lustre: 6046:0:(ldlm_lib.c:2069:extend_recovery_timer()) lustre-MDT0000: extended recovery timer reached hard limit: 60, extend: 0 [ 51.713910] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 0@lo (at 0@lo) [ 51.714085] Lustre: 6046:0:(ldlm_lib.c:2069:extend_recovery_timer()) Skipped 1 previous similar message [ 51.715280] Lustre: *** cfs_fail_loc=514, val=0*** [ 51.715630] LustreError: 6046:0:(tgt_handler.c:526:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff97ea5281d340 x1848758873718400/t0(0) o41->lustre-MDT0001-mdtlov_UUID@0@lo:157/0 lens 224/0 e 0 to 0 dl 1763113907 ref 1 fl Interpret:/202/ffffffff rc 0/-1 job:'osp-pre-0-1.0' uid:0 gid:0 projid:4294967295 [ 107.991561] lustre-MDT0000: recovery timed out; 1 clients are still in recovery after 59 seconds (3 clients connected) [ 107.992196] Lustre: lustre-MDT0000: recovery is timed out, evict stale exports [ 107.992684] Lustre: 7865:0:(ldlm_lib.c:2069:extend_recovery_timer()) lustre-MDT0000: extended recovery timer reached hard limit: 60, extend: 1 [ 107.993004] Lustre: 7865:0:(ldlm_lib.c:2069:extend_recovery_timer()) Skipped 107 previous similar messages