Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.1.1
-
3
-
4031
Description
A production MDS in recovery seemed to restart the recovery timer more than we would expect. Here is a paraphrased transcript of the console log:
Apr 25 12:21:28 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 5:00 Apr 25 12:21:28 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 5:00 Apr 25 12:36:03 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:36:03 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 193 clients in recovery for 0:24 (repeats once) Apr 25 12:36:29 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:36:29 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:36:30 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:36:32 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:36:33 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00 Apr 25 12:36:33 roc-mds1 kernel: Lustre: ls4-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 15:00 (repeats about ten times over ten minutes) Apr 25 12:51:33 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:51:34 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:51:35 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 12:51:35 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00 Apr 25 12:55:20 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 11:15 Apr 25 13:05:20 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 1:15 Apr 25 13:06:35 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 13:06:35 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 13:06:36 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 13:06:37 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 13:06:39 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 13:06:39 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00 Apr 25 13:15:20 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 6:19 Apr 25 13:21:39 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 13:21:40 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports Apr 25 13:21:41 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00
This raises several questions.
- Is it unexpected to have multiple "Starting recovery timer" messages, or is this just normal extension of recovery timeout due to new client connections?
- Should it be possible to start the recovery timer twice in the same second, as at 12:21:28?
- How many times should "recovery is timed out, evict stale exports" appear? (Looking at target_recovery_overseer(), it seems it should be at most twice.)
- In check_and_start_recovery_timer(), is the cfs_timer_is_armed() check sufficient to avoid race conditions, or does it need to also check the obd_device recovery flags?
Let's also think about how to improve the code comments, manual, and log messages to make the recovery process more transparent.