Details

    • 3
    • 4031

    Description

      A production MDS in recovery seemed to restart the recovery timer more than we would expect. Here is a paraphrased transcript of the console log:

      Apr 25 12:21:28 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 5:00
      Apr 25 12:21:28 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 5:00
      Apr 25 12:36:03 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:36:03 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 193 clients in recovery for 0:24
      (repeats once)                                                                 
      Apr 25 12:36:29 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:36:29 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:36:30 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:36:32 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:36:33 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00
      Apr 25 12:36:33 roc-mds1 kernel: Lustre: ls4-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 15:00
      (repeats about ten times over ten minutes)                                     
      Apr 25 12:51:33 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:51:34 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:51:35 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 12:51:35 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00
      Apr 25 12:55:20 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 11:15
      Apr 25 13:05:20 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 1:15
      Apr 25 13:06:35 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 13:06:35 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 13:06:36 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 13:06:37 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 13:06:39 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 13:06:39 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00
      Apr 25 13:15:20 roc-mds1 kernel: Lustre: ls5-MDT0000: Denying connection for new client ..., waiting for 100 clients in recovery for 6:19
      Apr 25 13:21:39 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 13:21:40 roc-mds1 kernel: Lustre: ls5-MDT0000: recovery is timed out, evict stale exports
      Apr 25 13:21:41 roc-mds1 kernel: Lustre: ls5-MDT0000: Starting recovery timer for 15:00
      

      This raises several questions.

      • Is it unexpected to have multiple "Starting recovery timer" messages, or is this just normal extension of recovery timeout due to new client connections?
      • Should it be possible to start the recovery timer twice in the same second, as at 12:21:28?
      • How many times should "recovery is timed out, evict stale exports" appear? (Looking at target_recovery_overseer(), it seems it should be at most twice.)
      • In check_and_start_recovery_timer(), is the cfs_timer_is_armed() check sufficient to avoid race conditions, or does it need to also check the obd_device recovery flags?

      Let's also think about how to improve the code comments, manual, and log messages to make the recovery process more transparent.

      Attachments

        Activity

          [LU-1352] spurious recovery timer resets

          Closing old 2.1 bug. Problem was fixed in 2.2 and fix was backported to 2.1.

          adilger Andreas Dilger added a comment - Closing old 2.1 bug. Problem was fixed in 2.2 and fix was backported to 2.1.

          We also experienced the same issue at TGCC site, and what is not clear for me is if "http://review.whamcloud.com/#change,2719" patch is b2_1 only and just a backport of the new recovery timer algorithms as implemented in b2_3/master ??...

          bfaccini Bruno Faccini (Inactive) added a comment - We also experienced the same issue at TGCC site, and what is not clear for me is if "http://review.whamcloud.com/#change,2719" patch is b2_1 only and just a backport of the new recovery timer algorithms as implemented in b2_3/master ??...

          If you have a second patch that is needed, please submit it to gerrit now as well. We can give it some testing now on our end.

          morrone Christopher Morrone (Inactive) added a comment - If you have a second patch that is needed, please submit it to gerrit now as well. We can give it some testing now on our end.

          status update:
          the first patch is tracked at http://review.whamcloud.com/#change,2719, and the second patch is upon the first one
          and will attach it once the first is merged.

          hongchao.zhang Hongchao Zhang added a comment - status update: the first patch is tracked at http://review.whamcloud.com/#change,2719 , and the second patch is upon the first one and will attach it once the first is merged.

          Hi Hongchao,

          That approach sounds reasonable to me. Thanks

          nedbass Ned Bass (Inactive) added a comment - Hi Hongchao, That approach sounds reasonable to me. Thanks

          this patch is part of the patches landed on master, and it mainly bases on the implementation of IR(Imperative Recovery),
          and the patch http://review.whamcloud.com/1620 and http://review.whamcloud.com/1722 are fixing the problem in
          http://review.whamcloud.com/1292, http://review.whamcloud.com/2674 is a separated patch, then how about creating two patches
          to backport it?

          hongchao.zhang Hongchao Zhang added a comment - this patch is part of the patches landed on master, and it mainly bases on the implementation of IR(Imperative Recovery), and the patch http://review.whamcloud.com/1620 and http://review.whamcloud.com/1722 are fixing the problem in http://review.whamcloud.com/1292 , http://review.whamcloud.com/2674 is a separated patch, then how about creating two patches to backport it?

          I think that when porting patches to another branch the patches need to remain separate. The commit messages should be mostly the same as well (especially the bug reference number and first line summary). It will be very difficult to make sense of what is in a branch using "git log --oneline" if we squash multiple commits into one on another branch.

          morrone Christopher Morrone (Inactive) added a comment - I think that when porting patches to another branch the patches need to remain separate. The commit messages should be mostly the same as well (especially the bug reference number and first line summary). It will be very difficult to make sense of what is in a branch using "git log --oneline" if we squash multiple commits into one on another branch.
          hongchao.zhang Hongchao Zhang added a comment - the patch is tracked at http://review.whamcloud.com/#change,2719

          status update:
          the patch is under test, and there are still some problems in it and hope to attach it before the weekend.

          hongchao.zhang Hongchao Zhang added a comment - status update: the patch is under test, and there are still some problems in it and hope to attach it before the weekend.

          status update:

          the related tickets are:
          ORNL-28 http://review.whamcloud.com/1292, http://review.whamcloud.com/1293, http://review.whamcloud.com/1620
          LU-889 http://review.whamcloud.com/1722

          the pending patch will try to backport these patches to b2_1

          hongchao.zhang Hongchao Zhang added a comment - status update: the related tickets are: ORNL-28 http://review.whamcloud.com/1292 , http://review.whamcloud.com/1293 , http://review.whamcloud.com/1620 LU-889 http://review.whamcloud.com/1722 the pending patch will try to backport these patches to b2_1

          People

            hongchao.zhang Hongchao Zhang
            nedbass Ned Bass (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: