Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7486

Inconsistent state information in health_check

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • Lustre 2.8.0
    • lola
      build:2.7.63-4-gf84e06e, a7eface85ea2d2aa6198681264b082a0244855d4
    • 3
    • 9223372036854775807

    Description

      The error occurred during soak testing of build '20151122' (see https://wiki.hpdd.intel.com/pages/viewpage.action?title=Soak+Testing+on+Lola&spaceKey=Releases#SoakTestingonLola-20151122).

      If a Lustre client hit a LBUG (for example LU-7422) the state information found in /proc/fs/lustre/health_check is inconsistent and confusing:

      [root@lola-29 ~]# cat /proc/fs/lustre/health_check
      LBUG
      healthy
      

      It would be desired to remove the string 'health' or replace it with 'insane' or an equivalent string, mostly to provide meaningful messages to monitoring tools.

      Attachments

        Activity

          [LU-7486] Inconsistent state information in health_check

          patch has landed to master for 2.9.0

          jgmitter Joseph Gmitter (Inactive) added a comment - patch has landed to master for 2.9.0

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17981/
          Subject: LU-7486 obdclass: health_check to report unhealthy upon LBUG
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 909e4dc00f224834ff7ac4b6b8f0f6bf76e3c58d

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17981/ Subject: LU-7486 obdclass: health_check to report unhealthy upon LBUG Project: fs/lustre-release Branch: master Current Patch Set: Commit: 909e4dc00f224834ff7ac4b6b8f0f6bf76e3c58d

          Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/17981
          Subject: LU-7486 obdclass: health_check to report unhealthy upon LBUG
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 45a0dabd513a4c7ab16042a4519aea42263ccc30

          gerrit Gerrit Updater added a comment - Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/17981 Subject: LU-7486 obdclass: health_check to report unhealthy upon LBUG Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 45a0dabd513a4c7ab16042a4519aea42263ccc30

          Hello Frank, you are right and I just found this too when working with Gabriele to determine health_check proc-file capability!
          Will post a patch to report "NOT HEALTHY" like already for any other failure being detected during check.

          bfaccini Bruno Faccini (Inactive) added a comment - Hello Frank, you are right and I just found this too when working with Gabriele to determine health_check proc-file capability! Will post a patch to report "NOT HEALTHY" like already for any other failure being detected during check.

          People

            bfaccini Bruno Faccini (Inactive)
            heckes Frank Heckes (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: