Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8017

All Nodes report NOT HEALTHY, system is healthy

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • Lustre 2.9.0
    • 3
    • 9223372036854775807

    Description

      Current build installed; https://build.hpdd.intel.com/job/lustre-reviews/38245/
      This issue has persisted for the last two builds.
      After mounting the filesystem, all nodes report NOT HEALTHY in /proc/fs/lustre/health_check.

      1. pdsh -g server 'lctl get_param health_check' |dshbak -c
        ----------------
        lola-[2-11]
        ----------------
        health_check=healthy
        NOT HEALTHY

      The filesystem otherwise operates normally, jobs run, results are created.
      We were using the health_check as part of our monitoring - this has been discontinued.
      We are uncertain as to the cause, as all operations we can test work fine, and no errors are reported.

      Attachments

        Issue Links

          Activity

            [LU-8017] All Nodes report NOT HEALTHY, system is healthy
            cliffw Cliff White (Inactive) made changes -
            Link New: This issue is related to LDEV-547 [ LDEV-547 ]
            adilger Andreas Dilger made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Reopened [ 4 ] New: Resolved [ 5 ]
            bogl Bob Glossman (Inactive) made changes -
            Resolution Original: Fixed [ 1 ]
            Status Original: Resolved [ 5 ] New: Reopened [ 4 ]
            simmonsja James A Simmons made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            adilger Andreas Dilger made changes -
            Priority Original: Major [ 3 ] New: Critical [ 2 ]
            adilger Andreas Dilger made changes -
            Fix Version/s New: Lustre 2.9.0 [ 11891 ]
            simmonsja James A Simmons made changes -
            Link New: This issue is related to LU-8066 [ LU-8066 ]
            adilger Andreas Dilger made changes -
            Assignee Original: WC Triage [ wc-triage ] New: James A Simmons [ simmonsja ]
            jamesanunez James Nunez (Inactive) made changes -
            Affects Version/s New: Lustre 2.9.0 [ 11891 ]
            cliffw Cliff White (Inactive) made changes -
            Priority Original: Minor [ 4 ] New: Major [ 3 ]

            People

              simmonsja James A Simmons
              cliffw Cliff White (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: