Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8609

connect client health_check file to client import state

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 9223372036854775807

    Description

      The /proc/fs/lustre/health_check file on the client does not currently report anything besides health=healthy, since this is only wired up to report request handling state on the server nodes, and if the ldiskfs filesystem has been mounted read-only.

      It would be desirable to have health_check print out the OSC and MDC import states on the client (e.g. if imp->imp_state != LUSTRE_IMP_FULL as long as the import is not marked imp_deactive).

      Attachments

        Issue Links

          Activity

            [LU-8609] connect client health_check file to client import state

            Link the work of LU-10756 since a patch exist that send uevents about the client imports state. Admins can setup udev rules to log or handle specific states. That is a more dynamic approach. We can supply more in depth information related to this ticket.

            simmonsja James A Simmons added a comment - Link the work of LU-10756 since a patch exist that send uevents about the client imports state. Admins can setup udev rules to log or handle specific states. That is a more dynamic approach. We can supply more in depth information related to this ticket.

            Hmm, looking at this more closely, it seems that the health_check file never prints anything but "health=healthy" even on the server under normal circumstances, and I guess this shouldn't change. If there is a problem it will print "LBUG" or "NOT HEALTHY". What is more important is that the client report "health=NOT HEALTHY" if the import states are bad, not that individual connection states are printed.

            adilger Andreas Dilger added a comment - Hmm, looking at this more closely, it seems that the health_check file never prints anything but "health=healthy" even on the server under normal circumstances, and I guess this shouldn't change. If there is a problem it will print "LBUG" or "NOT HEALTHY". What is more important is that the client report "health=NOT HEALTHY" if the import states are bad, not that individual connection states are printed.
            adilger Andreas Dilger added a comment - - edited

            James, why not just move the whole file to debugfs? There are already "lfs osts}" and "{{lfs mdts" commands to print this state for users, the "health_check" file is mostly for HA monitoring scripts.

            adilger Andreas Dilger added a comment - - edited James, why not just move the whole file to debugfs? There are already " lfs osts}" and "{{lfs mdts " commands to print this state for users, the "health_check" file is mostly for HA monitoring scripts.

            This over laps with something Andreas suggested. In that we create two files for the health_checker. One to report the health and the second in debugfs that has more in depth details why it failed.

            simmonsja James A Simmons added a comment - This over laps with something Andreas suggested. In that we create two files for the health_checker. One to report the health and the second in debugfs that has more in depth details why it failed.
            emoly.liu Emoly Liu added a comment -

            adilger,
            What kind of OSC and MDC import states on the client should be printed by health_check? Is the following format OK?

            health_check=healthy
            target state:
              lustre-OST0000_UUID: FULL
              lustre-OST0001_UUID: FULL
              ...
              MGS: FULL
              ...   
            
            emoly.liu Emoly Liu added a comment - adilger , What kind of OSC and MDC import states on the client should be printed by health_check? Is the following format OK? health_check=healthy target state: lustre-OST0000_UUID: FULL lustre-OST0001_UUID: FULL ... MGS: FULL ...

            People

              emoly.liu Emoly Liu
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: