[LU-8609] connect client health_check file to client import state Created: 13/Sep/16  Updated: 27/Aug/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: Emoly Liu
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-8066 Move lustre procfs handling to sysfs ... Open
is related to LU-10756 Send Uevents for interesting Lustre c... Open
Rank (Obsolete): 9223372036854775807

 Description   

The /proc/fs/lustre/health_check file on the client does not currently report anything besides health=healthy, since this is only wired up to report request handling state on the server nodes, and if the ldiskfs filesystem has been mounted read-only.

It would be desirable to have health_check print out the OSC and MDC import states on the client (e.g. if imp->imp_state != LUSTRE_IMP_FULL as long as the import is not marked imp_deactive).



 Comments   
Comment by Emoly Liu [ 24/Jan/18 ]

adilger,
What kind of OSC and MDC import states on the client should be printed by health_check? Is the following format OK?

health_check=healthy
target state:
  lustre-OST0000_UUID: FULL
  lustre-OST0001_UUID: FULL
  ...
  MGS: FULL
  ...   
Comment by James A Simmons [ 24/Jan/18 ]

This over laps with something Andreas suggested. In that we create two files for the health_checker. One to report the health and the second in debugfs that has more in depth details why it failed.

Comment by Andreas Dilger [ 26/Jan/18 ]

James, why not just move the whole file to debugfs? There are already "lfs osts}" and "{{lfs mdts" commands to print this state for users, the "health_check" file is mostly for HA monitoring scripts.

Comment by Andreas Dilger [ 26/Jan/18 ]

Hmm, looking at this more closely, it seems that the health_check file never prints anything but "health=healthy" even on the server under normal circumstances, and I guess this shouldn't change. If there is a problem it will print "LBUG" or "NOT HEALTHY". What is more important is that the client report "health=NOT HEALTHY" if the import states are bad, not that individual connection states are printed.

Comment by James A Simmons [ 27/Aug/18 ]

Link the work of LU-10756 since a patch exist that send uevents about the client imports state. Admins can setup udev rules to log or handle specific states. That is a more dynamic approach. We can supply more in depth information related to this ticket.

Generated at Sat Feb 10 02:19:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.