[LU-8169] More extensive and accessible logging of lfsck Created: 19/May/16  Updated: 21/May/16  Resolved: 20/May/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Nathan Dauchy (Inactive) Assignee: nasf (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-5202 LFSCK 5: LFSCK needs to log all chang... In Progress
Rank (Obsolete): 9223372036854775807

 Description   

We would like to have more extensive logging of lfsck actions, dumping to a file rather than being limited to a small buffer.

From LU-8071:

By default, the repairing behaviour will be recorded in Lustre debug log
via label "D_LFSCK". But because Lustre kernel log is in RAM only, and if
you did not dump them periodically, then it will be overwritten.

Rather than using the debuging interface, this should be exposed through the "lctl" tool. Either an option to "lfsck_start", or a new "lctl lfsck_log" control option. Writing to a specified file on the MDS would be sufficient, but the ability to write to syslog could potentially be useful as well.

Also, I would propose that the default behavior be that logging is *enabled*. It is easy enough to go back and delete log files after the fact if they are unneeded... but impossible to track down some of what lfsck has done once the kernel buffer has been exhausted.



 Comments   
Comment by Andreas Dilger [ 20/May/16 ]

I fully agree. Please see LU-5202 for possible solutions to this. One option is to use lctl set_param printk=+lfsck to log the LFSCK messages to the console log, but they may need to be cleaned up to some extent so that D_LFSCK is only used for important messages (i.e. start/stop, and repair) and not just spewing status messages.

Some possibilities also exist with "lctl lfsck_start" also starting debug_daemon with a filter for only D_LFSCK messages and writing to e.g. /var/log/lfsck-YYYYMMDD-HHMM.log by default. It would also need to stop debug_daemon when LFSCK completed, but I'm not sure how easily that is done (in-band D_LFSCK "STOP debug_daemon" message?)

Comment by Peter Jones [ 20/May/16 ]

Fan Yong

Do you agree that this ticket is seems to be a duplicate of LU-5202?

Peter

Comment by nasf (Inactive) [ 20/May/16 ]

Fan Yong
Do you agree that this ticket is seems to be a duplicate of LU-5202?
Peter

Yes, I think so.

Comment by Peter Jones [ 20/May/16 ]

Thanks Fan Yong

Generated at Sat Feb 10 02:15:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.