[LU-15937] lctl llog commands do not work for DNE recovery logs Created: 12/Jun/22  Updated: 12/Jun/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-15761 cannot finish MDS recovery Resolved
is related to LU-15936 DOSTID macro is not printing llog IDs... Resolved
is related to LU-15938 MDT recovery did not finish due to co... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I was unable to get "lctl --device NN llog_print <FID>" and expecially "lctl --device NN llog_cancel <FID>" to work on the DNE recovery logs located in update_logs_dir, regardless of which device NN was used. This was further complicated by the fact that the llog FIDs are printed incorrectly by the DOSTID macro (LU-15936).

It would be very useful to be able to use llog_print and llog_cancel for the DNE recovery llogs under update_log_dir to allow cancelling problematic records as seen in other tickets (e.g. LU-15761). Since the DNE recovery logs for MDT000x are located on some other MDT000y that may already be mounted, this otherwise requires MDT000y to be unmounted and the llog file truncated to force a "more easily handled" error, and is overly-broad in erasing other llog records.


Generated at Sat Feb 10 03:22:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.