[LU-5799] Unhelpful "deleting orphan objects" console message Created: 23/Oct/14  Updated: 13/Oct/21  Resolved: 13/Oct/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Christopher Morrone Assignee: Bob Glossman (Inactive)
Resolution: Low Priority Votes: 0
Labels: llnl

Severity: 3
Rank (Obsolete): 16264

 Description   

Running 2.5.3, we see the following message on the console pretty much every time we start an OST:

Lustre: lcy-OST000f: deleting orphan objects from 0x0:99109478 to 0x0:99113686

This cleanup is entirely normal. I don't think there is a good reason to be spamming the console with completely normal, expected behavior.

Granted, we recently used that message to identify a problem by the absence of that message. But it would be much better to print an error message when lustre skips orphan cleanup because the ranges are unexpectedly bad, and stay silent during normal cleanup.



 Comments   
Comment by Peter Jones [ 23/Oct/14 ]

Bob is looking into this one

Comment by Bob Glossman (Inactive) [ 23/Oct/14 ]

in master:
http://review.whamcloud.com/12407

in b2_5:
http://review.whamcloud.com/12408

Comment by Christopher Morrone [ 23/Oct/14 ]

The recent problem that I alluded to was LU-5648, as Ned clarified. And as Ned reviewed in gerrit, we will need to add proper debugging messages, not just remove the "deleting orphan objects" message.

Comment by Andreas Dilger [ 24/Oct/14 ]

Bob, it is better to submit patches to master first, and then they can be cherry-picked to b2_5 as needed and add the Lustre-change: and Lustre-commit: tags at that time. That improves tracking of patches between branches and avoids testing/inspecting the patches twice.

Comment by Andreas Dilger [ 24/Oct/14 ]

I'd actually prefer that these messages be kept, since they provide useful information for debugging (e.g. LU-5785) in case of recovery problems.

Comment by Christopher Morrone [ 15/Dec/14 ]

I think that we were one of the prime users of those message for a recent bug. But the reason we needed them is that there were no error messages printed in the error path. Instead we needed to intuit an error from the lack of the regular spam message.

It seems clear to me that the message is not understandable or usable to an sysadmin. Additionally, it prints out every time which only serves to each sysadmins not to look at lustre messages at all.

And if we can agree on that nature of the message, then it does not belong on the console. Instead we should add messages in the error paths.

Generated at Sat Feb 10 01:54:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.