[LU-9730] cleanup OST objects that have been leaked during interrupted/failed runs of obdfilter-survey Created: 02/Jul/17  Updated: 15/Feb/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Bruno Faccini (Inactive) Assignee: Bruno Faccini (Inactive)
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

If interrupted or upon failure, obdfilter-survey can leave OST objects allocated, unconnected and consuming space.

A first and simple fix will be to add an exit trap to the script in order to ensure that the previously created objects during the current run will be deleted.

Alternatively, a post-failure cleanup way/tool is also required to allow later/async deletion of these same kind of orphan objects.



 Comments   
Comment by Gerrit Updater [ 19/Jul/17 ]

Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/28113
Subject: LU-9730 tests: obdfilter-survey cleanup upon exit/signal
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 0b44735657983451407f9bbd90891b6c054bde6d

Comment by Bruno Faccini (Inactive) [ 30/Jan/18 ]

About the async post-failure cleanup process, as the orphan objects end up as unattached inodes with i_nlink == 1, for ldiskfs back-end case, an e2fsck run would be very helpful to identify all all concerned inodes, among some others if Lustre FS and OST are currently in-use/mounted, and this at least using the -n option.
But then there is the need to access their LMA xattr content, in order to verify if they are of the FID_SEQ_ECHO sequence and thus use their object-id to request their destruction.
Or may be some of the OI code could be used/modified in order to permit inode/object-id mapping, or why not to implement a method to parse OI in order to retrieve all known/registered FID_SEQ_ECHO/object-id from it and thus be able to destroy all of them.

For ZFS back-end, looks like ZAP features usage will be required.

Comment by Patrick Farrell (Inactive) [ 12/Mar/19 ]

nangelinas, I know you were working on this recently.  Did you end up finishing something?

Generated at Sat Feb 10 02:28:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.