[LU-3335] LFSCK II: MDT-OST OST local consistency checking Created: 14/May/13  Updated: 19/Jul/21  Due: 31/Jul/13  Resolved: 23/Sep/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: Lustre 2.5.0, Lustre 2.6.0

Type: New Feature Priority: Critical
Reporter: Andreas Dilger Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: LFSCK

Issue Links:
Blocker
is blocking LU-1267 LFSCK II: MDT-OST consistency check/r... Resolved
Duplicate
duplicates LU-3995 CLONE - LFSCK II: MDT-OST OST local c... Resolved
Related
is related to LU-14864 osd_fid_lookup() ASSERTION( !updated ... Open
is related to LU-4829 LBUG: ASSERTION( !fid_is_idif(fid) ) Resolved
is related to LU-3588 use lctl --device option to specify d... Resolved
Story Points: 89
Epic: layout
Rank (Obsolete): 8251

 Description   

This ticket is related to running the existing OI Scrub checking for OST OSD devices. This includes

  • running the object iterator (should already work)
  • rebuilding the OI (O/$seq/d*/$oid for OST objects
  • rebuilding the O/$seq/LAST_ID file if incorrect (LU-14)


 Comments   
Comment by Alex Zhuravlev [ 14/May/13 ]

in the past I asked few times that OI scrub should be independent from specific OI format whether it's oi.* files or /O hierarchy. the logic of OI scrubber should just use osd_oi_lookup(), osd_oi_delete() and osd_oi_insert() and leave osd_oi_*() functions to decide which mapping to use for a specific object.

Comment by Andreas Dilger [ 14/May/13 ]

Sure, the users of OI Scrub will just iterate over the inodes and get the object FIDs from LMA or filter_fid (in case of upgraded filesystems. However, if OST objects are located in lost+found due to filesystem corruption, then they need to be moved back to /O/$seq/d*/$oid (i.e. restored to the "OI" for the OST). This is currently handled by a separate ll_recover_lost_found_objs tool, but it would be better to avoid the need to mount and run this separately on the OST.

Comment by Alex Zhuravlev [ 15/May/13 ]

hmm, I'm not aware of any users of OI Scrub - it's all internal to OSD ?
at which point you want to update LAST_ID? I guess this can't be done any time, OFD might overwrite it immediately?

Comment by Andreas Dilger [ 11/Jun/13 ]

By "users of OI Scrub" I mean "users of the object iterator".

I also don't think it is harmful to update LAST_ID under lock, if this is done with proper locking of the LAST_ID object. Since this is now just a normal FID, we should be able to use the object lock to serialize access?

Comment by nasf (Inactive) [ 17/Jun/13 ]

The patches list:

http://review.whamcloud.com/#/c/6697/
http://review.whamcloud.com/#/c/6669/
http://review.whamcloud.com/#/c/6698/
http://review.whamcloud.com/#/c/6857/
http://review.whamcloud.com/#/c/7143/
http://review.whamcloud.com/#/c/7144/
http://review.whamcloud.com/#/c/7145/

Comment by nasf (Inactive) [ 04/Jul/13 ]

OI table based scanning to repair the inconsistency of inode without LMA (or dummy OI mapping items) will be an time-consuming work. Just like find out orphan OST-objects for LFSCK phase II, we can consider to do that in later release. Instead, I made another relative simple patch to resolve part of the issues: only the used/accessed objects will be repaired by the RPC service threads.

http://review.whamcloud.com/#/c/6899/

Andreas, how do you think?

Comment by Jodi Levi (Inactive) [ 23/Sep/13 ]

Follow on work for 2.6 is being tracked under LU-3995. Any additional patches for this work should be linked to LU-3995. Closing this ticket as 2.5 patches have landed.

Generated at Sat Feb 10 01:33:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.