[LU-3569] Use real OST index as ostid_to_fid() parameter instead of always "0" Created: 09/Jul/13  Updated: 25/Jun/20  Resolved: 10/Dec/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: Lustre 2.6.0, Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: nasf (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-7585 Implement OI Scrub for ZFS Resolved
is related to LU-8898 Interop conf-sanity test_52 FAIL: ll... Resolved
is related to LU-4414 ostid_id() returns incorrect OID for ... Resolved
is related to LU-4232 MDT and OST should validate incoming ... Open
Sub-Tasks:
Key
Summary
Type
Status
Assignee
LU-4414 ostid_id() returns incorrect OID for ... Technical task Resolved Di Wang  
Severity: 3
Rank (Obsolete): 8988

 Description   

Currently, the callers for ostid_to_fid() always use "0" as the "ost_idx" parameter. There are potential issues for that, and may cause FID inconsistency as to crash the system.



 Comments   
Comment by Di Wang [ 16/Jul/13 ]

Hmm, currently, we only use loi_index to locate OST for object with IDIF, so I do not think using "0" can cause potential problem here, unless I miss sth?

Comment by Andreas Dilger [ 17/Jul/13 ]

The problem is that LFSCK will begin storing OST object FIDs in the LMA. If it is storing IDIF FIDs in the LMA with OST index 0, this will be confusing in the future. I would prefer to use a consistent FID when accessing the OST objects so that the code does not have to have as many special cases in it.

Comment by Di Wang [ 17/Jul/13 ]

I see. Ok, then I will work on it. Actually, I tried this before but blocked by some quota issue. Actually I more concerned about the compatiblity issue between old MDT and new OST, if the old MDT use some old IDIF to track some sth. I will check.

Comment by Di Wang [ 19/Jul/13 ]

Wang Di (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/7053
Subject: LU-3569 ofd: packing ost_idx in IDIF
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 0e45c385508fc4377f07595d281911310fce3f29

Comment by Andreas Dilger [ 02/Oct/13 ]

I see that we are already storing the "index == 0" FID into the OST objects with the current master:

# debugfs /dev/vg_sookie/lvost2
debugfs 1.42.7.wc2 (12-Apr-2013)
debugfs:  stats
Filesystem volume name:   testfs-OST0001
:
:
debugfs:  stat O/0/d0/1856
Inode: 172   Type: regular    Mode:  0666   Flags: 0x80000
:
:
Extended attributes stored in inode body: 
  lma = "08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 40 07 00 00 00 00 00 00 " (24)
  lma: fid=[0x100000000:0x740:0x0] compat=8 incompat=0
  fid = "00 04 00 00 02 00 00 00 f8 75 00 00 00 00 00 00 " (16)
  fid: parent=[0x200000400:0x75f8:0x0] stripe=0

So this is always storing FID_SEQ_IDIF = 0x100000000 into the on-disk OST object LMA, instead of (FID_SEQ_NORMAL + ost_idx << 16) = 0x100010000 in this case. If this has been around since 2.4.x days then I guess LFSCK will have to fix it up either way and there is no requirement to fix it immediately, but if it is new in 2.5 then I'd prefer that this be fixed before 2.5.0.

Comment by Di Wang [ 03/Oct/13 ]

Oh, unfortunately, we always store FID_SEQ_IDIF(without ost index) for all of OST objects since 2.4.0. So LFSCK have to fix it. Nasf?

Comment by nasf (Inactive) [ 08/Oct/13 ]

I am thinking whether we will have the requirement to change OST-index in the future. If we have, then storing OST-index inside the IDIF will be trouble. Because for normal FID, OST-index is meaningless, the client can get the OST-index by checking FLD with the given FID. So as long as we can adjust the FLD correctly, then changing OST-index without breaking the system is easy.

Currently, because all the IDIF use 0-index, the OI scrub does not care about the OST-index. But if the up layer change the IDIF logic, means both LOD and OFD use the real OST-index, then OI scrub needs to the OST-index to rebuild the LMA and the /O tree. And before such conversion is completed, the system is un-accessable, it is not like OI scrub for MDT file-level backup/restore. Under the later one case, the system is still accessible during the OI scrub rebuilding the OI files. I am not sure whether OI scrub or the OSD should know the 'index' information or not.

Comment by Andreas Dilger [ 08/Oct/13 ]

Even if we don't store the OST index into the on-disk LMA FID, we need to fix a couple of bugs in the ll_lost_found_objs.

Comment by Andreas Dilger [ 08/Oct/13 ]

Looks like I am mistaken. I thought that it was a bug for ll_recover_lost_found_objs checking fid_seq_is_idif() and creating the object path LPX64, since this would move objects to "O/0x0/d*" instead of "O/0/d*". However, I didn't notice until now that the old code is actually using LPX64i instead of LPX64, and LPX64i prints zero as "0" instead of "0x0".

Even so, I cleaned up the code a bit to consolidate where the seq_name is being generated into one place, and IDIF objects now use LPU64 for the seq_name to make the code more clear.

http://review.whamcloud.com/7880

Comment by Lai Siyao [ 31/Jul/15 ]

Andreas, is there anything needed for this ticket?

Comment by Andreas Dilger [ 19/Aug/15 ]

Lai, I'd be grateful if you could revive my http://review.whamcloud.com/7880 patch to fix ll_recover_lost_found_objs, or possibly better is to make a different patch to delete ll_recover_lost_found_objs.c entirely because LFSCK should handle that now (sanity-scrub test_14 tests this).

Comment by Peter Jones [ 24/Aug/15 ]

As per discussion on the triage call, let's just delete the tool for 2.8

Comment by Gerrit Updater [ 18/Sep/15 ]

Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/16477
Subject: LU-3569 utils: remove ll_recover_lost_found_obj
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ef14f2f062ee31302d55a0a29ecee94233f47aa8

Comment by Peter Jones [ 04/Dec/15 ]

Fan Yong

We have been discussing this issue on the triage call and it looks like an fsck issue. Could you please look at the latest feedback in the patch and see what needs to be done?

Thanks

Peter

Comment by nasf (Inactive) [ 08/Dec/15 ]

Peter, I have refreshed the patch http://review.whamcloud.com/#/c/16477/

Comment by Gerrit Updater [ 10/Dec/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16477/
Subject: LU-3569 utils: remove ll_recover_lost_found_obj
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 8f810496b9b13b79018b6889939a6de6e19ddddd

Comment by Joseph Gmitter (Inactive) [ 10/Dec/15 ]

Last patch has landed for 2.8.

Generated at Sat Feb 10 01:35:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.