[LU-3871] e2fsck reports inode reference count error Created: 02/Sep/13 Updated: 07/Oct/13 Resolved: 30/Sep/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | Lustre 2.5.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Niu Yawei (Inactive) | Assignee: | Di Wang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 10043 | ||||||||
| Description |
|
It can be easily reproduced by: sh llmount.sh .... sh llmountcleanup.sh .... e2fsck -fn $ost_dev The e2fsck shows: e2fsck 1.42.7.wc1 (12-Apr-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Inode 90 ref count is 1, should be 2. Fix? no Inode 94 ref count is 1, should be 2. Fix? no Inode 105 ref count is 1, should be 2. Fix? no Inode 107 ref count is 1, should be 2. Fix? no Pass 5: Checking group summary information lustre-OST0000: ********** WARNING: Filesystem still has errors ********** lustre-OST0000: 293/50016 files (0.7% non-contiguous), 9608/50000 blocks I tracked the problem and looks it's introduced by commit: e0702769f267dd009a6287bbc9da2760079a101d ( |
| Comments |
| Comment by Niu Yawei (Inactive) [ 02/Sep/13 ] |
|
Di, could you take a look? Thanks. |
| Comment by Niu Yawei (Inactive) [ 02/Sep/13 ] |
|
The changes in fid_is_on_ost() is highly suspicious (e0702769f267dd009a6287bbc9da2760079a101d): --- a/lustre/osd-ldiskfs/osd_oi.c +++ b/lustre/osd-ldiskfs/osd_oi.c @@ -470,8 +470,6 @@ static int osd_oi_iam_lookup(struct osd_thread_info *oti, int fid_is_on_ost(struct osd_thread_info *info, struct osd_device *osd, const struct lu_fid *fid, enum oi_check_flags flags) { - struct lu_seq_range *range = &info->oti_seq_range; - int rc; ENTRY; if (flags & OI_KNOWN_ON_OST) @@ -487,17 +485,7 @@ int fid_is_on_ost(struct osd_thread_info *info, struct osd_device *osd, if (!(flags & OI_CHECK_FLD)) RETURN(0); - rc = osd_fld_lookup(info->oti_env, osd, fid, range); - if (rc != 0) { - CERROR("%s: Can not lookup fld for "DFID"\n", - osd_name(osd), PFID(fid)); - RETURN(rc); - } - - CDEBUG(D_INFO, "fid "DFID" range "DRANGE"\n", PFID(fid), - PRANGE(range)); - - if (fld_range_is_ost(range)) + if (osd->od_is_ost) RETURN(1); RETURN(0); Which definitely changed the behaviour when OI_CHECK_FLD is present (after the change, fid_is_on_ost() will always return 1 on the ost device) And I found the following code looks confusing to me: #define LU_SEQ_RANGE_MDT 0x0 #define LU_SEQ_RANGE_OST 0x1 #define LU_SEQ_RANGE_ANY 0x3 #define LU_SEQ_RANGE_MASK 0x3 What does LU_SEQ_RANGE_ANY mean? It's a combination of RANGE_MDT and RANGE_OST or it's another type of RANGE? |
| Comment by Di Wang [ 02/Sep/13 ] |
|
I pushed patch http://review.whamcloud.com/#/c/7527/ . I thought the problem is in the osd_oi_insert, and it seems a lot special FIDs is inserted into IAM, is this on purpose? or bug? related with OI scrub? Fan Yong, could you please have a look? |
| Comment by nasf (Inactive) [ 03/Sep/13 ] |
|
Inserting local object into OI (oi.16.xxx) is expected. They can be easily distinguished from other normal FIDs. They are either IGIFs or local FIDs. |
| Comment by Andreas Dilger [ 04/Sep/13 ] |
|
Niu, which inodes are the ones reported by e2fsck (90, 94, 105, 107)? Are these OST objects, directories, other? |
| Comment by Niu Yawei (Inactive) [ 05/Sep/13 ] |
|
I believe they are special objects on OST, such as named llog, global quota files, etc. Di has posted a patch for this (http://review.whamcloud.com/#/c/7527/), though it needs be updated. |
| Comment by Peter Jones [ 30/Sep/13 ] |
|
Landed for 2.5.0 |