[LU-3132] lfsck: failed to recreate /mnt/lustre/d0.lfsck/testfile.7 missing obj 0:12 Created: 09/Apr/13  Updated: 14/May/13  Resolved: 14/May/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.9
Fix Version/s: Lustre 1.8.9

Type: Bug Priority: Minor
Reporter: Emoly Liu Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-3041 Interop 1.8.9<->2.4 failure on test s... Resolved
Severity: 3
Rank (Obsolete): 7605

 Description   

b1_8 lfsck test log shows

lfsck 1.42.6.wc2 (10-Dec-2012)
lfsck: ost_idx 0: pass1: check for duplicate objects
lfsck: ost_idx 0: pass1 OK (20 files total)
lfsck: ost_idx 0: pass2: check for missing inode objects
[0]: failed to recreate /mnt/lustre/d0.lfsck/testfile.7 missing obj 0:12
lfsck: ost_idx 0: pass2 ERROR: 1 dangling inodes found (21 files total)

The maloo report is at https://maloo.whamcloud.com/test_sessions/61ecab6c-a051-11e2-898f-52540035b04c



 Comments   
Comment by Emoly Liu [ 25/Apr/13 ]

I added some debugging info to ll_lov_recreate().

@@ -2254,6 +2254,8 @@ static int ll_lov_recreate(struct inode *inode, obd_id id, obd_gr gr,
 out:
         up(&lli->lli_size_sem);
         OBDO_FREE(oa);
+        printk("rc=%d: [%u] recreate missing obj "LPU64":"LPU64"\n",
+                rc, (__u32)ost_idx, id, gr);
         return rc;
 }

lfsck output showed that there were 5 dangling inodes,

lfsck: ost_idx 0: pass2: check for missing inode objects
[0]: failed to recreate /mnt/lustre/d0.lfsck/testfile.1 missing obj 0:34
[0]: failed to recreate /mnt/lustre/d0.lfsck/testfile.3 missing obj 0:35
[0]: failed to recreate /mnt/lustre/d0.lfsck/testfile.5 missing obj 0:36
[0]: failed to recreate /mnt/lustre/d0.lfsck/testfile.7 missing obj 0:37
[0]: failed to recreate /mnt/lustre/d0.lfsck/testfile.9 missing obj 0:38
lfsck: ost_idx 0: pass2 ERROR: 5 dangling inodes found (76 files total)

but the debugging info showed ost_idx and id in struct ll_recreate_obj passed from lfsck_recreate_obj() by ioctl is wrong.

rc=-22: [0] recreate missing obj 47571424239216:0
rc=-22: [0] recreate missing obj 47571424239216:0
rc=-22: [0] recreate missing obj 47571424239216:0
rc=-22: [0] recreate missing obj 47571424239216:0
rc=-22: [0] recreate missing obj 47571424239216:0

I will investigate it.

Comment by Emoly Liu [ 26/Apr/13 ]

After looking into lfsck.c, I found something wrong in lfsck_recreate_obj().

/* If an MDS file is missing an object recreate object using an ioctl call */
static int lfsck_recreate_obj(int cmd, void *creat, struct ost_id *oi,
                              __u32 ost_idx, char *path)
{
......
        rc = ioctl(fd, cmd, &creat); 

Here, "creat" is already a pointer to struct ll_recreate_obj variable, so we should pass it into ioctl directly.

I will push a patch later.

Comment by Emoly Liu [ 27/Apr/13 ]

http://review.whamcloud.com/#change,6181

Comment by Emoly Liu [ 14/May/13 ]

patch landed.

Generated at Sat Feb 10 01:31:15 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.