Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
== sanity-lfsck test 18d: Find out orphan OST-object and repair it (4) =============================== 16:03:13 (1477411393) ##### The target MDT-object layout EA slot is occpuied by some new created OST-object when repair dangling reference case. Such conflict OST-object has never been modified. Then when found the orphan OST-object, LFSCK will replace it with the orphan OST-object. ##### [0x280000400:0x4:0x0] /mnt/lustre/d18d.sanity-lfsck/a1/f1 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 2 0x2 0 [0x280000400:0x5:0x0] /mnt/lustre/d18d.sanity-lfsck/a1/f2 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 3 0x3 0 Inject failure to make /mnt/lustre/d18d.sanity-lfsck/a1/f1 and /mnt/lustre/d18d.sanity-lfsck/a1/f2 to reference the same OST-object (which is f1's OST-obejct). Then drop /mnt/lustre/d18d.sanity-lfsck/a1/f1 and its OST-object, so f2 becomes dangling reference case, but f2's old OST-object is there. fail_loc=0x1618 fail_loc=0 stopall to cleanup object cache setupall pdsh@fre0127: fre0125: ssh exited with exit code 1 pdsh@fre0127: fre0125: ssh exited with exit code 1 pdsh@fre0127: fre0125: ssh exited with exit code 1 pdsh@fre0127: fre0125: ssh exited with exit code 1 pdsh@fre0127: fre0125: ssh exited with exit code 1 pdsh@fre0127: fre0125: ssh exited with exit code 1 pdsh@fre0127: fre0126: ssh exited with exit code 1 pdsh@fre0127: fre0126: ssh exited with exit code 1 pdsh@fre0127: fre0126: ssh exited with exit code 1 pdsh@fre0127: fre0126: ssh exited with exit code 1 The file size should be incorrect since dangling referenced ls: cannot access /mnt/lustre/d18d.sanity-lfsck/a1/f2: No such file or directory fail_val=5 fail_loc=0x1602 Trigger layout LFSCK on all devices to find out orphan OST-object Started LFSCK on the device lustre-MDT0000: scrub layout Waiting 120 secs for update Waiting 110 secs for update Waiting 100 secs for update Waiting 90 secs for update Waiting 80 secs for update Waiting 70 secs for update Waiting 60 secs for update Waiting 50 secs for update Waiting 40 secs for update Waiting 30 secs for update Waiting 20 secs for update Waiting 10 secs for update Update not seen after 120s: wanted 'scanning-phase2' got 'completed' sanity-lfsck test_18d: @@@@@@ FAIL: (3.0) MDS1 is not the expected 'scanning-phase2' ... Resetting fail_loc on all nodes...done. FAIL 18d (214s)
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23650/
Subject:
LU-8810tests: skip non-crucial LFSCK intermediateness checkProject: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 790b56d82cf82dbaf30c1d1788e647d1e4a8dee0