[LU-5248] Test failure on sanity-lfsck.sh, subtest test_4: ls: reading directory /mnt/lustre/d4.sanity-lfsck: Input/output error Created: 24/Jun/14  Updated: 04/Dec/14  Resolved: 04/Dec/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.1, Lustre 2.4.3
Fix Version/s: Lustre 2.5.4

Type: Bug Priority: Major
Reporter: Emoly Liu Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocker
is blocking LU-5241 2.4.3<->2.5.2 interop: sanity-lfsck t... Resolved
Duplicate
duplicates LU-5112 Interop 2.5 server with higher versio... Resolved
Severity: 3
Rank (Obsolete): 14639

 Description   

I found this issue during backporting http://review.whamcloud.com/#/c/9704 to b2_5. sanity-lfsck.sh test_5 always complained the test_4 environment was insane.

This issue is easy to reproduce by the following debug patch:

diff --git a/lustre/tests/sanity-lfsck.sh b/lustre/tests/sanity-lfsck.sh
index 4d203d6..f8765c0 100644
--- a/lustre/tests/sanity-lfsck.sh
+++ b/lustre/tests/sanity-lfsck.sh
@@ -386,7 +386,11 @@ test_4()
 
        #define OBD_FAIL_FID_LOOKUP     0x1505
        do_facet $SINGLEMDS $LCTL set_param fail_loc=0x1505
-       ls $DIR/$tdir/ > /dev/null || error "(11) no FID-in-dirent."
+       #ls $DIR/$tdir > /dev/null || error "(11) no FID-in-dirent."
+       ls -al $DIR/$tdir
+       echo "rc=${PIPESTATUS[0]}"
+       ls -al $DIR/$tdir
+       echo "rc=${PIPESTATUS[0]}"
 
        do_facet $SINGLEMDS $LCTL set_param fail_loc=0
 }

The output is like

fail_loc=0x1505
total 0
rc=0
ls: reading directory /mnt/lustre/d4.sanity-lfsck: Input/output error
total 0
rc=2

dmesg log:

Lustre: *** cfs_fail_loc=1505, val=0***
Lustre: 14962:0:(mdd_object.c:1955:mdd_dir_page_build()) build page failed: -2!
LustreError: 15167:0:(dir.c:422:ll_get_dir_page()) read cache page: [0x200000400:0x1:0x0] at 0: rc -2
LustreError: 15167:0:(dir.c:584:ll_dir_read()) error reading dir [0x200000400:0x1:0x0] at 0: rc -2
LustreError: 15170:0:(dir.c:398:ll_get_dir_page()) dir page locate: [0x200000400:0x1:0x0] at 0: rc -5


 Comments   
Comment by Oleg Drokin [ 24/Jun/14 ]

So, is the test_4 env complaint being insane only affects b2_5 or master as well?

Comment by Emoly Liu [ 25/Jun/14 ]

Oleg, in my test it only affects b2_5, master has no such issue.

Comment by nasf (Inactive) [ 08/Jul/14 ]

Here is the patch:

http://review.whamcloud.com/11006

Comment by nasf (Inactive) [ 28/Jul/14 ]

Here is the patch:

http://review.whamcloud.com/11006

Do we have any plan to land this patch to b2_5?

Comment by Gerrit Updater [ 04/Dec/14 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/11006/
Subject: LU-5248 osd: NOT inject OBD_FAIL_FID_LOOKUP on dotdot
Project: fs/lustre-release
Branch: b2_5
Current Patch Set:
Commit: c202c54d6d84f54560474f3a6f316af4fdd9e475

Comment by Peter Jones [ 04/Dec/14 ]

Landed for 2.5.4

Generated at Sat Feb 10 01:49:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.