[LU-368] lfsck: exit with 2 unfixed errors Created: 27/May/11  Updated: 02/Sep/15  Resolved: 02/Sep/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.6
Fix Version/s: Lustre 1.8.6

Type: Bug Priority: Major
Reporter: Jian Yu Assignee: Jian Yu
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

Lustre Branch: b1_8
Lustre Build: http://newbuild.whamcloud.com/job/lustre-b1_8/61/
Distro/Arch: RHEL5/x86_64 (OFED 1.5.3, ext4)

One Single Node: client-8-ib (1 combo MGS/MDT and 6 OSTs)

[root@client-8-ib ~]# rpm -q e2fsprogs
e2fsprogs-1.41.90.wc2-0redhat


Attachments: File db.tgz     File lfsck-1306485152.tar.bz2    
Severity: 3
Rank (Obsolete): 10345

 Description   

lfsck test failed as follows:

lfsck: pass4 finished
lfsck: exit with 2 unfixed errors
 lfsck : @@@@@@ FAIL: lfsck -c -l --mdsdb /home/yujian/test_logs/mdsdb --ostdb  /home/yujian/test_logs/ostdb-0 /home/yujian/test_logs/ostdb-1 /home/yujian/test_logs/ostdb-2 /home/yujian/test_logs/ostdb-3 /home/yujian/test_logs/ostdb-4 /home/yujian/test_logs/ostdb-5 /mnt/lustre returned 2, should be <= 1 
Dumping lctl log to /home/yujian/test_logs/2011-05-27/013056/lfsck..*.1306485152.log
tar: Removing leading `/' from member names
/home/yujian/test_logs/2011-05-27/013056/lfsck-1306485152.tar.bz2

Dmesg showed:

LustreError: 18028:0:(filter.c:1557:filter_destroy_internal()) destroying objid 10 ino 78289 nlink 0 count 2
LustreError: 18028:0:(filter.c:1563:filter_destroy_internal()) error unlinking objid 10: rc -2
LustreError: 18042:0:(filter.c:1557:filter_destroy_internal()) destroying objid 29 ino 78308 nlink 0 count 2
LustreError: 18042:0:(filter.c:1563:filter_destroy_internal()) error unlinking objid 29: rc -2

Maloo report: https://maloo.whamcloud.com/test_sets/e129e0b4-883b-11e0-b4df-52540025f9af

The logs and db files are attached.



 Comments   
Comment by Peter Jones [ 27/May/11 ]

Andreas is trying to reproduce this

Comment by Jian Yu [ 15/Jun/11 ]

Lustre Branch: v1_8_6_RC2
Lustre Build: http://newbuild.whamcloud.com/job/lustre-b1_8/80/
e2fsprogs Build: http://newbuild.whamcloud.com/job/e2fsprogs-master/40/
Distro/Arch: RHEL6/x86_64(patchless client, in-kernel OFED, kernel version: 2.6.32-131.2.1.el6)
RHEL5/x86_64(server, OFED 1.5.3.1, kernel version: 2.6.18-238.12.1.el5_lustre)

The same failure occurred while running lfsck test:

lfsck: /mnt/lustre/lost+found/duplicates/5-0:207-[0x7a82:0x56816da7:0x0]:testfile.20 unlinked
lfsck: /mnt/lustre/lost+found/duplicates/5-0:207-[0x7a8c:0x665fef84:0x0]:testfile.20.bad unlinked
removed directory: `/mnt/lustre/lost+found/duplicates'
lfsck: pass4 finished
lfsck: exit with 2 unfixed errors
 lfsck : @@@@@@ FAIL: lfsck -c -l --mdsdb /home/yujian/test_logs/mdsdb --ostdb  /home/yujian/test_logs/ostdb-0 /home/yujian/test_logs/ostdb-1 /home/yujian/test_logs/ostdb-2 /home/yujian/test_logs/ostdb-3 /home/yujian/test_logs/ostdb-4 /home/yujian/test_logs/ostdb-5 /mnt/lustre returned 2, should be <= 1 

Maloo report: https://maloo.whamcloud.com/test_sets/dbb262d4-9739-11e0-9a27-52540025f9af

Comment by Peter Jones [ 09/Aug/11 ]

Yu Jian is going to work on this

Comment by Jian Yu [ 12/Aug/11 ]

I could reproduce the issue on single RHEL5/x86_64 test node against the latest Lustre b1_8 build (http://newbuild.whamcloud.com/job/lustre-b1_8/120/) with e2fsprogs-1.41.90.wc3-0redhat. The same test passed on master branch. I'll investigate more.

Comment by Andreas Dilger [ 02/Sep/15 ]

Old lfsck is no longer supported.

Generated at Sat Feb 10 01:06:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.