The lfsck.sh script problem can be easily fixed by changing the "debugfs remove" to "local mount remove":
The new method of removing object looks like:
# Remove objects associated with files.
remove_objects() {
local ostdev=$1
shift
local group=$1
shift
local objids="$@"
local facet=ost$((OSTIDX + 1))
local mntpt=$(facet_mntpt $facet)
local opts=$OST_MOUNT_OPTS
local i
local rc
echo "removing objects from $ostdev on $facet: $objids"
if ! do_facet $facet test -b $ostdev; then
opts=$(csa_add "$opts" -o loop)
fi
mount -t $(facet_fstype $facet) $opts $ostdev $mntpt ||
return $?
rc=0;
for i in $objids; do
rm $mntpt/O/$group/d$((i % 32))/$i || { rc=$?; break; }
done
umount -f $mntpt || return $?
return $rc
}
With the new test script, e2fsck doesn't complain the quota usage inconsistency anymore.
However, seems the lfsck.sh can't pass for the "if is_empty_fs $MOUNT" case at all, during the lfsck fix phase, it'll hit LBUG in mdd_create_data() (because we don't support MDS_OPEN_HAS_OBJS flag anymore), after fixing the LBUG, lfsck still failed:
lfsck -c -l --mdsdb /tmp/mdsdb --ostdb /tmp/ostdb-0 /tmp/ostdb-1 /mnt/lustre
lfsck 1.42.5.wc2 (15-Sep-2012)
lfsck: ost_idx 0: pass1: check for duplicate objects
lfsck: ost_idx 0: pass1 OK (71 files total)
lfsck: ost_idx 0: pass2: check for missing inode objects
Failed to find fid [0x200000400:0x57:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x59:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x5b:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x5d:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x5f:0x0]: DB_NOTFOUND: No matching key/data pair found
lfsck: ost_idx 0: pass2 OK (76 objects)
lfsck: ost_idx 0: pass3: check for orphan objects
lfsck: [0]: pass3 saved orphan object 0:43, 1048576 bytes
lfsck: [0]: pass3 saved orphan object 0:44, 1048576 bytes
lfsck: [0]: pass3 saved orphan object 0:45, 1048576 bytes
lfsck: [0]: pass3 saved orphan object 0:46, 1048576 bytes
lfsck: [0]: pass3 saved orphan object 0:47, 1048576 bytes
lfsck: ost_idx 0: pass3 FIXED: 5MB of orphan data (5 of 91 files total)
lfsck: ost_idx 1: pass1: check for duplicate objects
lfsck: ost_idx 1: pass1 OK (71 files total)
lfsck: ost_idx 1: pass2: check for missing inode objects
lfsck: ost_idx 1: pass2 OK (76 objects)
lfsck: ost_idx 1: pass3: check for orphan objects
lfsck: [1]: pass3 saved orphan object 0:43, 1048576 bytes
lfsck: [1]: pass3 saved orphan object 0:44, 1048576 bytes
lfsck: [1]: pass3 saved orphan object 0:45, 1048576 bytes
lfsck: [1]: pass3 saved orphan object 0:46, 1048576 bytes
lfsck: [1]: pass3 saved orphan object 0:47, 1048576 bytes
lfsck: ost_idx 1: pass3 FIXED: 5MB of orphan data (5 of 96 files total)
lfsck: pass4: check for 20 duplicate object references
Failed to find fid [0x200000400:0x61:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x6a:0xedbddba1:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x63:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc1:0xedbddba3:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x67:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc5:0xedbddba7:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x65:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc3:0xedbddba5:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x69:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc7:0xedbddba9:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x62:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x6c:0xedbddba2:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x64:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc2:0xedbddba4:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x68:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc6:0xedbddba8:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x66:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc4:0xedbddba6:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0x200000400:0x6a:0x0]: DB_NOTFOUND: No matching key/data pair found
Failed to find fid [0xc8:0xedbddbaa:0x0]: DB_NOTFOUND: No matching key/data pair found
removed directory: `/mnt/lustre/lost+found/duplicates'
lfsck: pass4 finished
lfsck: fixed 10 errors
lfsck finished with rc=1
removed `/tmp/mdsdb'
removed `/tmp/mdsdb.mdshdr'
removed `/tmp/ostdb-0'
removed `/tmp/ostdb-1'
clean after the first check
== lfsck test complete, duration 44 sec == 03:35:27 (1359016527)
rm: cannot remove `/mnt/lustre/d0.lfsck/testfile.18.bad': Input/output error
rm: cannot remove `/mnt/lustre/d0.lfsck/testfile.17.bad': Input/output error
rm: cannot remove `/mnt/lustre/d0.lfsck/testfile.19.bad': Input/output error
rm: cannot remove `/mnt/lustre/d0.lfsck/testfile.15': Input/output error
rm: cannot remove `/mnt/lustre/d0.lfsck/testfile.14.bad': Input/output error
rm: cannot remove `/mnt/lustre/d0.lfsck/testfile.20.bad': Input/output error
rm: cannot remove `/mnt/lustre/d0.lfsck/testfile.12': Input/output error
lfsck : @@@@@@ FAIL: remove sub-test dirs failed
Trace dump:
= /home/niu/lustre/lustre-master/lustre/tests/test-framework.sh:3957:error_noexit()
= /home/niu/lustre/lustre-master/lustre/tests/test-framework.sh:3980:error()
= /home/niu/lustre/lustre-master/lustre/tests/test-framework.sh:3565:check_and_cleanup_lustre()
= lfsck.sh:291:main()
Dumping lctl log to /tmp/test_logs/1359016483/lfsck..*.1359016527.log
Dumping logs only on local client.
I'm not sure if lfsck is still being supportted or used by user? If not, I think we can just leave these errors behind, otherwise, we'd open other tickets and fix lfsck itself. Andreas, any comment? Thanks.
patch landed for 2.4