Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.7.0
-
None
-
3
-
9223372036854775807
Description
My understanding is that lfsck in lustre-2.7 should be able to handle lost file information on the MDT, as long as the objects are still on the OSTs. However, a simple test to simulate this is not recovering the files. Shouldn't it at least be able to put them into lost+found? Or am I misunderstanding the capabilities of lfsck? Or is the following test case invalid in some way?
On the client, just create some test files...
# cd /mnt/lustre/client/lfscktest # echo foo > foo # mkdir bar # echo baz > bar/baz # lfs getstripe foo bar/baz foo lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 9 obdidx objid objid group 9 460962 0x708a2 0 bar/baz lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 12 obdidx objid objid group 12 460866 0x70842 0 # sync
On the MDS, simulate the MDT losing the information, such as could happen through restoring from a slightly outdated MDT backup...
# umount /mnt/lustre/nbptest-mdt # mount -t ldiskfs /dev/mapper/nbptest--vg-mdttest /mnt/lustre/nbptest-mdt # cd /mnt/lustre/nbptest-mdt/ROOT # ls -ld lfscktest lfscktest/* drwxr-xr-x+ 3 root root 4096 May 30 08:15 lfscktest drwxr-xr-x+ 2 root root 4096 May 30 08:15 lfscktest/bar -rw-r--r-- 1 root root 0 May 30 08:14 lfscktest/foo # rm -rf lfscktest/* # cd # umount /mnt/lustre/nbptest-mdt # mount -t lustre /dev/mapper/nbptest--vg-mdttest /mnt/lustre/nbptest-mdt
Now check the filesystem...
# lctl clear # lctl debug_daemon start /var/log/lfsck.debug # lctl lfsck_start -A -M nbptest-MDT0000 -c on -C on -o Started LFSCK on the device nbptest-MDT0000: scrub layout namespace # lctl get_param -n osd-ldiskfs.*.oi_scrub | grep status status: init status: completed # lctl debug_daemon stop # lctl debug_file /var/log/lfsck.debug | egrep -v " (NRS|RPC) " > /var/log/lfsck.log
And look back on the client...
# cd /mnt/lustre/client/ # ls -la lfscktest/ total 8 drwxr-xr-x+ 2 root root 4096 May 30 08:22 . drwxr-xr-x+ 9 root root 4096 May 30 08:14 .. # ls -la .lustre/lost+found/MDT0000 total 8 drwx------+ 3 root root 4096 May 27 10:44 . dr-x------+ 3 root root 4096 May 27 09:01 ..
Notice that there is no sign of the files being restored anywhere. Nor do I find any mention of the object ID's in the lfsck.log file.
Note that running lfsck_start with the "-t layout" option did not change the behaviour either.
Can be closed. Add nasa label.