[LU-6460] LLIF_FILE_RESTORING is not cleared at end of restore Created: 13/Apr/15 Updated: 31/Jul/16 Resolved: 31/Jul/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | John Hammond | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | hsm, medium | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
The LLIF_FILE_RESTORING flag is not cleared until an IO is performed on the inode. This may cause stale file attributes to be cached if the file is stated during restore. To reproduce: export MOUNT_2=y
llmount.sh
lctl conf_param lustre-MDT0000.mdt.hsm_control=enabled
mkdir -p /mnt/lustre-hsm
mount $HOSTNAME@tcp:/lustre /mnt/lustre-hsm -t lustre -o user_xattr,flock
mkdir -p /tmp/arc1
lhsmtool_posix -vvvv --hsm_root=/tmp/arc1 --daemon /mnt/lustre-hsm 2> /tmp/hsm.log
echo XXX > /mnt/lustre/f0
lfs hsm_archive /mnt/lustre/f0
sleep 1
lfs hsm_release /mnt/lustre/f0
killall lhsmtool_posix
cat /mnt/lustre/f0 &
sleep 1
stat /mnt/lustre2/f0
lhsmtool_posix -vvvv --hsm_root=/tmp/arc1 --daemon /mnt/lustre-hsm 2> /tmp/hsm.log
wait
dd if=/dev/zero of=/mnt/lustre/f0 count=1
stat /mnt/lustre/f0
stat /mnt/lustre2/f0
sleep 60
stat /mnt/lustre/f0
stat /mnt/lustre2/f0
Output ... t:~# lhsmtool_posix -vvvv --hsm_root=/tmp/arc1 --daemon /mnt/lustre-hsm 2> /tmp/hsm.log t:~# echo XXX > /mnt/lustre/f0 t:~# lfs hsm_archive /mnt/lustre/f0 t:~# sleep 1 t:~# lfs hsm_release /mnt/lustre/f0 t:~# killall lhsmtool_posix t:~# cat /mnt/lustre/f0 & [1] 10620 t:~# sleep 1 t:~# stat /mnt/lustre2/f0 File: `/mnt/lustre2/f0' Size: 4 Blocks: 1 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205255725063 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-04-13 16:58:17.000000000 -0500 Modify: 2015-04-13 16:58:17.000000000 -0500 Change: 2015-04-13 16:58:17.000000000 -0500 t:~# lhsmtool_posix -vvvv --hsm_root=/tmp/arc1 --daemon /mnt/lustre-hsm 2> /tmp/hsm.log t:~# wait XXX [1]+ Done cat /mnt/lustre/f0 t:~# dd if=/dev/zero of=/mnt/lustre/f0 count=1 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.000714119 s, 717 kB/s t:~# stat /mnt/lustre/f0 File: `/mnt/lustre/f0' Size: 512 Blocks: 8 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205255725063 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-04-13 16:58:17.000000000 -0500 Modify: 2015-04-13 16:58:28.000000000 -0500 Change: 2015-04-13 16:58:28.000000000 -0500 t:~# stat /mnt/lustre2/f0 File: `/mnt/lustre2/f0' Size: 4 Blocks: 1 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205255725063 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-04-13 16:58:17.000000000 -0500 Modify: 2015-04-13 16:58:28.000000000 -0500 Change: 2015-04-13 16:58:28.000000000 -0500 t:~# sleep 60 t:~# stat /mnt/lustre/f0 File: `/mnt/lustre/f0' Size: 512 Blocks: 8 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205255725063 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-04-13 16:58:17.000000000 -0500 Modify: 2015-04-13 16:58:28.000000000 -0500 Change: 2015-04-13 16:58:28.000000000 -0500 t:~# stat /mnt/lustre2/f0 File: `/mnt/lustre2/f0' Size: 4 Blocks: 1 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205255725063 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-04-13 16:58:17.000000000 -0500 Modify: 2015-04-13 16:58:28.000000000 -0500 Change: 2015-04-13 16:58:28.000000000 -0500 |
| Comments |
| Comment by Peter Jones [ 21/Apr/15 ] |
|
Bruno Could you please look into this one? Peter |
| Comment by Andreas Dilger [ 21/Apr/15 ] |
|
John, what is the impact of this bug? What does userspace do with LLIF_FILE_RESTORING? |
| Comment by John Hammond [ 21/Apr/15 ] |
|
> John, what is the impact of this bug? What does userspace do with LLIF_FILE_RESTORING? The impact is that the wrong size and attributes may be reported by stat. A reproducer is shown in the description. Userspace cannot see LLIF_FILE_RESTORING. llite uses this flag to determine if the attributes from the MDT are sufficient for stat(). |
| Comment by Gerrit Updater [ 27/Apr/15 ] |
|
Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/14609 |
| Comment by Gerrit Updater [ 31/Jul/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14609/ |
| Comment by Peter Jones [ 31/Jul/15 ] |
|
Landed for 2.8 |