Details
-
Improvement
-
Resolution: Fixed
-
Minor
-
Lustre 2.7.0, Upstream
-
3
-
9223372036854775807
Description
The FIEMAP ioctl is used by many tools like cp(1) to optimize sparse files, but any error is treated as failure and such tools usually fall back by reading the whole file (so, zeroes if there is no stripe)
lov_object_fiemap returns -ENODATA if it could not get the object's lov_stripe_md. We should just fill the structure with no extent and return success instead.
We have many files with no stripe here because our copytool destripes released files, so they get assigned new OSTs on restore (instead of being restored where they originally were created, which might be bad balance) ; this mostly only impact admins copying released data around very carefully so this is somewhat minor for us, but still a bug imo.
FWIW we also have actual sparse files with data (e.g. VM images) on lustre with non-trivial striping so fiemap returning ENOTSUPP on call without FIEMAP_FLAG_DEVICE_ORDER, and cp would just read everything. I think we could support this better as well... But that's another issue
It's actually not so easy to create a file without stripe for testing, so I attached a simple program that does.
$ gcc -o create-nostripe create-nostripe.c $ ./create-nostripe /mnt/lustre/testfile $ strace -e ioctl,read cp -a /mnt/lustre/testfile /tmp |& head -n 20 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300j\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\37\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\23\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\34\2\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\25\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\16\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240l\0\0\0\0\0\0"..., 832) = 832 read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tr"..., 1024) = 360 read(3, "", 1024) = 0 ioctl(3, FS_IOC_FIEMAP, 0x7ffd374e88e0) = -1 ENODATA (No data available) read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 ....
I will submit a patch for this right away.
Attachments
Activity
Reporter | Original: Dominique Martinet [ martinetd ] | New: CEA [ cealustre ] |
Fix Version/s | New: Lustre 2.11.0 [ 13091 ] | |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
Comment |
[ Hi Dominique, I cannot reproduce this on master: {noformat} m:lustre# mcreate f0 m:lustre# truncate --size=10G f0 m:lustre# stat f0 File: ‘f0’ Size: 10737418240 Blocks: 0 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205272502277 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2017-12-21 09:06:29.000000000 -0600 Modify: 2017-12-21 09:06:37.000000000 -0600 Change: 2017-12-21 09:06:37.000000000 -0600 Birth: - m:lustre# strace -e open,close,ioctl,read,write,ftruncate cp -a f0 /tmp/f0 ... open("f0", O_RDONLY|O_NOFOLLOW) = 3 open("/tmp/f0", O_WRONLY|O_TRUNC) = 4 ioctl(3, FS_IOC_FIEMAP, 0x7ffd0ca79c20) = 0 ftruncate(4, 10737418240) = 0 ... close(4) = 0 close(3) = 0 ... +++ exited with 0 +++ {noformat} (In the strace output above I have elided the syscalls for etc and lib files.) ] |
Issue Type | Original: Bug [ 1 ] | New: Improvement [ 4 ] |
Assignee | Original: WC Triage [ wc-triage ] | New: Dominique Martinet [ martinetd ] |
Labels | Original: cea | New: cea patch |