Details
-
Improvement
-
Resolution: Fixed
-
Minor
-
Lustre 2.7.0, Upstream
-
3
-
9223372036854775807
Description
The FIEMAP ioctl is used by many tools like cp(1) to optimize sparse files, but any error is treated as failure and such tools usually fall back by reading the whole file (so, zeroes if there is no stripe)
lov_object_fiemap returns -ENODATA if it could not get the object's lov_stripe_md. We should just fill the structure with no extent and return success instead.
We have many files with no stripe here because our copytool destripes released files, so they get assigned new OSTs on restore (instead of being restored where they originally were created, which might be bad balance) ; this mostly only impact admins copying released data around very carefully so this is somewhat minor for us, but still a bug imo.
FWIW we also have actual sparse files with data (e.g. VM images) on lustre with non-trivial striping so fiemap returning ENOTSUPP on call without FIEMAP_FLAG_DEVICE_ORDER, and cp would just read everything. I think we could support this better as well... But that's another issue
It's actually not so easy to create a file without stripe for testing, so I attached a simple program that does.
$ gcc -o create-nostripe create-nostripe.c $ ./create-nostripe /mnt/lustre/testfile $ strace -e ioctl,read cp -a /mnt/lustre/testfile /tmp |& head -n 20 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300j\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\37\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\23\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\34\2\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\25\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\16\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240l\0\0\0\0\0\0"..., 832) = 832 read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tr"..., 1024) = 360 read(3, "", 1024) = 0 ioctl(3, FS_IOC_FIEMAP, 0x7ffd374e88e0) = -1 ENODATA (No data available) read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65536) = 65536 ....
I will submit a patch for this right away.