[LU-16808] lfs find --printf fails on FIFOs and special files Created: 09/May/23 Updated: 20/Jun/23 Resolved: 20/Jun/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Thomas Bertschinger | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
$ mkfifo myfifo $ lfs find myfifo myfifo $ lfs find myfifo --printf "%p\n" ** command hangs ** ^C $ sudo mknod mychardev c 1 5 $ lfs find mychardev mychardev $ lfs find mychardev --printf "%p\n" lfs: failed for 'mychardev': Inappropriate ioctl for device |
| Comments |
| Comment by Patrick Farrell [ 09/May/23 ] |
|
So it shouldn't hang on the fifo obviously, but can you explain the behavior you'd expect on these special files? I mean, what does lfs find do in general if called on a regular file rather than a directory? |
| Comment by Thomas Bertschinger [ 09/May/23 ] |
|
I suppose I'd expect the printf to print out the desired information without error (as long as there are no Lustre-specific format directives that don't apply to special files). I think this would mainly be relevant if you pass `lfs find` a directory and there are special files under that directory tree. I guess it's probably not common to put special files on Lustre so I'll admit this likely wouldn't be seen on real systems. I would expect this to work without error: $ cd /mnt/lustre $ touch a $ mknod mychardev c 1 5 $ lfs find /mnt/lustre --printf "%p\n" /mnt/lustre /mnt/lustre/a lfs: failed for '/mnt/lustre': Inappropriate ioctl for device But this is probably fine failing (unless maybe the %Li should just be empty or have a placeholder value in the line for the special file?): $ $ lfs find /mnt/lustre --printf "%Li %p\n" 0 /mnt/lustre 1 /mnt/lustre/a lfs: failed for '/mnt/lustre': Inappropriate ioctl for device Also, I think that error message should include the filename that failed because if you're running find on a big tree it won't necessarily be obvious what failed. |
| Comment by Patrick Farrell [ 09/May/23 ] |
|
"I guess it's probably not common to put special files on Lustre so I'll admit this likely wouldn't be seen on real systems." FWIW, after reading your response, I think in general we should probably not return inappropriate ioctl (and obviously not for the requests which have a clear correct answer) - we should return whatever makes the most sense. imo, it's confusing to get an error in the middle of an fs scan type operation. Glancing in to the code, it looks like our ll_special_inode_operations (our inode operations for all file structs which are not directories, regular files, or links) does not provide the .unlocked_ioctl function, so Lustre has no support for Lustre specific ioctls on special files. (I'm just going to use 'special files' to mean anything that's not a directory, regular file, or symlink.) I think you have opened a bit of a can of worms here, unfortunately, since now you may have to consider which Lustre specific ioctls (and there are a lot of them) are meaningful on special files and give them implementations. (Possibly it might make sense to share the ll_file_ioctl function as used for regular files, but you'd still have to walk through all the ioctls and add various checks.) Oh and we don't provide unlocked_ioctl for symlinks either! That's interesting; I wonder how that works in practice and how others/VFS handles this. Like I said, a little bit of a can of worms. It's probably possible to figure out what ioctls 'find' is using and just make sure those have handling for special files (and possibly symlinks too, as appropriate?), and ignore the rest of the Lustre specific ioctls. That wouldn't be a regression from current behavior, where no Lustre specific ioctls work on special files at all (afaics). |
| Comment by Patrick Farrell [ 09/May/23 ] |
|
Oh, and absolutely about the error message including the file name, that's obviously right. That could be a quick separate patch if you wanted to do it. |
| Comment by Patrick Farrell [ 09/May/23 ] |
|
Also, I didn't look at why find hangs on fifos, that's probably the most problematic issue here. |
| Comment by Andreas Dilger [ 09/May/23 ] |
|
The FIFO hang is probably because the open() immediately gets blocked on waiting for some data to be available on the pipe? The strace shows: write(1, "/mnt/testfs/newdir\n", 19/mnt/testfs/newdir
) = 19
close(3) = 0
fstat(4, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl(4, F_GETFL) = 0x28800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
fcntl(4, F_SETFD, FD_CLOEXEC) = 0
getdents(4, /* 3 entries */, 32768) = 72
ioctl(4, _IOC(_IOC_READ|_IOC_WRITE, 0x69, 0x16, 0x140), 0x16f3040) = 0
ioctl(4, _IOC(_IOC_READ, 0x66, 0xaf, 0x4), 0x7ffd455d7480) = 0
open("/mnt/testfs/newdir/char", O_RDONLY) = 3
ioctl(3, FS_IOC_FSGETXATTR, 0x7ffd455d70b0) = -1 ENOTTY (Inappropriate ioctl for device)
close(3) = 0
ioctl(4, _IOC(_IOC_READ|_IOC_WRITE, 0x69, 0x16, 0x140), 0x1f84040) = 0
ioctl(4, _IOC(_IOC_READ, 0x66, 0xaf, 0x4), 0x7ffc5a629f30) = 0
open("/mnt/testfs/newdir/fifo", O_RDONLY
I suspect that the "semantic traverse" can already use the file type in the directory entry to skip trying to process such files. In fact, looking at the strace output for regular file processing it seems like it is doing far too much for most files already: 1683657120.708615 getdents(4, /* 102 entries */, 32768) = 6448 <0.000784>
1683657120.709513 ioctl(4, _IOC(_IOC_READ|_IOC_WRITE, 0x69, 0x16, 0x140), 0xea2040) = 0 <0.002730>
1683657120.712309 open("/mnt/testfs/dir/filename_is_long-15", O_RDONLY) = 3 <0.003517>
1683657120.715904 ioctl(3, _IOC(_IOC_READ, 0x66, 0xaf, 0x4), 0x7ffc0a9b3ba0) = 0 <0.000140>
1683657120.716106 ioctl(3, FS_IOC_FSGETXATTR, 0x7ffc0a9b37d0) = 0 <0.000044>
1683657120.716243 newfstatat(4, "filename_is_long-15", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0 <0.000683>
1683657120.717022 write(1, "/mnt/testfs/dir/filename_is_long-15", 36) = 36 <0.000296> /mnt/testfs/dir/filename_is_long-15
1683657120.717386 close(3) = 0 <0.002073>
The ioctl(4, _IOC(_IOC_READ|_IOC_WRITE, 0x69, 0x16, 0x140) = IOC_MDC_GETFILEINFO_V2) is expected for every file, since it gets the file layout and MDT attributes from the parent directory but we definitely should not be doing open() and ioctl(FS_IOC_FSGETXATTR) and newfstatat() on every file when we are only printing the pathname. That is a lot of overhead (more than double the original ioctl(IOC_MDC_GETFILEINFO_V2) call. It looks like this was added as part of the -printf functionality, since running without -printf "%p" shows none of that overhead: 1683657295.265889 getdents(4, /* 102 entries */, 32768) = 6448 <0.000393> 1683657295.266362 write(1, "/mnt/testfs/dir/filename_is_long-15"..., 36) = 36 <0.000009> /mnt/testfs/dir/filename_is_long-15 1683657295.266446 write(1, "/mnt/testfs/dir/filename_is_long-44"..., 36) = 36 <0.000009> /mnt/testfs/dir/filename_is_long-44 I guess that is already described in LU-15837 under "selective fetching of metadata attributes. In particular, fetching the projid for a file requires an extra syscall that is unnecessary if the projid is not being printed" but it isn't just one syscall but multiple. |
| Comment by Gerrit Updater [ 10/May/23 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50905 |
| Comment by Gerrit Updater [ 10/May/23 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50916 |
| Comment by Thomas Bertschinger [ 16/May/23 ] |
|
I noticed another (potential) issue with symlinks and the project ID. For most attributes it appears lfs find uses data from the symlink itself, but for project ID, lfs find seems to use the value from the target: $ touch file20 && lfs project -p 20 file20 $ ln -s file20 link21 && lfs project -p 21 link21 $ lfs project . 20 - ./file20 21 - ./link21 # including %s in the format string to demonstrate that lfs find uses values from the link itself (and not the target) for attributes other than project id $ lfs find . --printf "%p %s %LP\n" . 4096 0 ./file20 0 20 ./link21 6 20 Would we expect that lfs find would use the value from the link in this case? Interestingly, I ran a quick test on ext4 and found that (at least in my setup) projid on symlinks did not work:
ext4-mount $ touch file20 && chattr -p 20 file20
ext4-mount $ ln -s file20 link21
ext4-mount $ chattr -p 21 link21
chattr: Operation not supported while reading flags on link21
ext4-mount $ lsattr -p .
0 --------------e------- ./lost+found
lsattr: Operation not supported While reading flags on ./link21
20 --------------e------- ./file20
I am not sure what the "expected" behavior of projid and symlinks is so I don't know if the lfs find behavior above is problematic. @adilger any thoughts on this? If the current behavior is wrong should I submit a ticket for this? |
| Comment by Andreas Dilger [ 16/May/23 ] |
|
Getting the projid on the symlink target is intentional behavior ( |
| Comment by Gerrit Updater [ 20/Jun/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50905/ |
| Comment by Gerrit Updater [ 20/Jun/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50916/ |
| Comment by Peter Jones [ 20/Jun/23 ] |
|
Landed for 2.16 |