[LU-12332] Add a liblustreapi call for IOC_MDC_GETFILEINFO Created: 23/May/19  Updated: 28/Jul/20  Resolved: 28/Jul/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0, Lustre 2.14.0, Lustre 2.12.4

Type: Improvement Priority: Minor
Reporter: Aurelien Degremont (Inactive) Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-11367 integrate LSOM with lfs find Resolved
is related to LU-10934 integrate statx() API with Lustre Resolved
is related to LU-11695 disabling the xattr cache on client f... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

IOC_MDC_GETFILEINFO is a very convenient ioctl to retrieve metadata information (stat + lov EA v1) from MDT. This avoid going to OSTs if you don't need information stored over there (like file size).

This is wrapped by :

 int get_lmd_info_fd(char *path, int parent_fd, int dir_fd,
                     void *lmdbuf, int lmdlen, enum get_lmd_info_type type)

but this function is not exported and it could be nice to also add a llapi prefix.



 Comments   
Comment by Andreas Dilger [ 24/May/19 ]

Aurelien, thanks for filing this ticket, it is something that I've been meaning to do as well.

The patch https://review.whamcloud.com/33545 adds functionality to IOC_MDC_GETFILEINFO so that it can optionally return the Lazy Size-on-MDT data from the initial MDS RPC, so that tools such as "lfs find" and other scanners can use this as the approximate file size, instead of having to do an extra stat() on the file each time to get the size/blocks.

As such, I'd like to plumb in the llapi_* interface for this command to include the "__u64 lmd_flags" an argument, and separate out the pointer to "struct stat" from "struct lov_user_md" so that it is isolated from the underlying ioctl interface a bit. That allows the internals to call either IOC_MDC_GETFILEINFO or IOC_MDC_GETFILEINFO_OLD depending on which command the kernel supports without having to expose this to userspace. As such, I don't think that just renaming get_lmd_info_fd() is the right way to go.

Comment by Aurelien Degremont (Inactive) [ 24/May/19 ]

Actually I was also thinking about add a call which was not exactly get_lmd_info_fd() but take this opportunity to improve that a little bit. Your proposal makes sense.

Enabling this call to return more information in one RPC is interesting. Tools like Robinhood can benefit from a call which "give me everything in 1 RPC", like stats, layout, ...

So, whatever this call will return with 1 RPC, the more, the better

Comment by Nathan Rutman [ 29/Aug/19 ]

This ticket seems to be addressed by LU-11367 patch
https://review.whamcloud.com/#/c/35167/

int llapi_get_lum_file_fd(int dir_fd, const char *fname, __u64 *valid,
lstatx_t *statx, struct lov_user_md *lum,
size_t lumsize);

Comment by Nathan Rutman [ 29/Aug/19 ]

I don't want to hold up the landing of LU-11367, but it's not quite ideal yet.

llapi_get_lum_file_fd requires an open parent directory FD to send the ioctl to. It looks like this has to be the direct parent given the strrchr call in get_lmd_info_fd, rather than a true "path relative to the parent fd", which might allow us to issue the ioctl on the FS root in all cases, and avoid opening the parent dirs.

Since both stat() and getfattr() can be issued against the path with no opens, it seems like the info should be obtainable with even fewer RPCs. For example, llapi_get_lum_file could, instead of opening the parent dir and calling get_lum_file_fd, just call stat and getfattr lustre.lov, and avoid the open. Of course, stat and getfattr are two calls - but in reality we know that the stat() has already retrieved the layout info (because it needs to talk to the OSTs for size). So this is my long-winded way of asking: why can't we get all the info we want for the cost of a single stat()?

Comment by Aurelien Degremont (Inactive) [ 30/Aug/19 ]

> why can't we get all the info we want for the cost of a single stat()?

 Because stat() will also send RPCs to OSTs and this is what we want to avoid, no?

Comment by Nathan Rutman [ 30/Aug/19 ]

> Because stat() will also send RPCs to OSTs and this is what we want to avoid, no?
Perhaps better phrased as "the cost of a single RPC". My point is that a stat() call gets all the metadata that the MDS holds, both the MD in the inode and the layout, for the cost of a single ldlm_ibits_enqueue call to the MDS. At that point, all the info (except size) is available on the client, without an "open".

If we filled in the stat size with lazy some data, and interrupted the stat on the client before it tried to get the size from the OSTs, we would have everything that llapi_get_lum_file_fd has, with a single RPC.

Having said all that:
>RPCs to OSTs and this is what we want to avoid
Interestingly: no. That has always been the assumption, that "stats are slow because they have to go to OSTs". But that doesn't actually seem to be the case. We can get very high stat rates, even going to OSTs. The problem really seems to be RPCs to the MDS, and reducing the MDS RPC count is actually the win. In particular, directory open on MDS is slow, maybe because of locks, so it's likely (although I haven't tested) that the ioctl is slower that the stat, even including OST RPCs.

Comment by Andreas Dilger [ 28/Jul/20 ]

The new llapi was added as part of LU-11367 (in 2.13.0 and 2.12.4).

The ability to selectively get file attributes (without doing an OST RPC) is added with the statx() API in LU-10934 in 2.14.

Generated at Sat Feb 10 02:51:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.