[LU-3840] llapi_layout API design discussion - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9942

Description

Extensions to liblustreapi have been posted for review here.

http://review.whamcloud.com/5302

The goal of the API extensions is to provided a user-friendly interface for interacting with the layout of files in Lustre filesystems, and to hide the wire-protocol details behind an opaque data type.

This issue will provide a forum for potentials users and developers to discuss and critique the design of the proposed API extensions.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

llapi_layout_alloc.txt
2 kB
04/Aug/14 6:22 PM
llapi_layout_file_create.txt
2 kB
04/Aug/14 6:22 PM
llapi_layout_get_by_fd.txt
4 kB
04/Aug/14 6:24 PM
llapi_layout_ost_index_get.txt
2 kB
04/Aug/14 6:22 PM
llapi_layout_pattern_get.txt
1 kB
04/Aug/14 6:22 PM
llapi_layout_pool_name_get.txt
2 kB
04/Aug/14 6:22 PM
llapi_layout_stripe_count_get.txt
1 kB
04/Aug/14 6:22 PM
llapi_layout_stripe_size_get.txt
1 kB
04/Aug/14 6:22 PM
llapi_layout.txt
4 kB
04/Aug/14 6:22 PM

Issue Links

is related to

LU-4665 utils: lfs setstripe to specify OSTs

Resolved

is related to

LU-2182 Add llapi_file_get_layout() function in liblustreapi

Closed

LU-3480 Layout Enhancement

Closed

Activity

[LU-3840] llapi_layout API design discussion

Andreas Dilger added a comment - 28/Aug/14 7:16 AM

If there is an insistence on setting errno to return errors, it would still be possible to also return the negative errno instead of "-1" all the time. I'm not suggesting to return PTR_ERR() instead of NULL in case of memory allocation failures, as I agree that this is not very common for userspace programs.

Andreas Dilger added a comment - 28/Aug/14 7:16 AM If there is an insistence on setting errno to return errors, it would still be possible to also return the negative errno instead of "-1" all the time. I'm not suggesting to return PTR_ERR() instead of NULL in case of memory allocation failures, as I agree that this is not very common for userspace programs.

Christopher Morrone (Inactive) added a comment - 28/Aug/14 1:01 AM - edited

IMHO, "errno" is the domain of the kernel and libc and should not be used by application libraries.

I think that the lustre library should be considered a system library, not an "application library". From a normal application's perspective, the lustre library is as system as they come: you are using library the interacts on your behalf directly with the kernel to influence a service offered by the kernel. In that respect I would argue that errno is entirely reasonable to use.

I would argue that this kind of error handling is exactly what user-space C developers have come to expect from system level libraries. After all, if it isn't OK for use to use errno, is it really OK for us to reuse all of the standard error codes (EIO, EINVAL, etc.)? Shouldn't we have to invent our own error names and values if those things are only the purview of the kernel and Lib C?

Granted, using temporary variable to implement the use of error is mildly annoying. But is that really enough justification to violate the principle of least surprise for the user-space developers who will be consuming our library functions?

I think that the "Return negated errno values" approach is probably the least desirable of those proposed. This is a kernel-ism; the result of a clever hack that recognized that those memory values would never be valid so hey why not throw the error value in there. In user space, the programmers are going to think we have lost our minds if we force them to check for an negative version of an error code that is always positive everywhere else. At the very least, we would need to create macros or functions that the users would need to use to check the return code and another to translate the error code into the correct value. We would be shifting the annoying error code shuffling from the library writer to all of the library consumers.

errno is defined by the C language standard. Most, if not all, of the values of errno that we use (EACCESS, EAGAIN, EIO, EISDIR, etc.) are defined by POSIX.1-2001 or C99, not inventions of the Linux kernel.

If we use the same values as errno, users are going to want to use standard functions like perror() that assume the use of errno. Granted, strerror() also exists, but it is more difficult to use than perror().

It is already difficult to get users to check error codes, so I think it is important to keep things simple when reasonable to do so.

Christopher Morrone (Inactive) added a comment - 28/Aug/14 1:01 AM - edited IMHO, "errno" is the domain of the kernel and libc and should not be used by application libraries. I think that the lustre library should be considered a system library, not an "application library". From a normal application's perspective, the lustre library is as system as they come: you are using library the interacts on your behalf directly with the kernel to influence a service offered by the kernel. In that respect I would argue that errno is entirely reasonable to use. I would argue that this kind of error handling is exactly what user-space C developers have come to expect from system level libraries. After all, if it isn't OK for use to use errno, is it really OK for us to reuse all of the standard error codes (EIO, EINVAL, etc.)? Shouldn't we have to invent our own error names and values if those things are only the purview of the kernel and Lib C? Granted, using temporary variable to implement the use of error is mildly annoying. But is that really enough justification to violate the principle of least surprise for the user-space developers who will be consuming our library functions? I think that the "Return negated errno values" approach is probably the least desirable of those proposed. This is a kernel-ism; the result of a clever hack that recognized that those memory values would never be valid so hey why not throw the error value in there. In user space, the programmers are going to think we have lost our minds if we force them to check for an negative version of an error code that is always positive everywhere else. At the very least, we would need to create macros or functions that the users would need to use to check the return code and another to translate the error code into the correct value. We would be shifting the annoying error code shuffling from the library writer to all of the library consumers. errno is defined by the C language standard. Most, if not all, of the values of errno that we use (EACCESS, EAGAIN, EIO, EISDIR, etc.) are defined by POSIX.1-2001 or C99, not inventions of the Linux kernel. If we use the same values as errno, users are going to want to use standard functions like perror() that assume the use of errno. Granted, strerror() also exists, but it is more difficult to use than perror(). It is already difficult to get users to check error codes, so I think it is important to keep things simple when reasonable to do so.

Ned Bass (Inactive) added a comment - 27/Aug/14 10:49 PM

I'm on board with this. I just had a hallway discussion about the various approaches for returning errors, and negative errno return values were generally agreed to be the least evil of the following possibilities:

Use errno.
This is the current approach. It's evil for the reasons given above by Andreas.
Use a library-specific version of errno.
e.g. llapi_errno. It wouldn't get stomped on, but we'd have to handle thread safety. Ick.
Implement our own class of error codes.
This might be cleaner and more flexibile, but with a higher implementation and maintenance cost, plus UNIX programmers will be already familiar errno values.
Return negated errno values.
This is the proposed approach. The only real downside is there's little precedent outside the kernel and llapi. But it's thread safe and cleaner than the current approach.

Ned Bass (Inactive) added a comment - 27/Aug/14 10:49 PM I'm on board with this. I just had a hallway discussion about the various approaches for returning errors, and negative errno return values were generally agreed to be the least evil of the following possibilities: Use errno . This is the current approach. It's evil for the reasons given above by Andreas. Use a library-specific version of errno . e.g. llapi_errno . It wouldn't get stomped on, but we'd have to handle thread safety. Ick. Implement our own class of error codes. This might be cleaner and more flexibile, but with a higher implementation and maintenance cost, plus UNIX programmers will be already familiar errno values. Return negated errno values. This is the proposed approach. The only real downside is there's little precedent outside the kernel and llapi . But it's thread safe and cleaner than the current approach.

Andreas Dilger added a comment - 27/Aug/14 10:11 PM

One thing that snuck past my review in these patches was that the new llapi_layout_*() functions are all returning "-1" to the caller and returning the error codes via "errno" instead of returning the negative error numbers directly to the callers. IMHO, "errno" is the domain of the kernel and libc and should not be used by application libraries. This is a global variable that could be touched by many parts of the process, and there is the danger that errno gets clobbered by other parts of the code, leading to ugliness like:

        if (lum == NULL) {
                tmp = errno;
                close(fd);
                errno = tmp;
                return -1;
        }

and every piece of code that is returning an error having to do it twice:

        if (path == NULL ||
            (layout != NULL && layout->llot_magic != LLAPI_LAYOUT_MAGIC)) {
                errno = EINVAL;
                return -1;
        }

My preference would be to fix the new llapi_layout_*() functions to return the negative error number directly and avoid errno entirely.

Andreas Dilger added a comment - 27/Aug/14 10:11 PM One thing that snuck past my review in these patches was that the new llapi_layout_*() functions are all returning "-1" to the caller and returning the error codes via "errno" instead of returning the negative error numbers directly to the callers. IMHO, "errno" is the domain of the kernel and libc and should not be used by application libraries. This is a global variable that could be touched by many parts of the process, and there is the danger that errno gets clobbered by other parts of the code, leading to ugliness like: if (lum == NULL) { tmp = errno; close(fd); errno = tmp; return -1; } and every piece of code that is returning an error having to do it twice: if (path == NULL || (layout != NULL && layout->llot_magic != LLAPI_LAYOUT_MAGIC)) { errno = EINVAL; return -1; } My preference would be to fix the new llapi_layout_*() functions to return the negative error number directly and avoid errno entirely.

Ned Bass (Inactive) added a comment - 04/Aug/14 6:20 PM

Updated attached man pages to reflect recent API changes. In particular

removed llapi_layout_expected()
llapi_layout_by_{fd,fid,path}() renamed to llapi_layout_get_by_{fd,fid,path}()
Added flags parameter to llapi_layout_get_by_{fd,fid,path}()
Implemented flag LAYOUT_GET_EXPECTED which is accepted by llapi_layout_get_by_path() to implement functionality formerly provided by llapi_layout_expected()

Please review the documentation of the new flags parameter in llapi_layout_get_by_fd.txt.

Ned Bass (Inactive) added a comment - 04/Aug/14 6:20 PM Updated attached man pages to reflect recent API changes. In particular removed llapi_layout_expected() llapi_layout_by_{fd,fid,path}() renamed to llapi_layout_get_by_{fd,fid,path}() Added flags parameter to llapi_layout_get_by_{fd,fid,path}() Implemented flag LAYOUT_GET_EXPECTED which is accepted by llapi_layout_get_by_path() to implement functionality formerly provided by llapi_layout_expected() Please review the documentation of the new flags parameter in llapi_layout_get_by_fd.txt .

Ned Bass (Inactive) added a comment - 17/Jul/14 3:02 PM

I don't see any benefit to abandoning 5302

We can keep working there. I just find it becomes hard to navigate when the comment and revision history gets too long.

the xattr interface ... might "succeed" on any filesystem with xattr support, but not actually create the striped file as expected.

I have not found that to be the case in practice (ext4, zfs, tmpfs, and nfs). I believe (but haven't verified in the code) that a filesystem must explicitly register support for an xattr namespace beyond the standard ones, otherwise the kernel will return EOPNOTSUP.

Ned Bass (Inactive) added a comment - 17/Jul/14 3:02 PM I don't see any benefit to abandoning 5302 We can keep working there. I just find it becomes hard to navigate when the comment and revision history gets too long. the xattr interface ... might "succeed" on any filesystem with xattr support, but not actually create the striped file as expected. I have not found that to be the case in practice (ext4, zfs, tmpfs, and nfs). I believe (but haven't verified in the code) that a filesystem must explicitly register support for an xattr namespace beyond the standard ones, otherwise the kernel will return EOPNOTSUP.

Andreas Dilger added a comment - 17/Jul/14 8:23 AM

I don't see any benefit to abandoning 5302and creating a new change over just pushing a new patch which drops the changes you don't want. Making a new change just means more places to look for information about this change.

As for checks, I agree with John. Simple checks against hard (constant) limits are fine, but I'd prefer to do any complex checks (e.g. opening /proc files and iterating) in a separate function.

The kernel has to do all of these checks itself anyway. The main drawback is that the xattr interface that is needed for BG/L is clumsy because it might "succeed" on any filesystem with xattr support, but not actually create the striped file as expected. The ioctl() interface was less troublesome in this regard since it is unlikely that any filesystem would handle the Lustre ioctl command. It is worthwhile to verify if setxattr("lustre.lov") will work on other filesystems or if they will refuse the "lustre.lov" xattr because it is not in one if the normal namespaces ("user", "system", "trusted", or "security") that are handled by the kernel.

Andreas Dilger added a comment - 17/Jul/14 8:23 AM I don't see any benefit to abandoning 5302and creating a new change over just pushing a new patch which drops the changes you don't want. Making a new change just means more places to look for information about this change. As for checks, I agree with John. Simple checks against hard (constant) limits are fine, but I'd prefer to do any complex checks (e.g. opening /proc files and iterating) in a separate function. The kernel has to do all of these checks itself anyway. The main drawback is that the xattr interface that is needed for BG/L is clumsy because it might "succeed" on any filesystem with xattr support, but not actually create the striped file as expected. The ioctl() interface was less troublesome in this regard since it is unlikely that any filesystem would handle the Lustre ioctl command. It is worthwhile to verify if setxattr("lustre.lov") will work on other filesystems or if they will refuse the "lustre.lov" xattr because it is not in one if the normal namespaces ("user", "system", "trusted", or "security") that are handled by the kernel.

Ned Bass (Inactive) added a comment - 16/Jul/14 11:29 PM

That is one reason I want to start over from patch set 23. It was pretty thoroughly reviewed at that stage so I'd like to see it land with only minor changes and do follow-up work in separate patches. I don't see much benefit to breaking up what's already been reviewed though. The main obstacle to landing it is the expensive validity checks that Andreas objects to. But those checks can simply be removed for now, and replaced with improved interfaces as discussed above in subsequent patches. How would that sit with you, Andreas?

Ned Bass (Inactive) added a comment - 16/Jul/14 11:29 PM That is one reason I want to start over from patch set 23. It was pretty thoroughly reviewed at that stage so I'd like to see it land with only minor changes and do follow-up work in separate patches. I don't see much benefit to breaking up what's already been reviewed though. The main obstacle to landing it is the expensive validity checks that Andreas objects to. But those checks can simply be removed for now, and replaced with improved interfaces as discussed above in subsequent patches. How would that sit with you, Andreas?

James A Simmons added a comment - 16/Jul/14 10:42 PM - edited

Okay. I will take all the changes I did in 5302 and place it in the ~~LU-4665~~ patch. Andreas we will need to submit a different patch on top of Ned's new patch with your idea of a llapi_layout_verify. That is assuming people will accept your idea. Ned new base patch will be fine since it will not be the back end of anything. Further testing of the ~~LU-4665~~, which will use the layout api with lfs getstripe and setstripe, on my part will expose any problems with Ned's design. Plus the new base patch will not handle the case of DNE directories so that work will have to be developed as well. Much work to be done.

I promise to break the work I do in ~~LU-4665~~ into small incremental patch so people can properly review them.

Perhaps Ned you could consider breaking your patch up into smaller pieces. It is a lot of work.

James A Simmons added a comment - 16/Jul/14 10:42 PM - edited Okay. I will take all the changes I did in 5302 and place it in the LU-4665 patch. Andreas we will need to submit a different patch on top of Ned's new patch with your idea of a llapi_layout_verify. That is assuming people will accept your idea. Ned new base patch will be fine since it will not be the back end of anything. Further testing of the LU-4665 , which will use the layout api with lfs getstripe and setstripe, on my part will expose any problems with Ned's design. Plus the new base patch will not handle the case of DNE directories so that work will have to be developed as well. Much work to be done. I promise to break the work I do in LU-4665 into small incremental patch so people can properly review them. Perhaps Ned you could consider breaking your patch up into smaller pieces. It is a lot of work.

Ned Bass (Inactive) added a comment - 16/Jul/14 10:04 PM

Yes. John's comments cut to the heart of the matter are in line with what Andy has been asking for. The ultimate test of a layout's validity is whether Lustre accepts and creates it. But the API should provide interfaces that allow the user to determine valid layout values.

At this point I think http://review.whamcloud.com/#/c/5302/ should be abandoned and replaced with a new change ID based on patch set 23. That was the last stable point in the revision series (aside from errno stomping bugs caught by Frank), and there has been too much churn since then for me to review the changes with any confidence. Future revisions should be based strictly on design changes agreed to here, and avoid unnecessary refactoring of code that has already been reviewed, tested, and debugged.

For now, I respectfully request to be the only person to push changes to the review. Others are of course welcome to contribute dependent patches as separate reviews. I'm grateful that James took the initiative to move this forward, but I think it's important for consistency's sake to just have one chef in the kitchen. Please communicate any new requirements for ~~LU-4665~~ integration here so that work can move forward.

To summarize, I propose that we do the following, in order.

Abandon change 5302
Submit a new gerrit review based on 5302 patch set 23
Identify the questions we need the API to answer as suggested by John
Post draft man pages for new interfaces for review here, revise as needed
Refresh the patch with implementation of new interfaces, along with complete test cases and documentation

Are others on board with this plan?

Ned Bass (Inactive) added a comment - 16/Jul/14 10:04 PM Yes. John's comments cut to the heart of the matter are in line with what Andy has been asking for. The ultimate test of a layout's validity is whether Lustre accepts and creates it. But the API should provide interfaces that allow the user to determine valid layout values. At this point I think http://review.whamcloud.com/#/c/5302/ should be abandoned and replaced with a new change ID based on patch set 23. That was the last stable point in the revision series (aside from errno stomping bugs caught by Frank), and there has been too much churn since then for me to review the changes with any confidence. Future revisions should be based strictly on design changes agreed to here, and avoid unnecessary refactoring of code that has already been reviewed, tested, and debugged. For now, I respectfully request to be the only person to push changes to the review. Others are of course welcome to contribute dependent patches as separate reviews. I'm grateful that James took the initiative to move this forward, but I think it's important for consistency's sake to just have one chef in the kitchen. Please communicate any new requirements for LU-4665 integration here so that work can move forward. To summarize, I propose that we do the following, in order. Abandon change 5302 Submit a new gerrit review based on 5302 patch set 23 Identify the questions we need the API to answer as suggested by John Post draft man pages for new interfaces for review here, revise as needed Refresh the patch with implementation of new interfaces, along with complete test cases and documentation Are others on board with this plan?

John Hammond added a comment - 16/Jul/14 8:17 PM

Let me rephrase. What are the use cases of validity checking in user space? I'm in favor of good interfaces that answer specific questions like: "How many stripes can I have?" or "What are the minimum and maximum stripe sizes?" These are easy to answer.

On the other hand, offering an API to answer "Is this striping valid?" makes me a bit uncomfortable. It's a bit like being asked "Where babies come from?" by someone else's kid. There are too many details to ensure that striping that passes this function will be always be accepted and created by Lustre.

Setting implementation aside, how do I use such a function? Do I create various hypothetical striping and pass them to verify? This seems like a parameter search to answer the questions from the first paragraph.

John Hammond added a comment - 16/Jul/14 8:17 PM Let me rephrase. What are the use cases of validity checking in user space? I'm in favor of good interfaces that answer specific questions like: "How many stripes can I have?" or "What are the minimum and maximum stripe sizes?" These are easy to answer. On the other hand, offering an API to answer "Is this striping valid?" makes me a bit uncomfortable. It's a bit like being asked "Where babies come from?" by someone else's kid. There are too many details to ensure that striping that passes this function will be always be accepted and created by Lustre. Setting implementation aside, how do I use such a function? Do I create various hypothetical striping and pass them to verify? This seems like a parameter search to answer the questions from the first paragraph.

People

Assignee:: Ned Bass (Inactive)

Reporter:: Ned Bass (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 27/Aug/13 5:07 PM

Updated:: 03/Feb/15 7:14 PM

Resolved:: 03/Feb/15 7:14 PM