Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3840

llapi_layout API design discussion

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • None
    • None
    • 3
    • 9942

    Description

      Extensions to liblustreapi have been posted for review here.

      http://review.whamcloud.com/5302

      The goal of the API extensions is to provided a user-friendly interface for interacting with the layout of files in Lustre filesystems, and to hide the wire-protocol details behind an opaque data type.

      This issue will provide a forum for potentials users and developers to discuss and critique the design of the proposed API extensions.

      Attachments

        Issue Links

          Activity

            [LU-3840] llapi_layout API design discussion

            Duplicate of LU-2182

            jlevi Jodi Levi (Inactive) added a comment - Duplicate of LU-2182

            I would much rather stick with returning -1 and errors in errno than having positive error numbers returned from the functions. Many programs I've seen check "if (rc < 0)" for errors instead of "if (rc == -1)" , so I don't think it makes sense to break these.

            adilger Andreas Dilger added a comment - I would much rather stick with returning -1 and errors in errno than having positive error numbers returned from the functions. Many programs I've seen check " if (rc < 0) " for errors instead of " if (rc == -1) " , so I don't think it makes sense to break these.
            simmonsja James A Simmons added a comment - - edited

            So I got feedback from our users and middle ware developers. For our users they only care about lfs setstripe and getstripe working. They will most likely never program the striping api themselves. For the ones that do program the most common case is they test for success which means test for zero and then bomb out the app. They normally could care less about returned error values. What is most important to them is that a zero is returned for all functions on success. For our middle ware guys they also want zero to returned for all cases. Now for the error reporting they varied a bit on opinion. What I did get is they don't like negative errno values. They also were not fans of having to read errno itself after a function call. So they asked why not just return a positive errno instead since 0 is success and something else is failure. Also I found out in the discussion return 0 on success and a positive errno is defined in the POSIX.1c standard.

            simmonsja James A Simmons added a comment - - edited So I got feedback from our users and middle ware developers. For our users they only care about lfs setstripe and getstripe working. They will most likely never program the striping api themselves. For the ones that do program the most common case is they test for success which means test for zero and then bomb out the app. They normally could care less about returned error values. What is most important to them is that a zero is returned for all functions on success. For our middle ware guys they also want zero to returned for all cases. Now for the error reporting they varied a bit on opinion. What I did get is they don't like negative errno values. They also were not fans of having to read errno itself after a function call. So they asked why not just return a positive errno instead since 0 is success and something else is failure. Also I found out in the discussion return 0 on success and a positive errno is defined in the POSIX.1c standard.

            Because it would keep the same API that the rest of liblustreapi has today, and there is no drawback to doing so.

            The old API is seriously deficient in its design. Yes, we need to keep it around for some time, but I don't see a great deal of value in maintaining compatibility with poor design. This is our chance to make a clean break and make something that applications can really use.

            Virtually every application I've seen checks if (rc < 0) instead of if (rc == -1), so as it is clearly documented in the man pages that the functions return a negative value on error instead of "-1" there shouldn't be any problem.

            Granted, that is not uncommon. But the man page usually says:

            On error, -1 is returned, and errno is set appropriately.

            So it is perhaps not unreasonable to guess there are are also folks out there that are used to explicitly checking for -1. Lustre returning only -1 could be considered the least likely to trip anyone up because it will work for checks of either rc == -1 or rc < 0.

            morrone Christopher Morrone (Inactive) added a comment - Because it would keep the same API that the rest of liblustreapi has today, and there is no drawback to doing so. The old API is seriously deficient in its design. Yes, we need to keep it around for some time, but I don't see a great deal of value in maintaining compatibility with poor design. This is our chance to make a clean break and make something that applications can really use. Virtually every application I've seen checks if (rc < 0) instead of if (rc == -1), so as it is clearly documented in the man pages that the functions return a negative value on error instead of "-1" there shouldn't be any problem. Granted, that is not uncommon. But the man page usually says: On error, -1 is returned, and errno is set appropriately. So it is perhaps not unreasonable to guess there are are also folks out there that are used to explicitly checking for -1. Lustre returning only -1 could be considered the least likely to trip anyone up because it will work for checks of either rc == -1 or rc < 0 .
            afn Andy Nelson added a comment -

            I fully understand the constraints of backwards compatibility and "can't get rid of that yet" cruft. That was very much the point of my previous comment: I've got to carry around backwards compatibility stuff for the different lustre api's until such time as the new one is available on the oldest machine I have to compile code on. I'm expecting that I can remove the "old API" functionality from my own code in something like 5 years or so, if the "new API" stuff lands in a distributed version right now. And that isn't even the case yet, so the date keeps getting pulled further and further out.

            An example of that pain is the lustre_idl.h file, which I have to hack by hand and carry around with my code since it doesn't compile in user space and I need stuff out of there. The example codes in the lustre manual include it for example too. All the more complicated since that file changes in odd ways across different versions and I can't just use a 1.8x lustre_idl.h file on a lustre 2.x installation and expect it to work...

            So this is why my emphasis on the "start now" and get the clock ticking. As far as new api format (errno and such discussions), we've gone over that, but here is a reiteration: My position as an applications developer is to have one way to access the error information, and for that way to be the errno setting stuff.

            afn Andy Nelson added a comment - I fully understand the constraints of backwards compatibility and "can't get rid of that yet" cruft. That was very much the point of my previous comment: I've got to carry around backwards compatibility stuff for the different lustre api's until such time as the new one is available on the oldest machine I have to compile code on. I'm expecting that I can remove the "old API" functionality from my own code in something like 5 years or so, if the "new API" stuff lands in a distributed version right now. And that isn't even the case yet, so the date keeps getting pulled further and further out. An example of that pain is the lustre_idl.h file, which I have to hack by hand and carry around with my code since it doesn't compile in user space and I need stuff out of there. The example codes in the lustre manual include it for example too. All the more complicated since that file changes in odd ways across different versions and I can't just use a 1.8x lustre_idl.h file on a lustre 2.x installation and expect it to work... So this is why my emphasis on the "start now" and get the clock ticking. As far as new api format (errno and such discussions), we've gone over that, but here is a reiteration: My position as an applications developer is to have one way to access the error information, and for that way to be the errno setting stuff.

            I will hold off on my patches until this is resolved. I sent out a email to some people in our applications department to see what they say. I will report on that feedback.

            simmonsja James A Simmons added a comment - I will hold off on my patches until this is resolved. I sent out a email to some people in our applications department to see what they say. I will report on that feedback.

            Andy, while I'm all for updating applications to the new API, yours is not the only application in the world that uses it, so we can't remove the old API very quickly. Also, this new API has only just been landed into a development branch and would need to be backported into the maintenance releases before it even has a chance to be used by regular users.

            I think the best that can be done is to include this into all of the maintenance releases, update the old APIs to use this new code (James has started on that) and then it can be deprecated in a few years after it is available in those releases. It is also possible to mark those functions as deprecated in the headers so that application developers learn this before the API is removed. It would also make sense to update the user manual once this API is available in the maintenance releases.

            adilger Andreas Dilger added a comment - Andy, while I'm all for updating applications to the new API, yours is not the only application in the world that uses it, so we can't remove the old API very quickly. Also, this new API has only just been landed into a development branch and would need to be backported into the maintenance releases before it even has a chance to be used by regular users. I think the best that can be done is to include this into all of the maintenance releases, update the old APIs to use this new code (James has started on that) and then it can be deprecated in a few years after it is available in those releases. It is also possible to mark those functions as deprecated in the headers so that application developers learn this before the API is removed. It would also make sense to update the user manual once this API is available in the maintenance releases.
            afn Andy Nelson added a comment -

            I am all for abandoning the old api as soon as possible. It is a maintenance nightmare for my applicaiton code to handle. There is ugly and otherwise unneeded ifdef goo for different machines all over the place. Please dump that api. What Ned says is exactly right: there is a chance to do it right/better this time around.

            As far as "negative value returned" vs "specific negative value returned (i.e. rc=-ETHE_ERROR_CODE), again, this leads to confusion and complexity that is just not needed. What happens when someone makes a commit sometime and gets them out of sync, such that the return code and errno are not the same any more, for example? Yes, thats a bug that
            someone introduced, but that is the whole point. You can remove that bug years ahead of time by making it impossible to do that by design.

            afn Andy Nelson added a comment - I am all for abandoning the old api as soon as possible. It is a maintenance nightmare for my applicaiton code to handle. There is ugly and otherwise unneeded ifdef goo for different machines all over the place. Please dump that api. What Ned says is exactly right: there is a chance to do it right/better this time around. As far as "negative value returned" vs "specific negative value returned (i.e. rc=-ETHE_ERROR_CODE), again, this leads to confusion and complexity that is just not needed. What happens when someone makes a commit sometime and gets them out of sync, such that the return code and errno are not the same any more, for example? Yes, thats a bug that someone introduced, but that is the whole point. You can remove that bug years ahead of time by making it impossible to do that by design.

            so as it is clearly documented in the man pages that the functions return a negative value on error instead of "-1" there shouldn't be any problem.

            If the man page merely specifies a negative value on error, than a properly written application shouldn't rely on the specific value returned. Therefore in order for this behavior to be useful, it most become a formal part of the API and documented as such. Simplicity is arguably the most important aspect of a well-designed API, and having two alternative means of returning error codes would add complexity for little gain. I'm not persuaded we should do it just because the legacy llapi functions do it. We should see this as an opportunity to get things right this time, not to carry over all the old baggage.

            nedbass Ned Bass (Inactive) added a comment - so as it is clearly documented in the man pages that the functions return a negative value on error instead of "-1" there shouldn't be any problem. If the man page merely specifies a negative value on error, than a properly written application shouldn't rely on the specific value returned. Therefore in order for this behavior to be useful, it most become a formal part of the API and documented as such. Simplicity is arguably the most important aspect of a well-designed API, and having two alternative means of returning error codes would add complexity for little gain. I'm not persuaded we should do it just because the legacy llapi functions do it. We should see this as an opportunity to get things right this time, not to carry over all the old baggage.

            Because it would keep the same API that the rest of liblustreapi has today, and there is no drawback to doing so. Virtually every application I've seen checks if (rc < 0) instead of if (rc == -1), so as it is clearly documented in the man pages that the functions return a negative value on error instead of "-1" there shouldn't be any problem. That allows applications to choose which behaviour they want to use for programming, and I don't see it introducing any significant complexity into the library - at worst it would mean return -errno instead of return -1 in some places.

            adilger Andreas Dilger added a comment - Because it would keep the same API that the rest of liblustreapi has today, and there is no drawback to doing so. Virtually every application I've seen checks if (rc < 0) instead of if (rc == -1) , so as it is clearly documented in the man pages that the functions return a negative value on error instead of "-1" there shouldn't be any problem. That allows applications to choose which behaviour they want to use for programming, and I don't see it introducing any significant complexity into the library - at worst it would mean return -errno instead of return -1 in some places.

            People

              nedbass Ned Bass (Inactive)
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: