Details

    • Technical task
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • None
    • 9223372036854775807

    Description

      Right now lustre will fail to compile if one attempts to build from a source tree pulled from git, but not actually in a git working directory. The error will contain the string:

      "I have no idea how to create a META file"

      I can only guess at the reasons this was put in place originally, and none of the reasons that I am coming up with are particularly convincing.

      There are, on the other hand, pretty strong counter examples for why this restriction should not exist. For instance, an automated build-and-test farm might reasonably want to use a command like the following to retrieve a tree for a specific commit:

       git archive --format=tar.gz --remote=$URL $GIT_TREEISH

      As it stands, that tar ball will not allow lustre to be built.

      This is a long standing problem, and we really need to get it fixed.

      I have already hacked the code to stop the problem for LLNL's local 2.5-llnl branch, but I think a little more extensive overhaul is needed to make things clean. I think it is pretty reasonable for me to get this done in time for Lustre 2.9.

      Attachments

        Issue Links

          Activity

            [LU-7642] Allow lustre source build without git working directory

            Change 18107 landed, so this is now fixed.

            morrone Christopher Morrone (Inactive) added a comment - Change 18107 landed, so this is now fixed.

            FYI, I have a series of patches under LU-7699 that overhaul lustre versioning. This problem is eliminated in the first patch of that series (http://review.whamcloud.com/18107).

            morrone Christopher Morrone (Inactive) added a comment - FYI, I have a series of patches under LU-7699 that overhaul lustre versioning. This problem is eliminated in the first patch of that series ( http://review.whamcloud.com/18107 ).

            Look at this:

            $ sh autogen.sh
            configure.ac:12: installing 'config/config.guess'
            configure.ac:12: installing 'config/config.sub'
            configure.ac:14: installing 'config/install-sh'
            configure.ac:14: installing 'config/missing'
            libcfs/libcfs/autoMakefile.am: installing 'config/depcomp'
            $ ./configure --version
            Lustre configure LUSTRE_VERSION
            generated by GNU Autoconf 2.69
            [cut]
            $ ./configure --help
            `configure' configures Lustre LUSTRE_VERSION to adapt to many kinds of systems.
            [cut]
            

            This is because we do, in configure.ac, the following:

            AC_INIT([Lustre], [LUSTRE_VERSION], [https://jira.hpdd.intel.com/], [lustre])

            We are setting the version to the string literal "LUSTRE_VERSION". That is clearly not correct.

            morrone Christopher Morrone (Inactive) added a comment - Look at this: $ sh autogen.sh configure.ac:12: installing 'config/config.guess' configure.ac:12: installing 'config/config.sub' configure.ac:14: installing 'config/install-sh' configure.ac:14: installing 'config/missing' libcfs/libcfs/autoMakefile.am: installing 'config/depcomp' $ ./configure --version Lustre configure LUSTRE_VERSION generated by GNU Autoconf 2.69 [cut] $ ./configure --help `configure' configures Lustre LUSTRE_VERSION to adapt to many kinds of systems. [cut] This is because we do, in configure.ac, the following: AC_INIT([Lustre], [LUSTRE_VERSION], [https://jira.hpdd.intel.com/], [lustre]) We are setting the version to the string literal "LUSTRE_VERSION". That is clearly not correct.

            It is a bit confusing at this point how the Lustre version is set under what conditions. I'll keep some notes here as I go along, and maybe turn it into wiki doc later. The document under lustre/doc/VERSIONING is a bit out of date.

            lustre/autoconf/lustre-version.ac contains m4 macros named LUSTRE_

            {MAJOR,MINOR,PATCH,FIX}

            . They are combine in a single LUSTRE_VERSION m4 macro that contains only the first three numbers if LUSTRE_FIX is zero, and all four numbers otherwise.

            LUSTRE_VERSION is assigned to AC_LUSTRE_VERSION_STRING which is in turn made available more broadly with AC_SUBST. However AC_LUSTRE_VERSION_STRING is only currently used in one place, in lustre/include/lustre_ver.h.in and used to set a CPP define named LUSTRE_VERSION_STRING.

            Partly m4 macros are used so that the LUSTRE_VERSION can be used in AC_INIT in configure.ac.

            But it is not immediately clear to me why we use the extra steps of using AC_SUBST and later manually setting CPP macros with slightly different names when they could have just used AC_DEFINE directly.

            Next, lets consider places that try to query the local git repo for information.

            In config/lustre-build.m4, the LB_CONFIGURE macro calls LB_BUILDID.

            LB_BUILDID first checks if we are in a local git repo by running "git branch" and ignoring the output. If that command returns success, the macro tries to use "git describe", only looking at tags that match the pattern of the old cvs-style tags with underscores in their name (why are we still doing that??), and pulling out hash portion to use as the buildid.

            Alternatively, if "git branch" is unsuccessful LB_BUILDID tries to pull the buildid from a file named META. If that doesn't exist either, LB_BUILDID prints a scary warning but does not actually abort.

            There are a number of strange things here. For instance, the code will warn if the starting part of the tag doesn't match $VERSION (where does that one get set?), but doesn't actually abort using the hash. That is kind of strange. Also this whole thing only happens once when configure is run the first time, and then cached. So if the developer makes 15 commits after running configure, presumably the hash that lustre gets built with will be completely wrong. That seems odd.

            Also strange is that this code to parse git describe output is more or less duplicated, but not exactly, in lustre/scripts/version_tag-git.pl. For instance, version_tag-git.pl fails to employ the "--match" option with git describe the way that the configure script does.

            lustre/scripts/version_tag-git.pl is one of the support scripts in the overly-complicated-for-what-it-does script named lustre/scripts/version_tag.pl. version_tag.pl is called from the special automake dist-hook target (or more specifically the dist-hook target calls the module-dist-hook target which runs version_tag.pl). So this script is called at "make dist" time to record the custom buildid that is partially assempled from the output of git --describe, and put that information into a file named "META" so that it is available in the canonical distribution tarball. When the tarball is used in the future, access to a local git repo will no longer be available.

            This all seems much more convoluted than it needs to be. Open questions:

            • Why don't we just do the parsing of "git --describe"'s output in one place instead of 2 or 3?
            • Why don't we use the git --describe output in the version option of AC_INIT (we could use m4_esyscmd_s to call the same script that would be used elsewhere).
            morrone Christopher Morrone (Inactive) added a comment - It is a bit confusing at this point how the Lustre version is set under what conditions. I'll keep some notes here as I go along, and maybe turn it into wiki doc later. The document under lustre/doc/VERSIONING is a bit out of date. lustre/autoconf/lustre-version.ac contains m4 macros named LUSTRE_ {MAJOR,MINOR,PATCH,FIX} . They are combine in a single LUSTRE_VERSION m4 macro that contains only the first three numbers if LUSTRE_FIX is zero, and all four numbers otherwise. LUSTRE_VERSION is assigned to AC_LUSTRE_VERSION_STRING which is in turn made available more broadly with AC_SUBST. However AC_LUSTRE_VERSION_STRING is only currently used in one place, in lustre/include/lustre_ver.h.in and used to set a CPP define named LUSTRE_VERSION_STRING. Partly m4 macros are used so that the LUSTRE_VERSION can be used in AC_INIT in configure.ac. But it is not immediately clear to me why we use the extra steps of using AC_SUBST and later manually setting CPP macros with slightly different names when they could have just used AC_DEFINE directly. Next, lets consider places that try to query the local git repo for information. In config/lustre-build.m4, the LB_CONFIGURE macro calls LB_BUILDID. LB_BUILDID first checks if we are in a local git repo by running "git branch" and ignoring the output. If that command returns success, the macro tries to use "git describe", only looking at tags that match the pattern of the old cvs-style tags with underscores in their name (why are we still doing that??), and pulling out hash portion to use as the buildid. Alternatively, if "git branch" is unsuccessful LB_BUILDID tries to pull the buildid from a file named META. If that doesn't exist either, LB_BUILDID prints a scary warning but does not actually abort. There are a number of strange things here. For instance, the code will warn if the starting part of the tag doesn't match $VERSION (where does that one get set?), but doesn't actually abort using the hash. That is kind of strange. Also this whole thing only happens once when configure is run the first time, and then cached. So if the developer makes 15 commits after running configure, presumably the hash that lustre gets built with will be completely wrong. That seems odd. Also strange is that this code to parse git describe output is more or less duplicated, but not exactly, in lustre/scripts/version_tag-git.pl. For instance, version_tag-git.pl fails to employ the "--match" option with git describe the way that the configure script does. lustre/scripts/version_tag-git.pl is one of the support scripts in the overly-complicated-for-what-it-does script named lustre/scripts/version_tag.pl. version_tag.pl is called from the special automake dist-hook target (or more specifically the dist-hook target calls the module-dist-hook target which runs version_tag.pl). So this script is called at "make dist" time to record the custom buildid that is partially assempled from the output of git --describe, and put that information into a file named "META" so that it is available in the canonical distribution tarball. When the tarball is used in the future, access to a local git repo will no longer be available. This all seems much more convoluted than it needs to be. Open questions: Why don't we just do the parsing of "git --describe"'s output in one place instead of 2 or 3? Why don't we use the git --describe output in the version option of AC_INIT (we could use m4_esyscmd_s to call the same script that would be used elsewhere).

            Christopher J. Morrone (morrone2@llnl.gov) uploaded a new patch: http://review.whamcloud.com/18071
            Subject: LU-7642 build: Allow Lustre to build in absence of git
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 6d88d4018191a5ef3c5543f05e60bbfa3523794b

            gerrit Gerrit Updater added a comment - Christopher J. Morrone (morrone2@llnl.gov) uploaded a new patch: http://review.whamcloud.com/18071 Subject: LU-7642 build: Allow Lustre to build in absence of git Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6d88d4018191a5ef3c5543f05e60bbfa3523794b

            Absolutely! That is why I assigned this ticket to myself; I plan to do this work. This and other tickets will help me track what I need to get done, and also help begin socializing my vision for the build system in the 2.9 time frame.

            And FYI, I opened LU-7645 to start planning what needs to change about our use of the RPM Release field.

            morrone Christopher Morrone (Inactive) added a comment - Absolutely! That is why I assigned this ticket to myself; I plan to do this work. This and other tickets will help me track what I need to get done, and also help begin socializing my vision for the build system in the 2.9 time frame. And FYI, I opened LU-7645 to start planning what needs to change about our use of the RPM Release field.
            rread Robert Read added a comment -

            morrone Yes, that makes sense. Care to submit your patch for master? Code talks louder than words.

            rread Robert Read added a comment - morrone Yes, that makes sense. Care to submit your patch for master? Code talks louder than words.

            Prior to all of that, as you will recall we used to just have the version/build information in a file that somebody had to update, but it often got overlooked when doing version updates/releases, etc., so we automated getting that from git.

            I don't think that is quite accurate. We still need to make commits to update the version information. It would be quite bad to fail to do so even now. It is mostly the release field that is effected by all of this.

            And frankly, this use of the release field is a complete violation of what the release field is supposed to be for in rpm packaging. Anyone using a build system like koji (what fedora and related distros use), the release field must be reserved for packaging purposes. Lustre trying to reuse it for its own means causes a lot of complications. That just isn't what the Release field is for.

            We are using the Release field for Versioning, of Lustre, whereas the Release field should only be used for packing purposes. Granted at the moment the concepts or building and packaging are pretty confused and incestuously interwined in Lustre, but I am working hard to fix that.

            And further: There is minimal (no?) cost to you to allowing other people in the world to build without a working git repo. It is about as close to a free lunch as one can get. Other versioning issues will probably require tradeoffs, I agree, but I'm not going to go into those in this ticket.

            morrone Christopher Morrone (Inactive) added a comment - - edited Prior to all of that, as you will recall we used to just have the version/build information in a file that somebody had to update, but it often got overlooked when doing version updates/releases, etc., so we automated getting that from git. I don't think that is quite accurate. We still need to make commits to update the version information. It would be quite bad to fail to do so even now. It is mostly the release field that is effected by all of this. And frankly, this use of the release field is a complete violation of what the release field is supposed to be for in rpm packaging. Anyone using a build system like koji (what fedora and related distros use), the release field must be reserved for packaging purposes. Lustre trying to reuse it for its own means causes a lot of complications. That just isn't what the Release field is for. We are using the Release field for Versioning, of Lustre, whereas the Release field should only be used for packing purposes. Granted at the moment the concepts or building and packaging are pretty confused and incestuously interwined in Lustre, but I am working hard to fix that. And further: There is minimal (no?) cost to you to allowing other people in the world to build without a working git repo. It is about as close to a free lunch as one can get. Other versioning issues will probably require tradeoffs, I agree, but I'm not going to go into those in this ticket.

            rread Sure, we could throw any kinds of "placeholder" (i.e. like "custom") values into the missing metadata but the point of getting and using that metadata was to convey accurate information about exactly where the code came from.

            Prior to all of that, as you will recall we used to just have the version/build information in a file that somebody had to update, but it often got overlooked when doing version updates/releases, etc., so we automated getting that from git. It also never conveyed "between release" information as well as a tag+fwd commits, or even a git hash could.

            So yes, it does complicate and restrict, but there's no free lunch.

            brian Brian Murrell (Inactive) added a comment - rread Sure, we could throw any kinds of "placeholder" (i.e. like "custom") values into the missing metadata but the point of getting and using that metadata was to convey accurate information about exactly where the code came from. Prior to all of that, as you will recall we used to just have the version/build information in a file that somebody had to update, but it often got overlooked when doing version updates/releases, etc., so we automated getting that from git. It also never conveyed "between release" information as well as a tag+fwd commits, or even a git hash could. So yes, it does complicate and restrict, but there's no free lunch.

            Robert: I don't think so, I think something like that should be an option. If you don't have the git working directory, you don't know what tag it came from. The code should just have the version included. It avoids soooooo many problems.

            Now, I'm not at all saying that we can't also have Lustre add version info if it is in a git working directory. That part can certainly remain.

            morrone Christopher Morrone (Inactive) added a comment - Robert: I don't think so, I think something like that should be an option. If you don't have the git working directory, you don't know what tag it came from. The code should just have the version included. It avoids soooooo many problems. Now, I'm not at all saying that we can't also have Lustre add version info if it is in a git working directory. That part can certainly remain.

            People

              morrone Christopher Morrone (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: