lustre build system overhaul (LU-1199)

[LU-3462] Eliminate ldiskfs recursive & independent rpm packaging Created: 12/Jun/13  Updated: 09/Dec/13  Resolved: 12/Sep/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: Lustre 2.5.0

Type: Technical task Priority: Major
Reporter: Christopher Morrone Assignee: Minh Diep
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-3955 Eliminate ldiskfs recursive & indepen... Closed
Rank (Obsolete): 8659

 Description   

ldiskfs is a prerequisite of lustre. But instead of building the prerequisite before lustre, we have chosen to stop and perform the build of in the middle of the lustre. That introduces a great deal of unnecessary complexity to the lustre build system. The complexity is so great, that we have allowed the normal* way of building RPMs to remain broken for years. Our .src.rpm files remain unbuildable under many normal situations.

To make further progress on LU-1199, this major problem must be addressed. As I see it, there are two main solutions we could choose:

  1. Move ldiskfs out of the lustre source tree. Build it first.
  2. Keep ldiskfs in-tree, but eliminate its independant build system and spec file. The ldiskfs binary rpm becomes a sub-module of lustre.spec.

I have been moving us towards option 1. I have gotten enough change landed to allow ldiskfs to be built out of tree. It will not take much change to allow Lustre to use the externally pre-build ldiskfs packages. We already do this at LLNL, so I'm pretty clear on what needs to be done there.

I think option 1 would also position us nicely is James Simmons is successful in getting all of the ldiskfs into the upstream kernel at some point. We will already be comfortable at that point building against an ldiskfs that is out of tree.

It is also nicely symmetric with how we build against an external zfs today (and perhaps btrfs in the future). Even if we wind up needed to make an "lbtrfs" in the future, I don't think we would want to repeat the choice to try to merge that into the lustre tree.

I would like to see a quick decision from Intel on which path we should take for Lustre 2.5, so that we can get the change introduced early in the development cycle. That will give us some time to adjust all of our personal processes, and get the kinks worked out well in advance of release.

* I am defining "normal" to be the way that a distro like fedore would do it: first build ldiskfs rpms, including devel packages. Then build lustre against the already built ldiskfs packages.



 Comments   
Comment by Andreas Dilger [ 13/Jun/13 ]

So, maybe I'm a bit confused. Is it not possible to have the ldiskfs code as part of the Lustre source tree, but build it first? I know you don't like recursive package building, but could it be sequential package building (first ldiskfs, then Lustre)?

While it is true that there is some effort to push some ldiskfs patches upstream, I don't think we'll ever get to a "patchless" ldiskfs like we have with the kernel, unless this happens at the same time as the Lustre server code is accepted into the kernel as well. There is just no benefit upstream from some of the ldiskfs patches.

Moving the ldiskfs code out into a separate Git repo would increase the development overhead noticeably, since it means that a "commit" would not be atomic across both repos, and we'd have to start adding new complexity to manage ldiskfs interoperability with different versions of Lustre. I don't think this is a good use of limited developer time if the only benefit is for simplifying the packaging.

I'd be fine with having the Lustre build first build ldiskfs packages, install them, then build Lustre packages, and they could have separate .src.rpm files, but I'd like them to be kept in the same Git repo, since they often need to be updated in lockstep. Possibly we could use something like:

LDISKFS_HASH=$(git log ldiskfs | head -1 | awk '{ print $2 }')
LDISKFS_VERSION=$(git describe $LDISKFS_HASH)

to extract the last commit from the ldiskfs tree to decide whether it needs to be rebuilt or can be gotten from the build cache. We might Require: $LDISKFS_HASH so that the packages are kept in sync when installed?

Comment by Peter Jones [ 13/Jun/13 ]

Minh can you please review this suggestion to understand what would be required? thanks Peter

Comment by Christopher Morrone [ 13/Jun/13 ]

So, maybe I'm a bit confused. Is it not possible to have the ldiskfs code as part of the Lustre source tree, but build it first? I know you don't like recursive package building, but could it be sequential package building (first ldiskfs, then Lustre)?

No, I don't think that third option is really a viable option.

The goal, first and foremost, is to make normal rpm build operations work correctly. Creating a new script that sits on top of both lustre and ldiskfs is the lbuild model of doign things. It doesn't support the standard Fedora (and other rpm-based distro) way of building things.

I define "normal" as this:

  • Build ldiskfs
    • Generate ldiskfs .src.rpm
    • Feed .src.rpm to mock to build binary ldiskfs rpms
  • Build lustre against ldiskfs
    • Generate lustre .src.rpm
    • Feed .src.rpm to mock to build binary ldiskfs rpms

For that last step to work, lustre must have correct rpm requirements on ldiskfs development packages, and ldiskfs needs proper versioning.

Now, I suppose you could have a git repo where the top level directory just looks like this:

   ldiskfs/
   lustre/
   uberbuilditall.sh

It could work, but it doesn't save you much. And at the same time, it encourages the same lazy approach to packaging and versioning that we've been employing. (Two years without an ldiskfs version number bump, how did you guys miss that???)

You can have one commit that spans two projects, but I don't see much value in that. You'll need to start properly versioning ldiskfs in the lustre repo. Every time you tag lustre, you have to tag ldiskfs as well. It is going to get dirty.

If you really think that it is important to leave ldiskfs in-tree, then why would we want to perpetuate the current system of ldiskfs having its own autotools scripts and separate spec file? This is causing too many problems.

I'd be fine with having the Lustre build first build ldiskfs packages

I am not ok with that. At all.

If you are dead set on ldiskfs remaining in tree, lets just make it a proper part of lustre, which is the option 2 that I proposed above.

There would be absolutely no need for another lbuild-like wrapper script, no need for separate versioning and dependencies...I think the script that you are proposing is just a continuation of our current problem.

Comment by Christopher Morrone [ 13/Jun/13 ]

Andreas, let me ask it this way:

What value do we derive from the ldiskfs subdirectory of the git repo having its own spec file and separate autotools implementation?

Comment by Andreas Dilger [ 14/Jun/13 ]

OK, to clarify - I am not advocating for ldiskfs having its own spec file and separate autotools if that is not of benefit to your reorganization. I would be fine with #2 if that is an option - I was under the impression that #2 was not an option that would get you to your end goal. I think if the end game is pushing more of the ldiskfs patches upstream that #1 and #2 would essentially converge when the upstream kernel's ext4 code can be used for osd-ldiskfs without modification, or osd-zfs/osd-btrfs eventually take over.

Comment by Christopher Morrone [ 14/Jun/13 ]

Yes, #2 is an option. That is why I listed it as an option.

It would be disappointing to have wasted so much time on option #1 this past year, and have to reverse most of that work. But I just want to put this thing to bed at this point. If you guys won't accept option #1, but can commit to option #2, I am fine with that.

Option #2 certainly has its own tricky bits, and I haven't thought it through as much, but I don't imagine there is anything excessively difficult.

So far, the main issue I see will be dealing with the version number change on the lustre-ldiskfs package. When it becomes a lustre.spec sub package, the versioning will suddenly go backwards to a smaller version (because the lustre version is lower than the ldiskfs version number). But that shouldn't be too bad. We'll probably just need to use an rpm Epoch.

Comment by Christopher Morrone [ 14/Jun/13 ]

Ah, it occurs to me that if I simply put the ldiskfs.ko in the existing lustre-osd-ldiskfs package, then I can have lustre-osd-ldiskfs Obsoletes the lustre-ldiskfs packages. Then there would be no need to deal with rpm Epochs.

Comment by Andreas Dilger [ 15/Jun/13 ]

Sure, that seems totally reasonable, and keeps the idea that the OSD and ldiskfs code are tightly coupled.

Comment by Christopher Morrone [ 21/Jun/13 ]

James, you might want to be aware of this ticket.

Comment by Christopher Morrone [ 21/Jun/13 ]

OK, it looks like Intel wants to go with Option #2: Keep ldiskfs in-tree, but eliminate its independant build system.

I've started work on this. It will wind up eliminating things that we recently introduced like ldiskfs_config.h. When ldiskfs is merged into lustre's build system, there is no real reason to cleanly export ldiskfs's configuration any longer. The only user of ldiskfs now has direct access to that knowledge.

Comment by Christopher Morrone [ 21/Jun/13 ]

Why do we generate the ldiskfs source and apply the patches under make?

I think there are a number of advantages, and some good clean up to the make path that I can do if I move the patch application into the configure step.

Is there any good reason to keep it at make time?

Comment by Andreas Dilger [ 22/Jun/13 ]

I would think the reason the patches are applied by make is so that they get re-applied automatically if they are changed during development or after a git update. If they are applied during configure, then configure will only be re-run if Makefile.am or similar is changed.

Comment by Christopher Morrone [ 24/Jun/13 ]

That is a good guess! But from what I'm reading, I don't think that it really works that way.

It looks like the only dependency is on the series file. So patch modifications won't cause a "rm -rf" of the patched directory unless there is also a series file change.

It is not clear to me that there is much value in continuing that. It will lead to a false sense of security.

And doing it right would be very difficult, because we are combining the kernel's build system with another pass of a non-kernel autotools system. It looks to me like it would be very difficult to get this right (which is why it never has been). So I'm not sure there is even value in preserving the current system in the hopes that someone someday will do all of the work to make it correct.

I would vote for simplicity.

Comment by Christopher Morrone [ 02/Jul/13 ]

It didn't sounds like there was much endorsement of the plan to move patch application into configure, so I will avoid that for the first iteration of the patch.

This is not at all finished, but I pushed a patch just now to get a peak at how the Intel build system is going to react to the change. And to demonstrate that work is progressing.

http://review.whamcloud.com/6850

Comment by Christopher Morrone [ 03/Jul/13 ]

Patch set 2 of http://review.whamcloud.com/6850 built in Jenkins, and appears to be running through testing pretty well. I assume any problems probably would have been evident pretty quickly since this is just packaging change. And it has been running fine for hours now.

Minh, I think it would be worthwhile to get your initial impressions of the approach, keeping in mind that there is still cleanup to do.

Comment by Christopher Morrone [ 10/Jul/13 ]

Patch set 3 of http://review.whamcloud.com/6850 is now available.

I cleaned up the things that Minh noted in the last patch. The m4 files still have some obvious cleanup work needed before they are ready though.

The big change in this version is the elimination of the ldiskfs/ldiskfs subdirectory. I realized that it was pretty much superfluous once the ldiskfs build system was eliminated, so I moved the make files from that directory up one level and removed it.

Looks like it needs a rebase after today's landings too.

But I think things are looking pretty good otherwise.

Comment by Christopher Morrone [ 13/Jul/13 ]

Patch set 6 of http://review.whamcloud.com/6850 is now available.

I cleaned up the m4 files (among other things). The main task with the m4 cleanup was to merge the two ways that lustre and ldiskfs were determining the kernel version.

With that done, I think it is now ready for wider review.

Comment by Christopher Morrone [ 20/Jul/13 ]

I pulled a thread and unravelled the sweater. But I think I put it back together now. Hopefully patch set 12 will compile and be near final.

Patch set 12 depends on a new smaller patch that repairs some recent (and possibly some long standing) breakage with the Lustre "make dist" target.

Comment by Christopher Morrone [ 24/Jul/13 ]

It took a couple of more revisions, but things are now building. There are now two patches for this change:

http://review.whamcloud.com/7054
http://review.whamcloud.com/6850

6850 failed maloo on one test, but I believe that it was unrelated to the patch.

I am now done with work on these patches until more work arises from reviews and/or rebases.

Comment by Christopher Morrone [ 07/Aug/13 ]

Status update:

Patch 7054 landed.

The remaining patch, 6850 is in pretty good shape. Barring problems found by reviewers, I consider it ready to land. LLNL is already using it on our 2.4.0 branch.

6850 had a +1 review from Brian Murrell at Patch Set 14. I rebased it to address conflicts from newly landed patches, but master seems to be destabilized at the moment, and Maloo testing failed for reasons unrelated to the patch.

Comment by Minh Diep [ 13/Aug/13 ]

Hi Chris,

make srpm results in this error

[mpiuser@client-1 lu3462.org]$ make srpm
make -C lustre-iokit srpm
make[1]: Entering directory `/mnt/build/build/lu3462.org/lustre-iokit'
make[1]: *** No rule to make target `srpm'. Stop.
make[1]: Leaving directory `/mnt/build/build/lu3462.org/lustre-iokit'
make: *** [srpm] Error 2

After I comment out the make iokit in srpm, it continue but fails here.

...
Type 'make' to build Lustre.
+ make -j 4 -s
make[2]: Entering directory `/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53'
make[3]: Entering directory `/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53'
Making all in ldiskfs
Making all in .
make[4]: Entering directory `/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53'
make[5]: Entering directory `/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53/lustre'
make[5]: Leaving directory `/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53/lustre'
make[5]: Entering directory `/usr/src/kernels/2.6.32-358.11.1.el6.x86_64'
make[7]: *** No rule to make target `/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53/ldiskfs/dynlocks.o', needed by `/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53/ldiskfs/ldiskfs.o'. Stop.
make[7]: *** Waiting for unfinished jobs....
make[6]: *** [/localhome/mpiuser/rpmbuild/BUILD/lustre-2.4.53/ldiskfs] Error 2
make[6]: *** Waiting for unfinished jobs....

Comment by Christopher Morrone [ 13/Aug/13 ]

Ah, thanks! I clearly missed a makefile change for lustre-iokit. I remember making that change, so I must have misplaced it. Good catch.

The second problem I can't reproduce. Some more details about your configuration might help. But it may not matter in the end.

It looks like there is an existing bug not necessarily related to my patch, that the "srpm-real" make target does "rpmbuild -ta", which is incorrect. It should be doing "rpmbuild -ts". In other words, if you run "make srpm", there should really be no attempt to build ldiskfs, and you should not have seen that error. Then again, that doesn't entirely explain why the build did fail. Hopefully ldiskfs just was not configured correctly under that path (which is really should not need to be). But it is possible that the problem will just move to the rpm case.

Lets start with those two changes, and you can tell me if the problem moves to another make target. I just pushed revision 17 of the patch with those fixes.

By the way, don't hesitate to review the patch -1 when you find problems!

Comment by Peter Jones [ 12/Sep/13 ]

I am delighted to note that this patch has landed for 2.5

Comment by James A Simmons [ 17/Sep/13 ]

Chris since this patch has landed I'm seeing the follow errors.

lustre-2.4.92-broke/ldiskfs/trace/events/ldiskfs.h: In function ‘ftrace_profile_enable_ldiskfs_free_inode’:
lustre-2.4.92-broke/ldiskfs/trace/events/ldiskfs.h:18: error: implicit declaration of function ‘register_trace_ldiskfs_free_inode’
lustre-2.4.92-broke/ldiskfs/trace/events/ldiskfs.h: In function ‘ftrace_profile_disable_ldiskfs_free_inode’:
lustre-2.4.92-broke/ldiskfs/trace/events/ldiskfs.h:18: error: implicit declaration of function ‘unregister_trace_ldiskfs_free_inode’
lustre-2.4.92-broke/ldiskfs/trace/events/ldiskfs.h: In function ‘ftrace_profile_enable_ldiskfs_request_inode’

I have looked over the code in detail to see what is wrong but can't figure it out. Do you have any ideas?

Comment by Minh Diep [ 24/Sep/13 ]

one minor fix http://review.whamcloud.com/#/c/7639/ added.

Comment by Christopher Morrone [ 24/Sep/13 ]

Chris since this patch has landed I'm seeing the follow errors.

James, could you start a new ticket for that? I'll forget about this in no time if it is on a closed ticket.

Include which kernel you are using. Maybe configuration options too. I am not seeing that error, and I don't have any ideas off the top of my head.

Comment by Minh Diep [ 26/Sep/13 ]

http://review.whamcloud.com/#/c/7639/ landed in 2.5.0

Comment by Minh Diep [ 02/Dec/13 ]

James,

When you saw the ftrace compile issue above, were you using any external OFED?

Comment by James A Simmons [ 02/Dec/13 ]

Yes I was. Have you seen this with using a external OFED stack?

Comment by James A Simmons [ 09/Dec/13 ]

Minh I looked at LU-4314 which appears to be the same error I got. I will try the patch their to see if it does address my issue.

Comment by James A Simmons [ 09/Dec/13 ]

Just verified it is the LU-4314 issue with parallel builds. If I change the make flag options it builds. Will move discussion over to there.

Comment by Minh Diep [ 09/Dec/13 ]

James, the patch to fix the ldiskfs issue is in LU-4266

Generated at Sat Feb 10 01:34:07 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.