[LU-2473] ldiskfs RHEL6.4 support Created: 11/Dec/12  Updated: 09/Apr/13  Resolved: 09/Apr/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Improvement Priority: Blocker
Reporter: Christopher Morrone Assignee: Yang Sheng
Resolution: Fixed Votes: 0
Labels: HB

Issue Links:
Related
is related to LU-2967 list_del corruption - client crashes Resolved
is related to LU-20 patchless server kernel Resolved
is related to LU-992 deprecate RHEL5 kernel support for Lu... Resolved
is related to LU-2846 Kernel update [RHEL6.4 2.6.32-358.2.1... Resolved
Rank (Obsolete): 5819

 Description   

RHEL 6.4 beta is now out, and we are beginning to assemble the next TOSS release at LLNL. I will attach patches to this ticket that add preliminary RHEL 6.4 support to ldiskfs only. In otherwords, I am updating ldiskfs/kernel_patches, not lustre/kernel_patches.



 Comments   
Comment by Christopher Morrone [ 11/Dec/12 ]

I pushed the following two commits:

http://review.whamcloud.com/4803
http://review.whamcloud.com/4804

The idea is to add rhel 6.4 support without destroying the rhel 6.3 patch series. A novel idea, but one that I think is worth pursuing.

We don't need to official "support" RHEL 6.3 once RHEL 6.4 is out of beta, but it would be nice to not intentionally destroy the patch stack for RHEL 6.3.

This new way of doing things allows folks like LLNL that start testing new RHEL releases early to use lustre, and on the other hand also give those slow to upgrade to 6.4 a way to continue to test newer lustre version on rhel 6.3.

NOTE: I only did a fairly naive contextual fixup of the RHEL6.4 patches in http://review.whamcloud.com/4804. They apply and build, but an ldiskfs expert needs to look them over and make sure that they are still sane.

Comment by Peter Jones [ 12/Dec/12 ]

Yangsheng

Could you please comment on this?

Thanks

Peter

Comment by Yang Sheng [ 12/Dec/12 ]

This really is a new idea. I don't think it is a best way to handle ldiskfs patches like that.

Comment by James A Simmons [ 12/Dec/12 ]

I used this approach to support SLES11 SP1. The SP1 support is basically RHEL6 plus one patch to handle the difference. Any fixes RHEL6 got SP1 also got. This makes maintaining much easier. So IMNSHO Chris approach is the way to go. I will be approaching LU-1812 the same. I have to say tho I really wish we had a ldiskfs core that would apply distro specific patches instead of the reverse of the way we do it now.

Comment by Christopher Morrone [ 12/Dec/12 ]

I am obviously in agreement with James.

Yang Sheng, if you have advice for a better way to do it, please elaborate.

I agree that this is a new idea for how to handle the patches. The way that the patches have been managed in ldiskfs in the past has been rather horrible for anyone outside of Sun/Oracle/Whamcloud/Intel.

Comment by Christopher Morrone [ 12/Dec/12 ]

I have to say tho I really wish we had a ldiskfs core that would apply distro specific patches instead of the reverse of the way we do it now.

James, if I understand you correctly, you are suggesting that the ldiskfs package should contain the full source code from ext4, converted into ldiskfs. Not just be patches that are applied on the fly to ext4.

At LLNL, I attempted to do that at one point. I kept the full source in a git repo. But even for a single distro, it was quite difficult to maintain. Partly that was because upstream wasn't on board with the approach. But also it was because ext4 is pretty tightly integrated with the kernel, and even seemingly minor kernel updates made large dangerous changes to ext4. Dangerous in the sense that if they were not applied to ldiskfs as well, the code would run but be subtly broken. I've seen instances of that in recent history.

I'm not entirely opposed to that idea, but it did prove harder than I originally thought it would be.

Comment by James A Simmons [ 12/Dec/12 ]

Yes that was what I was suggesting as a alternative. I see you have bleed trying that were I only pondering if it would be worth doing. I see the cost is to high to do that approach.

Comment by James A Simmons [ 20/Dec/12 ]

Didn't want to clutter the comments section of the patch inspection. Since the patch needs to be refreshed I like to suggest that the unknown directory be named shared. Also I think Andreas patch cleanup code from LU-20 will need to be updated to the new changes in the patch tree structure.

Comment by Christopher Morrone [ 20/Dec/12 ]

Yes, I can do that. I'm waiting for an LU-1199 commit to land (any month now...) before the rebase.

Oh actually, the first http://review.whamcloud.com/4803 can probably be rebased now instead of waiting. Its probably only http://review.whamcloud.com/4804 that has a dependency on both LU-20 and LU-1199.

I did originally name the directory "shared" instead of "unknown". Mainly I wanted to avoid people throwing everything in "shared" in the future. I'd like to see new patches appear in the directory for the system where they were first developed.

But it is not a strong opinion. If you like that better, I may change it back when I rebase.

Comment by Christopher Morrone [ 20/Dec/12 ]

Also, I wanted to avoid the implication that the patches in other directories are NOT shared, when in fact they probably will be shared by newer revisions of the OS.

Comment by James A Simmons [ 20/Dec/12 ]

You made valid points. Unknown is fine with me. You will need to rebased 4803 in the future again since it will collide with the patch from LU-992. Sigh, so many colliding patches.

Comment by Christopher Morrone [ 07/Jan/13 ]

I have rebased patches, but the second one has conflicts with both LU-992 and LU-1199. So I either push now, and refresh the LU-992 patch (which does not belong to me), or I need to wait until LU-992 lands. I'll wait for now.

Comment by Christopher Morrone [ 07/Jan/13 ]

I rebased 4803 onto the LU-992 patch. The 4804 patch will need to wait until LU-992 really lands, because it also depends on the recently landed LU-1199 patches.

Comment by Christopher Morrone [ 08/Jan/13 ]

LU-992 landed, so I rebased both patches (4803, 4804) onto master.

Comment by Christopher Morrone [ 09/Jan/13 ]

Just so everyone is clear, the 4803 patch can be reviewed and landed without considering whether the the new patches to support RHEL6.4 are correct in 4804.

The 4803 patch just lays the groundwork to support new kernels with ldiskfs without intentionally breaking previous kernel support.

Comment by Christopher Morrone [ 10/Jan/13 ]

In fact, I more like maintain a base ldiskfs patch group. And provide a patch that contain special change to distro. That is, this patch first change base patch group, and then apply to ext4 source. If so, we can reduce the patch number. But it obviously bring some complicated things special to review.

That is only a slight change from what I have. And in fact, your "base" set of patches will quickly diminish each time a new OS is supported. Your solution means that when a "base" patch no longer applies to a new kernel, you need to not only make ONE copy (which is my proposal does), you will need to make a copy into EVERY existing kernel subdirectory as well as the new kernel subdirectory. Because now it is no longer a universal "base".

How is that any better than my system?

Comment by Yang Sheng [ 10/Jan/13 ]

I think i haven't express clear enough. The 'base patch group' is not make change unless ldiskfs needed. We provide other patch contain the changes that resolved conflict when the 'base patch' apply to kernel-tree. This patch just contain distro special changes. There process just running on build system. That is, every distrio just need one patch and 'base patch'.

I have ever produce a patch for that. As you know, since some things occured interrupt it. Please refer to:
https://bugzilla.lustre.org/attachment.cgi?id=32635
https://bugzilla.lustre.org/show_bug.cgi?id=24037

Comment by Christopher Morrone [ 10/Jan/13 ]

I am still not understanding what you are proposing. Do you want to do this:

1) apply common "base" patches
2) apply distro-specific patches

or this:

1) apply distro-specific patches
2) apply common "base" patches

Either way, I don't see that working particularly well. In the first case (apply "base" before "distro-specific"), it won't work because many of the patches will not apply to everything. So you wind up having MORE patch duplication than in my proposal. Because the way I do things, some patches that are shared interleave with distro-specific patches. With this option, as soon as you hit a patch that doesn't apply cleanly to ALL versions of ext4, you'll need to split it out into many distro-specific patches. In my plan, you only need to split out one copy of the patch for that one new kernel you are adding.

In the second case (apply "distro-specific" before "base"), that just seems crazy. You would essentially be trying to turn all different versions of ext4 into a single homogenous source code base, in order to allow a "base" set of patches to apply cleanly. You would be better off just putting the source code in the the package directly.

Comment by Yang Sheng [ 10/Jan/13 ]

It just 2 steps:

1. apply distro-special patch to 'base patches' --> 'tmp patches'
2. apply 'tmp patches' to kernel-tree;

The 'tmp patches' just exist in build process.

Comment by Christopher Morrone [ 11/Jan/13 ]

I really, really don't like that idea. Trying to work with patches of patches would be quite a bit more difficult than what I am proposing. Also, I think that would be a great deal more difficult for gerrit reviewers to deal with.

We already have a situation where gerrit reviewers need to review patches to patches. Your suggestion would mean that reviewers need to look at patches of patches of patches. Reviewing other people's work is hard enough. I think that would add a level of difficulty that we really do not want.

Comment by Yang Sheng [ 11/Jan/13 ]

Indeed, so I won't push it landed. Since it really difficult.

Comment by Christopher Morrone [ 11/Jan/13 ]

Brian Behlendorf made a suggestion that sounded good to me. Andreas, Peter, perhaps this would address your concern about people accidentally using unsupported patch series:

What about having an ldiskfs configure option named something like "--enable-unsupported" that would need to be set before any of the unsupported patch series are applied.

In other words, by default, only the couple of officially "supported" series would automatically apply, and it would look to most people like what we have today. The additional unsupported series files are only recognized and applied by the build system when --enable-unsupported is used.

Comment by Christopher Morrone [ 11/Feb/13 ]

It would be really nice to get change 4803 reviewed and landed ASAP. A number of other patches other depend on, or will conflict with, that patch.

The LU-1812 is now directly basing their work on that patch, and I'll refresh the RHEL6.4 patch here when/if that lands.

Comment by Yang Sheng [ 28/Feb/13 ]

Patch landed. close bug.

Comment by James A Simmons [ 28/Feb/13 ]

Its not finished yet. Patch http://review.whamcloud.com/#change,4804 needs to be rebased and merge yet.

Comment by Christopher Morrone [ 28/Feb/13 ]

James is correct, the patch that adds the actual "ldiskfs RHEL6.4 support" has not landed (change 4804). Only the patch that reorganizes the ldiskfs patches into subdirectories has landed (change 4803).

I will try to get to refreshing change 4804 soon.

Comment by Christopher Morrone [ 16/Mar/13 ]

I updated the RHEL6.4 ldiskfs series in change 4804. The list of patches that needed to be changed for RHEL6.4 shrunk because I rebased this on change 5001, and because I reverted a small upstream style change in one of the early patches so later patches wouldn't need the adjustment for context.

I believe that I addressed the previous review comments as well.

Comment by Bob Glossman (Inactive) [ 20/Mar/13 ]

Although this change is only necessary for rhel/centos 6.4 I've applied and tried it in centos 6.3 and sles11sp2 as well. The new selection rules in ldiskfs-build.m4 appear to work correctly everywhere.

Comment by Christopher Morrone [ 21/Mar/13 ]

Awesome, thanks alot for checking that!

For everyone watching at home, change 4804 is dependent the following series of three patches, which are pretty darn close to being ready themselves:

http://review.whamcloud.com/5675
http://review.whamcloud.com/4991
http://review.whamcloud.com/5001

I believe that 4991 and 5001 are done. 5675 we have some mysterious failures, but the change seems pretty good to me. I'm guessing that either the failures are unrelated to the change, but perhaps we missed at spot in lustre where the new header needs to be included. We'll see how the latest rebase does in autotest.

Comment by Christopher Morrone [ 27/Mar/13 ]

James found the problem with change 5675, so I am hopeful that all four changes are in their final state for landing.

Comment by Christopher Morrone [ 02/Apr/13 ]

The prerequisites have all been merged. Now we just await the landing of 4804.

Comment by James A Simmons [ 03/Apr/13 ]

As a bonus LU-3071 get fixed automatically when you move to RHEL6.4

Comment by Peter Jones [ 03/Apr/13 ]

It is LU-2967 that we are most concerned about with RHEL6.4

Comment by Christopher Morrone [ 03/Apr/13 ]

As a bonus LU-3071 get fixed automatically when you move to RHEL6.4

I do not think that is the case. Our Lustre 2.1.4 systems are already using RHEL6.4.

Comment by Yang Sheng [ 09/Apr/13 ]

Patch landed, close this ticket.

Generated at Sat Feb 10 01:25:30 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.