[LU-14680] remove strict dependency between user tools and kernel RPMs Created: 11/May/21  Updated: 19/Nov/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-13903 Make "configure --disable-modules" mo... Resolved
is related to LU-12511 Prepare lustre for adoption into the ... Open
is related to LU-9680 Improve the user land to kernel space... In Progress
Rank (Obsolete): 9223372036854775807

 Description   

There is currently a strong linkage between the userspace tools and the kernel modules, in lustre.spec.in:

Requires: %{requires_kmod_name} = %{requires_kmod_version}

and this evaluates to a very specific kernel module package version, e.g.:

kmod-lustre-client = 2.14.0_7_gf7512cc

This dates back to commit 1.6.0-2717-gc7496eccfc, but that doesn't provide much explanation beyond "Tighten up some dependencies.", and the b=13908 is no longer available:

commit c7496eccfc6a4bc804bd999343a2d289d5e9eeaf
Author:     brian <brian>
AuthorDate: Wed Apr 22 17:55:29 2009 +0000
Commit:     brian <brian>
CommitDate: Wed Apr 22 17:55:29 2009 +0000

    b=13908
    i=yibin.wang
    i=sheng.yang
    
    Allow the name, version, kernel release and release of the lustre packages
    to be defined on the command line.
    With such a feature, actually properly name the patchless client packages in
    our own build.
    Tighten up some dependencies.

Such a strong binding between the user tools and the kernel modules is not really needed and could be relaxed significantly. Sticking within the same major release (e.g. ">= 2.14.0" in this case) would almost certainly be enough, with the caveat that not having new tools may limit access to some functionality that is available in the kernel (e.g. missing an "lfs" or "lctl" command to enable something, or an llapi_* function that was added later.

In theory, newer tools could also handle significantly older kernel module versions to some extent, but would typically be lacking in features from a newer release. In that regard, it doesn't really make sense for the user tools to require a newer version, but rather the kernel modules should require at least a minimum version of the tools in order to enable the new functionality. That said, maintaining some level of coherency between the tools and kernel modules makes sense, maybe within a few major release versions (+/- 0.3) to allow old interfaces to be slowly deprecated as needed.



 Comments   
Comment by Andreas Dilger [ 11/May/21 ]

James, in patch https://review.whamcloud.com/38649 "LU-12511 build: ignore kmod handling in spec file for utilities only build" you removed the linkage between kernel modules and user tools completely. In my (limited) testing, this has been fine, but you have been testing with a wider range of modules and tools due to the kernel client.

Any thoughts on how tightly these versions should be coupled?

Comment by James A Simmons [ 08/Jul/21 ]

Sorry I didn't see this earlier. Yes I have tested over a range of version. For the Lustre utilties its pretty good for working over a range. As for LNet the opposite is true. The ioctls not only are different between version but even between tags. Don't try to ever run lnetctl that is not match to the tip of master or you will crash your nodes. 

Comment by Andreas Dilger [ 08/Jul/21 ]

Why does lnetctl have compatibility problems? That shouldn't happen, but I've never looked into that.

Comment by James A Simmons [ 18/Nov/21 ]

Any more work needed for this ?

Comment by Andreas Dilger [ 19/Nov/21 ]

James, it seems like the fix is only in the case where the tools are built separately from the rest of the code. If it is being built together there is still a hard linkage between the tools and the kernel modules, which I think is wrong. It should at least be relaxed to allow the same major release version.

Separately, it would be good to fix the lnetctl incompatibilities, so that it doesn't fail (or worse, crash the node) when not built for the same version.

Generated at Sat Feb 10 03:11:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.