IU Shared Secret Key authentication and encryption
(LU-3289)
|
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1 |
| Fix Version/s: | Lustre 2.5.0 |
| Type: | Technical task | Priority: | Blocker |
| Reporter: | Andrew Korty (Inactive) | Assignee: | Minh Diep |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | SSK, patch | ||
| Environment: |
All |
||
| Issue Links: |
|
||||||||||||
| Rank (Obsolete): | 8774 | ||||||||||||
| Comments |
| Comment by Andrew Korty (Inactive) [ 20/Jun/13 ] |
|
--enable-gss is not passed to configure on Gerritt test nodes, so GSSAPI code never gets built or tested. Also, commits to other parts of Lustre that break GSSAPI support but test fine without --enable-gss never get noticed. |
| Comment by Andrew Korty (Inactive) [ 21/Jun/13 ] |
| Comment by Keith Mannthey (Inactive) [ 21/Jun/13 ] |
|
If something like this lands everyone who builds will have to update their build servers for an optional part of the code. Not that it is a major issue but it is far reaching. I wonder if there is a --build-everything setting that would help catch all these sorts of issues for than just GSSAAPI. Is there a GSSAPI Lustre test? |
| Comment by Oleg Drokin [ 24/Jun/13 ] |
|
I think just enabling gss by default is not all that great of an ideea. Instead, Chris, can you please enable the option on our review builds, buto nly after we install necessary dependencies so that it actually builds. |
| Comment by Andrew Korty (Inactive) [ 09/Jul/13 ] |
|
I've changed this patch. Now, the default behavior is to enable GSS only if its prerequisite libraries are installed. Hence, --enable-gss enables GSSAPI or bust; --disable-gss disables it always; and the lack of either option conditionally enables GSSAPI, depending on whether the GSSAPI and Kerberos libraries are installed. |
| Comment by Jodi Levi (Inactive) [ 05/Sep/13 ] |
|
Patch landed to Master. Please let me know if any additional work is needed and I will reopen this ticket. |
| Comment by Patrick Farrell (Inactive) [ 09/Sep/13 ] |
|
This patch introduces a minor regression related to libkeyutils: As Andrew says, the default is to enable GSS only if its prerequisite libraries are installed, but this has a flaw as it relates to libkeyutils and LC_CONFIG_GSS_KEYRING. That message is coming from '# LC_CONFIG_GSS_KEYRING (default enabled, if gss is enabled)' in the lustre-core autoconf file. In AC_DEFUN([LC_CONFIG_GSS],
[AC_MSG_CHECKING([whether to enable gss/krb5 support])
AC_ARG_ENABLE([gss],
[AC_HELP_STRING([--enable-gss], [enable gss/krb5 support])],
[],[enable_gss='auto'])
AC_MSG_RESULT([$enable_gss])
if test x$enable_gss != xno; then
LC_CONFIG_GSS_KEYRING
sunrpc_required=$enable_gss
LC_CONFIG_SUNRPC
sunrpc_required=no
[......]
if test x$KRBDIR != x; then
AC_CHECK_LIB([gssapi], [gss_export_lucid_sec_context],
[GSSAPI_LIBS="$GSSAPI_LDFLAGS -lgssapi"],
[AC_CHECK_LIB([gssglue], [gss_export_lucid_sec_context],
[GSSAPI_LIBS="$GSSAPI_LDFLAGS -lgssglue"],
[if test x$enable_gss == xyes; then
AC_MSG_ERROR([libgssapi or libgssglue is not found, which is required by GSS.])
fi])],)
AC_SUBST(GSSAPI_LIBS)
fi
fi
I'm not sure what the best solution here is, there's a bit of an interdependence and I'm not really familiar with autoconf files. For the moment, we're working around this by removing |
| Comment by Andrew Korty (Inactive) [ 09/Sep/13 ] |
|
Thanks for the report, Patrick. I think it makes sense to disable GSS non-fatally if the GSS keyring library is not available. Thoughts? |
| Comment by Patrick Farrell (Inactive) [ 10/Sep/13 ] |
|
Andrew, I'd agree with that - I would've put a suggested patch together if I were a bit more familiar with autoconf. |
| Comment by Andreas Dilger [ 10/Sep/13 ] |
|
Reopening this bug until configure is changed to not require libkeyring. |
| Comment by Andreas Dilger [ 10/Sep/13 ] |
|
There was also agreement from the OpenSFS PAC that this should be a non-fatal error if GSS support was not explicitly requested. We need to get a patch ASAP, since this is likely breaking the build for most external users of Lustre. We didn't notice it internally since we installed these libraries explicitly on all of our build nodes to verify this patch was working. |
| Comment by Patrick Farrell (Inactive) [ 11/Sep/13 ] |
|
I spent a few minutes trying to generate a patch for this one, but I'm a bit puzzled here. Two things. Two, a problem: |
| Comment by Minh Diep [ 11/Sep/13 ] |
|
As for HAVE_GSS, I would move it to after checking the libraries of LC_CONFIG_GSS |
| Comment by Minh Diep [ 11/Sep/13 ] |
|
correct me if I am wrong, could this be solved with a simple enable_gss='no' instead of 'auto'? |
| Comment by Patrick Farrell (Inactive) [ 11/Sep/13 ] |
|
Minh: Agreed, it can be made conditional on successfully finding the libraries. I think that's the solution. What's preventing me from offering a patch is the first question about the necessity of GSS_KEYRING for GSS. |
| Comment by Patrick Farrell (Inactive) [ 11/Sep/13 ] |
|
About 'no' instead of 'auto': Yes, that would avoid this problem, but that's basically just reverting this patch completely. The point of the original patch was to turn on GSS by default so it would be tested. |
| Comment by Minh Diep [ 11/Sep/13 ] |
|
I am working on a patch for this now. Do we need kernel sunrpc support for using GSS? Currently it's not if enable_gss is set to auto |
| Comment by Minh Diep [ 11/Sep/13 ] |
|
patch for master http://review.whamcloud.com/#/c/7622/ |
| Comment by Patrick Farrell (Inactive) [ 13/Sep/13 ] |
|
I was hoping to offer a new version of Minh's patch, but I can't because I still don't know if GSS_KEYRING is strictly required for GSS. So is it possible/recommended to allow GSS to be enabled when GSS keyring is not? |
| Comment by Andreas Dilger [ 13/Sep/13 ] |
|
Andrew or Ken, could you please comment on this? |
| Comment by Ken Hornstein [ 13/Sep/13 ] |
|
So, I could be wrong ... but it looks like the following things are true: The GSS_KEYRING autoconf macro ends up doing two things: defining the HAVE_GSS_KEYRING preprocessor symbol, and setting the GSS_KEYRING Automake conditional. No code uses the HAVE_GSS_KEYRING preprocessor symbol. GSS_KEYRING controls whether or not gss_keyring.c is compiled. Given all that ... I would say, in answer to Patrick's question, it IS possible to allow GSS to be enabled when GSS keyring is not enabled. Recommended, maybe not, but I'd vote for not requiring the dependency on keyring to make things easier for everyone. |
| Comment by Minh Diep [ 13/Sep/13 ] |
|
Patrick, to avoid duplicate works, please go ahead and update the patch. Thank you very much. I will review and make sure it gets tested |
| Comment by Patrick Farrell (Inactive) [ 13/Sep/13 ] |
|
Ken, Thanks, that matches my understanding and fits with the autoconf file as written (Specifically the fact that it's possible to enable/disable gss_keyring separately from gss). Minh, Thanks - I'll have it up for review in a little bit. |
| Comment by Andrew Korty (Inactive) [ 13/Sep/13 ] |
|
I agree with Ken: GSS_KEYRING need not be a prerequisite to GSS. I also think Patrick has something about HAVE_GSS getting set spuriously although I'm sure I tested for that case. At any rate, it would make more sense for HAVE_GSS to reflect the situation accurately. |
| Comment by Patrick Farrell (Inactive) [ 14/Sep/13 ] |
|
Sorry about the delay on this, I ran in to some unexpected issues on my build system. It turns out the 'auto' from the original patch is causing the GSS dependencies to be checked, but it's not actually causing GSS to be built. I figured that out because my build system couldn't actually build GSS [A separate issue with my kernel source], and I couldn't understand why make rpms succeeded with gss not enabled or disabled (So, using 'auto'), but failed with gss enabled. This is because enable_gss is used like this in the 'configure' file that autoconf generates: The simple solution is to start with enable_gss to auto as we do now, then if the various tests pass, set it to 'yes' before leaving the function. I've tested doing this and it causes gss to be built by default. Once I've sorted out my other build issues and can completely build GSS without errors, I'll be able to test to my satisfaction and put up a patch for this and the earlier problem. Andrew: In any case, you're right that it shouldn't cause a problem - HAVE_GSS isn't actually used anywhere else, as far as I can tell. (Though I'm hardly an autoconf expert.)
|
| Comment by Patrick Farrell (Inactive) [ 15/Sep/13 ] |
|
OK, a bit of further information. I've been unable to build GSS with GSS_KEYRING enabled. I originally thought that was because of a problem with my kernel source, but I've now re-downloaded and rebuilt that twice now. The tests for building GSS_KEYRING are passing, but the code itself is not building; I'm getting errors indicating that some of the definitions it depends on aren't available. In particular, these appear to be those things which are defined if CONFIG_KEYS is set in the kernel. I've looked at the .config file being used to build the kernel, and I've confirmed CONFIG_KEYS is definitely set (It's the Lustre provided .config file.). (I'm following the WC build instructions from here: https://wiki.hpdd.intel.com/pages/viewpage.action?pageId=8126821) Since the 'auto' setting wasn't actually causing GSS to be built, I suspect that GSS_KEYRING actually hasn't been built lately and is linked incorrectly. I suspect if you tried to build it (--enable-gss is sufficient with current master) you'd find these same failures: I will put up a patch to fix the autoconf behavior, but I don't currently see how to fix the problem with building GSS_KEYRING. That means build systems without libkeyutils installed will gracefully choose not to build GSS_KEYRING, but those with libkeyutils installed will pass the autoconf tests and then fail to compile. |
| Comment by Patrick Farrell (Inactive) [ 16/Sep/13 ] |
|
Patch up at http://review.whamcloud.com/#/c/7622/ |
| Comment by Patrick Farrell (Inactive) [ 16/Sep/13 ] |
|
As expected, with that patch, the tests for GSS keyring are passing and it is failing to build: |
| Comment by Minh Diep [ 16/Sep/13 ] |
|
just a notice checking whether to enable gss/krb5 support... auto If we leave gss and gss keyring on auto and have keyutils..., but not Kerberos, should we issue a warning? |
| Comment by Minh Diep [ 17/Sep/13 ] |
|
I updated the patch and now it only failed on sles http://build.whamcloud.com/job/lustre-reviews/18238/ |
| Comment by Patrick Farrell (Inactive) [ 17/Sep/13 ] |
|
Nice catch, Minh, that does fix most of the build failures. The remaining ones are unusual issues unique to SLES11 and Ubuntu 10.04. I'll leave those to you. |
| Comment by Minh Diep [ 17/Sep/13 ] |
|
The reason it failed in sles11sp1 but not sles11sp2 is sles11sp1: sles11sp2: I believe the logic in the patch AC_CHECK_LIB([gssapi], [gss_export_lucid_sec_context], do we need both libgssapi and libgssglue to be yes or both to be no? |
| Comment by Patrick Farrell (Inactive) [ 17/Sep/13 ] |
|
Minh, Not in general. This is the output for el6 inkernel from the same build (http://build.whamcloud.com/job/lustre-reviews/18238/arch=x86_64,build_type=server,distro=el6,ib_stack=inkernel/consoleFull): So that same situation works for el6. Maybe that's not true for sles11sp1. |
| Comment by Ken Hornstein [ 17/Sep/13 ] |
|
Some versions of the GSS library don't provide gss_export_lucid_sec_context(), depending on the vintage. I'm actually in that situation, for a long, complicated, and stupid reason. I suspect that -lgssapi (typically provided by a Kerberos implementation) shipped with SLES11SP1 is one of those vintages. |
| Comment by Patrick Farrell (Inactive) [ 17/Sep/13 ] |
|
Here's the specific failure for SLES11SP1: A bit of looking at the source for SLES11SP2 and CentOS vs SLES11SP1 shows that function is defined in SLES11SP2 and CentOS, but it's not found in SLES11SP1. So I don't think there's an easy solution here if we actually want this to work on SLES11SP1, especially not if it's supposed to work on patchless clients. — Ken, It looks like you're right. Still, that function is found in lgssglue in SLES11SP1 and CentOS, so we're OK there. |
| Comment by Andreas Dilger [ 17/Sep/13 ] |
|
I don't think we are supporting SLES11 SP1 going forward on master, so this may be a non-issue. We need to change out builders to take this into account. |
| Comment by Minh Diep [ 18/Sep/13 ] |
|
Patrick, Ken Do you think 'checking for krb5int_derive_key in -lgssapi_krb5... no' must be yes to be able to build krb? |
| Comment by Patrick Farrell (Inactive) [ 18/Sep/13 ] |
|
Minh, I see if we have krb5int_derive_key, it will build correctly, because it's used instead of krb5_derive_key. But some platforms have krb5_derive_key, and in that case, they don't need krb5int_derive_key in the libraries. Is it possible that krb5int_derive_key is in one of the two possible GSS libraries and not the other? Maybe we need to require that library if you don't have krb5_derive_key in your kernel? (The SLES11SP1 case.) Andreas, It seems to me the problem with ignoring SLES11 because it's not supported going forward is that GSS_KEYRING won't build with at least b2_4 and possibly earlier versions as well. If it's supported there but can't actually be built... |
| Comment by Minh Diep [ 18/Sep/13 ] |
|
I'd say if we can't fine both krb5int_derive_key and krb5_derive_key, we will issue a warning and set HAVE_KRB5 to 0 |
| Comment by Thomas Stibor [ 19/Sep/13 ] |
|
The function name krb5_derive_key was renamed in MIT-Kerberos >= 1.8.X. Before it was (<= 1.7.X): thomas@lxdv65:~/tmp/krb/krb5-1.7.2>grep -r "krb5int_derive_key"
thomas@lxdv65:~/tmp/krb/krb5-1.7.2>grep -r "krb5_derive_key"
src/lib/crypto/vectors.c: r = krb5_derive_key (enc, in, out, usage);
src/lib/crypto/combine_keys.c: * DK is defined as the key derivation function (krb5_derive_key())
src/lib/crypto/combine_keys.c: if ((ret = krb5_derive_key(enc, &tkey, outkey, &input))) {
src/lib/crypto/aes/aes_s2k.c: err = krb5_derive_key (enc, key, key, &usage);
src/lib/crypto/libk5crypto.exports:krb5_derive_key
src/lib/crypto/dk/dk_encrypt.c: if ((ret = krb5_derive_key(enc, key, &ke, &d1)))
...
With version >= 1.8.X it changed to: thomas@lxdv65:~/tmp/krb/krb5-1.8.6>grep -r "krb5_derive_key" doc/CHANGES:Rename some lingering krb5_derive_key references. thomas@lxdv65:~/tmp/krb/krb5-1.8.6>grep -r "krb5int_derive_key" README:6629 krb5int_derive_key results in cache with uninitialized values doc/CHANGES: subject: krb5int_derive_key results in cache with uninitialized values doc/CHANGES: krb5int_derive_key creates a temporary keyblock to add to the derived cache. src/lib/crypto/crypto_tests/vectors.c: r = krb5int_derive_key (enc, in, out, usage); src/lib/crypto/krb/combine_keys.c: * DK is defined as the key derivation function (krb5int_derive_key()) src/lib/crypto/krb/combine_keys.c: ret = krb5int_derive_keyblock(enc, tkey, outkey, &input); ... The newer distributions usually ship MIT-Kerberos => 1.8.X, however I haven't looked into Heimdal. There it is probably different. |
| Comment by Thomas Stibor [ 19/Sep/13 ] |
|
Hello, I just discovered a problem with the commit 67b73336d5b3d631e6d9eb3809914b8b80825a24: diff --git a/lustre/utils/gss/svcgssd.c b/lustre/utils/gss/svcgssd.c
index cebd852..bd6a4de 100644
--- a/lustre/utils/gss/svcgssd.c
+++ b/lustre/utils/gss/svcgssd.c
@@ -148,10 +148,10 @@ mydaemon(int nochdir, int noclose)
static void
release_parent()
{
- int status;
+ int status, rc;
if (pipefds[1] > 0) {
- write(pipefds[1], &status, 1);
+ rc = write(pipefds[1], &status, 1);
close(pipefds[1]);
pipefds[1] = -1;
}
Since the code is compiled with gcc -Werror ... it results in: svcgssd.c: In function ‘release_parent’: svcgssd.c:151:14: error: variable ‘rc’ set but not used [-Werror=unused-but-set-variable] cc1: all warnings being treated as errors make[1]: *** [lsvcgssd-svcgssd.o] Error 1 |
| Comment by Patrick Farrell (Inactive) [ 19/Sep/13 ] |
|
Thomas, Definitely, that patch would give that error. If you look in Gerrit, that commit is patch set 4, and we're on to patch set 6. It actually appears that function is no longer being used, because we removed rc entirely and patch set 6 built successfully on all platforms. ('rc' was put there in patch set 4 to avoid a warning about ignoring a return value on write, but then hits the error you noted.) |
| Comment by Andreas Dilger [ 19/Sep/13 ] |
|
Making this a blocker for 2.5.0 until the configure/build problem is fixed for systems that do not have the gss libraries installed. |
| Comment by Peter Jones [ 24/Sep/13 ] |
|
Landed for 2.5.0 |