[LU-6490] builds on 3.12 fail in gss Created: 23/Apr/15  Updated: 01/Jul/16  Resolved: 05/Aug/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Major
Reporter: Bob Glossman (Inactive) Assignee: Bob Glossman (Inactive)
Resolution: Fixed Votes: 0
Labels: patch
Environment:

sles12


Issue Links:
Related
is related to LU-6356 Kerberos revival Resolved
is related to LU-6020 Bugfixes for GSS/Kerberos Resolved
is related to LU-6237 EL7 libgssglue or libgssapi are not f... Resolved
is related to LU-6215 Sync Lustre external tree with lustre... Resolved
is related to LU-5609 support for 3.17 kernel Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

build fails in gss code. not noticed until very recently due to commonly doing builds in environments without kerberos support or with --disable-gss. lustre configured --disable-gss builds and runs fine.

example errors:

  CC [M]  /home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.o
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c: In function ‘request_key_unlink’:
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:649:18: note: in expansion of macro ‘key_tgcred’
   ring = key_get(key_tgcred(tsk)->process_keyring);
                  ^
In file included from include/linux/srcu.h:33:0,
                 from include/linux/notifier.h:15,
                 from include/linux/memory_hotplug.h:6,
                 from include/linux/mmzone.h:830,
                 from include/linux/gfp.h:4,
                 from include/linux/kmod.h:22,
                 from include/linux/module.h:13,
                 from /home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:43:
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
include/linux/rcupdate.h:521:11: note: in definition of macro ‘__rcu_dereference_check’
   typeof(*p) *_________p1 = (typeof(*p)*__force )ACCESS_ONCE(p); \
           ^
include/linux/rcupdate.h:709:28: note: in expansion of macro ‘rcu_dereference_check’
 #define rcu_dereference(p) rcu_dereference_check(p, 0)
                            ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:18: note: in expansion of macro ‘rcu_dereference’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:34: note: in expansion of macro ‘key_tgcred’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                                  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
include/linux/rcupdate.h:521:38: note: in definition of macro ‘__rcu_dereference_check’
   typeof(*p) *_________p1 = (typeof(*p)*__force )ACCESS_ONCE(p); \
                                      ^
include/linux/rcupdate.h:709:28: note: in expansion of macro ‘rcu_dereference_check’
 #define rcu_dereference(p) rcu_dereference_check(p, 0)
                            ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:18: note: in expansion of macro ‘rcu_dereference’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:34: note: in expansion of macro ‘key_tgcred’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                                  ^
In file included from include/linux/init.h:4:0,
                 from /home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:42:
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
include/linux/compiler.h:365:43: note: in definition of macro ‘ACCESS_ONCE’
 #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
                                           ^
include/linux/rcupdate.h:613:2: note: in expansion of macro ‘__rcu_dereference_check’
  __rcu_dereference_check((p), rcu_read_lock_held() || (c), __rcu)
  ^
include/linux/rcupdate.h:709:28: note: in expansion of macro ‘rcu_dereference_check’
 #define rcu_dereference(p) rcu_dereference_check(p, 0)
                            ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:18: note: in expansion of macro ‘rcu_dereference’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:34: note: in expansion of macro ‘key_tgcred’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                                  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
include/linux/compiler.h:365:50: note: in definition of macro ‘ACCESS_ONCE’
 #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
                                                  ^
include/linux/rcupdate.h:613:2: note: in expansion of macro ‘__rcu_dereference_check’
  __rcu_dereference_check((p), rcu_read_lock_held() || (c), __rcu)
  ^
include/linux/rcupdate.h:709:28: note: in expansion of macro ‘rcu_dereference_check’
 #define rcu_dereference(p) rcu_dereference_check(p, 0)
                            ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:18: note: in expansion of macro ‘rcu_dereference’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:34: note: in expansion of macro ‘key_tgcred’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                                  ^
In file included from include/linux/srcu.h:33:0,
                 from include/linux/notifier.h:15,
                 from include/linux/memory_hotplug.h:6,
                 from include/linux/mmzone.h:830,
                 from include/linux/gfp.h:4,
                 from include/linux/kmod.h:22,
                 from include/linux/module.h:13,
                 from /home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:43:
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
include/linux/rcupdate.h:526:13: note: in definition of macro ‘__rcu_dereference_check’
   ((typeof(*p) __force __kernel *)(_________p1)); \
             ^
include/linux/rcupdate.h:709:28: note: in expansion of macro ‘rcu_dereference_check’
 #define rcu_dereference(p) rcu_dereference_check(p, 0)
                            ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:18: note: in expansion of macro ‘rcu_dereference’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:654:34: note: in expansion of macro ‘key_tgcred’
   ring = key_get(rcu_dereference(key_tgcred(tsk)
                                  ^
In file included from include/linux/init.h:4:0,
                 from /home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:42:
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c: In function ‘gss_kt_instantiate’:
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
include/linux/compiler.h:153:42: note: in definition of macro ‘unlikely’
 # define unlikely(x) __builtin_expect(!!(x), 0)
                                          ^
/home/bogl/lustre-release/libcfs/include/libcfs/libcfs_private.h:99:23: note: in expansion of macro ‘LASSERTF’
 #define LASSERT(cond) LASSERTF(cond, "\n")
                       ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1244:2: note: in expansion of macro ‘LASSERT’
  LASSERT(key_tgcred(current)->session_keyring);
  ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1244:10: note: in expansion of macro ‘key_tgcred’
  LASSERT(key_tgcred(current)->session_keyring);
          ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1247:16: note: in expansion of macro ‘key_tgcred’
  rc = key_link(key_tgcred(current)->session_keyring, key);
                ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:121:37: error: ‘const struct cred’ has no member named ‘tgcred’
 #define key_tgcred(tsk) ((tsk)->cred->tgcred)
                                     ^
/home/bogl/lustre-release/libcfs/include/libcfs/libcfs_debug.h:231:55: note: in expansion of macro ‘key_tgcred’
                 libcfs_debug_msg(&msgdata, format, ## __VA_ARGS__);     \
                                                       ^
/home/bogl/lustre-release/libcfs/include/libcfs/libcfs_debug.h:241:9: note: in expansion of macro ‘__CDEBUG’
         __CDEBUG(&cdls, mask, format, ## __VA_ARGS__);\
         ^
/home/bogl/lustre-release/libcfs/include/libcfs/libcfs_debug.h:271:37: note: in expansion of macro ‘CDEBUG_LIMIT’
 #define CERROR(format, ...)         CDEBUG_LIMIT(D_ERROR, format, ## __VA_ARGS__)
                                     ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1250:3: note: in expansion of macro ‘CERROR’
   CERROR("failed to link key %08x to keyring %08x: %d\n",
   ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c: At top level:
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1409:9: error: initialization from incompatible pointer type [-Werror]
         .instantiate    = gss_kt_instantiate,
         ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1409:9: error: (near initialization for ‘gss_key_type.instantiate’) [-Werror]
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1410:9: error: initialization from incompatible pointer type [-Werror]
         .update         = gss_kt_update,
         ^
/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.c:1410:9: error: (near initialization for ‘gss_key_type.update’) [-Werror]
cc1: all warnings being treated as errors
scripts/Makefile.build:324: recipe for target '/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.o' failed
make[7]: *** [/home/bogl/lustre-release/lustre/ptlrpc/gss/gss_keyring.o] Error 1
scripts/Makefile.build:485: recipe for target '/home/bogl/lustre-release/lustre/ptlrpc/gss' failed
make[6]: *** [/home/bogl/lustre-release/lustre/ptlrpc/gss] Error 2
scripts/Makefile.build:485: recipe for target '/home/bogl/lustre-release/lustre/ptlrpc' failed
make[5]: *** [/home/bogl/lustre-release/lustre/ptlrpc] Error 2
scripts/Makefile.build:485: recipe for target '/home/bogl/lustre-release/lustre' failed
make[4]: *** [/home/bogl/lustre-release/lustre] Error 2
Makefile:1287: recipe for target '_module_/home/bogl/lustre-release' failed
make[3]: *** [_module_/home/bogl/lustre-release] Error 2
make[3]: Leaving directory '/home/bogl/linux-3.12.39-47'
autoMakefile:1018: recipe for target 'modules' failed
make[2]: *** [modules] Error 2
make[2]: Leaving directory '/home/bogl/lustre-release'
autoMakefile:565: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/bogl/lustre-release'
autoMakefile:466: recipe for target 'all' failed
make: *** [all] Error 2

we need autoconf and code changes to adapt to different gss data structures and APIs seen in sles12.



 Comments   
Comment by James A Simmons [ 23/Apr/15 ]

This appears to be missing in newer upstream kernel versions as well. Does RHEL7 have this same issue?

Comment by Bob Glossman (Inactive) [ 23/Apr/15 ]

no, not an issue in el7

Comment by Bob Glossman (Inactive) [ 23/Apr/15 ]

it may be a non issue on el7 only because there's no libgssglue or libgssapi there. lustre build won't build gss due to missing libs on el7.

Comment by Andreas Dilger [ 24/Apr/15 ]

Sebastien, it looks like there may be some issues with GSS + RHEL 7 (see earlier comments in this bug). Is this addressed by your patches under LU-6356?

Comment by James A Simmons [ 27/Apr/15 ]

Looking at newer kernels besides the removal of tgcred also struct key_type has changed.

Comment by Sebastien Buisson (Inactive) [ 30/Apr/15 ]

Hi,

None of the patches under LU-6356 address the build issue on RHEL7 or newer kernels.
However we have opened a separate ticket to report the build problem of lack of libgssglue or libgssapi on RHEL7: LU-6237.

Cheers,
Sebastien.

Comment by Gerrit Updater [ 18/Jun/15 ]

Sebastien Buisson (sebastien.buisson@bull.net) uploaded a new patch: http://review.whamcloud.com/15342
Subject: LU-6490 gss: rhel7 adjustments for gssapi code
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1d31032524e59dea8ed7500a65d8252a29d40526

Comment by Sebastien Buisson (Inactive) [ 18/Jun/15 ]

Hi,

I have made a patch to address the various changes around gssapi in rhel7.

Could you please have a look and review it?

Thanks,
Sebastien.

Comment by Peter Jones [ 18/Jun/15 ]

Bob

Could you please take care of this patch?

Thanks

Peter

Comment by Bob Glossman (Inactive) [ 19/Jun/15 ]

I see the following build fail on sles12 with this proposed mod:

make[3]: Entering directory '/home/bogl/lustre-release/lustre/utils'
Making all in gss
make[4]: Entering directory '/home/bogl/lustre-release/lustre/utils/gss'
gcc -DHAVE_CONFIG_H -I. -I../../..   -include
/home/bogl/lustre-release/config.h -I/home/bogl/lustre-release/libcfs/include
-I/home/bogl/lustre-release/lnet/include
-I/home/bogl/lustre-release/lustre/include  -fPIC -D_LARGEFILE64_SOURCE=1
-D_FILE_OFFSET_BITS=64 -DLUSTRE_UTILS=1 -g -O2 -Werror  -g -O2 -Werror -MT
lsvcgssd-context.o -MD -MP -MF .deps/lsvcgssd-context.Tpo -c -o
lsvcgssd-context.o `test -f 'context.c' || echo './'`context.c
context.c:35:27: fatal error: gssapi/gssapi.h: No such file or directory
 #include <gssapi/gssapi.h>
                           ^
compilation terminated.
Makefile:969: recipe for target 'lsvcgssd-context.o' failed
make[4]: *** [lsvcgssd-context.o] Error 1
make[4]: Leaving directory '/home/bogl/lustre-release/lustre/utils/gss'
Makefile:1277: recipe for target 'all-recursive' failed
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory '/home/bogl/lustre-release/lustre/utils'
autoMakefile:488: recipe for target 'all-recursive' failed
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory '/home/bogl/lustre-release/lustre'
autoMakefile:564: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/bogl/lustre-release'
autoMakefile:463: recipe for target 'all' failed
make: *** [all] Error 2

The #include file /usr/include/gssglue/gssapi/gssapi.h does exist. It's not clear to me why the build isn't finding it and using it.

Comment by Bob Glossman (Inactive) [ 19/Jun/15 ]

The previous build fail was due to not having krb5-devel installed on sles12. Once I install that the build is OK.

I think eliminating some of the autocconf checks for dependencies may be wrong or have gone too far. It would be better to detect lack of needed #includes and refuse to build gss as it used to.

Comment by Sebastien Buisson (Inactive) [ 19/Jun/15 ]

As krb5-devel installs libgssapi_krb5 which provides gss_krb5_export_lucid_sec_context() that is tested in a config check, how come gss build can be enabled without this rpm?

Comment by Gerrit Updater [ 19/Jun/15 ]

Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/15354
Subject: LU-6490 build: enable gss build on sles12
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 586b0e9486e8304b51bcd1cf4051a303f121630f

Comment by Bob Glossman (Inactive) [ 19/Jun/15 ]

I think the check for gss_krb5_export_lucid_sec_context() is one of the checks eliminated in your mod. The lack of that check is what now allows enable of gss build without krb5-devel (I think). Would be good to have some replacement check that turns off gss enable if trying to build without krb5-devel present.

As long as we equip our standard, regular build environments with all the needed rpms I don't think this is a critical issue. However it is a small regression and may bite some users that do their own builds unexpectedly.

Comment by Sebastien Buisson (Inactive) [ 22/Jun/15 ]

To me the check for gss_krb5_export_lucid_sec_context() is still there in lustre/autoconf/kerberos5.m4, which is part of AC_KERBEROS_V5 checks, called from LC_CONFIG_GSS in lustre/autoconf/lustre-core.m4.
But the check for gss_krb5_export_lucid_sec_context() looks into libgssapi_krb5 for the binary symbols, not for the headers brought by krb5-devel. However, in AC_KERBEROS_V5 check (lustre/autoconf/kerberos5.m4), there is a test on $dir/include/gssapi/gssapi_krb5.h existence.
So we should not end up building with gssapi support in an environment where libs are installed but not the corresponding headers.

Comment by James A Simmons [ 23/Jul/15 ]

Any updates on a possible solution?

Comment by Sebastien Buisson (Inactive) [ 24/Jul/15 ]

Hi,

As I explained in a comment in http://review.whamcloud.com/15342, in both rhel6 and rhel7, /usr/include/gssapi/gssapi.h is brought by the same rpm as /usr/include/gssapi/gssapi_krb5.h, which is krb5-devel rpm. Given that gssapi/gssapi_krb5.h is tested in AC_KERBEROS_V5 check (lustre/autoconf/kerberos5.m4), we cannot end up building with GSS support when gssapi/gssapi.h is missing. In my build environment, the absence of gssapi/gssapi_krb5.h disables GSS support, both in rhel6 and rhel7.

I agree autoconf/configure is not robust to the situation where gssapi/gssapi_krb5.h is there but gssapi/gssapi.h is not, but rpm packaging may be blamed in that case.

What do you think?

Comment by James A Simmons [ 24/Jul/15 ]

Sebastien which rpms did you install for RHEL6.X and RHEL7?

Comment by Oleg Drokin [ 24/Jul/15 ]

I do not have krb5-devel installed, but I do have krb5-libs

So autoconf est seems to have a bug (excerpt from config.log):

configure:18332: checking for Kerberos v5
configure:18421: result: 
configure:18435: checking for gss_krb5_export_lucid_sec_context in -l
configure:18470: gcc -o conftest -g -O2 -I/home/green/git/lustre-release/libcfs/include -I/home/green/git/lustre-release/lnet/include -I/home/green/git/lustre-release/lustre/include   conftest.c -l   >&5
gcc: argument to '-l' is missing
Comment by James A Simmons [ 24/Jul/15 ]

With my testing with Oleg's setup looking at the config.log I see the following:

configure:18580: checking for gss_krb5_export_lucid_sec_context in -l
configure:18615: gcc -o conftest -g -O2 -I/tmp/lustre-2.7.56/libcfs/include -I/tmp/lustre-2.7.56/lnet/include -I/tmp/lustre-2.7.56/lustre/include conftest.c -l -lkeyutils >&5
/usr/bin/ld: cannot find -l-lkeyutils
collect2: ld returned 1 exit status

Oleg do you have the keyutils packages installed as well?

Comment by Sebastien Buisson (Inactive) [ 27/Jul/15 ]

Hi,

For both RHEL6 and RHEL7, I have the same rpms installed:
krb5-devel
krb5-libs
krb5-workstation
pam_krb5

The issue met by Oleg is due to a lack of checking of $KRBDIR in kerberos5.m4 and lustre-core.m4. I have updated my patch at http://review.whamcloud.com/15342 to fix this.

Sebastien.

Comment by James A Simmons [ 28/Jul/15 ]

We will need on more kernel patch to support Ubuntu15.

Comment by Gerrit Updater [ 30/Jul/15 ]

James Simmons (uja.ornl@yahoo.com) uploaded a new patch: http://review.whamcloud.com/15804
Subject: LU-6490 gss: handle key_type match replacement
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8d5eef7ff61cf619b1881161c02c88e0138cfb81

Comment by Gerrit Updater [ 03/Aug/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15342/
Subject: LU-6490 gss: 3.1x kernels adjustments for gssapi code
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 37e738fbde6220164da7b9c2097065eb323e2da7

Comment by Gerrit Updater [ 03/Aug/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15354/
Subject: LU-6490 build: enable gss build on sles12
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d3341dff726d8781f16e21f7d403c2ec11f8113d

Comment by James A Simmons [ 03/Aug/15 ]

One patch left to enable Ubuntu15 support.

Comment by Gerrit Updater [ 05/Aug/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15804/
Subject: LU-6490 gss: handle struct key_type match replacement
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 50c79ee142be4bad9e63d603dda0a8d80ac40444

Comment by Peter Jones [ 05/Aug/15 ]

Landed for 2.8

Generated at Sat Feb 10 02:00:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.