[LU-10531] GSS, Shared Key and Kerberos support broken in master and lustre 2.10 Created: 18/Jan/18 Updated: 09/Feb/18 Resolved: 06/Feb/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0, Lustre 2.10.2 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Sebastien Buisson (Inactive) | Assignee: | James A Simmons |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | gss, kerberos, patch | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
GSS, Shared Key and Kerberos support is currently broken in master branch. It is indeed impossible to set any flavor for sptlrpc, whereas it is gssnull or ski or krb5 {n,a,i,p}. For instance, when doing 'lctl conf_param lustre.srpc.flavor.default=krb5n' or 'lctl set_param -P lustre.srpc.flavor.default=krb5n', the command returns no error, but the value is never applied. The commit introducing this regression is the following, and aims at making 'lctl set_param -P' functional: As mentioned in this patch's comment, "currently virtual attributes failover.nid, sptlrpc, and quota As I understand 'lctl set_param -P' needs more work to make it work for sptlrpc, the patch should not break 'lctl conf_param' functionality for sptlrpc. |
| Comments |
| Comment by Peter Jones [ 18/Jan/18 ] |
|
James It looks like we should revert this change Peter |
| Comment by James A Simmons [ 18/Jan/18 ] |
|
Reverting will not help out since in my own testing sptlrpc was broken before this patch landed |
| Comment by Peter Jones [ 18/Jan/18 ] |
|
ok - thanks James. We'll hold off on the revert then |
| Comment by Sebastien Buisson (Inactive) [ 19/Jan/18 ] |
|
Hi James, thanks for looking into this. What do you mean by "sptlrpc was broken before patch https://review.whamcloud.com/28590, and in 2.10"? While trying to narrow down the problem hit with 'lctl conf_param lustre.srpc.flavor.default=krb5n', I found it worked for all codes I was able to compile, except when patch https://review.whamcloud.com/28590 was in the pile. By working I mean issuing the command and then seing that the given value had been taken into account under /proc/fs/lustre///srpc_info on the clients and servers. |
| Comment by Gerrit Updater [ 19/Jan/18 ] |
|
James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/30937 |
| Comment by James A Simmons [ 19/Jan/18 ] |
|
Back to the normal failures [ 445.426749] LustreError: 14736:0:(gss_keyring.c:1423:gss_kt_update()) negotiation: rpc err 0, gss err d0000 |
| Comment by James A Simmons [ 19/Jan/18 ] |
|
Sebastien I posted the errors I'm seeing in the previous comment. For some reason the client can mount but no one can access the file system. The patch is ready for review |
| Comment by Andreas Dilger [ 19/Jan/18 ] |
|
It seems that we are still not getting regular enough testing of the Kerberos and SSK functionality to avoid regressions, and playing catch-up with regressions added a long time ago is a lot more work than finding recent regressions or preventing them in the first place. Nathan, Chris, Sebastien, is it possible for you guys to start running automated regression tests against master with SSK/Kerberos configured on a regular basis (e.g. daily against master, and as often as possible against new patches as time permits)? That would avoid these kinds of problems from being introduced in the first place. |
| Comment by Gerrit Updater [ 23/Jan/18 ] |
|
Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/30984 |
| Comment by Sebastien Buisson (Inactive) [ 23/Jan/18 ] |
|
Thanks to the patch https://review.whamcloud.com/30937 from James, I am again able to set sptlrpc flavor with 'lctl conf_param' commands (however it does not work with 'lctl set_param -P'). However, for DNE setups the new patch I just pushed in https://review.whamcloud.com/30984 is mandatory. This is indeed another regression in Kerberos support, inadvertently introduced by patch https://review.whamcloud.com/27823. I agree we need more regular testing of GSS/SSK/Kerberos functionality. Manual testing of single patches cannot cover all cases all the time. At DDN we already have some resources for Lustre non-regression tests. I will see if it is possible to dedicate part of them to continuous Kerberos testing. I should get back to you on this matter in a couple of weeks. |
| Comment by Gerrit Updater [ 31/Jan/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30937/ |
| Comment by Gerrit Updater [ 06/Feb/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30984/ |
| Comment by Peter Jones [ 06/Feb/18 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 07/Feb/18 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/31208 |
| Comment by Gerrit Updater [ 07/Feb/18 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/31209 |
| Comment by Gerrit Updater [ 09/Feb/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/31208/ |
| Comment by Gerrit Updater [ 09/Feb/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/31209/ |