[LU-8795] The user cannot access lustre even if they successfully authenticate by kinit Created: 03/Nov/16  Updated: 08/Nov/16  Resolved: 07/Nov/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: sebg-crd-pm (Inactive) Assignee: Jeremy Filizetti
Resolution: Fixed Votes: 0
Labels: None
Environment:

Centos7.2 3.10.0-327.el7.x86_64
Lustre 2.8.55_19_ga84250b


Issue Links:
Related
is related to LU-8813 Kerberos: sanity and sanity-krb5 test... Resolved
Epic/Theme: kerberos
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

1. I find one problem for Kerberos in Lustre. I do not know whether it is a setting error or bug. When we activate the Kerberos function in all servers (MGS, MDS, and OSS) and clients mount lustre with krb5 option, root can access the lustre file system. However, the normal users can not access the lustre even if they have authenticated through Kerberos (kinit). The following error logs are messages when normal user wants to access lustre.
lgss_keyring: [21277]:TRACE:main(): start parsing parameters
lgss_keyring: [21277]:INFO:main(): key 698610133, desc 1002@2f, ugid 1002:1002, sring 44699816, coinfo 47:krb5:1002:1002::
lgss_keyring: [21277]:ERROR:parse_callout_info(): short of components
lgss_keyring: [21277]:ERROR:main(): can't extract callout info: 47:krb5:1002:1002::
kernel: LustreError: 21275:0:(gss_keyring.c:846:gss_sec_lookup_ctx_kr()) failed request key: -126
kernel: LustreError: 21275:0:(sec.c:452:sptlrpc_req_get_ctx()) req ffff881f4f703f00: fail to get context
kernel: LustreError: 21275:0:(file.c:3332:ll_inode_revalidate_fini()) hpcfs: revalidate FID [0x200000bd0:0x1:0x0] error: rc = -111



 Comments   
Comment by Andreas Dilger [ 03/Nov/16 ]

The 2.8.55 build of Lustre is a development version of the Lustre master branch, but we are happy that you are testing it. There have been recent changes to the GSSAPI code, which is closely tied to Kerberos.

If you have been testing the master release frequently with Kerberos, do you know when this functionality was last working? That would help isolate the change(s) that are the source of the problem.

Comment by sebg-crd-pm (Inactive) [ 03/Nov/16 ]

The version 2.8.55 has been my first version since I touch lustre, so I did not use any master branch. I will download the latest version and try it again.

Comment by sebg-crd-pm (Inactive) [ 03/Nov/16 ]

I have tried the lustre 2.8.60. The problem still exists

Comment by Peter Jones [ 03/Nov/16 ]

How about checking back to 2.8.50? This tag is functionally equivalent to the community 2.8 release and so this will give us an indication as to whether this has never worked or got broken during the 2.9 development cycle.

Comment by Oleg Drokin [ 03/Nov/16 ]

I think Peter meant 2.8.50 which is equivalent to 2.8.0, because 2.7.50 is 2.7.0.

Comment by Peter Jones [ 03/Nov/16 ]

Confirmed. Sorry for any confusion caused.

Comment by sebg-crd-pm (Inactive) [ 04/Nov/16 ]

I found that the function is working in lustre 2.8.0, users can access the lustre file system after they executes kinit
Here is the log.
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:main(): start parsing parameters
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:INFO:main(): key 605755931, desc 1000@8, ugid 1000:1000, sring 380314743, coinfo 8:krb5:1000:1000::1:0x9000000000000:hpcfs-MDT0000-mdc-ffff881987e72800:0x9000000000000
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:parse_callout_info(): components: 8,krb5,1000,1000,,1,0x9000000000000,hpcfs-MDT0000-mdc-ffff881987e72800,0x9000000000000
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:DEBUG:parse_callout_info(): parse call out info: secid 8, mech krb5, ugid 1000:1000, is_root 0, is_mdt 0, is_ost 0, svc 1, nid 0x9000000000000, tgt hpcfs-MDT0000-mdc-ffff881987e72800, self nid 0x9000000000000
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:main(): parsing parameters OK
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:lgss_mech_initialize(): initialize mech krb5
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:lgss_create_cred(): create a krb5 cred at 0x23e1340
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:lgss_prepare_cred(): preparing krb5 cred 0x23e1340
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:DEBUG:lkrb5_prepare_user_cred(): using krb5 cache name: FILE:/tmp/krb5cc_1000
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:DEBUG:lgss_krb5_set_ccache_name(): set cc: FILE:/tmp/krb5cc_1000
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:main(): instantiated kernel key 241b1a1b
Nov 4 13:44:04 Blustre1 lgss_keyring: [28653]:TRACE:main(): forked child 28654
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:lgssc_kr_negotiate(): child start on behalf of key 241b1a1b: cred 0x23e1340, uid 1000, svc 1, nid 9000000000000, uids: 1000:1000/1000:1000
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:lolnd_nid2hostname(): LOLND: addr 0x0 => Blustre1
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:lgss_get_service_str(): constructed service string: lustre_mds@Blustre1
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:lgss_using_cred(): using krb5 cred 0x23e1340
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:lgssc_negotiation(): start gss negotiation
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:do_nego_rpc(): start negotiation rpc
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:do_nego_rpc(): to open /proc/fs/lustre/sptlrpc/gss/init_channel
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:do_nego_rpc(): to down-write
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 8, token len 156, res 0
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:lgssc_negotiation(): successfully negotiated a context
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:serialize_krb5_ctx(): lucid version!
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:prepare_krb5_rfc4121_buffer(): protocol 1
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:prepare_krb5_rfc4121_buffer(): serializing 3 keys with enctype 18 and size 32
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:update_kernel_key(): updating kernel key 241b1a1b
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:DEBUG:update_kernel_key(): key 241b1a1b: updated
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:INFO:lgssc_kr_negotiate(): key 241b1a1b for user 1000 is updated OK!
Nov 4 13:44:04 Blustre1 lgss_keyring: [28654]:TRACE:lgss_release_cred(): releasing krb5 cred 0x23e1340
Nov 4 13:44:04 Blustre1 kernel: Lustre: 28195:0:(sec_gss.c:2088:gss_svc_handle_init()) create svc ctx ffff881fce617a40: accept user 1000 from 0@lo
Nov 4 13:44:04 Blustre1 kernel: Lustre: 28654:0:(sec_gss.c:399:gss_cli_ctx_uptodate()) client refreshed ctx ffff881363c76780 idx 0xd2a00e187c2d0251 (1000->hpcfs-MDT0000_UUID), expiry 1478324471(+86227s)

Comment by Peter Jones [ 04/Nov/16 ]

Are you able to assist in further narrowing down when this regression was introduced between the 2.8.50 and 2.8.55 tags?

Comment by Peter Jones [ 04/Nov/16 ]

Jeremy

Do you have any suggestions here? Could this have been related to any of the SSK changes? http://review.whamcloud.com/#/c/16728 perhaps?

Peter

Comment by Gerrit Updater [ 05/Nov/16 ]

Jeremy Filizetti (jeremy.filizetti@gmail.com) uploaded a new patch: http://review.whamcloud.com/23600
Subject: LU-8795 gss: Prevent callout truncation with non-root users
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 35d09c71e7e8ce98561eb92e6f07d6d8dd166120

Comment by Jeremy Filizetti [ 05/Nov/16 ]

Looks like this is due to the SK changes for non-root users. sebg-crd-pm can you test the patch below to see if this fixes your issue:

http://review.whamcloud.com/23600

Comment by sebg-crd-pm (Inactive) [ 07/Nov/16 ]

I have tried the patch and It works now, Thanks for every one.

Comment by Gerrit Updater [ 07/Nov/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23600/
Subject: LU-8795 gss: Prevent callout truncation with non-root users
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: cc5601dfbe58ee8b0a024e2f9448a6a4f53c02a8

Comment by Peter Jones [ 07/Nov/16 ]

Landed for 2.9

Generated at Sat Feb 10 02:20:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.