Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.9.0
-
3
-
9223372036854775807
Description
I’m having problems running some test suites with Kerberos enabled. For example, running sanity.sh on 2.8.60 + the patch http://review.whamcloud.com/#/c/23600/, and the test fails with
# ./auster -v -k sanity --only 0a Started at Tue Nov 8 14:39:14 UTC 2016 eagle-48vm6.eagle.hpdd.intel.com: Checking config lustre mounted on /lustre/scratch Checking servers environments Checking clients eagle-48vm6.eagle.hpdd.intel.com environments Logging to local directory: /tmp/test_logs/2016-11-08/143914 Client: Lustre version: 2.8.60_1_g35d09c7 MDS: Lustre version: 2.8.60_1_g35d09c7 OSS: Lustre version: 2.8.60_1_g35d09c7 running: sanity ONLY=0a run_suite sanity /usr/lib64/lustre/tests/sanity.sh -----============= acceptance-small: sanity ============----- Tue Nov 8 14:39:21 UTC 2016 Running: bash /usr/lib64/lustre/tests/sanity.sh eagle-48vm6.eagle.hpdd.intel.com: Checking config lustre mounted on /lustre/scratch Checking servers environments Checking clients eagle-48vm6.eagle.hpdd.intel.com environments Using TIMEOUT=20 disable quota as required osd-ldiskfs.track_declares_assert=1 osd-ldiskfs.track_declares_assert=1 debug=-1 running as uid/gid/euid/egid 500/500/500/500, groups: [touch] [/lustre/scratch/d0_runas_test/f7025] touch: cannot touch `/lustre/scratch/d0_runas_test/f7025': Permission denied sanity : @@@@@@ FAIL: unable to write to /lustre/scratch/d0_runas_test as UID 500. Please set RUNAS_ID to some UID which exists on MDS and client or add user 500:500 on these nodes. Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4841:error() = /usr/lib64/lustre/tests/test-framework.sh:5670:check_runas_id() = /usr/lib64/lustre/tests/sanity.sh:126:main() Dumping lctl log to /tmp/test_logs/2016-11-08/143914/sanity..*.1478615970.log eagle-48vm1: Host key verification failed. eagle-48vm1: rsync: connection unexpectedly closed (0 bytes received so far) [sender] eagle-48vm1: rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6] pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 12 eagle-48vm2: Host key verification failed. eagle-48vm2: rsync: connection unexpectedly closed (0 bytes received so far) [sender] eagle-48vm2: rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6] pdsh@eagle-48vm6: eagle-48vm2: ssh exited with exit code 12 sanity returned 0 Finished at Tue Nov 8 14:39:31 UTC 2016 in 17s ./auster: completed with rc 0
The code that is failing in sanity.sh is
# $RUNAS_ID may get set incorrectly somewhere else [ $UID -eq 0 -a $RUNAS_ID -eq 0 ] && error "\$RUNAS_ID set to 0, but \$UID is al so 0!" check_runas_id $RUNAS_ID $RUNAS_GID $RUNAS
UID/GID 500 belongs to sanityusr and requested a Kerberos ticket before running sanity.sh:
# su sanityusr bash-4.1$ klist Ticket cache: FILE:/tmp/krb5cc_500 Default principal: sanityusr@CO.CFS Valid starting Expires Service principal 11/08/16 14:38:48 11/09/16 14:38:48 krbtgt/CO.CFS@CO.CFS
Note that CO.CFS is the realm being used.
Since Lustre is not failing, it’s not surprising that there is nothing of interest in dmesg. For example, form the MGS/MDS:
Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Tue Nov 8 14:39:21 UTC 2016 Lustre: DEBUG MARKER: Using TIMEOUT=20 Lustre: DEBUG MARKER: sanity : @@@@@@ FAIL: unable to write to /lustre/scratch/d0_runas_test as UID 500.
Logs for this run are at https://testing.hpdd.intel.com/test_sets/b87e2568-a5f8-11e6-964e-5254006e85c2.
More logs will be attached to this ticket.
The last time I tested Kerberos and the above tests ran was tag 2.8.54. Since that time some flags have been added to lsvcgssd. I tried to call lsvcgssd two different ways; the way things worked in 2.8.54 as ‘/usr/sbin/lsvcgssd’ on all Lustre servers and, the new recommended way, for the MGS/MDS, ‘/usr/sbin/lsvcgssd -m -g -k –vvv’ and, for the OSS, ‘/usr/sbin/lsvcgssd -o -k –vvv’ (verbosity is optional). All tests were run with RHEL 6.8 (for some reason, Maloo reports it as el6.7)
Confirmation that others are or are not experiencing this problem with these test suites and Kerberos would be helpful.
Attachments
Issue Links
Activity
Reopen to fix compatibility with svcgssd not being passed -k option for existing Kerberos configurations.
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23667/
Subject: LU-8813 utils: l_getidentity compatibility
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d385685a92668241d8c802f33f2e5497d9a7ea5a
James, please be aware of test interop issues when adding the new flags to the lsvcgssd startup. While I can understand that this may be necessary for SSK testing, it isn't clear to me what the need is for KRB testing, and this highlights that existing user deployments with Kerberos may start to fail in a similar manner, so I'm glad we caught this before release.
The addition of the "-k" option should not be required for Kerberos configurations, to maintain compatibility with existing environments, IMHO. My thought was to have the lack of "-k" set the krb_enabled if neither sk_enabled or null_enabled is no yet set, but print a warning like:
svcgssd: no "-k", "-s", or "-z" option given. Assume "-k" for compatibility reasons.
or similar to avoid confusing error messages.
I understand the remaining issues and they are minor issues in the sanity-krb5 test suite. Some tests in sanity-krb5 stop and start the lsvcgssd and restart with only the '-v' option. Once I added the correct flags to these calls for the MDS and for OSS, the full test suite runs with only one failure. I will upload a patch to modify the calls lsvcgssd in sanity-krb5.
Thank you Fan Yong and Andrew (and Andreas) for your quick reaction and comments. Your time spent and willingness to help is appreciated.
Thanks for the document. I'll review to see if I'm deviating from what it prescribes.
James,
/etc/request-key.d/lgssc.conf will likely work too.
Lustre utils (gssd and the upcall) should log something to /var/log/messages (or another syslogd log depending on its configuration) in case it helps.
Andrew - No, lgss_keyring is not in /etc/request-key.conf, but it is in /etc/request-key.d/lgssc.conf:
- grep lgss_keyring /etc/request-key.conf
[root@eagle-48vm1 ~]# more /etc/request-key.d/lgssc.conf
create lgssc * * /usr/sbin/lgss_keyring %o %k %t %d %c %u %g %T %P %S
I can move this line from /etc/request-key.d/lgssc.conf to /etc/request-key.conf and see if that changes things.
Also, yes, documentation would be very helpful.
James, does
mds # grep lgss_keyring /etc/request-key.conf
oss # grep lgss_keyring /etc/request-key.conf
show anything?
I can share an installation guide written for some older Lustre Kerberos/GSS, but it wasn't finished and verified.
I’m running with Fan Yong’s patch http://review.whamcloud.com/#/c/23667/2 and I can run sanity tests 0 and 6 with no problems. When I run sanity-krb5, test_1 hangs immediately. Here’s what I see from each node.
On the client console from the beginning of the sanity-krb5 suite:
-----============= acceptance-small: sanity-krb5 ============----- Thu Nov 10 16:43:20 UTC 2016 Running: bash /usr/lib64/lustre/tests/sanity-krb5.sh prepare krb5 cred RUNAS=runas -u 500 -g 500 running as uid/gid/euid/egid 500/500/500/500, groups: [krb5_login.sh] ***** refresh Kerberos V5 TGT for uid 500 ***** spawn /bin/bash export PS1="user@host $ " kinit [sanityusr@eagle-48vm6 tests]$ export PS1="user@host $ " user@host $ kinit Password for sanityusr@CO.CFS: user@host $ CRED=/tmp/krb5cc_500 Starting client eagle-48vm6.eagle.hpdd.intel.com: -o user_xattr,flock eagle-48vm1@tcp:/scratch /lustre/scratch2 Started clients eagle-48vm6.eagle.hpdd.intel.com: eagle-48vm1@tcp:/scratch on /lustre/scratch2 type lustre (rw,user_xattr,flock) eagle-48vm6.eagle.hpdd.intel.com: Checking config lustre mounted on /lustre/scratch Checking servers environments Checking clients eagle-48vm6.eagle.hpdd.intel.com environments Using TIMEOUT=20 disable quota as required osd-ldiskfs.track_declares_assert=1 osd-ldiskfs.track_declares_assert=1 Starting gss daemon on mds: eagle-48vm1 Starting gss daemon on ost: eagle-48vm2 setting all flavor to null pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 already have total 8 null connections running as uid/gid/euid/egid 500/500/500/500, groups: [krb5_login.sh] running as uid/gid/euid/egid 500/500/500/500, groups: [touch] [/lustre/scratch/d0_runas_test/f3531] excepting tests: skipping tests SLOW=no: 100 101 == sanity-krb5 test 0: start multiple gss daemons == 16:43:38 (1478796218) bring up gss daemons... Starting gss daemon on mds: eagle-48vm1 Starting gss daemon on ost: eagle-48vm2 check with someone already running... 1 instance(s) of lsvcgssd are running check with someone run & finished... pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 1 1 instance(s) of lsvcgssd are running check refresh... pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 1 1 instance(s) of lsvcgssd are running Resetting fail_loc on all nodes...done. 16:44:24 (1478796264) waiting for eagle-48vm1 network 5 secs ... 16:44:24 (1478796264) network interface is UP 16:44:24 (1478796264) waiting for eagle-48vm2 network 5 secs ... 16:44:24 (1478796264) network interface is UP 16:44:25 (1478796265) waiting for eagle-48vm6.eagle.hpdd.intel.com network 5 secs ... 16:44:25 (1478796265) network interface is UP PASS 0 (47s) setting all flavor to krb5p pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0 krb5p out of total 8 connections restoring to default flavor... checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 8/8 null connections now at default flavor settings Setting sptlrpc rule: scratch.srpc.flavor.default=krb5p checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 8/8 krb5p connections == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294)
On the client, dmesg:
Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294) Lustre: 7212:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796291/real 1478796291] req@ffff88007a903980 x1550626731373984/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796298 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7221:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 7221:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) get expired: 1478796331(+33s) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff88007a380180(0->scratch-MDT0000_UUID) Lustre: 7221:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 7514:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796298/real 1478796298] req@ffff88007a903980 x1550626731374176/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796305 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7517:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7517:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7517:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbec0(0->scratch-MDT0000_UUID) get expired: 1478796338(+33s) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff88007afbbec0(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7514:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Lustre: 7521:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796305/real 1478796305] req@ffff88007a8636c0 x1550626731374288/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796312 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7521:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7521:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7521:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) get expired: 1478796345(+33s) Lustre: 7521:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7527:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796313/real 1478796313] req@ffff88007a8633c0 x1550626731374512/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796320 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7527:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message LustreError: 7530:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7530:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7530:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047861f00(0->scratch-MDT0000_UUID) get expired: 1478796353(+33s) Lustre: 7530:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff880047861f00(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7533:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796320/real 1478796320] req@ffff880078409cc0 x1550626731374624/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796327 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7536:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7536:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7533:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3080(0->scratch-MDT0000_UUID) get expired: 1478796360(+33s) Lustre: 7533:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff8800782d3080(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7550:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796334/real 1478796334] req@ffff88007a8630c0 x1550626731374960/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796341 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7550:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 3 previous similar messages LustreError: 7553:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7553:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages Lustre: 7553:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3bc0(0->scratch-MDT0000_UUID) get expired: 1478796374(+33s) Lustre: 7553:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 3 previous similar messages Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800782d3bc0(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 3 previous similar messages Lustre: 7573:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796355/real 1478796355] req@ffff8800784090c0 x1550626731375408/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796362 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7573:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages LustreError: 7577:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7577:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 5 previous similar messages Lustre: 7577:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047f86bc0(0->scratch-MDT0000_UUID) get expired: 1478796395(+33s) Lustre: 7577:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 5 previous similar messages Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff880047f86bc0(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 5 previous similar messages Lustre: 7609:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796390/real 1478796390] req@ffff8800784090c0 x1550626731376192/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796397 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7609:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 8 previous similar messages LustreError: 7612:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7612:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages Lustre: 7612:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047f86a40(0->scratch-MDT0000_UUID) get expired: 1478796430(+33s) Lustre: 7612:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff880047f86a40(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages Lustre: 7680:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796460/real 1478796460] req@ffff88007cb863c0 x1550626731377760/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796467 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7680:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages LustreError: 7683:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7683:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 19 previous similar messages Lustre: 7683:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800783b16c0(0->scratch-MDT0000_UUID) get expired: 1478796500(+33s) Lustre: 7683:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 19 previous similar messages Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800783b16c0(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 19 previous similar messages Lustre: 7807:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796593/real 1478796593] req@ffff8800782606c0 x1550626731380784/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796600 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7807:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 36 previous similar messages LustreError: 7810:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7810:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 37 previous similar messages Lustre: 7810:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3800(0->scratch-MDT0000_UUID) get expired: 1478796633(+33s) Lustre: 7810:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 37 previous similar messages Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800782d3800(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 37 previous similar messages LustreError: 7511:0:(file.c:3401:ll_inode_revalidate_fini()) scratch: revalidate FID [0x200000007:0x1:0x0] error: rc = -4 [root@eagle-48vm6 ~]#
On the MGS/MDS, dmesg
Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294) Lustre: 11127:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796292/real 1478796292] req@ffff88007a2950c0 x1550567535768688/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796299 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11127:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11127:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99200(0->scratch-MDT0000_UUID) get expired: 1478796332(+33s) Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99200(0->scratch-MDT0000_UUID) Lustre: 11329:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796299/real 1478796299] req@ffff88007a2950c0 x1550567535768784/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796306 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11329:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11329:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077914380(0->scratch-MDT0000_UUID) get expired: 1478796339(+33s) Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077914380(0->scratch-MDT0000_UUID) Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 4322:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 4322:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 2 previous similar messages Lustre: 11333:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796306/real 1478796306] req@ffff88007a2950c0 x1550567535768848/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796313 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11333:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11333:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a992c0(0->scratch-MDT0000_UUID) get expired: 1478796346(+33s) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a992c0(0->scratch-MDT0000_UUID) Lustre: 11336:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796313/real 1478796313] req@ffff88007a2950c0 x1550567535768976/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796320 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11336:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11336:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb740(0->scratch-MDT0000_UUID) get expired: 1478796353(+33s) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb740(0->scratch-MDT0000_UUID) Lustre: 11339:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796320/real 1478796320] req@ffff88007a2950c0 x1550567535769040/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796327 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11339:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11339:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99500(0->scratch-MDT0000_UUID) get expired: 1478796360(+33s) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99500(0->scratch-MDT0000_UUID) Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 11346:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796334/real 1478796334] req@ffff88007a2950c0 x1550567535769232/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796341 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11346:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message LustreError: 11346:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11346:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 11346:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb680(0->scratch-MDT0000_UUID) get expired: 1478796374(+33s) Lustre: 11346:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb680(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 5 previous similar messages Lustre: 11355:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796355/real 1478796355] req@ffff88007a2950c0 x1550567535769488/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796362 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11355:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages LustreError: 11355:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11355:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages Lustre: 11355:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb740(0->scratch-MDT0000_UUID) get expired: 1478796395(+33s) Lustre: 11355:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 2 previous similar messages Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb740(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 2 previous similar messages Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 1 previous similar message Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 2 previous similar messages Lustre: 11376:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796390/real 1478796390] req@ffff88007a2950c0 x1550567535769936/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796397 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11376:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages LustreError: 11376:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11376:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 4 previous similar messages Lustre: 11376:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077914080(0->scratch-MDT0000_UUID) get expired: 1478796430(+33s) Lustre: 11376:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 4 previous similar messages Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077914080(0->scratch-MDT0000_UUID) Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 4 previous similar messages Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 3 previous similar messages Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 5 previous similar messages Lustre: 11413:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796460/real 1478796460] req@ffff880076d2c380 x1550567535770832/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796467 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11413:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 9 previous similar messages LustreError: 11413:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11413:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages Lustre: 11413:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a992c0(0->scratch-MDT0000_UUID) get expired: 1478796500(+33s) Lustre: 11413:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a992c0(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 15 previous similar messages Lustre: 11483:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796593/real 1478796593] req@ffff880076d2c380 x1550567535772560/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796600 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11483:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages LustreError: 11483:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11483:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 18 previous similar messages Lustre: 11483:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99c80(0->scratch-MDT0000_UUID) get expired: 1478796633(+33s) Lustre: 11483:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 18 previous similar messages Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99c80(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 18 previous similar messages Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 25 previous similar messages [root@eagle-48vm1 ~]#
On the OSS, dmesg:
Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294) Lustre: 9304:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796293/real 1478796293] req@ffff8800796a8c80 x1550567557600736/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796300 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9307:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 9304:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4c80(0->scratch-MDT0000_UUID) get expired: 1478796333(+33s) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88004d2e4c80(0->scratch-MDT0000_UUID) LustreError: 9307:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9344:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796300/real 1478796300] req@ffff8800796a8c80 x1550567557600784/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796307 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9347:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 9347:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) get expired: 1478796340(+33s) Lustre: 9347:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9344:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Lustre: 9350:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796307/real 1478796307] req@ffff8800796a8680 x1550567557600832/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796314 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9350:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9350:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9350:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4b00(0->scratch-MDT0000_UUID) get expired: 1478796347(+33s) Lustre: 9350:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4b00(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9359:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796314/real 1478796314] req@ffff8800796a8080 x1550567557600928/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796321 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9359:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message LustreError: 9356:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9356:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9356:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077ecb6c0(0->scratch-MDT0000_UUID) get expired: 1478796354(+33s) Lustre: 9356:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff880077ecb6c0(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9363:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796321/real 1478796321] req@ffff88007968bcc0 x1550567557600976/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796328 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9366:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9366:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9363:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4a40(0->scratch-MDT0000_UUID) get expired: 1478796361(+33s) Lustre: 9363:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4a40(0->scratch-MDT0000_UUID) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9376:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796335/real 1478796335] req@ffff88007968b6c0 x1550567557601120/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796342 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9376:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 3 previous similar messages LustreError: 9379:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9379:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages Lustre: 9379:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12240(0->scratch-MDT0000_UUID) get expired: 1478796375(+33s) Lustre: 9379:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 3 previous similar messages Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88005fd12240(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 3 previous similar messages Lustre: 9395:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796356/real 1478796356] req@ffff88007968b6c0 x1550567557601312/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796363 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9395:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages LustreError: 9398:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9398:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 5 previous similar messages Lustre: 9398:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12240(0->scratch-MDT0000_UUID) get expired: 1478796396(+33s) Lustre: 9398:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 5 previous similar messages Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88005fd12240(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 5 previous similar messages Lustre: 9429:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796391/real 1478796391] req@ffff88007968bcc0 x1550567557601648/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796398 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9429:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 8 previous similar messages LustreError: 9432:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9432:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages Lustre: 9432:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12c00(0->scratch-MDT0000_UUID) get expired: 1478796431(+33s) Lustre: 9432:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88005fd12c00(0->scratch-MDT0000_UUID) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages Lustre: 9496:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796461/real 1478796461] req@ffff88004d2be6c0 x1550567557602320/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796468 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9496:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages LustreError: 9499:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9499:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 19 previous similar messages Lustre: 9499:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) get expired: 1478796501(+33s) Lustre: 9499:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 19 previous similar messages Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 19 previous similar messages Lustre: 9624:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796594/real 1478796594] req@ffff8800796c4c80 x1550567557603616/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796601 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9627:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9627:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 37 previous similar messages Lustre: 9627:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007d0efe40(0->scratch-MDT0000_UUID) get expired: 1478796634(+33s) Lustre: 9627:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 37 previous similar messages Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88007d0efe40(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 36 previous similar messages Lustre: 9624:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 37 previous similar messages [root@eagle-48vm2 ~]#
I’ve updated the Kerberos setup script to get the changes to the script from LU-6971, http://review.whamcloud.com/#/c/19789/13, so we are no longer creating the perm.conf file. The Kerberos setup script that I’m using is at http://review.whamcloud.com/#/c/18012/ .
We are also calling lsvcgssd with the new recommended flags; For the MGS/MDS, use '/usr/sbin/lsvcgssd -k -g -m' and, for the OSS, use '/usr/sbin/lsvcgssd -k -o'.
So, there is still an issue, but maybe this should move to a new ticket?
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23925
Subject:
LU-8813gss: allow svcgssd to start without "-k"Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6f63a934d3f771c479f296b203f0f717cfae5313