Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8813

Kerberos: sanity and sanity-krb5 test suites fail on non-root user trying to touch file

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.0
    • Lustre 2.9.0
    • 3
    • 9223372036854775807

    Description

      I’m having problems running some test suites with Kerberos enabled. For example, running sanity.sh on 2.8.60 + the patch http://review.whamcloud.com/#/c/23600/, and the test fails with

      # ./auster -v -k sanity --only 0a
      Started at Tue Nov  8 14:39:14 UTC 2016
      eagle-48vm6.eagle.hpdd.intel.com: Checking config lustre mounted on /lustre/scratch
      Checking servers environments
      Checking clients eagle-48vm6.eagle.hpdd.intel.com environments
      Logging to local directory: /tmp/test_logs/2016-11-08/143914
      Client: Lustre version: 2.8.60_1_g35d09c7
      MDS: Lustre version: 2.8.60_1_g35d09c7
      OSS: Lustre version: 2.8.60_1_g35d09c7
      running: sanity ONLY=0a 
      run_suite sanity /usr/lib64/lustre/tests/sanity.sh
      -----============= acceptance-small: sanity ============----- Tue Nov  8 14:39:21 UTC 2016
      Running: bash /usr/lib64/lustre/tests/sanity.sh
      eagle-48vm6.eagle.hpdd.intel.com: Checking config lustre mounted on /lustre/scratch
      Checking servers environments
      Checking clients eagle-48vm6.eagle.hpdd.intel.com environments
      Using TIMEOUT=20
      disable quota as required
      osd-ldiskfs.track_declares_assert=1
      osd-ldiskfs.track_declares_assert=1
      debug=-1
      running as uid/gid/euid/egid 500/500/500/500, groups:
       [touch] [/lustre/scratch/d0_runas_test/f7025]
      touch: cannot touch `/lustre/scratch/d0_runas_test/f7025': Permission denied
       sanity : @@@@@@ FAIL: unable to write to /lustre/scratch/d0_runas_test as UID 500.
              Please set RUNAS_ID to some UID which exists on MDS and client or
              add user 500:500 on these nodes. 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4841:error()
        = /usr/lib64/lustre/tests/test-framework.sh:5670:check_runas_id()
        = /usr/lib64/lustre/tests/sanity.sh:126:main()
      Dumping lctl log to /tmp/test_logs/2016-11-08/143914/sanity..*.1478615970.log
      eagle-48vm1: Host key verification failed.
      eagle-48vm1: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
      eagle-48vm1: rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6]
      pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 12
      eagle-48vm2: Host key verification failed.
      eagle-48vm2: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
      eagle-48vm2: rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6]
      pdsh@eagle-48vm6: eagle-48vm2: ssh exited with exit code 12
      sanity returned 0
      Finished at Tue Nov  8 14:39:31 UTC 2016 in 17s
      ./auster: completed with rc 0
      

      The code that is failing in sanity.sh is

      # $RUNAS_ID may get set incorrectly somewhere else
      [ $UID -eq 0 -a $RUNAS_ID -eq 0 ] && error "\$RUNAS_ID set to 0, but \$UID is al
      so 0!"
      
      check_runas_id $RUNAS_ID $RUNAS_GID $RUNAS
      

      UID/GID 500 belongs to sanityusr and requested a Kerberos ticket before running sanity.sh:

      # su sanityusr
      bash-4.1$ klist
      Ticket cache: FILE:/tmp/krb5cc_500
      Default principal: sanityusr@CO.CFS
      
      Valid starting     Expires            Service principal
      11/08/16 14:38:48  11/09/16 14:38:48  krbtgt/CO.CFS@CO.CFS
      

      Note that CO.CFS is the realm being used.

      Since Lustre is not failing, it’s not surprising that there is nothing of interest in dmesg. For example, form the MGS/MDS:

      Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Tue Nov 8 14:39:21 UTC 2016
      Lustre: DEBUG MARKER: Using TIMEOUT=20
      Lustre: DEBUG MARKER: sanity : @@@@@@ FAIL: unable to write to /lustre/scratch/d0_runas_test as UID 500.
      

      Logs for this run are at https://testing.hpdd.intel.com/test_sets/b87e2568-a5f8-11e6-964e-5254006e85c2.
      More logs will be attached to this ticket.

      The last time I tested Kerberos and the above tests ran was tag 2.8.54. Since that time some flags have been added to lsvcgssd. I tried to call lsvcgssd two different ways; the way things worked in 2.8.54 as ‘/usr/sbin/lsvcgssd’ on all Lustre servers and, the new recommended way, for the MGS/MDS, ‘/usr/sbin/lsvcgssd -m -g -k –vvv’ and, for the OSS, ‘/usr/sbin/lsvcgssd -o -k –vvv’ (verbosity is optional). All tests were run with RHEL 6.8 (for some reason, Maloo reports it as el6.7)

      Confirmation that others are or are not experiencing this problem with these test suites and Kerberos would be helpful.

      Attachments

        Issue Links

          Activity

            [LU-8813] Kerberos: sanity and sanity-krb5 test suites fail on non-root user trying to touch file

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23925
            Subject: LU-8813 gss: allow svcgssd to start without "-k"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 6f63a934d3f771c479f296b203f0f717cfae5313

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23925 Subject: LU-8813 gss: allow svcgssd to start without "-k" Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6f63a934d3f771c479f296b203f0f717cfae5313

            Reopen to fix compatibility with svcgssd not being passed -k option for existing Kerberos configurations.

            adilger Andreas Dilger added a comment - Reopen to fix compatibility with svcgssd not being passed -k option for existing Kerberos configurations.

            The patch has been landed to master.

            yong.fan nasf (Inactive) added a comment - The patch has been landed to master.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23667/
            Subject: LU-8813 utils: l_getidentity compatibility
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d385685a92668241d8c802f33f2e5497d9a7ea5a

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23667/ Subject: LU-8813 utils: l_getidentity compatibility Project: fs/lustre-release Branch: master Current Patch Set: Commit: d385685a92668241d8c802f33f2e5497d9a7ea5a

            James, please be aware of test interop issues when adding the new flags to the lsvcgssd startup. While I can understand that this may be necessary for SSK testing, it isn't clear to me what the need is for KRB testing, and this highlights that existing user deployments with Kerberos may start to fail in a similar manner, so I'm glad we caught this before release.

            The addition of the "-k" option should not be required for Kerberos configurations, to maintain compatibility with existing environments, IMHO. My thought was to have the lack of "-k" set the krb_enabled if neither sk_enabled or null_enabled is no yet set, but print a warning like:

             svcgssd: no "-k", "-s", or "-z" option given.  Assume "-k" for compatibility reasons.  
            

            or similar to avoid confusing error messages.

            adilger Andreas Dilger added a comment - James, please be aware of test interop issues when adding the new flags to the lsvcgssd startup. While I can understand that this may be necessary for SSK testing, it isn't clear to me what the need is for KRB testing, and this highlights that existing user deployments with Kerberos may start to fail in a similar manner, so I'm glad we caught this before release. The addition of the "-k" option should not be required for Kerberos configurations, to maintain compatibility with existing environments, IMHO. My thought was to have the lack of "-k" set the krb_enabled if neither sk_enabled or null_enabled is no yet set, but print a warning like: svcgssd: no "-k", "-s", or "-z" option given. Assume "-k" for compatibility reasons. or similar to avoid confusing error messages.
            jamesanunez James Nunez (Inactive) added a comment - - edited

            I understand the remaining issues and they are minor issues in the sanity-krb5 test suite. Some tests in sanity-krb5 stop and start the lsvcgssd and restart with only the '-v' option. Once I added the correct flags to these calls for the MDS and for OSS, the full test suite runs with only one failure. I will upload a patch to modify the calls lsvcgssd in sanity-krb5.

            Thank you Fan Yong and Andrew (and Andreas) for your quick reaction and comments. Your time spent and willingness to help is appreciated.

            jamesanunez James Nunez (Inactive) added a comment - - edited I understand the remaining issues and they are minor issues in the sanity-krb5 test suite. Some tests in sanity-krb5 stop and start the lsvcgssd and restart with only the '-v' option. Once I added the correct flags to these calls for the MDS and for OSS, the full test suite runs with only one failure. I will upload a patch to modify the calls lsvcgssd in sanity-krb5. Thank you Fan Yong and Andrew (and Andreas) for your quick reaction and comments. Your time spent and willingness to help is appreciated.

            Thanks for the document. I'll review to see if I'm deviating from what it prescribes.

            jamesanunez James Nunez (Inactive) added a comment - Thanks for the document. I'll review to see if I'm deviating from what it prescribes.
            panda Andrew Perepechko added a comment - - edited

            James,

            /etc/request-key.d/lgssc.conf will likely work too.

            Lustre utils (gssd and the upcall) should log something to /var/log/messages (or another syslogd log depending on its configuration) in case it helps.

            panda Andrew Perepechko added a comment - - edited James, /etc/request-key.d/lgssc.conf will likely work too. Lustre utils (gssd and the upcall) should log something to /var/log/messages (or another syslogd log depending on its configuration) in case it helps.
            jamesanunez James Nunez (Inactive) added a comment - - edited

            Andrew - No, lgss_keyring is not in /etc/request-key.conf, but it is in /etc/request-key.d/lgssc.conf:

            1. grep lgss_keyring /etc/request-key.conf
              [root@eagle-48vm1 ~]# more /etc/request-key.d/lgssc.conf
              create lgssc * * /usr/sbin/lgss_keyring %o %k %t %d %c %u %g %T %P %S

            I can move this line from /etc/request-key.d/lgssc.conf to /etc/request-key.conf and see if that changes things.

            Also, yes, documentation would be very helpful.

            jamesanunez James Nunez (Inactive) added a comment - - edited Andrew - No, lgss_keyring is not in /etc/request-key.conf, but it is in /etc/request-key.d/lgssc.conf: grep lgss_keyring /etc/request-key.conf [root@eagle-48vm1 ~] # more /etc/request-key.d/lgssc.conf create lgssc * * /usr/sbin/lgss_keyring %o %k %t %d %c %u %g %T %P %S I can move this line from /etc/request-key.d/lgssc.conf to /etc/request-key.conf and see if that changes things. Also, yes, documentation would be very helpful.

            James, does

            mds # grep lgss_keyring /etc/request-key.conf
            oss # grep lgss_keyring /etc/request-key.conf

            show anything?

            I can share an installation guide written for some older Lustre Kerberos/GSS, but it wasn't finished and verified.

            panda Andrew Perepechko added a comment - James, does mds # grep lgss_keyring /etc/request-key.conf oss # grep lgss_keyring /etc/request-key.conf show anything? I can share an installation guide written for some older Lustre Kerberos/GSS, but it wasn't finished and verified.

            I’m running with Fan Yong’s patch http://review.whamcloud.com/#/c/23667/2 and I can run sanity tests 0 and 6 with no problems. When I run sanity-krb5, test_1 hangs immediately. Here’s what I see from each node.

            On the client console from the beginning of the sanity-krb5 suite:

            -----============= acceptance-small: sanity-krb5 ============----- Thu Nov 10 16:43:20 UTC 2016
            Running: bash /usr/lib64/lustre/tests/sanity-krb5.sh
            prepare krb5 cred
            RUNAS=runas -u 500 -g 500
            running as uid/gid/euid/egid 500/500/500/500, groups:
            [krb5_login.sh]
            ***** refresh Kerberos V5 TGT for uid 500 *****
            spawn /bin/bash
            export PS1="user@host $ " 
            kinit
            [sanityusr@eagle-48vm6 tests]$ export PS1="user@host $ " 
            user@host $ kinit
            Password for sanityusr@CO.CFS: 
            user@host $ CRED=/tmp/krb5cc_500
            Starting client eagle-48vm6.eagle.hpdd.intel.com:  -o user_xattr,flock eagle-48vm1@tcp:/scratch /lustre/scratch2
            Started clients eagle-48vm6.eagle.hpdd.intel.com: 
            eagle-48vm1@tcp:/scratch on /lustre/scratch2 type lustre (rw,user_xattr,flock)
            eagle-48vm6.eagle.hpdd.intel.com: Checking config lustre mounted on /lustre/scratch
            Checking servers environments
            Checking clients eagle-48vm6.eagle.hpdd.intel.com environments
            Using TIMEOUT=20
            disable quota as required
            osd-ldiskfs.track_declares_assert=1
            osd-ldiskfs.track_declares_assert=1
            Starting gss daemon on mds: eagle-48vm1
            Starting gss daemon on ost: eagle-48vm2
            setting all flavor to null
            pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            already have total 8 null connections
            running as uid/gid/euid/egid 500/500/500/500, groups:
            [krb5_login.sh]
            running as uid/gid/euid/egid 500/500/500/500, groups:
            [touch] [/lustre/scratch/d0_runas_test/f3531]
            excepting tests: 
            skipping tests SLOW=no: 100 101
            
            
            == sanity-krb5 test 0: start multiple gss daemons == 16:43:38 (1478796218)
            bring up gss daemons...
            Starting gss daemon on mds: eagle-48vm1
            Starting gss daemon on ost: eagle-48vm2
            check with someone already running...
            1 instance(s) of lsvcgssd are running
            check with someone run & finished...
            pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 1
            1 instance(s) of lsvcgssd are running
            check refresh...
            pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 1
            1 instance(s) of lsvcgssd are running
            Resetting fail_loc on all nodes...done.
            16:44:24 (1478796264) waiting for eagle-48vm1 network 5 secs ...
            16:44:24 (1478796264) network interface is UP
            16:44:24 (1478796264) waiting for eagle-48vm2 network 5 secs ...
            16:44:24 (1478796264) network interface is UP
            16:44:25 (1478796265) waiting for eagle-48vm6.eagle.hpdd.intel.com network 5 secs ...
            16:44:25 (1478796265) network interface is UP
            PASS 0 (47s)
            setting all flavor to krb5p
            pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            found 0 krb5p out of total 8 connections
            restoring to default flavor...
            checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            found 8/8 null connections
            now at default flavor settings
            Setting sptlrpc rule: scratch.srpc.flavor.default=krb5p
            checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            found 0/8 krb5p connections
            checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            found 0/8 krb5p connections
            checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            found 0/8 krb5p connections
            checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            found 0/8 krb5p connections
            checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2
            found 8/8 krb5p connections
            
            == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294)
            

            On the client, dmesg:

            Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294)
            Lustre: 7212:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796291/real 1478796291]  req@ffff88007a903980 x1550626731373984/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796298 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 7221:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 7221:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) get expired: 1478796331(+33s)
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff88007a380180(0->scratch-MDT0000_UUID)
            Lustre: 7221:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 7514:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796298/real 1478796298]  req@ffff88007a903980 x1550626731374176/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796305 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 7517:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7517:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 7517:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbec0(0->scratch-MDT0000_UUID) get expired: 1478796338(+33s)
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff88007afbbec0(0->scratch-MDT0000_UUID)
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 7514:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
            Lustre: 7521:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796305/real 1478796305]  req@ffff88007a8636c0 x1550626731374288/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796312 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 7521:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7521:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 7521:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) get expired: 1478796345(+33s)
            Lustre: 7521:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff88007afbbb00(0->scratch-MDT0000_UUID)
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 7527:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796313/real 1478796313]  req@ffff88007a8633c0 x1550626731374512/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796320 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 7527:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message
            LustreError: 7530:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7530:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 7530:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047861f00(0->scratch-MDT0000_UUID) get expired: 1478796353(+33s)
            Lustre: 7530:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff880047861f00(0->scratch-MDT0000_UUID)
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 7533:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796320/real 1478796320]  req@ffff880078409cc0 x1550626731374624/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796327 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 7536:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7536:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 7533:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3080(0->scratch-MDT0000_UUID) get expired: 1478796360(+33s)
            Lustre: 7533:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff8800782d3080(0->scratch-MDT0000_UUID)
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 7550:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796334/real 1478796334]  req@ffff88007a8630c0 x1550626731374960/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796341 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 7550:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 3 previous similar messages
            LustreError: 7553:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7553:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages
            Lustre: 7553:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3bc0(0->scratch-MDT0000_UUID) get expired: 1478796374(+33s)
            Lustre: 7553:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 3 previous similar messages
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800782d3bc0(0->scratch-MDT0000_UUID)
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 3 previous similar messages
            Lustre: 7573:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796355/real 1478796355]  req@ffff8800784090c0 x1550626731375408/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796362 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 7573:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages
            LustreError: 7577:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7577:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 5 previous similar messages
            Lustre: 7577:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047f86bc0(0->scratch-MDT0000_UUID) get expired: 1478796395(+33s)
            Lustre: 7577:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 5 previous similar messages
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff880047f86bc0(0->scratch-MDT0000_UUID)
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 5 previous similar messages
            Lustre: 7609:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796390/real 1478796390]  req@ffff8800784090c0 x1550626731376192/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796397 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 7609:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 8 previous similar messages
            LustreError: 7612:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7612:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages
            Lustre: 7612:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047f86a40(0->scratch-MDT0000_UUID) get expired: 1478796430(+33s)
            Lustre: 7612:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff880047f86a40(0->scratch-MDT0000_UUID)
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages
            Lustre: 7680:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796460/real 1478796460]  req@ffff88007cb863c0 x1550626731377760/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796467 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 7680:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages
            LustreError: 7683:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7683:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 19 previous similar messages
            Lustre: 7683:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800783b16c0(0->scratch-MDT0000_UUID) get expired: 1478796500(+33s)
            Lustre: 7683:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 19 previous similar messages
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800783b16c0(0->scratch-MDT0000_UUID)
            Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 19 previous similar messages
            Lustre: 7807:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796593/real 1478796593]  req@ffff8800782606c0 x1550626731380784/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796600 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 7807:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 36 previous similar messages
            LustreError: 7810:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 7810:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 37 previous similar messages
            Lustre: 7810:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3800(0->scratch-MDT0000_UUID) get expired: 1478796633(+33s)
            Lustre: 7810:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 37 previous similar messages
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800782d3800(0->scratch-MDT0000_UUID)
            Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 37 previous similar messages
            LustreError: 7511:0:(file.c:3401:ll_inode_revalidate_fini()) scratch: revalidate FID [0x200000007:0x1:0x0] error: rc = -4
            [root@eagle-48vm6 ~]# 
            

            On the MGS/MDS, dmesg

            Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294)
            Lustre: 11127:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796292/real 1478796292]  req@ffff88007a2950c0 x1550567535768688/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796299 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 11127:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 11127:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99200(0->scratch-MDT0000_UUID) get expired: 1478796332(+33s)
            Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99200(0->scratch-MDT0000_UUID)
            Lustre: 11329:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796299/real 1478796299]  req@ffff88007a2950c0 x1550567535768784/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796306 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 11329:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 11329:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077914380(0->scratch-MDT0000_UUID) get expired: 1478796339(+33s)
            Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077914380(0->scratch-MDT0000_UUID)
            Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 4322:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 4322:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 2 previous similar messages
            Lustre: 11333:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796306/real 1478796306]  req@ffff88007a2950c0 x1550567535768848/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796313 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 11333:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 11333:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a992c0(0->scratch-MDT0000_UUID) get expired: 1478796346(+33s)
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a992c0(0->scratch-MDT0000_UUID)
            Lustre: 11336:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796313/real 1478796313]  req@ffff88007a2950c0 x1550567535768976/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796320 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 11336:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 11336:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb740(0->scratch-MDT0000_UUID) get expired: 1478796353(+33s)
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb740(0->scratch-MDT0000_UUID)
            Lustre: 11339:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796320/real 1478796320]  req@ffff88007a2950c0 x1550567535769040/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796327 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 11339:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 11339:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99500(0->scratch-MDT0000_UUID) get expired: 1478796360(+33s)
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99500(0->scratch-MDT0000_UUID)
            Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 11346:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796334/real 1478796334]  req@ffff88007a2950c0 x1550567535769232/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796341 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 11346:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message
            LustreError: 11346:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 11346:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 11346:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb680(0->scratch-MDT0000_UUID) get expired: 1478796374(+33s)
            Lustre: 11346:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb680(0->scratch-MDT0000_UUID)
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 5 previous similar messages
            Lustre: 11355:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796355/real 1478796355]  req@ffff88007a2950c0 x1550567535769488/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796362 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 11355:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
            LustreError: 11355:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 11355:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages
            Lustre: 11355:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb740(0->scratch-MDT0000_UUID) get expired: 1478796395(+33s)
            Lustre: 11355:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 2 previous similar messages
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb740(0->scratch-MDT0000_UUID)
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 2 previous similar messages
            Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 1 previous similar message
            Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 2 previous similar messages
            Lustre: 11376:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796390/real 1478796390]  req@ffff88007a2950c0 x1550567535769936/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796397 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 11376:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages
            LustreError: 11376:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 11376:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 4 previous similar messages
            Lustre: 11376:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077914080(0->scratch-MDT0000_UUID) get expired: 1478796430(+33s)
            Lustre: 11376:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 4 previous similar messages
            Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077914080(0->scratch-MDT0000_UUID)
            Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 4 previous similar messages
            Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 3 previous similar messages
            Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 5 previous similar messages
            Lustre: 11413:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796460/real 1478796460]  req@ffff880076d2c380 x1550567535770832/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796467 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 11413:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 9 previous similar messages
            LustreError: 11413:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 11413:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages
            Lustre: 11413:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a992c0(0->scratch-MDT0000_UUID) get expired: 1478796500(+33s)
            Lustre: 11413:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a992c0(0->scratch-MDT0000_UUID)
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages
            Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 15 previous similar messages
            Lustre: 11483:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796593/real 1478796593]  req@ffff880076d2c380 x1550567535772560/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796600 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 11483:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages
            LustreError: 11483:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 11483:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 18 previous similar messages
            Lustre: 11483:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99c80(0->scratch-MDT0000_UUID) get expired: 1478796633(+33s)
            Lustre: 11483:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 18 previous similar messages
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99c80(0->scratch-MDT0000_UUID)
            Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 18 previous similar messages
            Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop
            Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 25 previous similar messages
            [root@eagle-48vm1 ~]# 
            

            On the OSS, dmesg:

            Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294)
            Lustre: 9304:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796293/real 1478796293]  req@ffff8800796a8c80 x1550567557600736/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796300 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 9307:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 9304:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4c80(0->scratch-MDT0000_UUID) get expired: 1478796333(+33s)
            Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88004d2e4c80(0->scratch-MDT0000_UUID)
            LustreError: 9307:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 9344:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796300/real 1478796300]  req@ffff8800796a8c80 x1550567557600784/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796307 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 9347:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            Lustre: 9347:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) get expired: 1478796340(+33s)
            Lustre: 9347:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4980(0->scratch-MDT0000_UUID)
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 9344:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
            Lustre: 9350:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796307/real 1478796307]  req@ffff8800796a8680 x1550567557600832/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796314 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 9350:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9350:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 9350:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4b00(0->scratch-MDT0000_UUID) get expired: 1478796347(+33s)
            Lustre: 9350:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4b00(0->scratch-MDT0000_UUID)
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 9359:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796314/real 1478796314]  req@ffff8800796a8080 x1550567557600928/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796321 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 9359:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message
            LustreError: 9356:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9356:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 9356:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077ecb6c0(0->scratch-MDT0000_UUID) get expired: 1478796354(+33s)
            Lustre: 9356:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff880077ecb6c0(0->scratch-MDT0000_UUID)
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 9363:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796321/real 1478796321]  req@ffff88007968bcc0 x1550567557600976/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796328 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 9366:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9366:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message
            Lustre: 9363:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4a40(0->scratch-MDT0000_UUID) get expired: 1478796361(+33s)
            Lustre: 9363:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message
            Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4a40(0->scratch-MDT0000_UUID)
            Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message
            Lustre: 9376:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796335/real 1478796335]  req@ffff88007968b6c0 x1550567557601120/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796342 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 9376:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 3 previous similar messages
            LustreError: 9379:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9379:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages
            Lustre: 9379:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12240(0->scratch-MDT0000_UUID) get expired: 1478796375(+33s)
            Lustre: 9379:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 3 previous similar messages
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88005fd12240(0->scratch-MDT0000_UUID)
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 3 previous similar messages
            Lustre: 9395:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796356/real 1478796356]  req@ffff88007968b6c0 x1550567557601312/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796363 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 9395:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages
            LustreError: 9398:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9398:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 5 previous similar messages
            Lustre: 9398:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12240(0->scratch-MDT0000_UUID) get expired: 1478796396(+33s)
            Lustre: 9398:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 5 previous similar messages
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88005fd12240(0->scratch-MDT0000_UUID)
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 5 previous similar messages
            Lustre: 9429:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796391/real 1478796391]  req@ffff88007968bcc0 x1550567557601648/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796398 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 9429:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 8 previous similar messages
            LustreError: 9432:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9432:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages
            Lustre: 9432:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12c00(0->scratch-MDT0000_UUID) get expired: 1478796431(+33s)
            Lustre: 9432:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages
            Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88005fd12c00(0->scratch-MDT0000_UUID)
            Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages
            Lustre: 9496:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796461/real 1478796461]  req@ffff88004d2be6c0 x1550567557602320/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796468 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            Lustre: 9496:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages
            LustreError: 9499:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9499:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 19 previous similar messages
            Lustre: 9499:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) get expired: 1478796501(+33s)
            Lustre: 9499:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 19 previous similar messages
            Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4980(0->scratch-MDT0000_UUID)
            Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 19 previous similar messages
            Lustre: 9624:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796594/real 1478796594]  req@ffff8800796c4c80 x1550567557603616/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796601 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
            LustreError: 9627:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
            LustreError: 9627:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 37 previous similar messages
            Lustre: 9627:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007d0efe40(0->scratch-MDT0000_UUID) get expired: 1478796634(+33s)
            Lustre: 9627:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 37 previous similar messages
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88007d0efe40(0->scratch-MDT0000_UUID)
            Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 36 previous similar messages
            Lustre: 9624:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 37 previous similar messages
            [root@eagle-48vm2 ~]# 
            

            I’ve updated the Kerberos setup script to get the changes to the script from LU-6971, http://review.whamcloud.com/#/c/19789/13, so we are no longer creating the perm.conf file. The Kerberos setup script that I’m using is at http://review.whamcloud.com/#/c/18012/ .

            We are also calling lsvcgssd with the new recommended flags; For the MGS/MDS, use '/usr/sbin/lsvcgssd -k -g -m' and, for the OSS, use '/usr/sbin/lsvcgssd -k -o'.

            So, there is still an issue, but maybe this should move to a new ticket?

            jamesanunez James Nunez (Inactive) added a comment - I’m running with Fan Yong’s patch http://review.whamcloud.com/#/c/23667/2 and I can run sanity tests 0 and 6 with no problems. When I run sanity-krb5, test_1 hangs immediately. Here’s what I see from each node. On the client console from the beginning of the sanity-krb5 suite: -----============= acceptance-small: sanity-krb5 ============----- Thu Nov 10 16:43:20 UTC 2016 Running: bash /usr/lib64/lustre/tests/sanity-krb5.sh prepare krb5 cred RUNAS=runas -u 500 -g 500 running as uid/gid/euid/egid 500/500/500/500, groups: [krb5_login.sh] ***** refresh Kerberos V5 TGT for uid 500 ***** spawn /bin/bash export PS1="user@host $ " kinit [sanityusr@eagle-48vm6 tests]$ export PS1="user@host $ " user@host $ kinit Password for sanityusr@CO.CFS: user@host $ CRED=/tmp/krb5cc_500 Starting client eagle-48vm6.eagle.hpdd.intel.com: -o user_xattr,flock eagle-48vm1@tcp:/scratch /lustre/scratch2 Started clients eagle-48vm6.eagle.hpdd.intel.com: eagle-48vm1@tcp:/scratch on /lustre/scratch2 type lustre (rw,user_xattr,flock) eagle-48vm6.eagle.hpdd.intel.com: Checking config lustre mounted on /lustre/scratch Checking servers environments Checking clients eagle-48vm6.eagle.hpdd.intel.com environments Using TIMEOUT=20 disable quota as required osd-ldiskfs.track_declares_assert=1 osd-ldiskfs.track_declares_assert=1 Starting gss daemon on mds: eagle-48vm1 Starting gss daemon on ost: eagle-48vm2 setting all flavor to null pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 already have total 8 null connections running as uid/gid/euid/egid 500/500/500/500, groups: [krb5_login.sh] running as uid/gid/euid/egid 500/500/500/500, groups: [touch] [/lustre/scratch/d0_runas_test/f3531] excepting tests: skipping tests SLOW=no: 100 101 == sanity-krb5 test 0: start multiple gss daemons == 16:43:38 (1478796218) bring up gss daemons... Starting gss daemon on mds: eagle-48vm1 Starting gss daemon on ost: eagle-48vm2 check with someone already running... 1 instance(s) of lsvcgssd are running check with someone run & finished... pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 1 1 instance(s) of lsvcgssd are running check refresh... pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 1 1 instance(s) of lsvcgssd are running Resetting fail_loc on all nodes...done. 16:44:24 (1478796264) waiting for eagle-48vm1 network 5 secs ... 16:44:24 (1478796264) network interface is UP 16:44:24 (1478796264) waiting for eagle-48vm2 network 5 secs ... 16:44:24 (1478796264) network interface is UP 16:44:25 (1478796265) waiting for eagle-48vm6.eagle.hpdd.intel.com network 5 secs ... 16:44:25 (1478796265) network interface is UP PASS 0 (47s) setting all flavor to krb5p pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0 krb5p out of total 8 connections restoring to default flavor... checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 8/8 null connections now at default flavor settings Setting sptlrpc rule: scratch.srpc.flavor.default=krb5p checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 0/8 krb5p connections checking all2all...pdsh@eagle-48vm6: eagle-48vm1: ssh exited with exit code 2 found 8/8 krb5p connections == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294) On the client, dmesg: Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294) Lustre: 7212:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796291/real 1478796291] req@ffff88007a903980 x1550626731373984/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796298 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7221:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 7221:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) get expired: 1478796331(+33s) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff88007a380180(0->scratch-MDT0000_UUID) Lustre: 7221:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 7514:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796298/real 1478796298] req@ffff88007a903980 x1550626731374176/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796305 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7517:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7517:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7517:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbec0(0->scratch-MDT0000_UUID) get expired: 1478796338(+33s) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff88007afbbec0(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7514:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Lustre: 7521:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796305/real 1478796305] req@ffff88007a8636c0 x1550626731374288/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796312 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7521:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7521:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7521:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) get expired: 1478796345(+33s) Lustre: 7521:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff88007afbbb00(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7527:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796313/real 1478796313] req@ffff88007a8633c0 x1550626731374512/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796320 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7527:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message LustreError: 7530:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7530:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7530:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047861f00(0->scratch-MDT0000_UUID) get expired: 1478796353(+33s) Lustre: 7530:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff880047861f00(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7533:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796320/real 1478796320] req@ffff880078409cc0 x1550626731374624/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796327 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 7536:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7536:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 7533:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3080(0->scratch-MDT0000_UUID) get expired: 1478796360(+33s) Lustre: 7533:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff8800782d3080(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 7550:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796334/real 1478796334] req@ffff88007a8630c0 x1550626731374960/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796341 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7550:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 3 previous similar messages LustreError: 7553:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7553:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages Lustre: 7553:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3bc0(0->scratch-MDT0000_UUID) get expired: 1478796374(+33s) Lustre: 7553:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 3 previous similar messages Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800782d3bc0(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 3 previous similar messages Lustre: 7573:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796355/real 1478796355] req@ffff8800784090c0 x1550626731375408/t0(0) o801->scratch-MDT0000-mdc-ffff880037ad4400@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796362 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7573:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages LustreError: 7577:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7577:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 5 previous similar messages Lustre: 7577:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047f86bc0(0->scratch-MDT0000_UUID) get expired: 1478796395(+33s) Lustre: 7577:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 5 previous similar messages Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007cece140: destroy ctx ffff880047f86bc0(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 5 previous similar messages Lustre: 7609:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796390/real 1478796390] req@ffff8800784090c0 x1550626731376192/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796397 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7609:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 8 previous similar messages LustreError: 7612:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7612:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages Lustre: 7612:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880047f86a40(0->scratch-MDT0000_UUID) get expired: 1478796430(+33s) Lustre: 7612:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff880047f86a40(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages Lustre: 7680:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796460/real 1478796460] req@ffff88007cb863c0 x1550626731377760/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796467 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7680:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages LustreError: 7683:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7683:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 19 previous similar messages Lustre: 7683:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800783b16c0(0->scratch-MDT0000_UUID) get expired: 1478796500(+33s) Lustre: 7683:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 19 previous similar messages Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800783b16c0(0->scratch-MDT0000_UUID) Lustre: 23443:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 19 previous similar messages Lustre: 7807:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796593/real 1478796593] req@ffff8800782606c0 x1550626731380784/t0(0) o801->scratch-MDT0000-mdc-ffff88007adbb000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796600 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 7807:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 36 previous similar messages LustreError: 7810:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 7810:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 37 previous similar messages Lustre: 7810:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff8800782d3800(0->scratch-MDT0000_UUID) get expired: 1478796633(+33s) Lustre: 7810:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 37 previous similar messages Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007afbb980: destroy ctx ffff8800782d3800(0->scratch-MDT0000_UUID) Lustre: 23442:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 37 previous similar messages LustreError: 7511:0:(file.c:3401:ll_inode_revalidate_fini()) scratch: revalidate FID [0x200000007:0x1:0x0] error: rc = -4 [root@eagle-48vm6 ~]# On the MGS/MDS, dmesg Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294) Lustre: 11127:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796292/real 1478796292] req@ffff88007a2950c0 x1550567535768688/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796299 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11127:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11127:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99200(0->scratch-MDT0000_UUID) get expired: 1478796332(+33s) Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99200(0->scratch-MDT0000_UUID) Lustre: 11329:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796299/real 1478796299] req@ffff88007a2950c0 x1550567535768784/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796306 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11329:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11329:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077914380(0->scratch-MDT0000_UUID) get expired: 1478796339(+33s) Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077914380(0->scratch-MDT0000_UUID) Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 4322:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 4322:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 2 previous similar messages Lustre: 11333:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796306/real 1478796306] req@ffff88007a2950c0 x1550567535768848/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796313 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11333:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11333:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a992c0(0->scratch-MDT0000_UUID) get expired: 1478796346(+33s) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a992c0(0->scratch-MDT0000_UUID) Lustre: 11336:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796313/real 1478796313] req@ffff88007a2950c0 x1550567535768976/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796320 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11336:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11336:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb740(0->scratch-MDT0000_UUID) get expired: 1478796353(+33s) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb740(0->scratch-MDT0000_UUID) Lustre: 11339:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796320/real 1478796320] req@ffff88007a2950c0 x1550567535769040/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796327 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 11339:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 11339:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99500(0->scratch-MDT0000_UUID) get expired: 1478796360(+33s) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99500(0->scratch-MDT0000_UUID) Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 11346:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796334/real 1478796334] req@ffff88007a2950c0 x1550567535769232/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796341 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11346:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message LustreError: 11346:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11346:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 11346:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb680(0->scratch-MDT0000_UUID) get expired: 1478796374(+33s) Lustre: 11346:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb680(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 5 previous similar messages Lustre: 11355:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796355/real 1478796355] req@ffff88007a2950c0 x1550567535769488/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796362 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11355:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages LustreError: 11355:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11355:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages Lustre: 11355:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077bfb740(0->scratch-MDT0000_UUID) get expired: 1478796395(+33s) Lustre: 11355:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 2 previous similar messages Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077bfb740(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 2 previous similar messages Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 1 previous similar message Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 2 previous similar messages Lustre: 11376:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796390/real 1478796390] req@ffff88007a2950c0 x1550567535769936/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796397 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11376:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages LustreError: 11376:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11376:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 4 previous similar messages Lustre: 11376:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077914080(0->scratch-MDT0000_UUID) get expired: 1478796430(+33s) Lustre: 11376:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 4 previous similar messages Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880077914080(0->scratch-MDT0000_UUID) Lustre: 4151:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 4 previous similar messages Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3906:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 3 previous similar messages Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 5 previous similar messages Lustre: 11413:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796460/real 1478796460] req@ffff880076d2c380 x1550567535770832/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796467 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11413:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 9 previous similar messages LustreError: 11413:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11413:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages Lustre: 11413:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a992c0(0->scratch-MDT0000_UUID) get expired: 1478796500(+33s) Lustre: 11413:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a992c0(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3904:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 15 previous similar messages Lustre: 11483:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796593/real 1478796593] req@ffff880076d2c380 x1550567535772560/t0(0) o801->scratch-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1478796600 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 11483:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages LustreError: 11483:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 11483:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 18 previous similar messages Lustre: 11483:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880058a99c80(0->scratch-MDT0000_UUID) get expired: 1478796633(+33s) Lustre: 11483:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 18 previous similar messages Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff880058ed3900: destroy ctx ffff880058a99c80(0->scratch-MDT0000_UUID) Lustre: 4152:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 18 previous similar messages Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) waited 20s timeout, drop Lustre: 3905:0:(gss_svc_upcall.c:994:gss_svc_upcall_handle_init()) Skipped 25 previous similar messages [root@eagle-48vm1 ~]# On the OSS, dmesg: Lustre: DEBUG MARKER: == sanity-krb5 test 1: access with or without krb5 credential == 16:44:54 (1478796294) Lustre: 9304:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796293/real 1478796293] req@ffff8800796a8c80 x1550567557600736/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796300 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9307:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 9304:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4c80(0->scratch-MDT0000_UUID) get expired: 1478796333(+33s) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88004d2e4c80(0->scratch-MDT0000_UUID) LustreError: 9307:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9344:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796300/real 1478796300] req@ffff8800796a8c80 x1550567557600784/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796307 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9347:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 Lustre: 9347:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) get expired: 1478796340(+33s) Lustre: 9347:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9344:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Lustre: 9350:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796307/real 1478796307] req@ffff8800796a8680 x1550567557600832/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796314 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9350:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9350:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9350:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4b00(0->scratch-MDT0000_UUID) get expired: 1478796347(+33s) Lustre: 9350:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4b00(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9359:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796314/real 1478796314] req@ffff8800796a8080 x1550567557600928/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796321 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9359:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message LustreError: 9356:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9356:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9356:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880077ecb6c0(0->scratch-MDT0000_UUID) get expired: 1478796354(+33s) Lustre: 9356:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff880077ecb6c0(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9363:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796321/real 1478796321] req@ffff88007968bcc0 x1550567557600976/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796328 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9366:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9366:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 1 previous similar message Lustre: 9363:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4a40(0->scratch-MDT0000_UUID) get expired: 1478796361(+33s) Lustre: 9363:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 1 previous similar message Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4a40(0->scratch-MDT0000_UUID) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 1 previous similar message Lustre: 9376:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796335/real 1478796335] req@ffff88007968b6c0 x1550567557601120/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796342 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9376:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 3 previous similar messages LustreError: 9379:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9379:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 2 previous similar messages Lustre: 9379:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12240(0->scratch-MDT0000_UUID) get expired: 1478796375(+33s) Lustre: 9379:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 3 previous similar messages Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88005fd12240(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 3 previous similar messages Lustre: 9395:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796356/real 1478796356] req@ffff88007968b6c0 x1550567557601312/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796363 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9395:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 4 previous similar messages LustreError: 9398:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9398:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 5 previous similar messages Lustre: 9398:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12240(0->scratch-MDT0000_UUID) get expired: 1478796396(+33s) Lustre: 9398:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 5 previous similar messages Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88005fd12240(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 5 previous similar messages Lustre: 9429:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796391/real 1478796391] req@ffff88007968bcc0 x1550567557601648/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796398 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9429:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 8 previous similar messages LustreError: 9432:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9432:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 9 previous similar messages Lustre: 9432:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88005fd12c00(0->scratch-MDT0000_UUID) get expired: 1478796431(+33s) Lustre: 9432:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 9 previous similar messages Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88005fd12c00(0->scratch-MDT0000_UUID) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 9 previous similar messages Lustre: 9496:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796461/real 1478796461] req@ffff88004d2be6c0 x1550567557602320/t0(0) o801->scratch-MDT0000-lwp-OST0000@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796468 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Lustre: 9496:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 18 previous similar messages LustreError: 9499:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9499:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 19 previous similar messages Lustre: 9499:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) get expired: 1478796501(+33s) Lustre: 9499:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 19 previous similar messages Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4bc0: destroy ctx ffff88004d2e4980(0->scratch-MDT0000_UUID) Lustre: 4295:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 19 previous similar messages Lustre: 9624:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478796594/real 1478796594] req@ffff8800796c4c80 x1550567557603616/t0(0) o801->scratch-MDT0000-lwp-OST0001@10.100.4.154@tcp:12/10 lens 224/224 e 0 to 1 dl 1478796601 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 LustreError: 9627:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0 LustreError: 9627:0:(gss_keyring.c:1418:gss_kt_update()) Skipped 37 previous similar messages Lustre: 9627:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007d0efe40(0->scratch-MDT0000_UUID) get expired: 1478796634(+33s) Lustre: 9627:0:(sec_gss.c:316:cli_ctx_expire()) Skipped 37 previous similar messages Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88004d2e4d40: destroy ctx ffff88007d0efe40(0->scratch-MDT0000_UUID) Lustre: 4296:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) Skipped 36 previous similar messages Lustre: 9624:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 37 previous similar messages [root@eagle-48vm2 ~]# I’ve updated the Kerberos setup script to get the changes to the script from LU-6971 , http://review.whamcloud.com/#/c/19789/13 , so we are no longer creating the perm.conf file. The Kerberos setup script that I’m using is at http://review.whamcloud.com/#/c/18012/ . We are also calling lsvcgssd with the new recommended flags; For the MGS/MDS, use '/usr/sbin/lsvcgssd -k -g -m' and, for the OSS, use '/usr/sbin/lsvcgssd -k -o'. So, there is still an issue, but maybe this should move to a new ticket?

            People

              yong.fan nasf (Inactive)
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: