Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.9.0
-
None
-
Centos 7.2: Kernel 3.10.0-327.36.3.el7_lustre.x86_64
Lustre version: master(commit 075f98e585a27b846ebd26f1d70a77eefb0f8c5f) + http://review.whamcloud.com/fs/lustre-release refs/changes/80/20780/26 + http://review.whamcloud.com/fs/lustre-release refs/changes/25/23925/3
Test cluster layout:
MDS1: 1 MGT, 1MDT
MDS2: 2 MDTs
OSS1: 4 OSTs
OSS2: 4 OSTsCentos 7.2: Kernel 3.10.0-327.36.3.el7_lustre.x86_64 Lustre version: master(commit 075f98e585a27b846ebd26f1d70a77eefb0f8c5f) + http://review.whamcloud.com/fs/lustre-release refs/changes/80/20780/26 + http://review.whamcloud.com/fs/lustre-release refs/changes/25/23925/3 Test cluster layout: MDS1: 1 MGT, 1MDT MDS2: 2 MDTs OSS1: 4 OSTs OSS2: 4 OSTs
-
3
-
9223372036854775807
Description
The `/usr/bin/lfs setdirstripe -i1 -c2 /mnt/lustre/d1.sanity` command issued by sanity.sh test_1 hangs when mdt2mdt connections are secured with the krb5p security flavor. This test has failed with SSK flavors skpi and ski; other flavors have not yet been tested.
== MDS2 logs ==
[ 341.920887] Lustre: DEBUG MARKER: == sanity test 1: mkdir; remkdir; rmdir ============================================================== 09:13:33 (1480515213) [ 341.954824] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff8800d368ed00 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/0/ffffffff rc 0/-1 [ 343.088324] Lustre: 2602:0:(sec_gss.c:2063:gss_svc_handle_init()) create svc ctx ffff8802051cc440: user from 172.18.0.18@tcp authenticated as mds [ 343.088328] Lustre: 2602:0:(sec_gss.c:2063:gss_svc_handle_init()) Skipped 9 previous similar messages [ 343.088346] Lustre: 2602:0:(sec_gss.c:371:gss_cli_ctx_uptodate()) server installed reverse ctx ffff8802051a6900 idx 0x35ebf0324b78e726, expiry 1480599087(+83873s) [ 343.088348] Lustre: 2602:0:(sec_gss.c:371:gss_cli_ctx_uptodate()) Skipped 9 previous similar messages [ 352.954000] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff880203a13300 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 363.953599] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff880204ecc600 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 374.953137] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff880205722850 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 382.074673] LNet: Service thread pid 3152 was inactive for 40.12s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 382.074679] Pid: 3152, comm: mdt00_003 [ 382.074681] Call Trace: [ 382.074692] [<ffffffff8163bd69>] schedule+0x29/0x70 [ 382.074697] [<ffffffff816399c5>] schedule_timeout+0x175/0x2d0 [ 382.074704] [<ffffffff8108bff0>] ? process_timeout+0x0/0x10 [ 382.074723] [<ffffffffa0642ef1>] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] [ 382.074777] [<ffffffffa0970fa0>] ? ldlm_expired_completion_wait+0x0/0x250 [ptlrpc] [ 382.074820] [<ffffffffa0971791>] ldlm_completion_ast+0x5a1/0x910 [ptlrpc] [ 382.074860] [<ffffffffa095e730>] ? ldlm_resource_add_lock+0x70/0x1b0 [ptlrpc] [ 382.074867] [<ffffffff810b8940>] ? default_wake_function+0x0/0x20 [ 382.074910] [<ffffffffa0973448>] ldlm_cli_enqueue_fini+0x938/0xdb0 [ptlrpc] [ 382.074952] [<ffffffffa09763b2>] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] [ 382.074994] [<ffffffffa09711f0>] ? ldlm_completion_ast+0x0/0x910 [ptlrpc] [ 382.075012] [<ffffffffa0cfda80>] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] [ 382.075032] [<ffffffffa0eafc82>] osp_md_object_lock+0x172/0x2e0 [osp] [ 382.075054] [<ffffffffa0e56c51>] ? mdd_la_get+0x81/0x1a0 [mdd] [ 382.075077] [<ffffffffa0decb20>] lod_object_lock+0xf0/0x950 [lod] [ 382.075093] [<ffffffffa06524be>] ? cfs_hash_bd_add_locked+0x1e/0x80 [libcfs] [ 382.075105] [<ffffffffa0652869>] ? cfs_hash_spin_unlock+0x9/0x10 [libcfs] [ 382.075171] [<ffffffffa077fdb7>] ? lu_object_find_try+0x207/0x2b0 [obdclass] [ 382.075177] [<ffffffffa0e5410b>] mdd_object_lock+0x3b/0xd0 [mdd] [ 382.075186] [<ffffffffa0d0ca72>] mdt_remote_object_lock+0x1e2/0x710 [mdt] [ 382.075202] [<ffffffffa077ff0c>] ? lu_object_find_at+0xac/0xe0 [obdclass] [ 382.075228] [<ffffffffa09a3bc7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc] [ 382.075237] [<ffffffffa0d0dfc6>] mdt_object_lock_internal+0x166/0x300 [mdt] [ 382.075244] [<ffffffffa0d0e174>] mdt_object_lock+0x14/0x20 [mdt] [ 382.075254] [<ffffffffa0d246a1>] mdt_md_create+0x421/0xba0 [mdt] [ 382.075268] [<ffffffffa075d838>] ? lprocfs_counter_add+0x128/0x1a0 [obdclass] [ 382.075285] [<ffffffffa0793ef1>] ? lprocfs_job_stats_log+0xd1/0x630 [obdclass] [ 382.075294] [<ffffffffa0d24f8b>] mdt_reint_create+0x16b/0x350 [mdt] [ 382.075302] [<ffffffffa0d26490>] mdt_reint_rec+0x80/0x210 [mdt] [ 382.075309] [<ffffffffa0d08db2>] mdt_reint_internal+0x5b2/0x9b0 [mdt] [ 382.075317] [<ffffffffa0d141e7>] mdt_reint+0x67/0x140 [mdt] [ 382.075351] [<ffffffffa0a05225>] tgt_request_handle+0x915/0x1320 [ptlrpc] [ 382.075378] [<ffffffffa09b11ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] [ 382.075404] [<ffffffffa09aed68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [ 382.075407] [<ffffffff810b8952>] ? default_wake_function+0x12/0x20 [ 382.075409] [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90 [ 382.075434] [<ffffffffa09b5260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc] [ 382.075458] [<ffffffffa09b47c0>] ? ptlrpc_main+0x0/0x1de0 [ptlrpc] [ 382.075461] [<ffffffff810a5b8f>] kthread+0xcf/0xe0 [ 382.075463] [<ffffffff810a5ac0>] ? kthread+0x0/0xe0 [ 382.075466] [<ffffffff81646c98>] ret_from_fork+0x58/0x90 [ 382.075469] [<ffffffff810a5ac0>] ? kthread+0x0/0xe0 [ 382.075472] LustreError: dumping log to /tmp/lustre-log.1480515253.3152 [ 385.952974] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff880205375800 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 396.952564] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff880204d3db00 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 418.952006] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff8800d72c8050 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 418.952150] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) Skipped 1 previous similar message [ 451.951069] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff8800d04f0f00 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 451.951177] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) Skipped 2 previous similar messages [ 517.949644] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff8800d0514600 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 517.949761] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) Skipped 5 previous similar messages [ 641.950918] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 172.18.0.18@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 641.950949] LustreError: 3152:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1480515213, 300s ago), entering recovery for lustre-MDT0000_UUID@172.18.0.18@tcp ns: lustre-MDT0000-osp-MDT0001 lock: ffff880211f20e00/0xd408ec1632ce7580 lrc: 4/0,1 mode: --/EX res: [0x200000007:0x1:0x0].0x0 bits 0x2 rrc: 1 type: IBT flags: 0x1000001000000 nid: local remote: 0xf8f327625a9c3943 expref: -99 pid: 3152 timeout: 0 lvb_type: 0 [ 641.951489] Lustre: lustre-MDT0000-osp-MDT0001: Connection restored to 172.18.0.18@tcp (at 172.18.0.18@tcp) [ 641.951493] Lustre: Skipped 3 previous similar messages [ 652.950923] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff8802051a8900 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 652.951025] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) Skipped 12 previous similar messages [ 916.946544] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) @@@ faked source MDT req@ffff8800d04d3c00 x1552432553273552/t0(0) o105-><?>@<?>:0/0 lens 336/0 e 0 to 0 dl 0 ref 1 fl New:/2/ffffffff rc 0/-1 [ 916.946651] LustreError: 2582:0:(sec.c:2010:sptlrpc_svc_check_from()) Skipped 23 previous similar messages [ 936.436828] Lustre: 2596:0:(service.c:1331:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8800d368ea00 x1552432529149200/t0(0) o36->c91cef1e-2de0-ca74-2459-2c580f0f0023@172.18.0.22@tcp:-1/-1 lens 672/568 e 24 to 0 dl 1480515813 ref 2 fl Interpret:/0/0 rc 0/0 [ 942.942067] Lustre: lustre-MDT0001: Client c91cef1e-2de0-ca74-2459-2c580f0f0023 (at 172.18.0.22@tcp) reconnecting [ 942.942090] Lustre: lustre-MDT0001: Connection restored to c91cef1e-2de0-ca74-2459-2c580f0f0023 (at 172.18.0.22@tcp)
The test was executed with:
export SEC=krb5p; ./auster -f multinode -rsv -d /import/results/krb5p sanity --only 1
Attachments
Issue Links
- is related to
-
LU-3778 GSS doesn't know about proxy subsystems
- Resolved