[LU-7211] performance-sanity test_7: test failed to respond and timed out Created: 25/Sep/15  Updated: 12/May/16  Resolved: 12/May/16

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Client: 2.7.59, RHEL 7.1
Server: 2.7.59, RHEL 7.1


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/173296dc-61b4-11e5-a10f-5254006e85c2.

The sub-test test_7 failed with the following error:

test failed to respond and timed out

mds dmesg: observed some abnormality just before performance-sanity test 7

[ 7629.763251] Lustre: DEBUG MARKER: == mdsrate-lookup-10dirs test complete, duration 1313 sec ============================================ 01:15:44 (1442884544)
[ 7740.131394] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version
[ 7740.393305] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.mdt=ug
[ 7740.607242] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ug
[ 7741.958784] Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
[ 7742.161256] Lustre: DEBUG MARKER: umount -d -f /mnt/mds1
[ 7742.264711] LustreError: 1633:0:(client.c:1138:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff8800653a0c00 x1512968916905480/t0(0) o101->lustre-MDT0000-lwp-MDT0000@0@lo:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
[ 7742.266529] LustreError: 1633:0:(client.c:1138:ptlrpc_import_delay_req()) Skipped 5 previous similar messages
[ 7742.267652] LustreError: 1633:0:(qsd_reint.c:55:qsd_reint_completion()) lustre-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x10000:0x0], rc:-5
[ 7743.151301] LustreError: 3576:0:(client.c:1138:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff8800653a0c00 x1512968916905512/t0(0) o13->lustre-OST0000-osc-MDT0000@10.1.6.116@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
[ 7743.153894] LustreError: 3576:0:(client.c:1138:ptlrpc_import_delay_req()) Skipped 1 previous similar message
[ 7743.596403] Lustre: lustre-MDT0000: Not available for connect from 10.1.6.116@tcp (stopping)
[ 7743.598894] Lustre: Skipped 6 previous similar messages
[ 7744.671206] LustreError: 3577:0:(client.c:1138:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff8800602e9500 x1512968916905516/t0(0) o13->lustre-OST0004-osc-MDT0000@10.1.6.116@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
[ 7746.991272] LustreError: 3576:0:(client.c:1138:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff8800679af900 x1512968916905536/t0(0) o13->lustre-OST0002-osc-MDT0000@10.1.6.116@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
[ 7746.993782] LustreError: 3576:0:(client.c:1138:ptlrpc_import_delay_req()) Skipped 4 previous similar messages
[ 7753.586738] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.1.6.116@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 7753.591106] LustreError: Skipped 6 previous similar messages
[ 7754.310250] Lustre: 1632:0:(client.c:2039:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1442884662/real 1442884662]  req@ffff8800602e9500 x1512968916905552/t0(0) o251->MGC10.1.6.115@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1442884668 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
[ 7754.313658] Lustre: 1632:0:(client.c:2039:ptlrpc_expire_one_request()) Skipped 6 previous similar messages
[ 7756.073041] Lustre: server umount lustre-MDT0000 complete

ost dmesg: frequently observed in performance-sanity test 7

[ 7868.419657] LustreError: 137-5: lustre-OST0001_UUID: not available for connect from 10.1.6.115@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 7868.423882] LustreError: Skipped 6 previous similar messages


 Comments   
Comment by Sarah Liu [ 20/Jan/16 ]

it seems a dup of LU-3786 to me

Comment by Saurabh Tandan (Inactive) [ 05/Feb/16 ]

Another instance on master for FULL - EL7.1 Server/EL7.1 Client - ZFS, build# 3314
https://testing.hpdd.intel.com/test_sets/e6636600-cb88-11e5-b49e-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 10/Feb/16 ]

Another instance found for Full tag 2.7.66 - EL6.7 Server/EL6.7 Client - ZFS, build# 3314
https://testing.hpdd.intel.com/test_sets/a53643d2-cb47-11e5-a59a-5254006e85c2

Another instance found for Full tag 2.7.66 -EL7.1 Server/EL7.1 Client - ZFS, build# 3314
https://testing.hpdd.intel.com/test_sets/e6636600-cb88-11e5-b49e-5254006e85c2

Comment by Sarah Liu [ 12/May/16 ]

dup of LU-3786

Generated at Sat Feb 10 02:06:57 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.