[LU-10659] racer test_1: test_1 failed with 2 Created: 12/Feb/18  Updated: 03/Apr/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0, Lustre 2.10.4
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Bob Glossman (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

racer test_1 - test_1 failed with 2
^^^^^^^^^^^^^ DO NOT REMOVE LINE ABOVE ^^^^^^^^^^^^^

This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

seen in https://testing.hpdd.intel.com/test_sets/b86555bc-101c-11e8-a7cd-52540065bddc
FAIL status, not TIMEOUT

This issue relates to the following test suite run:

test_1 failed with the following error:

test_1 failed with 2

stuck threads seen on MDT, from dmesg log:

[ 6964.411076] Pid: 9977, comm: mdt00_016
[ 6964.412630] 
Call Trace:
[ 6964.415633]  [<ffffffff816ab6b9>] schedule+0x29/0x70
[ 6964.417759]  [<ffffffff816a9004>] schedule_timeout+0x174/0x2c0
[ 6964.419672]  [<ffffffff8109a6c0>] ? process_timeout+0x0/0x10
[ 6964.421734]  [<ffffffffc068a3c1>] ? cfs_block_sigsinv+0x71/0xa0 [libcfs]
[ 6964.423452]  [<ffffffffc0a98120>] ? ldlm_expired_completion_wait+0x0/0x370 [ptlrpc]
[ 6964.425155]  [<ffffffffc0a98a31>] ldlm_completion_ast+0x5a1/0x910 [ptlrpc]
[ 6964.426795]  [<ffffffff810c6440>] ? default_wake_function+0x0/0x20
[ 6964.428349]  [<ffffffffc0a9a340>] ldlm_cli_enqueue_local+0x230/0x960 [ptlrpc]
[ 6964.429938]  [<ffffffffc0a98490>] ? ldlm_completion_ast+0x0/0x910 [ptlrpc]
[ 6964.431681]  [<ffffffffc0fcad40>] ? mdt_blocking_ast+0x0/0x2a0 [mdt]
[ 6964.433272]  [<ffffffffc0fdda92>] mdt_object_local_lock.isra.67+0x202/0xad0 [mdt]
[ 6964.434970]  [<ffffffffc0fcad40>] ? mdt_blocking_ast+0x0/0x2a0 [mdt]
[ 6964.436620]  [<ffffffffc0a98490>] ? ldlm_completion_ast+0x0/0x910 [ptlrpc]
[ 6964.438309]  [<ffffffffc0acbc87>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[ 6964.439844]  [<ffffffffc0fde3bb>] mdt_object_lock_internal+0x5b/0x350 [mdt]
[ 6964.441464]  [<ffffffffc0fdf1c6>] mdt_getattr_name_lock+0x8e6/0x1c70 [mdt]
[ 6964.443064]  [<ffffffffc0fe72a3>] ? ucred_set_jobid+0x53/0x70 [mdt]
[ 6964.444682]  [<ffffffffc0acbf3c>] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc]
[ 6964.446290]  [<ffffffffc0fe0800>] mdt_intent_getattr+0x2b0/0x480 [mdt]
[ 6964.447936]  [<ffffffffc0fdca75>] mdt_intent_opc+0x215/0xa40 [mdt]
[ 6964.449578]  [<ffffffffc0fe4838>] mdt_intent_policy+0x138/0x320 [mdt]
[ 6964.451285]  [<ffffffffc0a7f2f7>] ldlm_lock_enqueue+0x357/0x9c0 [ptlrpc]
[ 6964.452910]  [<ffffffffc0aa7bd2>] ldlm_handle_enqueue0+0x4f2/0x1720 [ptlrpc]
[ 6964.454496]  [<ffffffffc0ad03a0>] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc]
[ 6964.456243]  [<ffffffffc0b334a2>] tgt_enqueue+0x62/0x210 [ptlrpc]
[ 6964.457846]  [<ffffffffc0b3815b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
[ 6964.459520]  [<ffffffffc0adafdb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 6964.461340]  [<ffffffffc0ad8088>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
[ 6964.462994]  [<ffffffff810bc0f8>] ? __wake_up_common+0x58/0x90
[ 6964.464641]  [<ffffffffc0ade99f>] ptlrpc_main+0xc3f/0x1f90 [ptlrpc]
[ 6964.466284]  [<ffffffffc0addd60>] ? ptlrpc_main+0x0/0x1f90 [ptlrpc]
[ 6964.467978]  [<ffffffff810b252f>] kthread+0xcf/0xe0
[ 6964.469488]  [<ffffffff810b2460>] ? kthread+0x0/0xe0
[ 6964.470954]  [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[ 6964.472371]  [<ffffffff810b2460>] ? kthread+0x0/0xe0

[ 6964.474903] LNet: Service thread pid 9967 was inactive for 62.33s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
[ 6965.216034] LNet: Service thread pid 9987 was inactive for 62.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.


 Comments   
Comment by Minh Diep [ 26/Feb/18 ]

+1 on b2_10

https://testing.hpdd.intel.com/test_sets/aaa1317a-1685-11e8-bd00-52540065bddc

Generated at Sat Feb 10 02:37:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.