[LU-10240] sanity test_117: hung at ofd_create_hdl() Created: 14/Nov/17  Updated: 29/Dec/17

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Jinshan Xiong <jinshan.xiong@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/acd3cd62-c92e-11e7-a066-52540065bddc.

The sub-test test_117 failed with the following error:

Timeout occurred after 111 mins, last suite running was sanity, restarting cluster to continue tests

Please provide additional information about the failure here.

Info required for matching: sanity 117

One thread on the OFD side was hung.

[ 3720.141075] INFO: task ll_ost00_011:11377 blocked for more than 120 seconds.
[ 3720.143587] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3720.146029] ll_ost00_011    D ffff88005e6071b0     0 11377      2 0x00000080
[ 3720.148413]  ffff880055e6fb80 0000000000000046 ffff88005cf89fa0 ffff880055e6ffd8
[ 3720.150904]  ffff880055e6ffd8 ffff880055e6ffd8 ffff88005cf89fa0 ffff88005e6071a8
[ 3720.153382]  ffff88005e6071ac ffff88005cf89fa0 00000000ffffffff ffff88005e6071b0
[ 3720.155871] Call Trace:
[ 3720.157939]  [<ffffffff816aa4a9>] schedule_preempt_disabled+0x29/0x70
[ 3720.160334]  [<ffffffff816a83d7>] __mutex_lock_slowpath+0xc7/0x1d0
[ 3720.162685]  [<ffffffff816a77ef>] mutex_lock+0x1f/0x2f
[ 3720.164931]  [<ffffffffc11d190b>] ofd_create_hdl+0xc7b/0x2080 [ofd]
[ 3720.167276]  [<ffffffffc0e8bc67>] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc]
[ 3720.169631]  [<ffffffffc0e8bfbf>] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc]
[ 3720.171968]  [<ffffffffc0e8c2e1>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[ 3720.174271]  [<ffffffffc0ef3af5>] tgt_request_handle+0x925/0x13b0 [ptlrpc]
[ 3720.176570]  [<ffffffffc0e97d7e>] ptlrpc_server_handle_request+0x24e/0xab0 [ptlrpc]
[ 3720.178938]  [<ffffffffc0e9b522>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[ 3720.181207]  [<ffffffffc0e9aa90>] ? ptlrpc_register_service+0xe80/0xe80 [ptlrpc]
[ 3720.183545]  [<ffffffff810b099f>] kthread+0xcf/0xe0
[ 3720.185699]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
[ 3720.188010]  [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
[ 3720.190217]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40

When I looked at the test case, I found that the fail_loc OBD_FAIL_OST_SETATTR_CREDITS does not appear in the code at all.



 Comments   
Comment by nasf (Inactive) [ 18/Dec/17 ]

+1 on b2_10:
https://testing.hpdd.intel.com/test_sets/2a09eb88-e1e7-11e7-a066-52540065bddc

Comment by Mikhail Pershin [ 22/Dec/17 ]

Again on master, sanity.sh test_52b:
https://testing.hpdd.intel.com/test_sets/b1e87420-e66f-11e7-a066-52540065bddc

Generated at Sat Feb 10 02:33:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.