[LU-6555] sanity-hsm test_13, test_15 timeout Created: 01/May/15  Updated: 11/May/15  Resolved: 11/May/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Blocker
Reporter: Jian Yu Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Lustre build: https://build.hpdd.intel.com/job/lustre-master/3005
Distro/Arch: RHEL6.6/x86_64 (server), RHEL7.1/x86_64 (client)


Issue Links:
Duplicate
is duplicated by LU-6559 sanity-hsm test_15: rebind list of fi... Resolved
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity-hsm test 13 hung as follows:

CMD: shadow-22vm4 /usr/sbin/lctl set_param mdt.lustre-MDT0000.hsm_control=enabled
mdt.lustre-MDT0000.hsm_control=enabled
CMD: shadow-22vm4 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control
CMD: shadow-22vm4 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control
CMD: shadow-22vm5 rm -rf /home/autotest2/.autotest/shared_dir/2015-04-29/161546-69851862854760/arc1/*

On client shadow-22vm5:

rm              D ffff88007fc13680     0 22144  22143 0x00000080
 ffff88007b35bba8 0000000000000086 ffff88007b35bfd8 0000000000013680
 ffff88007b35bfd8 0000000000013680 ffff880036834440 ffff880036834440
 ffff88007ff66c58 0000000000000082 ffffffffa03f9b10 ffff88007b35bc20
Call Trace:
 [<ffffffffa03f9b10>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 [<ffffffff81609e29>] schedule+0x29/0x70
 [<ffffffffa03f9b45>] rpc_wait_bit_killable+0x35/0x90 [sunrpc]
 [<ffffffff81607ef0>] __wait_on_bit+0x60/0x90
 [<ffffffffa03f9b10>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
 [<ffffffff81607fa7>] out_of_line_wait_on_bit+0x87/0xb0
 [<ffffffff81098380>] ? autoremove_wake_function+0x40/0x40
 [<ffffffffa03f0610>] ? call_decode+0x870/0x870 [sunrpc]
 [<ffffffffa03f0610>] ? call_decode+0x870/0x870 [sunrpc]
 [<ffffffffa03faa44>] __rpc_execute+0x154/0x420 [sunrpc]
 [<ffffffff81098335>] ? wake_up_bit+0x25/0x30
 [<ffffffffa03fc11e>] rpc_execute+0x5e/0xa0 [sunrpc]
 [<ffffffffa03f2200>] rpc_run_task+0x70/0x90 [sunrpc]
 [<ffffffffa03f2270>] rpc_call_sync+0x50/0xc0 [sunrpc]
 [<ffffffffa053d473>] nfs3_rpc_wrapper.constprop.9+0x73/0xb0 [nfsv3]
 [<ffffffffa053dd10>] nfs3_proc_rmdir+0x90/0x110 [nfsv3]
 [<ffffffffa04840cf>] nfs_rmdir+0x5f/0x190 [nfs]
 [<ffffffff811d3318>] vfs_rmdir+0xa8/0x100
 [<ffffffff811d7585>] do_rmdir+0x1a5/0x200
 [<ffffffff811c8b0e>] ? ____fput+0xe/0x10
 [<ffffffff81093c5c>] ? task_work_run+0xac/0xe0
 [<ffffffff81013b6c>] ? do_notify_resume+0x9c/0xb0
 [<ffffffff811d8605>] SyS_unlinkat+0x25/0x40
 [<ffffffff81614a29>] system_call_fastpath+0x16/0x1b

Maloo report: https://testing.hpdd.intel.com/test_sets/ecc86158-efe2-11e4-96a8-5254006e85c2



 Comments   
Comment by Jian Yu [ 01/May/15 ]

More instances on master branch:
https://testing.hpdd.intel.com/test_sets/915cc3aa-f004-11e4-96a8-5254006e85c2
https://testing.hpdd.intel.com/test_sets/47e62942-eff4-11e4-96a8-5254006e85c2
https://testing.hpdd.intel.com/test_sets/b2ae2b04-eedb-11e4-b1e4-5254006e85c2
https://testing.hpdd.intel.com/test_sets/41163dd2-ee1e-11e4-848f-5254006e85c2
https://testing.hpdd.intel.com/test_sets/66adf0fe-ed19-11e4-94b4-5254006e85c2
https://testing.hpdd.intel.com/test_sets/0d065378-ed1b-11e4-bca3-5254006e85c2

Comment by Andreas Dilger [ 11/May/15 ]

This was fixed via patch http://review.whamcloud.com/14659 "LU-6559 test: use local tmp for HSM archive".

Generated at Sat Feb 10 02:01:13 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.