[LU-895] 1.8<->2.2 interop: test connectathon hang Created: 04/Dec/11  Updated: 29/May/17  Resolved: 29/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

server: lustre-master build #353 RHEL6-x86_64
client: 1.8.6-wc1


Attachments: File stack_trace    
Severity: 3
Rank (Obsolete): 10341

 Description   

When running parallel-scale test_connectathon lock test, system hang. Please find MDS trace in the attached. This issue can be reproduced.

client:
-------------------------------------------------------------------------
root 6347 0.0 0.0 0 0 ? S 22:14 0:00 [ldlm_bl_03]
root 6535 0.0 0.0 107324 2144 ttyS0 S+ 22:16 0:00 bash /usr/lib64/lustre/tests/parallel-scale.sh
root 6536 0.0 0.0 100896 640 ttyS0 S+ 22:16 0:00 tee /tmp/test_logs/2011-12-03/210437/parallel-scale.test_con
root 10965 0.0 0.0 107324 2192 ttyS0 S+ 22:18 0:00 bash /usr/lib64/lustre/tests/parallel-scale.sh
root 10967 0.0 0.0 106088 1344 ttyS0 S+ 22:18 0:00 sh runtests -f
root 10974 0.0 0.0 6672 576 ttyS0 S+ 22:18 0:00 tlocklfs /mnt/lustre/d0.connectathon
root 10975 0.0 0.0 6508 316 ttyS0 S+ 22:18 0:00 tlocklfs /mnt/lustre/d0.connectathon

client trace:
--------------------------------------------------------------------------
tlocklfs S 0000000000000001 0 10974 10967 0x00000080
ffff8802b9987ca8 0000000000000086 0000000000000000 0000000000000082
0000000000000001 ffff8802b1ac5cb8 0000000000000000 0000000100518209
ffff88031f8dd0b8 ffff8802b9987fd8 000000000000f598 ffff88031f8dd0b8
Call Trace:
[<ffffffff8117bf7b>] pipe_wait+0x5b/0x80
[<ffffffff8108e100>] ? autoremove_wake_function+0x0/0x40
[<ffffffff814dbc1e>] ? mutex_lock+0x1e/0x50
[<ffffffff8117c9d6>] pipe_read+0x3e6/0x4e0
[<ffffffff811723ea>] do_sync_read+0xfa/0x140
[<ffffffff8108e100>] ? autoremove_wake_function+0x0/0x40
[<ffffffff811bc395>] ? fcntl_setlk+0x75/0x320
[<ffffffff81204ef6>] ? security_file_permission+0x16/0x20
[<ffffffff81172e15>] vfs_read+0xb5/0x1a0
[<ffffffff810d1ac2>] ? audit_syscall_entry+0x272/0x2a0
[<ffffffff81172f51>] sys_read+0x51/0x90
[<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
tlocklfs S 0000000000000001 0 10975 10974 0x00000080
ffff880247309a98 0000000000000082 0000000000000000 0000020300002adf
0000000000000000 ffffffffa051119b ffff880300000075 000000010051777e
ffff88031a337ab8 ffff880247309fd8 000000000000f598 ffff88031a337ab8
Call Trace:
[<ffffffffa04c300d>] ldlm_flock_completion_ast+0x61d/0x9f0 [ptlrpc]
[<ffffffff8105dc20>] ? default_wake_function+0x0/0x20
[<ffffffffa04b1565>] ldlm_cli_enqueue_fini+0x6c5/0xba0 [ptlrpc]
[<ffffffff8105dc20>] ? default_wake_function+0x0/0x20
[<ffffffffa04b5074>] ldlm_cli_enqueue+0x344/0x7a0 [ptlrpc]
[<ffffffffa06c8edd>] ll_file_flock+0x47d/0x6b0 [lustre]
[<ffffffff81190f40>] ? mntput_no_expire+0x30/0x110
[<ffffffffa04c29f0>] ? ldlm_flock_completion_ast+0x0/0x9f0 [ptlrpc]
[<ffffffff8117f451>] ? path_put+0x31/0x40
[<ffffffff811bc243>] vfs_lock_file+0x23/0x40
[<ffffffff811bc497>] fcntl_setlk+0x177/0x320
[<ffffffff811845f7>] sys_fcntl+0x197/0x530
[<ffffffff8100b172>] system_call_fastpath+0x16/0x1b



 Comments   
Comment by Andreas Dilger [ 29/May/17 ]

Close old ticket.

Generated at Sat Feb 10 01:11:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.