[LU-2408] Interop 2.3<->2.4 Failure on test suite racer test_1: dir_create.sh and dd hang Created: 29/Nov/12  Updated: 30/Nov/12  Resolved: 30/Nov/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

server: 2.3 RHEL6
client: lustre-master build# 1065 RHEL6


Severity: 3
Rank (Obsolete): 5714

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/a7d94d7a-3980-11e2-9fda-52540035b04c.

The sub-test test_1 failed with the following error:

test failed to respond and timed out

client console shows:

02:19:29:INFO: task dir_create.sh:32643 blocked for more than 120 seconds.
02:19:30:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
02:19:30:dir_create.sh D 0000000000000000     0 32643  32631 0x00000080
02:19:30: ffff88007adb5cc8 0000000000000086 ffff880075710ef8 ffff880075710e00
02:19:30: ffff880000000030 ffff88007adb5d20 ffff88007adb5cd8 0000000000000000
02:19:30: ffff88007bc385f8 ffff88007adb5fd8 000000000000fb88 ffff88007bc385f8
02:19:30:Call Trace:
02:19:30: [<ffffffff814fefbe>] __mutex_lock_slowpath+0x13e/0x180
02:19:30: [<ffffffff814fee5b>] mutex_lock+0x2b/0x50
02:19:30: [<ffffffff81179998>] do_truncate+0x58/0xa0
02:19:30: [<ffffffff81188461>] ? path_put+0x31/0x40
02:19:30: [<ffffffff8118c399>] do_filp_open+0x829/0xd60
02:19:30: [<ffffffff81014ef9>] ? init_fpu+0x59/0xc0
02:19:30: [<ffffffff811982b2>] ? alloc_fd+0x92/0x160
02:19:30: [<ffffffff81178769>] do_sys_open+0x69/0x140
02:19:30: [<ffffffff81178880>] sys_open+0x20/0x30
02:19:30: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
02:19:30:INFO: task dir_create.sh:32661 blocked for more than 120 seconds.
02:19:30:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
02:19:30:dir_create.sh D 0000000000000000     0 32661  32631 0x00000080
02:19:30: ffff8800783c9928 0000000000000082 ffff8800783c98d8 ffff880074cd0510
02:19:30: 0000000000000286 0000000000000003 0000000000000001 0000000000000286
02:19:30: ffff8800783c7ab8 ffff8800783c9fd8 000000000000fb88 ffff8800783c7ab8
02:19:30:Call Trace:
02:19:30: [<ffffffffa0b1c7da>] ? cfs_waitq_signal+0x1a/0x20 [libcfs]
02:19:30: [<ffffffff814fe7d5>] schedule_timeout+0x215/0x2e0
02:19:30: [<ffffffffa0651fbc>] ? ptlrpc_request_bufs_pack+0x5c/0x80 [ptlrpc]
02:19:30: [<ffffffffa0667c30>] ? lustre_swab_ost_body+0x0/0x10 [ptlrpc]
02:19:30: [<ffffffff814fe453>] wait_for_common+0x123/0x180
02:19:30: [<ffffffff81060250>] ? default_wake_function+0x0/0x20
02:19:30: [<ffffffff814fe56d>] wait_for_completion+0x1d/0x20
02:19:30: [<ffffffffa07ef1d4>] osc_io_setattr_end+0xc4/0x1a0 [osc]
02:19:30: [<ffffffffa0871c50>] ? lov_io_end_wrapper+0x0/0x100 [lov]
02:19:30: [<ffffffffa055aa80>] cl_io_end+0x60/0x150 [obdclass]
02:19:30: [<ffffffffa055b350>] ? cl_io_start+0x0/0x140 [obdclass]
02:19:30: [<ffffffffa0871d41>] lov_io_end_wrapper+0xf1/0x100 [lov]
02:19:30: [<ffffffffa08717ce>] lov_io_call+0x8e/0x130 [lov]
02:19:30: [<ffffffffa087342c>] lov_io_end+0x4c/0x110 [lov]
02:19:31: [<ffffffffa055aa80>] cl_io_end+0x60/0x150 [obdclass]
02:19:31: [<ffffffffa055fce2>] cl_io_loop+0xc2/0x1b0 [obdclass]
02:19:31: [<ffffffffa0986148>] cl_setattr_ost+0x208/0x2d0 [lustre]
02:19:31: [<ffffffffa0955472>] ll_setattr_raw+0x752/0xfd0 [lustre]
02:19:31: [<ffffffffa0955d4b>] ll_setattr+0x5b/0xf0 [lustre]
02:19:31: [<ffffffff81197368>] notify_change+0x168/0x340
02:19:31: [<ffffffff811799a4>] do_truncate+0x64/0xa0
02:19:31: [<ffffffff8118c399>] do_filp_open+0x829/0xd60
02:19:31: [<ffffffff81014ef9>] ? init_fpu+0x59/0xc0
02:19:31: [<ffffffff811982b2>] ? alloc_fd+0x92/0x160
02:19:31: [<ffffffff81178769>] do_sys_open+0x69/0x140
02:19:31: [<ffffffff81178880>] sys_open+0x20/0x30
02:19:31: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
02:19:31:INFO: task dd:344 blocked for more than 120 seconds.
02:19:31:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
02:19:31:dd            D 0000000000000000     0   344  32660 0x00000080
02:19:31: ffff88007ba159c8 0000000000000082 ffff88007ba159b8 ffff880079276d10
02:19:31: ffffffffa0569330 ffff88007af3cac0 00000000000006c5 ffff8800792b2fa8
02:19:31: ffff8800755f1058 ffff88007ba15fd8 000000000000fb88 ffff8800755f1058
02:19:31:Call Trace:
02:19:31: [<ffffffff814fefbe>] __mutex_lock_slowpath+0x13e/0x180
02:19:31: [<ffffffff814fee5b>] mutex_lock+0x2b/0x50
02:19:31: [<ffffffffa0556837>] cl_lock_mutex_get+0x77/0xe0 [obdclass]
02:19:31: [<ffffffffa0558df8>] cl_lock_enclosure+0x188/0x220 [obdclass]
02:19:31: [<ffffffffa0558ed3>] cl_lock_closure_build+0x43/0x150 [obdclass]
02:19:31: [<ffffffffa086e4ab>] lov_sublock_lock+0x6b/0x2f0 [lov]
02:19:31: [<ffffffffa086ed78>] lov_lock_use+0x98/0x360 [lov]
02:19:31: [<ffffffffa05581b5>] cl_use_try+0x175/0x300 [obdclass]
02:19:31: [<ffffffffa055849d>] cl_enqueue_try+0x15d/0x310 [obdclass]
02:19:31: [<ffffffffa05598fd>] cl_enqueue_locked+0x6d/0x210 [obdclass]
02:19:31: [<ffffffffa055a5de>] cl_lock_request+0x7e/0x280 [obdclass]
02:19:31: [<ffffffffa055fa46>] cl_io_lock+0x3d6/0x5b0 [obdclass]
02:19:31: [<ffffffffa055fcc2>] cl_io_loop+0xa2/0x1b0 [obdclass]
02:19:31: [<ffffffffa093988b>] ll_file_io_generic+0x42b/0x550 [lustre]
02:19:31: [<ffffffffa093a78c>] ll_file_aio_write+0x13c/0x2c0 [lustre]
02:19:31: [<ffffffffa093aa79>] ll_file_write+0x169/0x2a0 [lustre]
02:19:31: [<ffffffff8117b198>] vfs_write+0xb8/0x1a0
02:19:31: [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
02:19:31: [<ffffffff8117bbb1>] sys_write+0x51/0x90
02:19:31: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b


 Comments   
Comment by Peter Jones [ 30/Nov/12 ]

duplicate of LU-2406

Generated at Sat Feb 10 01:24:57 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.