[LU-5575] Failure on test suite replay-ost-single test_5 Created: 03/Sep/14  Updated: 26/Apr/19  Resolved: 12/Sep/14

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: zfs
Environment:

server and client: lustre-master build 2639 zfs


Issue Links:
Duplicate
is duplicated by LU-4950 sanity-benchmark test fsx hung: txg_s... Closed
Related
is related to LU-4950 sanity-benchmark test fsx hung: txg_s... Closed
is related to LU-5214 Failure on test suite replay-ost-sing... Closed
is related to LU-12234 sanity-benchmark test iozone hangs in... Open
Severity: 3
Rank (Obsolete): 15555

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/68416460-30f3-11e4-b503-5254006e85c2.

The sub-test test_5 failed with the following error:

test failed to respond and timed out

OST dmesg shows D processes, not sure if it is dup of LU-5214

ll_ost_io00_0 D 0000000000000001     0 11233      2 0x00000080
 ffff88006bfb7810 0000000000000046 000000012061e300 0000000000000001
 0000000000000300 0000000000000082 ffff88006bfb77a0 ffff88005c7ebd80
 ffff88007bf75af8 ffff88006bfb7fd8 000000000000fbc8 ffff88007bf75af8
Call Trace:
 [<ffffffff8109b1ee>] ? prepare_to_wait_exclusive+0x4e/0x80
 [<ffffffffa014347d>] cv_wait_common+0xed/0x100 [spl]
 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa01434e5>] __cv_wait+0x15/0x20 [spl]
 [<ffffffffa0242f7b>] txg_wait_open+0x7b/0xa0 [zfs]
 [<ffffffffa020a95e>] dmu_tx_wait+0x29e/0x2b0 [zfs]
 [<ffffffff8152aabe>] ? mutex_lock+0x1e/0x50
 [<ffffffffa020aa01>] dmu_tx_assign+0x91/0x490 [zfs]
 [<ffffffffa0a9a54d>] osd_trans_start+0xed/0x430 [osd_zfs]
 [<ffffffffa0b8cf4c>] ofd_trans_start+0x7c/0x100 [ofd]
 [<ffffffffa0b947c3>] ofd_commitrw_write+0x543/0x1060 [ofd]
 [<ffffffffa0b958a3>] ofd_commitrw+0x5c3/0xae0 [ofd]
 [<ffffffffa1449341>] ? lprocfs_counter_add+0x151/0x1c0 [obdclass]
 [<ffffffffa0751efd>] obd_commitrw.clone.0+0x11d/0x390 [ptlrpc]
 [<ffffffffa07591bc>] tgt_brw_write+0xc7c/0x1530 [ptlrpc]
 [<ffffffffa06b37d0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
 [<ffffffffa075814e>] tgt_request_handle+0x71e/0xb10 [ptlrpc]
 [<ffffffffa0707174>] ptlrpc_main+0xe64/0x1990 [ptlrpc]
 [<ffffffffa0706310>] ? ptlrpc_main+0x0/0x1990 [ptlrpc]
 [<ffffffff8109abf6>] kthread+0x96/0xa0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109ab60>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

...

txg_sync      D 0000000000000000     0 20168      2 0x00000080
 ffff88001e775ba0 0000000000000046 00000000ffffffff 00005b2df8907ec2
 ffff88001e775b10 ffff880068007ad0 0000000000cbcf9a ffffffffabfe3aa2
 ffff88006a2adaf8 ffff88001e775fd8 000000000000fbc8 ffff88006a2adaf8
Call Trace:
 [<ffffffff810a6d31>] ? ktime_get_ts+0xb1/0xf0
 [<ffffffff81529c53>] io_schedule+0x73/0xc0
 [<ffffffffa014341c>] cv_wait_common+0x8c/0x100 [spl]
 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa01434a8>] __cv_wait_io+0x18/0x20 [spl]
 [<ffffffffa02890ab>] zio_wait+0xfb/0x1b0 [zfs]
 [<ffffffffa021e9e3>] dsl_pool_sync+0xb3/0x3f0 [zfs]
 [<ffffffffa0236e4b>] spa_sync+0x40b/0xa60 [zfs]
 [<ffffffffa0243916>] txg_sync_thread+0x2e6/0x510 [zfs]
 [<ffffffff810591a9>] ? set_user_nice+0xc9/0x130
 [<ffffffffa0243630>] ? txg_sync_thread+0x0/0x510 [zfs]
 [<ffffffffa013ec2f>] thread_generic_wrapper+0x5f/0x70 [spl]
 [<ffffffffa013ebd0>] ? thread_generic_wrapper+0x0/0x70 [spl]
 [<ffffffff8109abf6>] kthread+0x96/0xa0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109ab60>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

Generated at Sat Feb 10 01:52:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.