[LU-2486] Interop 2.3<->2.4 Failure on test suite sanity test_27n: task sync:30724 blocked for more than 120 seconds Created: 12/Dec/12  Updated: 17/Apr/17  Resolved: 17/Apr/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 5831

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/331bb160-4413-11e2-8b5c-52540035b04c.

The sub-test test_27n failed with the following error:

test failed to respond and timed out

From client console

06:45:05:INFO: task sync:30724 blocked for more than 120 seconds.
06:45:05:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
06:45:05:sync          D 0000000000000000     0 30724  30474 0x00000080
06:45:05: ffff88007a9a7c98 0000000000000082 ffff8800ffffffff 00000fc7e87e52d0
06:45:05: ffff88007a3b3578 ffff88007486f210 0000000000147024 ffffffffaf3a0fb9
06:45:05: ffff88007ba45ab8 ffff88007a9a7fd8 000000000000fb88 ffff88007ba45ab8
06:45:05:Call Trace:
06:45:05: [<ffffffff8109cec9>] ? ktime_get_ts+0xa9/0xe0
06:45:05: [<ffffffff81114420>] ? sync_page+0x0/0x50
06:45:05: [<ffffffff814fe833>] io_schedule+0x73/0xc0
06:45:05: [<ffffffff8111445d>] sync_page+0x3d/0x50
06:45:05: [<ffffffff814ff1ef>] __wait_on_bit+0x5f/0x90
06:45:05: [<ffffffff81114693>] wait_on_page_bit+0x73/0x80
06:45:05: [<ffffffff810921b0>] ? wake_bit_function+0x0/0x50
06:45:06: [<ffffffff8112ab95>] ? pagevec_lookup_tag+0x25/0x40
06:45:06: [<ffffffff81114b0b>] wait_on_page_writeback_range+0xfb/0x190
06:45:06: [<ffffffff814febbc>] ? wait_for_common+0x14c/0x180
06:45:06: [<ffffffff810602c0>] ? default_wake_function+0x0/0x20
06:45:06: [<ffffffff81114bcf>] filemap_fdatawait+0x2f/0x40
06:45:06: [<ffffffff811a4d74>] sync_inodes_sb+0x114/0x190
06:45:06: [<ffffffff811aa812>] __sync_filesystem+0x82/0x90
06:45:06: [<ffffffff811aa918>] sync_filesystems+0xf8/0x130
06:45:06: [<ffffffff811aa9b1>] sys_sync+0x21/0x40
06:45:06: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
06:46:09:Lustre: 2908:0:(client.c:1827:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1355150763/real 1355150763]  req@ffff88007c90f800 x1420977260327144/t0(0) o8->lustre-OST0000-osc-ffff880079bc8800@10.10.4.199@tcp:28/4 lens 400/544 e 0 to 1 dl 1355150788 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
06:46:09:Lustre: 2908:0:(client.c:1827:ptlrpc_expire_one_request()) Skipped 130 previous similar messages
06:47:10:INFO: task sync:30724 blocked for more than 120 seconds.
06:47:10:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
06:47:11:sync          D 0000000000000000     0 30724  30474 0x00000080
06:47:11: ffff88007a9a7c98 0000000000000082 ffff8800ffffffff 00000fc7e87e52d0
06:47:11: ffff88007a3b3578 ffff88007486f210 0000000000147024 ffffffffaf3a0fb9
06:47:11: ffff88007ba45ab8 ffff88007a9a7fd8 000000000000fb88 ffff88007ba45ab8


 Comments   
Comment by Andreas Dilger [ 17/Apr/17 ]

Close old issue.

Generated at Sat Feb 10 01:25:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.