[LU-6461] Interop 2.5.3<->master recovery-small test_9: task ll_ost_io00_0 in D state Created: 13/Apr/15 Updated: 10/Oct/21 Resolved: 10/Oct/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: lustre-master build #2983 |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/95224732-dfb8-11e4-b5b0-5254006e85c2. The sub-test test_9 failed with the following error: test failed to respond and timed out OST console 03:45:52:Lustre: DEBUG MARKER: == recovery-small test 9: pause bulk on OST (bug 1420) == 20:39:30 (1428637170) 03:45:52:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x214 03:45:52:LustreError: 11613:0:(fail.c:132:__cfs_fail_timeout_set()) cfs_fail_timeout id 214 sleeping for 20000000ms 03:45:52:INFO: task ll_ost_io00_002:11613 blocked for more than 120 seconds. 03:45:52: Not tainted 2.6.32-504.12.2.el6_lustre.x86_64 #1 03:45:52:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 03:45:52:ll_ost_io00_0 D 0000000000000001 0 11613 2 0x00000080 03:45:52: ffff88005081bad0 0000000000000046 ffffffffa04bce48 0000000000000000 03:45:52: ffff88005081bb30 0000000061492ae0 ffffffffa04e5110 ffff88007047448e 03:45:52: 0000004e379f4150 ffffffffa04bce3e ffff880061493098 ffff88005081bfd8 03:45:52:Call Trace: 03:45:52: [<ffffffff8152b162>] schedule_timeout+0x192/0x2e0 03:45:52: [<ffffffff810874f0>] ? process_timeout+0x0/0x10 03:45:52: [<ffffffffa04a6141>] __cfs_fail_timeout_set+0xe1/0x160 [libcfs] 03:45:52: [<ffffffffa0f01b27>] cfs_fail_timeout_set.clone.2+0x27/0x40 [ptlrpc] 03:45:52: [<ffffffffa0f0854b>] tgt_brw_write+0x36b/0x1530 [ptlrpc] 03:45:52: [<ffffffffa04a9161>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 03:45:52: [<ffffffffa04a9161>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 03:45:52: [<ffffffffa04a5798>] ? libcfs_log_return+0x28/0x40 [libcfs] 03:45:52: [<ffffffffa0f046cd>] ? tgt_request_preprocess+0x20d/0x1370 [ptlrpc] 03:45:53: [<ffffffffa04a9161>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 03:45:53: [<ffffffffa0f07a9e>] tgt_request_handle+0x8be/0x1000 [ptlrpc] 03:45:53: [<ffffffffa0eb7a51>] ptlrpc_main+0xe41/0x1960 [ptlrpc] 03:45:53: [<ffffffffa0eb6c10>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] 03:45:53: [<ffffffff8109e66e>] kthread+0x9e/0xc0 03:45:53: [<ffffffff8100c20a>] child_rip+0xa/0x20 03:45:53: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 03:45:53: [<ffffffff8100c200>] ? child_rip+0x0/0x20 03:45:53:INFO: task ll_ost_io00_002:11613 blocked for more than 120 seconds. 03:45:53: Not tainted 2.6.32-504.12.2.el6_lustre.x86_64 #1 03:45:54:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. |
| Comments |
| Comment by Andreas Dilger [ 21/Apr/15 ] |
|
It seems that pause_bulk() was changed in 2.5.52 via " |