[LU-15473] sanity test_230d: Timeout waiting for IOs on all nodes Created: 21/Jan/22 Updated: 22/Feb/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for eaujames <eaujames@ddn.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/6c537d23-38e5-4825-a422-bbca89bdc908 test_230d failed with the following error: Timeout occurred after 265 mins, last suite running was sanity This seems to be hardware related. All node (even the clients) seems to wait for io: *client1:* ... [12960.292893] INFO: task jbd2/vda1-8:268 blocked for more than 120 seconds. [12960.294060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12960.295280] jbd2/vda1-8 D ffffa025761447e0 0 268 2 0x00000000 [12960.296455] Call Trace: [12960.297713] [<ffffffffa2789179>] schedule+0x29/0x70 [12960.298496] [<ffffffffa2786e41>] schedule_timeout+0x221/0x2d0 [12960.303047] [<ffffffffa2788a2d>] io_schedule_timeout+0xad/0x130 [12960.303979] [<ffffffffa2788ac8>] io_schedule+0x18/0x20 [12960.304789] [<ffffffffa2787491>] bit_wait_io+0x11/0x50 [12960.305605] [<ffffffffa2786fb7>] __wait_on_bit+0x67/0x90 [12960.307245] [<ffffffffa2787121>] out_of_line_wait_on_bit+0x81/0xb0 [12960.309171] [<ffffffffa228723a>] __wait_on_buffer+0x2a/0x30 [12960.310124] [<ffffffffc03dc871>] jbd2_journal_commit_transaction+0x1771/0x19c0 [jbd2] [12960.312219] [<ffffffffc03e1f89>] kjournald2+0xc9/0x260 [jbd2] [12960.315005] [<ffffffffa20c5e61>] kthread+0xd1/0xe0 [12960.318681] INFO: task 0anacron:4661 blocked for more than 120 seconds. [12960.319697] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12960.320887] 0anacron D ffffa025dead68e0 0 4661 4657 0x00000080 [12960.322029] Call Trace: [12960.323221] [<ffffffffa2789179>] schedule+0x29/0x70 [12960.323998] [<ffffffffa2786e41>] schedule_timeout+0x221/0x2d0 [12960.327541] [<ffffffffa2788a2d>] io_schedule_timeout+0xad/0x130 [12960.328470] [<ffffffffa2788ac8>] io_schedule+0x18/0x20 [12960.329276] [<ffffffffa2787491>] bit_wait_io+0x11/0x50 [12960.330091] [<ffffffffa2786fb7>] __wait_on_bit+0x67/0x90 [12960.331722] [<ffffffffa2787121>] out_of_line_wait_on_bit+0x81/0xb0 [12960.333597] [<ffffffffa228723a>] __wait_on_buffer+0x2a/0x30 [12960.334518] [<ffffffffc03fc217>] __ext4_get_inode_loc+0x197/0x3c0 [ext4] [12960.335572] [<ffffffffc03feb36>] ext4_iget+0x96/0xbd0 [ext4] [12960.336470] [<ffffffffc03ff6a5>] ext4_iget_normal+0x35/0x40 [ext4] [12960.337446] [<ffffffffc0409c52>] ext4_lookup+0xc2/0x160 [ext4] [12960.338368] [<ffffffffa22591d3>] lookup_real+0x23/0x60 [12960.339179] [<ffffffffa2259bf2>] __lookup_hash+0x42/0x60 [12960.340033] [<ffffffffa27800e5>] lookup_slow+0x42/0xa7 [12960.340842] [<ffffffffa225cdbf>] link_path_walk+0x80f/0x8b0 [12960.341719] [<ffffffffa225cfca>] path_lookupat+0x7a/0x8d0 [12960.346235] [<ffffffffa225d84b>] filename_lookup+0x2b/0xc0 [12960.347098] [<ffffffffa2261557>] user_path_at_empty+0x67/0xc0 [12960.349786] [<ffffffffa22615c1>] user_path_at+0x11/0x20 [12960.350612] [<ffffffffa224c902>] SyS_faccessat+0xb2/0x230 [12960.351469] [<ffffffffa2795f92>] system_call_fastpath+0x25/0x2a ... *MDS:* ... [12720.294647] INFO: task jbd2/vda1-8:268 blocked for more than 120 seconds. [12720.297276] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12720.298494] jbd2/vda1-8 D ffff9fd5f63bd860 0 268 2 0x00000000 [12720.299642] Call Trace: [12720.300874] [<ffffffff92589179>] schedule+0x29/0x70 [12720.301645] [<ffffffff92586e41>] schedule_timeout+0x221/0x2d0 [12720.306109] [<ffffffff92588a2d>] io_schedule_timeout+0xad/0x130 [12720.307026] [<ffffffff92588ac8>] io_schedule+0x18/0x20 [12720.307831] [<ffffffff92587491>] bit_wait_io+0x11/0x50 [12720.308634] [<ffffffff92586fb7>] __wait_on_bit+0x67/0x90 [12720.310235] [<ffffffff92587121>] out_of_line_wait_on_bit+0x81/0xb0 [12720.312135] [<ffffffff9208724a>] __wait_on_buffer+0x2a/0x30 [12720.313143] [<ffffffffc049c871>] jbd2_journal_commit_transaction+0x1771/0x19c0 [jbd2] [12720.315211] [<ffffffffc04a1f89>] kjournald2+0xc9/0x260 [jbd2] [12720.317980] [<ffffffff91ec5e61>] kthread+0xd1/0xe0 ... *OST:* ... [12840.193849] INFO: task jbd2/vda1-8:267 blocked for more than 120 seconds. [12840.195003] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12840.196213] jbd2/vda1-8 D ffff937175945860 0 267 2 0x00000000 [12840.197371] Call Trace: [12840.198599] [<ffffffff88589179>] schedule+0x29/0x70 [12840.199389] [<ffffffff88586e41>] schedule_timeout+0x221/0x2d0 [12840.203947] [<ffffffff88588a2d>] io_schedule_timeout+0xad/0x130 [12840.204877] [<ffffffff88588ac8>] io_schedule+0x18/0x20 [12840.205687] [<ffffffff88587491>] bit_wait_io+0x11/0x50 [12840.206489] [<ffffffff88586fb7>] __wait_on_bit+0x67/0x90 [12840.208134] [<ffffffff88587121>] out_of_line_wait_on_bit+0x81/0xb0 [12840.210022] [<ffffffff8808724a>] __wait_on_buffer+0x2a/0x30 [12840.210938] [<ffffffffc033ff72>] jbd2_journal_commit_transaction+0xe72/0x19c0 [jbd2] [12840.213018] [<ffffffffc0345f89>] kjournald2+0xc9/0x260 [jbd2] [12840.215786] [<ffffffff87ec5e61>] kthread+0xd1/0xe0 ... VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |