Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.10.0, Lustre 2.10.1
-
None
-
trevis-35, full, ZFS
EL7, master branch, v2.9.57, b3575
-
3
-
9223372036854775807
Description
https://testing.hpdd.intel.com/test_sessions/df55763f-2960-40d5-b78d-bd088d00e6e3
Several hung processes on the client side. Here are two of the longer traces:
From client dmesg:
[27722.373633] jbd2/vda1-8 D ffffffff8168a1e0 0 272 2 0x00000000 [27722.375266] ffff880036113ac0 0000000000000046 ffff8800360e3ec0 ffff880036113fd8 [27722.376932] ffff880036113fd8 ffff880036113fd8 ffff8800360e3ec0 ffff88007fd16c40 [27722.378628] 0000000000000000 7fffffffffffffff ffff88007ff5b9d0 ffffffff8168a1e0 [27722.380339] Call Trace: [27722.381566] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50 [27722.383076] [<ffffffff8168c169>] schedule+0x29/0x70 [27722.384533] [<ffffffff81689bc9>] schedule_timeout+0x239/0x2c0 [27722.386051] [<ffffffff81060c1f>] ? kvm_clock_get_cycles+0x1f/0x30 [27722.387580] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50 [27722.389031] [<ffffffff8168b70e>] io_schedule_timeout+0xae/0x130 [27722.390525] [<ffffffff8168b7a8>] io_schedule+0x18/0x20 [27722.391966] [<ffffffff8168a1f1>] bit_wait_io+0x11/0x50 [27722.393394] [<ffffffff81689d15>] __wait_on_bit+0x65/0x90 [27722.394833] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50 [27722.396245] [<ffffffff81689dc1>] out_of_line_wait_on_bit+0x81/0xb0 [27722.397743] [<ffffffff810b1be0>] ? wake_bit_function+0x40/0x40 [27722.399215] [<ffffffff8123338a>] __wait_on_buffer+0x2a/0x30 [27722.400687] [<ffffffffa01a6742>] jbd2_journal_commit_transaction+0x1752/0x19a0 [jbd2] [27722.402312] [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0 [27722.403795] [<ffffffffa01aae99>] kjournald2+0xc9/0x260 [jbd2] [27722.405260] [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30 [27722.406757] [<ffffffffa01aadd0>] ? commit_timeout+0x10/0x10 [jbd2] [27722.408218] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 [27722.409639] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 [27722.411170] [<ffffffff816970d8>] ret_from_fork+0x58/0x90 [27722.412601] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
and
[27723.879985] in:imjournal D ffffffff8168a1e0 0 878 1 0x00000080 [27723.881530] ffff88007b9bf9b0 0000000000000082 ffff88007b9caf10 ffff88007b9bffd8 [27723.883152] ffff88007b9bffd8 ffff88007b9bffd8 ffff88007b9caf10 ffff88007fc16c40 [27723.884762] 0000000000000000 7fffffffffffffff ffff88007ff607e8 ffffffff8168a1e0 [27723.886380] Call Trace: [27723.887549] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50 [27723.888943] [<ffffffff8168c169>] schedule+0x29/0x70 [27723.890306] [<ffffffff81689bc9>] schedule_timeout+0x239/0x2c0 [27723.891727] [<ffffffff810b1ae5>] ? wake_up_bit+0x25/0x30 [27723.893135] [<ffffffff81060c1f>] ? kvm_clock_get_cycles+0x1f/0x30 [27723.894586] [<ffffffff810eb08c>] ? ktime_get_ts64+0x4c/0xf0 [27723.896001] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50 [27723.897347] [<ffffffff8168b70e>] io_schedule_timeout+0xae/0x130 [27723.898774] [<ffffffff8168b7a8>] io_schedule+0x18/0x20 [27723.900127] [<ffffffff8168a1f1>] bit_wait_io+0x11/0x50 [27723.901463] [<ffffffff81689d15>] __wait_on_bit+0x65/0x90 [27723.902809] [<ffffffff81180191>] wait_on_page_bit+0x81/0xa0 [27723.904175] [<ffffffff810b1be0>] ? wake_bit_function+0x40/0x40 [27723.905558] [<ffffffff8119126b>] truncate_inode_pages_range+0x3bb/0x740 [27723.907030] [<ffffffffa01fab8c>] ? __ext4_journal_stop+0x3c/0xb0 [ext4] [27723.908476] [<ffffffff812630ea>] ? __dquot_initialize+0x3a/0x1c0 [27723.909876] [<ffffffff8119166e>] truncate_inode_pages_final+0x5e/0x90 [27723.911317] [<ffffffffa01cb46c>] ext4_evict_inode+0x10c/0x4d0 [ext4] [27723.912746] [<ffffffff8121a767>] evict+0xa7/0x170 [27723.914021] [<ffffffff8121b005>] iput+0xf5/0x180 [27723.915286] [<ffffffff81215ae8>] dentry_kill+0x168/0x1b0 [27723.916611] [<ffffffff81215b8c>] dput+0x5c/0xd0 [27723.917853] [<ffffffff8120ff82>] SYSC_renameat2+0x4e2/0x570 [27723.919175] [<ffffffff811b6cf7>] ? do_munmap+0x2c7/0x420 [27723.920469] [<ffffffff81210dce>] SyS_renameat2+0xe/0x10 [27723.921753] [<ffffffff81210e0e>] SyS_rename+0x1e/0x20 [27723.923014] [<ffffffff81697189>] system_call_fastpath+0x16/0x1b