Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9515

sanity-benchmark test_iozone: test failed to respond and timed out

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.10.0, Lustre 2.10.1
    • None
    • trevis-35, full, ZFS
        EL7, master branch, v2.9.57, b3575
    • 3
    • 9223372036854775807

    Description

      https://testing.hpdd.intel.com/test_sessions/df55763f-2960-40d5-b78d-bd088d00e6e3

      Several hung processes on the client side. Here are two of the longer traces:

      From client dmesg:

      [27722.373633] jbd2/vda1-8     D ffffffff8168a1e0     0   272      2 0x00000000
      [27722.375266]  ffff880036113ac0 0000000000000046 ffff8800360e3ec0 ffff880036113fd8
      [27722.376932]  ffff880036113fd8 ffff880036113fd8 ffff8800360e3ec0 ffff88007fd16c40
      [27722.378628]  0000000000000000 7fffffffffffffff ffff88007ff5b9d0 ffffffff8168a1e0
      [27722.380339] Call Trace:
      [27722.381566]  [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
      [27722.383076]  [<ffffffff8168c169>] schedule+0x29/0x70
      [27722.384533]  [<ffffffff81689bc9>] schedule_timeout+0x239/0x2c0
      [27722.386051]  [<ffffffff81060c1f>] ? kvm_clock_get_cycles+0x1f/0x30
      [27722.387580]  [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
      [27722.389031]  [<ffffffff8168b70e>] io_schedule_timeout+0xae/0x130
      [27722.390525]  [<ffffffff8168b7a8>] io_schedule+0x18/0x20
      [27722.391966]  [<ffffffff8168a1f1>] bit_wait_io+0x11/0x50
      [27722.393394]  [<ffffffff81689d15>] __wait_on_bit+0x65/0x90
      [27722.394833]  [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
      [27722.396245]  [<ffffffff81689dc1>] out_of_line_wait_on_bit+0x81/0xb0
      [27722.397743]  [<ffffffff810b1be0>] ? wake_bit_function+0x40/0x40
      [27722.399215]  [<ffffffff8123338a>] __wait_on_buffer+0x2a/0x30
      [27722.400687]  [<ffffffffa01a6742>] jbd2_journal_commit_transaction+0x1752/0x19a0 [jbd2]
      [27722.402312]  [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
      [27722.403795]  [<ffffffffa01aae99>] kjournald2+0xc9/0x260 [jbd2]
      [27722.405260]  [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30
      [27722.406757]  [<ffffffffa01aadd0>] ? commit_timeout+0x10/0x10 [jbd2]
      [27722.408218]  [<ffffffff810b0a4f>] kthread+0xcf/0xe0
      [27722.409639]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
      [27722.411170]  [<ffffffff816970d8>] ret_from_fork+0x58/0x90
      [27722.412601]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
      

      and

      [27723.879985] in:imjournal    D ffffffff8168a1e0     0   878      1 0x00000080
      [27723.881530]  ffff88007b9bf9b0 0000000000000082 ffff88007b9caf10 ffff88007b9bffd8
      [27723.883152]  ffff88007b9bffd8 ffff88007b9bffd8 ffff88007b9caf10 ffff88007fc16c40
      [27723.884762]  0000000000000000 7fffffffffffffff ffff88007ff607e8 ffffffff8168a1e0
      [27723.886380] Call Trace:
      [27723.887549]  [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
      [27723.888943]  [<ffffffff8168c169>] schedule+0x29/0x70
      [27723.890306]  [<ffffffff81689bc9>] schedule_timeout+0x239/0x2c0
      [27723.891727]  [<ffffffff810b1ae5>] ? wake_up_bit+0x25/0x30
      [27723.893135]  [<ffffffff81060c1f>] ? kvm_clock_get_cycles+0x1f/0x30
      [27723.894586]  [<ffffffff810eb08c>] ? ktime_get_ts64+0x4c/0xf0
      [27723.896001]  [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
      [27723.897347]  [<ffffffff8168b70e>] io_schedule_timeout+0xae/0x130
      [27723.898774]  [<ffffffff8168b7a8>] io_schedule+0x18/0x20
      [27723.900127]  [<ffffffff8168a1f1>] bit_wait_io+0x11/0x50
      [27723.901463]  [<ffffffff81689d15>] __wait_on_bit+0x65/0x90
      [27723.902809]  [<ffffffff81180191>] wait_on_page_bit+0x81/0xa0
      [27723.904175]  [<ffffffff810b1be0>] ? wake_bit_function+0x40/0x40
      [27723.905558]  [<ffffffff8119126b>] truncate_inode_pages_range+0x3bb/0x740
      [27723.907030]  [<ffffffffa01fab8c>] ? __ext4_journal_stop+0x3c/0xb0 [ext4]
      [27723.908476]  [<ffffffff812630ea>] ? __dquot_initialize+0x3a/0x1c0
      [27723.909876]  [<ffffffff8119166e>] truncate_inode_pages_final+0x5e/0x90
      [27723.911317]  [<ffffffffa01cb46c>] ext4_evict_inode+0x10c/0x4d0 [ext4]
      [27723.912746]  [<ffffffff8121a767>] evict+0xa7/0x170
      [27723.914021]  [<ffffffff8121b005>] iput+0xf5/0x180
      [27723.915286]  [<ffffffff81215ae8>] dentry_kill+0x168/0x1b0
      [27723.916611]  [<ffffffff81215b8c>] dput+0x5c/0xd0
      [27723.917853]  [<ffffffff8120ff82>] SYSC_renameat2+0x4e2/0x570
      [27723.919175]  [<ffffffff811b6cf7>] ? do_munmap+0x2c7/0x420
      [27723.920469]  [<ffffffff81210dce>] SyS_renameat2+0xe/0x10
      [27723.921753]  [<ffffffff81210e0e>] SyS_rename+0x1e/0x20
      [27723.923014]  [<ffffffff81697189>] system_call_fastpath+0x16/0x1b
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jcasper James Casper
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: