Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15473

sanity test_230d: Timeout waiting for IOs on all nodes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for eaujames <eaujames@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/6c537d23-38e5-4825-a422-bbca89bdc908

      test_230d failed with the following error:

      Timeout occurred after 265 mins, last suite running was sanity
      

      This seems to be hardware related. All node (even the clients) seems to wait for io:

      *client1:*

      ...
      [12960.292893] INFO: task jbd2/vda1-8:268 blocked for more than 120 seconds.
      [12960.294060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [12960.295280] jbd2/vda1-8     D ffffa025761447e0     0   268      2 0x00000000
      [12960.296455] Call Trace:
      [12960.297713]  [<ffffffffa2789179>] schedule+0x29/0x70
      [12960.298496]  [<ffffffffa2786e41>] schedule_timeout+0x221/0x2d0
      [12960.303047]  [<ffffffffa2788a2d>] io_schedule_timeout+0xad/0x130
      [12960.303979]  [<ffffffffa2788ac8>] io_schedule+0x18/0x20
      [12960.304789]  [<ffffffffa2787491>] bit_wait_io+0x11/0x50
      [12960.305605]  [<ffffffffa2786fb7>] __wait_on_bit+0x67/0x90
      [12960.307245]  [<ffffffffa2787121>] out_of_line_wait_on_bit+0x81/0xb0
      [12960.309171]  [<ffffffffa228723a>] __wait_on_buffer+0x2a/0x30
      [12960.310124]  [<ffffffffc03dc871>] jbd2_journal_commit_transaction+0x1771/0x19c0 [jbd2]
      [12960.312219]  [<ffffffffc03e1f89>] kjournald2+0xc9/0x260 [jbd2]
      [12960.315005]  [<ffffffffa20c5e61>] kthread+0xd1/0xe0
      [12960.318681] INFO: task 0anacron:4661 blocked for more than 120 seconds.
      [12960.319697] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [12960.320887] 0anacron        D ffffa025dead68e0     0  4661   4657 0x00000080
      [12960.322029] Call Trace:
      [12960.323221]  [<ffffffffa2789179>] schedule+0x29/0x70
      [12960.323998]  [<ffffffffa2786e41>] schedule_timeout+0x221/0x2d0
      [12960.327541]  [<ffffffffa2788a2d>] io_schedule_timeout+0xad/0x130
      [12960.328470]  [<ffffffffa2788ac8>] io_schedule+0x18/0x20
      [12960.329276]  [<ffffffffa2787491>] bit_wait_io+0x11/0x50
      [12960.330091]  [<ffffffffa2786fb7>] __wait_on_bit+0x67/0x90
      [12960.331722]  [<ffffffffa2787121>] out_of_line_wait_on_bit+0x81/0xb0
      [12960.333597]  [<ffffffffa228723a>] __wait_on_buffer+0x2a/0x30
      [12960.334518]  [<ffffffffc03fc217>] __ext4_get_inode_loc+0x197/0x3c0 [ext4]
      [12960.335572]  [<ffffffffc03feb36>] ext4_iget+0x96/0xbd0 [ext4]
      [12960.336470]  [<ffffffffc03ff6a5>] ext4_iget_normal+0x35/0x40 [ext4]
      [12960.337446]  [<ffffffffc0409c52>] ext4_lookup+0xc2/0x160 [ext4]
      [12960.338368]  [<ffffffffa22591d3>] lookup_real+0x23/0x60
      [12960.339179]  [<ffffffffa2259bf2>] __lookup_hash+0x42/0x60
      [12960.340033]  [<ffffffffa27800e5>] lookup_slow+0x42/0xa7
      [12960.340842]  [<ffffffffa225cdbf>] link_path_walk+0x80f/0x8b0
      [12960.341719]  [<ffffffffa225cfca>] path_lookupat+0x7a/0x8d0
      [12960.346235]  [<ffffffffa225d84b>] filename_lookup+0x2b/0xc0
      [12960.347098]  [<ffffffffa2261557>] user_path_at_empty+0x67/0xc0
      [12960.349786]  [<ffffffffa22615c1>] user_path_at+0x11/0x20
      [12960.350612]  [<ffffffffa224c902>] SyS_faccessat+0xb2/0x230
      [12960.351469]  [<ffffffffa2795f92>] system_call_fastpath+0x25/0x2a
      ...
      

      *MDS:*

      ...
      [12720.294647] INFO: task jbd2/vda1-8:268 blocked for more than 120 seconds.
      [12720.297276] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [12720.298494] jbd2/vda1-8     D ffff9fd5f63bd860     0   268      2 0x00000000
      [12720.299642] Call Trace:
      [12720.300874]  [<ffffffff92589179>] schedule+0x29/0x70
      [12720.301645]  [<ffffffff92586e41>] schedule_timeout+0x221/0x2d0
      [12720.306109]  [<ffffffff92588a2d>] io_schedule_timeout+0xad/0x130
      [12720.307026]  [<ffffffff92588ac8>] io_schedule+0x18/0x20
      [12720.307831]  [<ffffffff92587491>] bit_wait_io+0x11/0x50
      [12720.308634]  [<ffffffff92586fb7>] __wait_on_bit+0x67/0x90
      [12720.310235]  [<ffffffff92587121>] out_of_line_wait_on_bit+0x81/0xb0
      [12720.312135]  [<ffffffff9208724a>] __wait_on_buffer+0x2a/0x30
      [12720.313143]  [<ffffffffc049c871>] jbd2_journal_commit_transaction+0x1771/0x19c0 [jbd2]
      [12720.315211]  [<ffffffffc04a1f89>] kjournald2+0xc9/0x260 [jbd2]
      [12720.317980]  [<ffffffff91ec5e61>] kthread+0xd1/0xe0
      ...
      

      *OST:*

      ...
      [12840.193849] INFO: task jbd2/vda1-8:267 blocked for more than 120 seconds.
      [12840.195003] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [12840.196213] jbd2/vda1-8     D ffff937175945860     0   267      2 0x00000000
      [12840.197371] Call Trace:
      [12840.198599]  [<ffffffff88589179>] schedule+0x29/0x70
      [12840.199389]  [<ffffffff88586e41>] schedule_timeout+0x221/0x2d0
      [12840.203947]  [<ffffffff88588a2d>] io_schedule_timeout+0xad/0x130
      [12840.204877]  [<ffffffff88588ac8>] io_schedule+0x18/0x20
      [12840.205687]  [<ffffffff88587491>] bit_wait_io+0x11/0x50
      [12840.206489]  [<ffffffff88586fb7>] __wait_on_bit+0x67/0x90
      [12840.208134]  [<ffffffff88587121>] out_of_line_wait_on_bit+0x81/0xb0
      [12840.210022]  [<ffffffff8808724a>] __wait_on_buffer+0x2a/0x30
      [12840.210938]  [<ffffffffc033ff72>] jbd2_journal_commit_transaction+0xe72/0x19c0 [jbd2]
      [12840.213018]  [<ffffffffc0345f89>] kjournald2+0xc9/0x260 [jbd2]
      [12840.215786]  [<ffffffff87ec5e61>] kthread+0xd1/0xe0
      
      ...
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_230d - Timeout occurred after 265 mins, last suite running was sanity

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: