Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9247

replay-ost-single test_5: test failed to respond and timed out

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0, Lustre 2.10.7
    • None
    • onyx-32vm1-8, Full Group test,
      RHEL7.3/zfs, branch master, v2.9.54, b3541
    • 3
    • 9223372036854775807

    Description

      https://testing.hpdd.intel.com/test_sessions/afc7f4b0-0af4-11e7-8c9f-5254006e85c2

      It appears that zfs was hung and caused this timeout. Here are a couple indications of this:

      test_log:

      Starting ost1: lustre-ost1/ost1 /mnt/lustre-ost1
      CMD: onyx-32vm8 mkdir -p /mnt/lustre-ost1; mount -t lustre lustre-ost1/ost1 /mnt/lustre-ost1
      onyx-32vm8: e2label: No such file or directory while trying to open lustre-ost1/ost1
      onyx-32vm8: Couldn't find valid filesystem superblock.
      

      OST console:

      10:35:06:[31399.498089] txg_sync        D 0000000000000001     0 27626      2 0x00000080
      10:35:06:[31399.498090]  ffff880049607ac0 0000000000000046 ffff88003d98edd0 ffff880049607fd8
      10:35:06:[31399.498091]  ffff880049607fd8 ffff880049607fd8 ffff88003d98edd0 ffff88007fc16c40
      10:35:06:[31399.498092]  0000000000000000 7fffffffffffffff ffff88005ac587a8 0000000000000001
      10:35:06:[31399.498092] Call Trace:
      10:35:06:[31399.498093]  [<ffffffff8168bac9>] schedule+0x29/0x70
      10:35:06:[31399.498095]  [<ffffffff81689519>] schedule_timeout+0x239/0x2d0
      10:35:06:[31399.498096]  [<ffffffff810c4fe2>] ? default_wake_function+0x12/0x20
      10:35:06:[31399.498098]  [<ffffffff810ba238>] ? __wake_up_common+0x58/0x90
      10:35:06:[31399.498101]  [<ffffffff81060c1f>] ? kvm_clock_get_cycles+0x1f/0x30
      10:35:06:[31399.498103]  [<ffffffff8168b06e>] io_schedule_timeout+0xae/0x130
      10:35:06:[31399.498104]  [<ffffffff810b1416>] ? prepare_to_wait_exclusive+0x56/0x90
      10:35:06:[31399.498106]  [<ffffffff8168b108>] io_schedule+0x18/0x20
      10:35:06:[31399.498109]  [<ffffffffa0677617>] cv_wait_common+0xa7/0x130 [spl]
      10:35:06:[31399.498111]  [<ffffffff810b1720>] ? wake_up_atomic_t+0x30/0x30
      10:35:06:[31399.498114]  [<ffffffffa06776f8>] __cv_wait_io+0x18/0x20 [spl]
      10:35:06:[31399.498150]  [<ffffffffa07d151b>] zio_wait+0x10b/0x1f0 [zfs]
      10:35:06:[31399.498169]  [<ffffffffa075acdf>] dsl_pool_sync+0xbf/0x440 [zfs]
      10:35:06:[31399.498187]  [<ffffffffa0775868>] spa_sync+0x388/0xb50 [zfs]
      10:35:06:[31399.498189]  [<ffffffff810b174b>] ? autoremove_wake_function+0x2b/0x40
      10:35:06:[31399.498191]  [<ffffffff81689c72>] ? mutex_lock+0x12/0x2f
      10:35:06:[31399.498208]  [<ffffffffa07874e5>] txg_sync_thread+0x3c5/0x620 [zfs]
      10:35:06:[31399.498226]  [<ffffffffa0787120>] ? txg_init+0x280/0x280 [zfs]
      10:35:06:[31399.498229]  [<ffffffffa0672851>] thread_generic_wrapper+0x71/0x80 [spl]
      10:35:06:[31399.498232]  [<ffffffffa06727e0>] ? __thread_exit+0x20/0x20 [spl]
      10:35:06:[31399.498234]  [<ffffffff810b064f>] kthread+0xcf/0xe0
      10:35:06:[31399.498235]  [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140
      10:35:06:[31399.498237]  [<ffffffff81696958>] ret_from_fork+0x58/0x90
      10:35:06:[31399.498239]  [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140
      

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              jcasper James Casper (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: