Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12683

sanity-hsm 602 MDS hang at zfs part

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

       8912.216320] Lustre: mdt00_003: service thread pid 24030 was inactive for 60.101 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
      [ 8999.961057] INFO: task mdt00_003:24030 blocked for more than 120 seconds.
      [ 8999.962271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 8999.963586] mdt00_003       D ffff9992d7dc6180     0 24030      2 0x00000080
      [ 8999.964941] Call Trace:
      [ 8999.965448]  [<ffffffffa1368ed9>] schedule+0x29/0x70
      [ 8999.966536]  [<ffffffffc02d62d5>] cv_wait_common+0x125/0x150 [spl]
      [ 8999.967592]  [<ffffffffa0cc2e70>] ? wake_up_atomic_t+0x30/0x30
      [ 8999.968658]  [<ffffffffc02d6315>] __cv_wait+0x15/0x20 [spl]
      [ 8999.969889]  [<ffffffffc04b2db3>] zrl_add_impl+0x83/0xd0 [zfs]
      [ 8999.971305]  [<ffffffffc03e8451>] dbuf_destroy+0xd1/0x3b0 [zfs]
      [ 8999.972411]  [<ffffffffa0e1c2d6>] ? kfree+0x106/0x140
      [ 8999.973392]  [<ffffffffc04061c8>] dnode_destroy+0x138/0x230 [zfs]
      [ 8999.974492]  [<ffffffffc04075ce>] dnode_hold_impl+0x5fe/0xc40 [zfs]
      [ 8999.975621]  [<ffffffffc03f4905>] ? dmu_object_next+0x95/0x140 [zfs]
      [ 8999.976745]  [<ffffffffc03f4b9f>] dmu_object_alloc_dnsize+0x1ef/0x3e0 [zfs]
      [ 8999.977992]  [<ffffffffc1150df2>] __osd_object_create+0x82/0x170 [osd_zfs]
      [ 8999.979196]  [<ffffffffc115115d>] osd_mkreg+0x7d/0x210 [osd_zfs]
      [ 8999.980342]  [<ffffffffc1427ae9>] ? lod_sub_declare_xattr_set+0xf9/0x300 [lod]
      [ 8999.981594]  [<ffffffffc114d6a6>] osd_create+0x316/0xaf0 [osd_zfs]
      [ 8999.982687]  [<ffffffffc1425675>] lod_sub_create+0x1f5/0x480 [lod]
      [ 8999.983780]  [<ffffffffc1416559>] lod_create+0x69/0x350 [lod]
      [ 8999.984933]  [<ffffffffc1491e08>] mdd_create_object_internal+0xb8/0x280 [mdd]
      [ 8999.986143]  [<ffffffffc147b32d>] mdd_create_object+0x7d/0x8e0 [mdd]
      [ 8999.987234]  [<ffffffffc14859c7>] mdd_create+0xf57/0x1660 [mdd]
      [ 8999.988265]  [<ffffffffc13236a3>] mdt_reint_open+0x2373/0x3360 [mdt]
      

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/b7fd9ccc-c387-11e9-90ad-52540065bddc

      test_602 failed with the following error:

      Timeout occurred after 207 mins, last suite running was sanity-hsm, restarting cluster to continue tests
      

      <<Please provide additional information about the failure here>>

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-hsm test_602 - Timeout occurred after 207 mins, last suite running was sanity-hsm, restarting cluster to continue tests

      Attachments

        Activity

          People

            wc-triage WC Triage
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: