Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8791

Suspicious deadlock on OST stack

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Lustre 2.9.0
    • Fix Version/s: Lustre 2.9.0
    • Labels:
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      This problem occurred in soaked test.

      There are two suspicious call trace which leads me to suspect there exists a deadlock in the code:

      LNet: Service thread pid 30124 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      LNet: Skipped 3 previous similar messages
      Pid: 30124, comm: ll_ost_io00_116
      
      Call Trace:
       [<ffffffff81178260>] ? cache_alloc_refill+0x1c0/0x240
       [<ffffffff81291489>] ? cpumask_next_and+0x29/0x50
       [<ffffffff8153cc55>] rwsem_down_failed_common+0x95/0x1d0
       [<ffffffff8153cde6>] rwsem_down_read_failed+0x26/0x30
       [<ffffffff8129e8f4>] call_rwsem_down_read_failed+0x14/0x30
       [<ffffffff8153c2e4>] ? down_read+0x24/0x30
       [<ffffffffa0f13917>] osd_attr_get+0x77/0x2e0 [osd_zfs]
       [<ffffffffa1022301>] ofd_attr_handle_ugid+0x161/0x430 [ofd]
       [<ffffffffa1027ebd>] ofd_write_attr_set+0xad/0x9e0 [ofd]
       [<ffffffffa102933f>] ofd_commitrw_write+0x23f/0x10a0 [ofd]
       [<ffffffffa102df1d>] ? ofd_fmd_find_nolock+0xad/0xd0 [ofd]
       [<ffffffffa102a75f>] ofd_commitrw+0x5bf/0xb10 [ofd]
       [<ffffffff81149461>] ? kzfree+0x31/0x40
       [<ffffffffa079d5c1>] ? lprocfs_counter_add+0x151/0x1c0 [obdclass]
       [<ffffffffa0a17ad4>] obd_commitrw+0x114/0x380 [ptlrpc]
       [<ffffffffa0a20870>] tgt_brw_write+0xc70/0x1530 [ptlrpc]
       [<ffffffff8105e9b6>] ? enqueue_task+0x66/0x80
       [<ffffffff8105ab8d>] ? check_preempt_curr+0x6d/0x90
       [<ffffffff810674be>] ? try_to_wake_up+0x24e/0x3e0
       [<ffffffffa0974e20>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
       [<ffffffffa0a1f0ac>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
       [<ffffffffa09cb7b1>] ptlrpc_main+0xd31/0x1800 [ptlrpc]
       [<ffffffff81539b0e>] ? thread_return+0x4e/0x7d0
       [<ffffffffa09caa80>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
       [<ffffffff810a138e>] kthread+0x9e/0xc0
       [<ffffffff8100c28a>] child_rip+0xa/0x20
       [<ffffffff810a12f0>] ? kthread+0x0/0xc0
       [<ffffffff8100c280>] ? child_rip+0x0/0x20
      

      And then another one:

      Call Trace:
       [<ffffffffa015208c>] ? mlx4_ib_post_send+0x4fc/0x1370 [mlx4_ib]
       [<ffffffff8153cc55>] rwsem_down_failed_common+0x95/0x1d0
       [<ffffffff8153cde6>] rwsem_down_read_failed+0x26/0x30
       [<ffffffff8129e8f4>] call_rwsem_down_read_failed+0x14/0x30
       [<ffffffff8153c2e4>] ? down_read+0x24/0x30
       [<ffffffffa037e9b0>] dmu_buf_hold_array_by_dnode+0x50/0x4a0 [zfs]
       [<ffffffffa037f85b>] dmu_buf_hold_array_by_bonus+0x6b/0x90 [zfs]
       [<ffffffffa0f149af>] osd_bufs_get+0x4bf/0xb50 [osd_zfs]
       [<ffffffffa07c0af6>] ? lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa102893f>] ofd_preprw_read+0x14f/0x910 [ofd]
       [<ffffffffa102b008>] ofd_preprw+0x358/0x14f0 [ofd]
       [<ffffffffa0a1774f>] obd_preprw+0x10f/0x380 [ptlrpc]
       [<ffffffffa0a217c5>] tgt_brw_read+0x695/0x11d0 [ptlrpc]
       [<ffffffffa09e4194>] ? sptlrpc_svc_alloc_rs+0x74/0x360 [ptlrpc]
       [<ffffffffa09bab8b>] ? lustre_pack_reply_v2+0x1eb/0x280 [ptlrpc]
       [<ffffffffa09bacc6>] ? lustre_pack_reply_flags+0xa6/0x1e0 [ptlrpc]
       [<ffffffffa067ec8a>] ? lc_watchdog_touch+0x7a/0x190 [libcfs]
       [<ffffffffa0a1f0ac>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
       [<ffffffffa09cb7b1>] ptlrpc_main+0xd31/0x1800 [ptlrpc]
       [<ffffffff8106ee50>] ? pick_next_task_fair+0xd0/0x130
       [<ffffffff81539896>] ? schedule+0x176/0x3a0
       [<ffffffffa09caa80>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
       [<ffffffff810a138e>] kthread+0x9e/0xc0
       [<ffffffff8100c28a>] child_rip+0xa/0x20
       [<ffffffff810a12f0>] ? kthread+0x0/0xc0
       [<ffffffff8100c280>] ? child_rip+0x0/0x20
      
      Pid: 29297, comm: ll_ost_io00_003
      

      Two more:

      ll_ost_io00_0 D 0000000000000003     0 29439      2 0x00000080
       ffff8803d6a1f6f0 0000000000000046 0000000000000000 ffff8803d6a1f6b4
       0000000000000000 ffff88043fe82400 00006a78551f8e00 ffff8800387f5a00
       0000000000002c62 0000000106f6922c ffff8803d6a1bad8 ffff8803d6a1ffd8
      Call Trace:
       [<ffffffffa0314e5d>] cv_wait_common+0x11d/0x130 [spl]
       [<ffffffff810a1820>] ? autoremove_wake_function+0x0/0x40
       [<ffffffffa0314ec5>] __cv_wait+0x15/0x20 [spl]
       [<ffffffffa03757fb>] dbuf_read+0x3bb/0x850 [zfs]
       [<ffffffff8153afdb>] ? mutex_lock+0x2b/0x50
       [<ffffffffa0372a13>] ? dbuf_find+0x193/0x1a0 [zfs]
       [<ffffffffa0377c78>] dmu_buf_will_dirty+0x58/0xc0 [zfs]
       [<ffffffffa0377d37>] dbuf_new_size+0x57/0x1c0 [zfs]
       [<ffffffffa039259c>] dnode_set_blksz+0x30c/0x330 [zfs]
       [<ffffffffa037cbb8>] dmu_object_set_blocksize+0x48/0x70 [zfs]
       [<ffffffffa0f15e00>] osd_write_commit+0x770/0x970 [osd_zfs]
       [<ffffffffa102980b>] ofd_commitrw_write+0x70b/0x10a0 [ofd]
       [<ffffffffa102a75f>] ofd_commitrw+0x5bf/0xb10 [ofd]
       [<ffffffff81149461>] ? kzfree+0x31/0x40
       [<ffffffffa079d5c1>] ? lprocfs_counter_add+0x151/0x1c0 [obdclass]
       [<ffffffffa0a17ad4>] obd_commitrw+0x114/0x380 [ptlrpc]
       [<ffffffffa0a20870>] tgt_brw_write+0xc70/0x1530 [ptlrpc]
       [<ffffffff8105e9b6>] ? enqueue_task+0x66/0x80
       [<ffffffff8105ab8d>] ? check_preempt_curr+0x6d/0x90
       [<ffffffff810674be>] ? try_to_wake_up+0x24e/0x3e0
       [<ffffffffa0974e20>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
       [<ffffffffa0a1f0ac>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
       [<ffffffffa09cb7b1>] ptlrpc_main+0xd31/0x1800 [ptlrpc]
       [<ffffffff8106ee50>] ? pick_next_task_fair+0xd0/0x130
       [<ffffffff81539896>] ? schedule+0x176/0x3a0
       [<ffffffffa09caa80>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
       [<ffffffff810a138e>] kthread+0x9e/0xc0
       [<ffffffff8100c28a>] child_rip+0xa/0x20
       [<ffffffff810a12f0>] ? kthread+0x0/0xc0
       [<ffffffff8100c280>] ? child_rip+0x0/0x20
      
      ll_ost_io00_1 D 0000000000000005     0 29763      2 0x00000080
       ffff88021676b768 0000000000000046 0000000000000000 ffff88021676b72c
       ffff880400000000 ffff88043fe82800 00006a78551f88c2 ffff8800387f5a00
       0000000000002c58 0000000106f6922c ffff8803d27b9ad8 ffff88021676bfd8
      Call Trace:
       [<ffffffffa030db8e>] ? spl_kmem_zalloc+0x8e/0x190 [spl]
       [<ffffffff8153cc55>] rwsem_down_failed_common+0x95/0x1d0
       [<ffffffff8153afce>] ? mutex_lock+0x1e/0x50
       [<ffffffff8153cde6>] rwsem_down_read_failed+0x26/0x30
       [<ffffffff8129e8f4>] call_rwsem_down_read_failed+0x14/0x30
       [<ffffffff8153c2e4>] ? down_read+0x24/0x30
       [<ffffffffa0376bb1>] dbuf_dirty+0x381/0x980 [zfs]
       [<ffffffffa037728a>] dbuf_assign_arcbuf+0xda/0x2b0 [zfs]
       [<ffffffffa037f679>] dmu_assign_arcbuf+0x149/0x1f0 [zfs]
       [<ffffffffa0f15ac1>] osd_write_commit+0x431/0x970 [osd_zfs]
       [<ffffffffa102980b>] ofd_commitrw_write+0x70b/0x10a0 [ofd]
       [<ffffffffa102a75f>] ofd_commitrw+0x5bf/0xb10 [ofd]
       [<ffffffff81149461>] ? kzfree+0x31/0x40
       [<ffffffffa079d5c1>] ? lprocfs_counter_add+0x151/0x1c0 [obdclass]
       [<ffffffffa0a17ad4>] obd_commitrw+0x114/0x380 [ptlrpc]
       [<ffffffffa0a20870>] tgt_brw_write+0xc70/0x1530 [ptlrpc]
       [<ffffffff8105e9b6>] ? enqueue_task+0x66/0x80
       [<ffffffff8105ab8d>] ? check_preempt_curr+0x6d/0x90
       [<ffffffff810674be>] ? try_to_wake_up+0x24e/0x3e0
       [<ffffffffa0974e20>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
       [<ffffffffa0a1f0ac>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
       [<ffffffffa09cb7b1>] ptlrpc_main+0xd31/0x1800 [ptlrpc]
       [<ffffffff81539b0e>] ? thread_return+0x4e/0x7d0
       [<ffffffffa09caa80>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
       [<ffffffff810a138e>] kthread+0x9e/0xc0
       [<ffffffff8100c28a>] child_rip+0xa/0x20
       [<ffffffff810a12f0>] ? kthread+0x0/0xc0
       [<ffffffff8100c280>] ? child_rip+0x0/0x20
      

      Due to this issue, other service threads can't move forward because this tx can never complete.

      The node died when I was trying to collect more information. However, these callback trace are enough to start an initial analysis.

        Attachments

        1. lola-10.lfsck.full-debug.txt.gz
          4.50 MB
        2. lola-10.lfsck.hang.txt.gz
          94 kB
        3. lola-5.mount.fulldebug.txt.gz
          4.49 MB
        4. lola-5.mount.hang.txt.gz
          2.52 MB
        5. lola-8.lfsck.hang.txt.gz
          0.2 kB

          Activity

            People

            • Assignee:
              jay Jinshan Xiong (Inactive)
              Reporter:
              jay Jinshan Xiong (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: