Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9146

Backport patches from upstream to resolve deadlock in xattr

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      We need backport below patches from upstream to resolve deadlock for i_data_sem.

      From a521100231f816f8cdd9c8e77da14ff1e42c2b17 Mon Sep 17 00:00:00 2001
      From: Theodore Ts'o <tytso@mit.edu>
      Date: Thu, 4 Sep 2014 18:06:25 -0400
      Subject: [PATCH] ext4: pass allocation_request struct to
      ext4_(alloc,splice)_branch

      Instead of initializing the allocation_request structure in
      ext4_alloc_branch(), set it up in ext4_ind_map_blocks(), and then pass
      it to ext4_alloc_branch() and ext4_splice_branch().

      This allows ext4_ind_map_blocks to pass flags in the allocation
      request structure without having to add Yet Another argument to
      ext4_alloc_branch().

      Signed-off-by: Theodore Ts'o <tytso@mit.edu>
      Reviewed-by: Jan Kara <jack@suse.cz>

      From e3cf5d5d9a86df1c5e413bdd3725c25a16ff854c Mon Sep 17 00:00:00 2001
      From: Theodore Ts'o <tytso@mit.edu>
      Date: Thu, 4 Sep 2014 18:07:25 -0400
      Subject: [PATCH] ext4: prepare to drop EXT4_STATE_DELALLOC_RESERVED

      The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented
      because it was too hard to make sure the mballoc and get_block flags
      could be reliably passed down through all of the codepaths that end up
      calling ext4_mb_new_blocks().

      Since then, we have mb_flags passed down through most of the code
      paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky
      as it used to.

      This commit plumbs in the last of what is required, and then adds a
      WARN_ON check to make sure we haven't missed anything. If this passes
      a full regression test run, we can then drop
      EXT4_STATE_DELALLOC_RESERVED.

      Signed-off-by: Theodore Ts'o <tytso@mit.edu>
      Reviewed-by: Jan Kara <jack@suse.cz>

      From 2e81a4eeedcaa66e35f58b81e0755b87057ce392 Mon Sep 17 00:00:00 2001
      From: Jan Kara <jack@suse.cz>
      Date: Thu, 11 Aug 2016 12:38:55 -0400
      Subject: [PATCH] ext4: avoid deadlock when expanding inode size

      When we need to move xattrs into external xattr block, we call
      ext4_xattr_block_set() from ext4_expand_extra_isize_ea(). That may end
      up calling ext4_mark_inode_dirty() again which will recurse back into
      the inode expansion code leading to deadlocks.

      Protect from recursion using EXT4_STATE_NO_EXPAND inode flag and move
      its management into ext4_expand_extra_isize_ea() since its manipulation
      is safe there (due to xattr_sem) from possible races with
      ext4_xattr_set_handle() which plays with it as well.

      CC: stable@vger.kernel.org # 4.4.x
      Signed-off-by: Jan Kara <jack@suse.cz>
      Signed-off-by: Theodore Ts'o <tytso@mit.edu>

      Attachments

        Issue Links

          Activity

            [LU-9146] Backport patches from upstream to resolve deadlock in xattr
            ys Yang Sheng added a comment -

            Just for record.
            OSS stack trace from host gio12

            Feb 22 12:41:41 gio12 kernel: Lustre: Skipped 686202 previous similar messages
            Feb 22 12:41:50 gio12 kernel: Lustre: dtemp-OST001b: recovery is timed out, evict stale exports
            Feb 22 12:41:50 gio12 kernel: Lustre: dtemp-OST001b: disconnecting 1 stale clients
            Feb 22 12:41:50 gio12 kernel: Lustre: dtemp-OST001b: Client a83807d9-ca3b-9fd3-3cbc-1d2b648b12d1 (at 172.22.160.62@o2ib6) reconnecting
            Feb 22 12:41:50 gio12 kernel: Lustre: Skipped 894 previous similar messages
            Feb 22 12:41:52 gio12 kernel: Lustre: dtemp-OST001b: Recovery over after 14:56, of 1435 clients 1434 recovered and 1 was evicted.
            Feb 22 12:41:52 gio12 kernel: Lustre: Skipped 1 previous similar message
            Feb 22 12:41:52 gio12 kernel: Lustre: dtemp-OST001b: deleting orphan objects from 0x0:7139820 to 0x0:7139873
            Feb 22 12:44:16 gio12 kernel: INFO: task ll_ost_io01_002:14056 blocked for more than 120 seconds.
            Feb 22 12:44:16 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:44:16 gio12 kernel: ll_ost_io01_002 D 0000000000000000     0 14056      2 0x00000080
            Feb 22 12:44:16 gio12 kernel:  ffff881020b9b898 0000000000000046 ffff88102151b980 ffff881020b9bfd8
            Feb 22 12:44:16 gio12 kernel:  ffff881020b9bfd8 ffff881020b9bfd8 ffff88102151b980 ffff88102151b980
            Feb 22 12:44:16 gio12 kernel:  ffff8807a92bda90 fffffffeffffffff ffff8807a92bda98 0000000000000000
            Feb 22 12:44:16 gio12 kernel: Call Trace:
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff8163d8d5>] rwsem_down_read_failed+0xf5/0x170
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff81301e54>] call_rwsem_down_read_failed+0x14/0x30
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff8163b130>] ? down_read+0x20/0x30
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0def84b>] ldiskfs_xattr_block_set+0x62b/0xa80 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0df09d4>] ldiskfs_expand_extra_isize_ea+0x404/0x810 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0df6d9f>] ldiskfs_mark_inode_dirty+0x1af/0x210 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0de0884>] ldiskfs_ext_truncate+0x24/0xe0 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0df83b7>] ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0e92e08>] osd_punch+0x138/0x5e0 [osd_ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0c84346>] ofd_object_punch+0x6e6/0xc30 [ofd]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff8163e05b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:16 gio12 kernel: INFO: task jbd2/dm-11-8:15759 blocked for more than 120 seconds.
            Feb 22 12:44:16 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:44:16 gio12 kernel: jbd2/dm-11-8    D ffff880036446800     0 15759      2 0x00000080
            Feb 22 12:44:16 gio12 kernel:  ffff880fdd21bc88 0000000000000046 ffff88104f747300 ffff880fdd21bfd8
            Feb 22 12:44:16 gio12 kernel:  ffff880fdd21bfd8 ffff880fdd21bfd8 ffff88104f747300 ffff880fdd21bda0
            Feb 22 12:44:16 gio12 kernel:  ffff881016e128c0 ffff88104f747300 ffff880fdd21bd88 ffff880036446800
            Feb 22 12:44:16 gio12 kernel: Call Trace:
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa018d138>] jbd2_journal_commit_transaction+0x248/0x19e0 [jbd2]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810c15fc>] ? update_curr+0xcc/0x150
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810c1ac6>] ? dequeue_entity+0x106/0x520
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff8108d7be>] ? try_to_del_timer_sync+0x5e/0x90
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0192e99>] kjournald2+0xc9/0x260 [jbd2]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0192dd0>] ? commit_timeout+0x10/0x10 [jbd2]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:16 gio12 kernel: INFO: task kworker/u33:2:28976 blocked for more than 120 seconds.
            Feb 22 12:44:16 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:44:16 gio12 kernel: kworker/u33:2   D ffff880850ea8030     0 28976      2 0x00000080
            Feb 22 12:44:16 gio12 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:11)
            Feb 22 12:44:16 gio12 kernel:  ffff880462d2f8e8 0000000000000046 ffff88084f0a8b80 ffff880462d2ffd8
            Feb 22 12:44:16 gio12 kernel:  ffff880462d2ffd8 ffff880462d2ffd8 ffff88084f0a8b80 ffff881016e12800
            Feb 22 12:44:16 gio12 kernel:  ffff881016e12878 000000000d83b523 ffff880036446800 ffff880850ea8030
            Feb 22 12:44:16 gio12 kernel: Call Trace:
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0df7254>] ? ldiskfs_writepages+0x454/0xd80 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffffa0df7254>] ldiskfs_writepages+0x454/0xd80 [ldiskfs]
            Feb 22 12:44:16 gio12 kernel:  [<ffffffff81174d08>] ? generic_writepages+0x58/0x80
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff81175dae>] do_writepages+0x1e/0x40
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff81208c90>] __writeback_single_inode+0x40/0x220
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff812096fe>] writeback_sb_inodes+0x25e/0x420
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff8120995f>] __writeback_inodes_wb+0x9f/0xd0
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff8120a1a3>] wb_writeback+0x263/0x2f0
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff811f8fac>] ? get_nr_inodes+0x4c/0x70
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff8120c42b>] bdi_writeback_workfn+0x2cb/0x460
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff8109d6bb>] process_one_work+0x17b/0x470
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff8109e48b>] worker_thread+0x11b/0x400
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff8109e370>] ? rescuer_thread+0x400/0x400
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:17 gio12 kernel: INFO: task ll_ost_io03_004:32100 blocked for more than 120 seconds.
            Feb 22 12:44:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:44:17 gio12 kernel: ll_ost_io03_004 D ffff881053a24060     0 32100      2 0x00000080
            Feb 22 12:44:17 gio12 kernel:  ffff88031eb6f9f0 0000000000000046 ffff8808a62c2280 ffff88031eb6ffd8
            Feb 22 12:44:17 gio12 kernel:  ffff88031eb6ffd8 ffff88031eb6ffd8 ffff8808a62c2280 ffff881016e12800
            Feb 22 12:44:17 gio12 kernel:  ffff881016e12878 000000000d83b523 ffff880036446800 ffff881053a24060
            Feb 22 12:44:17 gio12 kernel: Call Trace:
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0e713d4>] ? osd_declare_xattr_set+0xe4/0x2e0 [osd_ldiskfs]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0e78534>] ? osd_trans_start+0x174/0x410 [osd_ldiskfs]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0e78534>] osd_trans_start+0x174/0x410 [osd_ldiskfs]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0c80d7b>] ofd_trans_start+0x6b/0xe0 [ofd]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0c8428a>] ofd_object_punch+0x62a/0xc30 [ofd]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810af0e8>] ? __wake_up_common+0x58/0x90
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:44:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:44:39 gio12 kernel: LustreError: 137-5: dtemp-OST001c_UUID: not available for connect from 172.22.166.12@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server.
            Feb 22 12:44:39 gio12 kernel: LustreError: Skipped 1679 previous similar messages
            ...
            Feb 22 12:45:42 gio12 kernel: LustreError: 137-5: dtemp-OST001c_UUID: not available for connect from 172.22.166.11@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server.
            Feb 22 12:46:17 gio12 kernel: INFO: task ll_ost_io01_002:14056 blocked for more than 120 seconds.
            Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:46:17 gio12 kernel: ll_ost_io01_002 D 0000000000000000     0 14056      2 0x00000080
            Feb 22 12:46:17 gio12 kernel:  ffff881020b9b898 0000000000000046 ffff88102151b980 ffff881020b9bfd8
            Feb 22 12:46:17 gio12 kernel:  ffff881020b9bfd8 ffff881020b9bfd8 ffff88102151b980 ffff88102151b980
            Feb 22 12:46:17 gio12 kernel:  ffff8807a92bda90 fffffffeffffffff ffff8807a92bda98 0000000000000000
            Feb 22 12:46:17 gio12 kernel: Call Trace:
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8163d8d5>] rwsem_down_read_failed+0xf5/0x170
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81301e54>] call_rwsem_down_read_failed+0x14/0x30
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8163b130>] ? down_read+0x20/0x30
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0def84b>] ldiskfs_xattr_block_set+0x62b/0xa80 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0df09d4>] ldiskfs_expand_extra_isize_ea+0x404/0x810 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0df6d9f>] ldiskfs_mark_inode_dirty+0x1af/0x210 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0de0884>] ldiskfs_ext_truncate+0x24/0xe0 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0df83b7>] ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0e92e08>] osd_punch+0x138/0x5e0 [osd_ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0c84346>] ofd_object_punch+0x6e6/0xc30 [ofd]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8163e05b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:46:17 gio12 kernel: INFO: task jbd2/dm-11-8:15759 blocked for more than 120 seconds.
            Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:46:17 gio12 kernel: jbd2/dm-11-8    D ffff880036446800     0 15759      2 0x00000080
            Feb 22 12:46:17 gio12 kernel:  ffff880fdd21bc88 0000000000000046 ffff88104f747300 ffff880fdd21bfd8
            Feb 22 12:46:17 gio12 kernel:  ffff880fdd21bfd8 ffff880fdd21bfd8 ffff88104f747300 ffff880fdd21bda0
            Feb 22 12:46:17 gio12 kernel:  ffff881016e128c0 ffff88104f747300 ffff880fdd21bd88 ffff880036446800
            Feb 22 12:46:17 gio12 kernel: Call Trace:
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa018d138>] jbd2_journal_commit_transaction+0x248/0x19e0 [jbd2]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810c15fc>] ? update_curr+0xcc/0x150
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810c1ac6>] ? dequeue_entity+0x106/0x520
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8108d7be>] ? try_to_del_timer_sync+0x5e/0x90
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0192e99>] kjournald2+0xc9/0x260 [jbd2]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0192dd0>] ? commit_timeout+0x10/0x10 [jbd2]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:46:17 gio12 kernel: INFO: task kworker/u33:2:28976 blocked for more than 120 seconds.
            Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:46:17 gio12 kernel: kworker/u33:2   D ffff880850ea8030     0 28976      2 0x00000080
            Feb 22 12:46:17 gio12 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:11)
            Feb 22 12:46:17 gio12 kernel:  ffff880462d2f8e8 0000000000000046 ffff88084f0a8b80 ffff880462d2ffd8
            Feb 22 12:46:17 gio12 kernel:  ffff880462d2ffd8 ffff880462d2ffd8 ffff88084f0a8b80 ffff881016e12800
            Feb 22 12:46:17 gio12 kernel:  ffff881016e12878 000000000d83b523 ffff880036446800 ffff880850ea8030
            Feb 22 12:46:17 gio12 kernel: Call Trace:
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0df7254>] ? ldiskfs_writepages+0x454/0xd80 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa0df7254>] ldiskfs_writepages+0x454/0xd80 [ldiskfs]
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81174d08>] ? generic_writepages+0x58/0x80
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81175dae>] do_writepages+0x1e/0x40
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81208c90>] __writeback_single_inode+0x40/0x220
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff812096fe>] writeback_sb_inodes+0x25e/0x420
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8120995f>] __writeback_inodes_wb+0x9f/0xd0
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8120a1a3>] wb_writeback+0x263/0x2f0
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff811f8fac>] ? get_nr_inodes+0x4c/0x70
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8120c42b>] bdi_writeback_workfn+0x2cb/0x460
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8109d6bb>] process_one_work+0x17b/0x470
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8109e48b>] worker_thread+0x11b/0x400
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8109e370>] ? rescuer_thread+0x400/0x400
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:46:17 gio12 kernel: INFO: task ll_ost_io03_004:32100 blocked for more than 120 seconds.
            Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:46:17 gio12 kernel: ll_ost_io03_004 D ffff881053a24060     0 32100      2 0x00000080
            Feb 22 12:46:17 gio12 kernel:  ffff88031eb6f9f0 0000000000000046 ffff8808a62c2280 ffff88031eb6ffd8
            Feb 22 12:46:17 gio12 kernel:  ffff88031eb6ffd8 ffff88031eb6ffd8 ffff8808a62c2280 ffff881016e12800
            Feb 22 12:46:17 gio12 kernel:  ffff881016e12878 000000000d83b523 ffff880036446800 ffff881053a24060
            Feb 22 12:46:17 gio12 kernel: Call Trace:
            Feb 22 12:46:17 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:46:17 gio12 kernel:  [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0e713d4>] ? osd_declare_xattr_set+0xe4/0x2e0 [osd_ldiskfs]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0e78534>] ? osd_trans_start+0x174/0x410 [osd_ldiskfs]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0e78534>] osd_trans_start+0x174/0x410 [osd_ldiskfs]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0c80d7b>] ofd_trans_start+0x6b/0xe0 [ofd]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0c8428a>] ofd_object_punch+0x62a/0xc30 [ofd]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffff810af0e8>] ? __wake_up_common+0x58/0x90
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            Feb 22 12:46:18 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:46:18 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:46:18 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:46:18 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:47:09 gio12 kernel: LustreError: 137-5: dtemp-OST001c_UUID: not available for connect from 172.22.166.12@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server.
            Feb 22 12:48:18 gio12 kernel: INFO: task ll_ost_io01_002:14056 blocked for more than 120 seconds.
            Feb 22 12:48:18 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:48:18 gio12 kernel: ll_ost_io01_002 D 0000000000000000     0 14056      2 0x00000080
            Feb 22 12:48:18 gio12 kernel:  ffff881020b9b898 0000000000000046 ffff88102151b980 ffff881020b9bfd8
            Feb 22 12:48:18 gio12 kernel:  ffff881020b9bfd8 ffff881020b9bfd8 ffff88102151b980 ffff88102151b980
            Feb 22 12:48:18 gio12 kernel:  ffff8807a92bda90 fffffffeffffffff ffff8807a92bda98 0000000000000000
            Feb 22 12:48:18 gio12 kernel: Call Trace:
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff8163d8d5>] rwsem_down_read_failed+0xf5/0x170
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff81301e54>] call_rwsem_down_read_failed+0x14/0x30
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff8163b130>] ? down_read+0x20/0x30
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0def84b>] ldiskfs_xattr_block_set+0x62b/0xa80 [ldiskfs]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0df09d4>] ldiskfs_expand_extra_isize_ea+0x404/0x810 [ldiskfs]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0df6d9f>] ldiskfs_mark_inode_dirty+0x1af/0x210 [ldiskfs]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0de0884>] ldiskfs_ext_truncate+0x24/0xe0 [ldiskfs]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0df83b7>] ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0e92e08>] osd_punch+0x138/0x5e0 [osd_ldiskfs]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0c84346>] ofd_object_punch+0x6e6/0xc30 [ofd]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff8163e05b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:48:18 gio12 kernel: INFO: task jbd2/dm-11-8:15759 blocked for more than 120 seconds.
            Feb 22 12:48:18 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 12:48:18 gio12 kernel: jbd2/dm-11-8    D ffff880036446800     0 15759      2 0x00000080
            Feb 22 12:48:18 gio12 kernel:  ffff880fdd21bc88 0000000000000046 ffff88104f747300 ffff880fdd21bfd8
            Feb 22 12:48:18 gio12 kernel:  ffff880fdd21bfd8 ffff880fdd21bfd8 ffff88104f747300 ffff880fdd21bda0
            Feb 22 12:48:18 gio12 kernel:  ffff881016e128c0 ffff88104f747300 ffff880fdd21bd88 ffff880036446800
            Feb 22 12:48:18 gio12 kernel: Call Trace:
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa018d138>] jbd2_journal_commit_transaction+0x248/0x19e0 [jbd2]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810c15fc>] ? update_curr+0xcc/0x150
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810c1ac6>] ? dequeue_entity+0x106/0x520
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff8108d7be>] ? try_to_del_timer_sync+0x5e/0x90
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0192e99>] kjournald2+0xc9/0x260 [jbd2]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30
            Feb 22 12:48:18 gio12 kernel:  [<ffffffffa0192dd0>] ? commit_timeout+0x10/0x10 [jbd2]
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 12:48:18 gio12 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            

            MDS also had watchdog stack traces occur for several mdt tasks. MDT watchdog traces were triggered once but OSS has repeated watchdog traces. Example MDT stack trace

            Feb 22 03:52:03 gio0 kernel: INFO: task mdt01_003:9154 blocked for more than 120 seconds.
            Feb 22 03:52:03 gio0 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
            Feb 22 03:52:03 gio0 kernel: mdt01_003       D ffff880838cab1d8     0  9154      2 0x00000080
            Feb 22 03:52:03 gio0 kernel:  ffff8810371ab4c8 0000000000000046 ffff8810507eb980 ffff8810371abfd8
            Feb 22 03:52:03 gio0 kernel:  ffff8810371abfd8 ffff8810371abfd8 ffff8810507eb980 ffff8810507eb980
            Feb 22 03:52:03 gio0 kernel:  ffff880838cab1c8 ffff880838cab1d0 ffffffff00000000 ffff880838cab1d8
            Feb 22 03:52:03 gio0 kernel: Call Trace:
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff8163bf19>] schedule+0x29/0x70
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff8163d6d5>] rwsem_down_write_failed+0x115/0x220
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff812134dc>] ? __find_get_block+0xbc/0x120
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff81301e83>] call_rwsem_down_write_failed+0x13/0x20
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff8163b16d>] ? down_write+0x2d/0x30
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa142bba7>] lod_alloc_qos.constprop.15+0x187/0x1400 [lod]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff8121292d>] ? __brelse+0x3d/0x50
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa10ed65f>] ? ldiskfs_xattr_ibody_get+0xef/0x1a0 [ldiskfs]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa10ec6af>] ? ldiskfs_xattr_find_entry+0x9f/0x130 [ldiskfs]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa142e7fd>] lod_qos_prep_create+0x10cd/0x1fbc [lod]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa11aaac9>] ? osd_declare_qid+0x279/0x4b0 [osd_ldiskfs]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa11aaeae>] ? osd_declare_inode_qid+0x1ae/0x290 [osd_ldiskfs]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1427fdd>] lod_declare_striped_object+0x1fd/0x810 [lod]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1171e23>] ? osd_declare_object_create+0x113/0x2b0 [osd_ldiskfs]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1429661>] lod_declare_object_create+0x231/0x4b0 [lod]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa14836af>] mdd_declare_object_create_internal+0xdf/0x2f0 [mdd]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1478038>] mdd_declare_create+0x48/0xef0 [mdd]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1479669>] mdd_create+0x789/0x12a0 [mdd]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa134ed52>] mdt_reint_open+0x1f92/0x2e00 [mdt]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa08aa1a9>] ? upcall_cache_get_entry+0x3e9/0x8e0 [libcfs]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff812fc212>] ? strlcpy+0x42/0x60
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1341e30>] mdt_reint_rec+0x80/0x210 [mdt]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1322921>] mdt_reint_internal+0x5e1/0xb30 [mdt]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa1322fd2>] mdt_intent_reint+0x162/0x420 [mdt]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0c95797>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa13268b5>] mdt_intent_opc+0x215/0xa30 [mdt]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0c99e30>] ? lustre_swab_ldlm_policy_data+0x30/0x30 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa132e478>] mdt_intent_policy+0x138/0x320 [mdt]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0c491d7>] ldlm_lock_enqueue+0x357/0x9c0 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0c6f7b2>] ldlm_handle_enqueue0+0x4f2/0x16f0 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0c99eb0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0cfcc32>] tgt_enqueue+0x62/0x210 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0d01c9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0ca4a3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa089ecf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0ca1b08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff810af0e8>] ? __wake_up_common+0x58/0x90
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0ca8360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffffa0ca7760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff810a5baf>] kthread+0xcf/0xe0
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            Feb 22 03:52:03 gio0 kernel:  [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            
            ys Yang Sheng added a comment - Just for record. OSS stack trace from host gio12 Feb 22 12:41:41 gio12 kernel: Lustre: Skipped 686202 previous similar messages Feb 22 12:41:50 gio12 kernel: Lustre: dtemp-OST001b: recovery is timed out, evict stale exports Feb 22 12:41:50 gio12 kernel: Lustre: dtemp-OST001b: disconnecting 1 stale clients Feb 22 12:41:50 gio12 kernel: Lustre: dtemp-OST001b: Client a83807d9-ca3b-9fd3-3cbc-1d2b648b12d1 (at 172.22.160.62@o2ib6) reconnecting Feb 22 12:41:50 gio12 kernel: Lustre: Skipped 894 previous similar messages Feb 22 12:41:52 gio12 kernel: Lustre: dtemp-OST001b: Recovery over after 14:56, of 1435 clients 1434 recovered and 1 was evicted. Feb 22 12:41:52 gio12 kernel: Lustre: Skipped 1 previous similar message Feb 22 12:41:52 gio12 kernel: Lustre: dtemp-OST001b: deleting orphan objects from 0x0:7139820 to 0x0:7139873 Feb 22 12:44:16 gio12 kernel: INFO: task ll_ost_io01_002:14056 blocked for more than 120 seconds. Feb 22 12:44:16 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:44:16 gio12 kernel: ll_ost_io01_002 D 0000000000000000 0 14056 2 0x00000080 Feb 22 12:44:16 gio12 kernel: ffff881020b9b898 0000000000000046 ffff88102151b980 ffff881020b9bfd8 Feb 22 12:44:16 gio12 kernel: ffff881020b9bfd8 ffff881020b9bfd8 ffff88102151b980 ffff88102151b980 Feb 22 12:44:16 gio12 kernel: ffff8807a92bda90 fffffffeffffffff ffff8807a92bda98 0000000000000000 Feb 22 12:44:16 gio12 kernel: Call Trace: Feb 22 12:44:16 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:44:16 gio12 kernel: [<ffffffff8163d8d5>] rwsem_down_read_failed+0xf5/0x170 Feb 22 12:44:16 gio12 kernel: [<ffffffff81301e54>] call_rwsem_down_read_failed+0x14/0x30 Feb 22 12:44:16 gio12 kernel: [<ffffffff8163b130>] ? down_read+0x20/0x30 Feb 22 12:44:16 gio12 kernel: [<ffffffffa0def84b>] ldiskfs_xattr_block_set+0x62b/0xa80 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0df09d4>] ldiskfs_expand_extra_isize_ea+0x404/0x810 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0df6d9f>] ldiskfs_mark_inode_dirty+0x1af/0x210 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0de0884>] ldiskfs_ext_truncate+0x24/0xe0 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0df83b7>] ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0e92e08>] osd_punch+0x138/0x5e0 [osd_ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0c84346>] ofd_object_punch+0x6e6/0xc30 [ofd] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd] Feb 22 12:44:16 gio12 kernel: [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] Feb 22 12:44:16 gio12 kernel: [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] Feb 22 12:44:16 gio12 kernel: [<ffffffff8163e05b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40 Feb 22 12:44:16 gio12 kernel: [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] Feb 22 12:44:16 gio12 kernel: [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] Feb 22 12:44:16 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:44:16 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:16 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:44:16 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:16 gio12 kernel: INFO: task jbd2/dm-11-8:15759 blocked for more than 120 seconds. Feb 22 12:44:16 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:44:16 gio12 kernel: jbd2/dm-11-8 D ffff880036446800 0 15759 2 0x00000080 Feb 22 12:44:16 gio12 kernel: ffff880fdd21bc88 0000000000000046 ffff88104f747300 ffff880fdd21bfd8 Feb 22 12:44:16 gio12 kernel: ffff880fdd21bfd8 ffff880fdd21bfd8 ffff88104f747300 ffff880fdd21bda0 Feb 22 12:44:16 gio12 kernel: ffff881016e128c0 ffff88104f747300 ffff880fdd21bd88 ffff880036446800 Feb 22 12:44:16 gio12 kernel: Call Trace: Feb 22 12:44:16 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:44:16 gio12 kernel: [<ffffffffa018d138>] jbd2_journal_commit_transaction+0x248/0x19e0 [jbd2] Feb 22 12:44:16 gio12 kernel: [<ffffffff810c15fc>] ? update_curr+0xcc/0x150 Feb 22 12:44:16 gio12 kernel: [<ffffffff810c1ac6>] ? dequeue_entity+0x106/0x520 Feb 22 12:44:16 gio12 kernel: [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0 Feb 22 12:44:16 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:44:16 gio12 kernel: [<ffffffff8108d7be>] ? try_to_del_timer_sync+0x5e/0x90 Feb 22 12:44:16 gio12 kernel: [<ffffffffa0192e99>] kjournald2+0xc9/0x260 [jbd2] Feb 22 12:44:16 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:44:16 gio12 kernel: [<ffffffffa0192dd0>] ? commit_timeout+0x10/0x10 [jbd2] Feb 22 12:44:16 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:44:16 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:16 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:44:16 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:16 gio12 kernel: INFO: task kworker/u33:2:28976 blocked for more than 120 seconds. Feb 22 12:44:16 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:44:16 gio12 kernel: kworker/u33:2 D ffff880850ea8030 0 28976 2 0x00000080 Feb 22 12:44:16 gio12 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:11) Feb 22 12:44:16 gio12 kernel: ffff880462d2f8e8 0000000000000046 ffff88084f0a8b80 ffff880462d2ffd8 Feb 22 12:44:16 gio12 kernel: ffff880462d2ffd8 ffff880462d2ffd8 ffff88084f0a8b80 ffff881016e12800 Feb 22 12:44:16 gio12 kernel: ffff881016e12878 000000000d83b523 ffff880036446800 ffff880850ea8030 Feb 22 12:44:16 gio12 kernel: Call Trace: Feb 22 12:44:16 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:44:16 gio12 kernel: [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2] Feb 22 12:44:16 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:44:16 gio12 kernel: [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2] Feb 22 12:44:16 gio12 kernel: [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0 Feb 22 12:44:16 gio12 kernel: [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0df7254>] ? ldiskfs_writepages+0x454/0xd80 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffffa0df7254>] ldiskfs_writepages+0x454/0xd80 [ldiskfs] Feb 22 12:44:16 gio12 kernel: [<ffffffff81174d08>] ? generic_writepages+0x58/0x80 Feb 22 12:44:17 gio12 kernel: [<ffffffff81175dae>] do_writepages+0x1e/0x40 Feb 22 12:44:17 gio12 kernel: [<ffffffff81208c90>] __writeback_single_inode+0x40/0x220 Feb 22 12:44:17 gio12 kernel: [<ffffffff812096fe>] writeback_sb_inodes+0x25e/0x420 Feb 22 12:44:17 gio12 kernel: [<ffffffff8120995f>] __writeback_inodes_wb+0x9f/0xd0 Feb 22 12:44:17 gio12 kernel: [<ffffffff8120a1a3>] wb_writeback+0x263/0x2f0 Feb 22 12:44:17 gio12 kernel: [<ffffffff811f8fac>] ? get_nr_inodes+0x4c/0x70 Feb 22 12:44:17 gio12 kernel: [<ffffffff8120c42b>] bdi_writeback_workfn+0x2cb/0x460 Feb 22 12:44:17 gio12 kernel: [<ffffffff8109d6bb>] process_one_work+0x17b/0x470 Feb 22 12:44:17 gio12 kernel: [<ffffffff8109e48b>] worker_thread+0x11b/0x400 Feb 22 12:44:17 gio12 kernel: [<ffffffff8109e370>] ? rescuer_thread+0x400/0x400 Feb 22 12:44:17 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:44:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:17 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:44:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:17 gio12 kernel: INFO: task ll_ost_io03_004:32100 blocked for more than 120 seconds. Feb 22 12:44:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:44:17 gio12 kernel: ll_ost_io03_004 D ffff881053a24060 0 32100 2 0x00000080 Feb 22 12:44:17 gio12 kernel: ffff88031eb6f9f0 0000000000000046 ffff8808a62c2280 ffff88031eb6ffd8 Feb 22 12:44:17 gio12 kernel: ffff88031eb6ffd8 ffff88031eb6ffd8 ffff8808a62c2280 ffff881016e12800 Feb 22 12:44:17 gio12 kernel: ffff881016e12878 000000000d83b523 ffff880036446800 ffff881053a24060 Feb 22 12:44:17 gio12 kernel: Call Trace: Feb 22 12:44:17 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:44:17 gio12 kernel: [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2] Feb 22 12:44:17 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:44:17 gio12 kernel: [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0e713d4>] ? osd_declare_xattr_set+0xe4/0x2e0 [osd_ldiskfs] Feb 22 12:44:17 gio12 kernel: [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0 Feb 22 12:44:17 gio12 kernel: [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0e78534>] ? osd_trans_start+0x174/0x410 [osd_ldiskfs] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0e78534>] osd_trans_start+0x174/0x410 [osd_ldiskfs] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0c80d7b>] ofd_trans_start+0x6b/0xe0 [ofd] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0c8428a>] ofd_object_punch+0x62a/0xc30 [ofd] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd] Feb 22 12:44:17 gio12 kernel: [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] Feb 22 12:44:17 gio12 kernel: [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] Feb 22 12:44:17 gio12 kernel: [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs] Feb 22 12:44:17 gio12 kernel: [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] Feb 22 12:44:17 gio12 kernel: [<ffffffff810af0e8>] ? __wake_up_common+0x58/0x90 Feb 22 12:44:17 gio12 kernel: [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] Feb 22 12:44:17 gio12 kernel: [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] Feb 22 12:44:17 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:44:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:17 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:44:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:44:39 gio12 kernel: LustreError: 137-5: dtemp-OST001c_UUID: not available for connect from 172.22.166.12@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. Feb 22 12:44:39 gio12 kernel: LustreError: Skipped 1679 previous similar messages ... Feb 22 12:45:42 gio12 kernel: LustreError: 137-5: dtemp-OST001c_UUID: not available for connect from 172.22.166.11@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. Feb 22 12:46:17 gio12 kernel: INFO: task ll_ost_io01_002:14056 blocked for more than 120 seconds. Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:46:17 gio12 kernel: ll_ost_io01_002 D 0000000000000000 0 14056 2 0x00000080 Feb 22 12:46:17 gio12 kernel: ffff881020b9b898 0000000000000046 ffff88102151b980 ffff881020b9bfd8 Feb 22 12:46:17 gio12 kernel: ffff881020b9bfd8 ffff881020b9bfd8 ffff88102151b980 ffff88102151b980 Feb 22 12:46:17 gio12 kernel: ffff8807a92bda90 fffffffeffffffff ffff8807a92bda98 0000000000000000 Feb 22 12:46:17 gio12 kernel: Call Trace: Feb 22 12:46:17 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:46:17 gio12 kernel: [<ffffffff8163d8d5>] rwsem_down_read_failed+0xf5/0x170 Feb 22 12:46:17 gio12 kernel: [<ffffffff81301e54>] call_rwsem_down_read_failed+0x14/0x30 Feb 22 12:46:17 gio12 kernel: [<ffffffff8163b130>] ? down_read+0x20/0x30 Feb 22 12:46:17 gio12 kernel: [<ffffffffa0def84b>] ldiskfs_xattr_block_set+0x62b/0xa80 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0df09d4>] ldiskfs_expand_extra_isize_ea+0x404/0x810 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0df6d9f>] ldiskfs_mark_inode_dirty+0x1af/0x210 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0de0884>] ldiskfs_ext_truncate+0x24/0xe0 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0df83b7>] ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0e92e08>] osd_punch+0x138/0x5e0 [osd_ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0c84346>] ofd_object_punch+0x6e6/0xc30 [ofd] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd] Feb 22 12:46:17 gio12 kernel: [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] Feb 22 12:46:17 gio12 kernel: [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] Feb 22 12:46:17 gio12 kernel: [<ffffffff8163e05b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40 Feb 22 12:46:17 gio12 kernel: [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] Feb 22 12:46:17 gio12 kernel: [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:46:17 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:46:17 gio12 kernel: INFO: task jbd2/dm-11-8:15759 blocked for more than 120 seconds. Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:46:17 gio12 kernel: jbd2/dm-11-8 D ffff880036446800 0 15759 2 0x00000080 Feb 22 12:46:17 gio12 kernel: ffff880fdd21bc88 0000000000000046 ffff88104f747300 ffff880fdd21bfd8 Feb 22 12:46:17 gio12 kernel: ffff880fdd21bfd8 ffff880fdd21bfd8 ffff88104f747300 ffff880fdd21bda0 Feb 22 12:46:17 gio12 kernel: ffff881016e128c0 ffff88104f747300 ffff880fdd21bd88 ffff880036446800 Feb 22 12:46:17 gio12 kernel: Call Trace: Feb 22 12:46:17 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:46:17 gio12 kernel: [<ffffffffa018d138>] jbd2_journal_commit_transaction+0x248/0x19e0 [jbd2] Feb 22 12:46:17 gio12 kernel: [<ffffffff810c15fc>] ? update_curr+0xcc/0x150 Feb 22 12:46:17 gio12 kernel: [<ffffffff810c1ac6>] ? dequeue_entity+0x106/0x520 Feb 22 12:46:17 gio12 kernel: [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:46:17 gio12 kernel: [<ffffffff8108d7be>] ? try_to_del_timer_sync+0x5e/0x90 Feb 22 12:46:17 gio12 kernel: [<ffffffffa0192e99>] kjournald2+0xc9/0x260 [jbd2] Feb 22 12:46:17 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:46:17 gio12 kernel: [<ffffffffa0192dd0>] ? commit_timeout+0x10/0x10 [jbd2] Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:46:17 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:46:17 gio12 kernel: INFO: task kworker/u33:2:28976 blocked for more than 120 seconds. Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:46:17 gio12 kernel: kworker/u33:2 D ffff880850ea8030 0 28976 2 0x00000080 Feb 22 12:46:17 gio12 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:11) Feb 22 12:46:17 gio12 kernel: ffff880462d2f8e8 0000000000000046 ffff88084f0a8b80 ffff880462d2ffd8 Feb 22 12:46:17 gio12 kernel: ffff880462d2ffd8 ffff880462d2ffd8 ffff88084f0a8b80 ffff881016e12800 Feb 22 12:46:17 gio12 kernel: ffff881016e12878 000000000d83b523 ffff880036446800 ffff880850ea8030 Feb 22 12:46:17 gio12 kernel: Call Trace: Feb 22 12:46:17 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:46:17 gio12 kernel: [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2] Feb 22 12:46:17 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:46:17 gio12 kernel: [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2] Feb 22 12:46:17 gio12 kernel: [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0 Feb 22 12:46:17 gio12 kernel: [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0df7254>] ? ldiskfs_writepages+0x454/0xd80 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffffa0df7254>] ldiskfs_writepages+0x454/0xd80 [ldiskfs] Feb 22 12:46:17 gio12 kernel: [<ffffffff81174d08>] ? generic_writepages+0x58/0x80 Feb 22 12:46:17 gio12 kernel: [<ffffffff81175dae>] do_writepages+0x1e/0x40 Feb 22 12:46:17 gio12 kernel: [<ffffffff81208c90>] __writeback_single_inode+0x40/0x220 Feb 22 12:46:17 gio12 kernel: [<ffffffff812096fe>] writeback_sb_inodes+0x25e/0x420 Feb 22 12:46:17 gio12 kernel: [<ffffffff8120995f>] __writeback_inodes_wb+0x9f/0xd0 Feb 22 12:46:17 gio12 kernel: [<ffffffff8120a1a3>] wb_writeback+0x263/0x2f0 Feb 22 12:46:17 gio12 kernel: [<ffffffff811f8fac>] ? get_nr_inodes+0x4c/0x70 Feb 22 12:46:17 gio12 kernel: [<ffffffff8120c42b>] bdi_writeback_workfn+0x2cb/0x460 Feb 22 12:46:17 gio12 kernel: [<ffffffff8109d6bb>] process_one_work+0x17b/0x470 Feb 22 12:46:17 gio12 kernel: [<ffffffff8109e48b>] worker_thread+0x11b/0x400 Feb 22 12:46:17 gio12 kernel: [<ffffffff8109e370>] ? rescuer_thread+0x400/0x400 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:46:17 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:46:17 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:46:17 gio12 kernel: INFO: task ll_ost_io03_004:32100 blocked for more than 120 seconds. Feb 22 12:46:17 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:46:17 gio12 kernel: ll_ost_io03_004 D ffff881053a24060 0 32100 2 0x00000080 Feb 22 12:46:17 gio12 kernel: ffff88031eb6f9f0 0000000000000046 ffff8808a62c2280 ffff88031eb6ffd8 Feb 22 12:46:17 gio12 kernel: ffff88031eb6ffd8 ffff88031eb6ffd8 ffff8808a62c2280 ffff881016e12800 Feb 22 12:46:17 gio12 kernel: ffff881016e12878 000000000d83b523 ffff880036446800 ffff881053a24060 Feb 22 12:46:17 gio12 kernel: Call Trace: Feb 22 12:46:17 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:46:17 gio12 kernel: [<ffffffffa018a085>] wait_transaction_locked+0x85/0xd0 [jbd2] Feb 22 12:46:18 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:46:18 gio12 kernel: [<ffffffffa018a400>] start_this_handle+0x2b0/0x5d0 [jbd2] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0e713d4>] ? osd_declare_xattr_set+0xe4/0x2e0 [osd_ldiskfs] Feb 22 12:46:18 gio12 kernel: [<ffffffff811c176a>] ? kmem_cache_alloc+0x1ba/0x1d0 Feb 22 12:46:18 gio12 kernel: [<ffffffffa018a933>] jbd2__journal_start+0xf3/0x1e0 [jbd2] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0e78534>] ? osd_trans_start+0x174/0x410 [osd_ldiskfs] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0dd6829>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0e78534>] osd_trans_start+0x174/0x410 [osd_ldiskfs] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0c80d7b>] ofd_trans_start+0x6b/0xe0 [ofd] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0c8428a>] ofd_object_punch+0x62a/0xc30 [ofd] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd] Feb 22 12:46:18 gio12 kernel: [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] Feb 22 12:46:18 gio12 kernel: [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] Feb 22 12:46:18 gio12 kernel: [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs] Feb 22 12:46:18 gio12 kernel: [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] Feb 22 12:46:18 gio12 kernel: [<ffffffff810af0e8>] ? __wake_up_common+0x58/0x90 Feb 22 12:46:18 gio12 kernel: [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] Feb 22 12:46:18 gio12 kernel: [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] Feb 22 12:46:18 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:46:18 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:46:18 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:46:18 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:47:09 gio12 kernel: LustreError: 137-5: dtemp-OST001c_UUID: not available for connect from 172.22.166.12@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. Feb 22 12:48:18 gio12 kernel: INFO: task ll_ost_io01_002:14056 blocked for more than 120 seconds. Feb 22 12:48:18 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:48:18 gio12 kernel: ll_ost_io01_002 D 0000000000000000 0 14056 2 0x00000080 Feb 22 12:48:18 gio12 kernel: ffff881020b9b898 0000000000000046 ffff88102151b980 ffff881020b9bfd8 Feb 22 12:48:18 gio12 kernel: ffff881020b9bfd8 ffff881020b9bfd8 ffff88102151b980 ffff88102151b980 Feb 22 12:48:18 gio12 kernel: ffff8807a92bda90 fffffffeffffffff ffff8807a92bda98 0000000000000000 Feb 22 12:48:18 gio12 kernel: Call Trace: Feb 22 12:48:18 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:48:18 gio12 kernel: [<ffffffff8163d8d5>] rwsem_down_read_failed+0xf5/0x170 Feb 22 12:48:18 gio12 kernel: [<ffffffff81301e54>] call_rwsem_down_read_failed+0x14/0x30 Feb 22 12:48:18 gio12 kernel: [<ffffffff8163b130>] ? down_read+0x20/0x30 Feb 22 12:48:18 gio12 kernel: [<ffffffffa0def84b>] ldiskfs_xattr_block_set+0x62b/0xa80 [ldiskfs] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0df09d4>] ldiskfs_expand_extra_isize_ea+0x404/0x810 [ldiskfs] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0df6d9f>] ldiskfs_mark_inode_dirty+0x1af/0x210 [ldiskfs] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0de0884>] ldiskfs_ext_truncate+0x24/0xe0 [ldiskfs] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0df83b7>] ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0e92e08>] osd_punch+0x138/0x5e0 [osd_ldiskfs] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0c84346>] ofd_object_punch+0x6e6/0xc30 [ofd] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0c715e6>] ofd_punch_hdl+0x466/0x720 [ofd] Feb 22 12:48:18 gio12 kernel: [<ffffffffa109bc9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] Feb 22 12:48:18 gio12 kernel: [<ffffffffa103ea3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] Feb 22 12:48:18 gio12 kernel: [<ffffffffa0815cf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs] Feb 22 12:48:18 gio12 kernel: [<ffffffffa103bb08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] Feb 22 12:48:18 gio12 kernel: [<ffffffff8163e05b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40 Feb 22 12:48:18 gio12 kernel: [<ffffffffa1042360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] Feb 22 12:48:18 gio12 kernel: [<ffffffffa1041760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] Feb 22 12:48:18 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:48:18 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:48:18 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:48:18 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:48:18 gio12 kernel: INFO: task jbd2/dm-11-8:15759 blocked for more than 120 seconds. Feb 22 12:48:18 gio12 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 12:48:18 gio12 kernel: jbd2/dm-11-8 D ffff880036446800 0 15759 2 0x00000080 Feb 22 12:48:18 gio12 kernel: ffff880fdd21bc88 0000000000000046 ffff88104f747300 ffff880fdd21bfd8 Feb 22 12:48:18 gio12 kernel: ffff880fdd21bfd8 ffff880fdd21bfd8 ffff88104f747300 ffff880fdd21bda0 Feb 22 12:48:18 gio12 kernel: ffff881016e128c0 ffff88104f747300 ffff880fdd21bd88 ffff880036446800 Feb 22 12:48:18 gio12 kernel: Call Trace: Feb 22 12:48:18 gio12 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 12:48:18 gio12 kernel: [<ffffffffa018d138>] jbd2_journal_commit_transaction+0x248/0x19e0 [jbd2] Feb 22 12:48:18 gio12 kernel: [<ffffffff810c15fc>] ? update_curr+0xcc/0x150 Feb 22 12:48:18 gio12 kernel: [<ffffffff810c1ac6>] ? dequeue_entity+0x106/0x520 Feb 22 12:48:18 gio12 kernel: [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0 Feb 22 12:48:18 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:48:18 gio12 kernel: [<ffffffff8108d7be>] ? try_to_del_timer_sync+0x5e/0x90 Feb 22 12:48:18 gio12 kernel: [<ffffffffa0192e99>] kjournald2+0xc9/0x260 [jbd2] Feb 22 12:48:18 gio12 kernel: [<ffffffff810a6ba0>] ? wake_up_atomic_t+0x30/0x30 Feb 22 12:48:18 gio12 kernel: [<ffffffffa0192dd0>] ? commit_timeout+0x10/0x10 [jbd2] Feb 22 12:48:18 gio12 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 12:48:18 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 12:48:18 gio12 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 12:48:18 gio12 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 MDS also had watchdog stack traces occur for several mdt tasks. MDT watchdog traces were triggered once but OSS has repeated watchdog traces. Example MDT stack trace Feb 22 03:52:03 gio0 kernel: INFO: task mdt01_003:9154 blocked for more than 120 seconds. Feb 22 03:52:03 gio0 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 22 03:52:03 gio0 kernel: mdt01_003 D ffff880838cab1d8 0 9154 2 0x00000080 Feb 22 03:52:03 gio0 kernel: ffff8810371ab4c8 0000000000000046 ffff8810507eb980 ffff8810371abfd8 Feb 22 03:52:03 gio0 kernel: ffff8810371abfd8 ffff8810371abfd8 ffff8810507eb980 ffff8810507eb980 Feb 22 03:52:03 gio0 kernel: ffff880838cab1c8 ffff880838cab1d0 ffffffff00000000 ffff880838cab1d8 Feb 22 03:52:03 gio0 kernel: Call Trace: Feb 22 03:52:03 gio0 kernel: [<ffffffff8163bf19>] schedule+0x29/0x70 Feb 22 03:52:03 gio0 kernel: [<ffffffff8163d6d5>] rwsem_down_write_failed+0x115/0x220 Feb 22 03:52:03 gio0 kernel: [<ffffffff812134dc>] ? __find_get_block+0xbc/0x120 Feb 22 03:52:03 gio0 kernel: [<ffffffff81301e83>] call_rwsem_down_write_failed+0x13/0x20 Feb 22 03:52:03 gio0 kernel: [<ffffffff8163b16d>] ? down_write+0x2d/0x30 Feb 22 03:52:03 gio0 kernel: [<ffffffffa142bba7>] lod_alloc_qos.constprop.15+0x187/0x1400 [lod] Feb 22 03:52:03 gio0 kernel: [<ffffffff8121292d>] ? __brelse+0x3d/0x50 Feb 22 03:52:03 gio0 kernel: [<ffffffffa10ed65f>] ? ldiskfs_xattr_ibody_get+0xef/0x1a0 [ldiskfs] Feb 22 03:52:03 gio0 kernel: [<ffffffffa10ec6af>] ? ldiskfs_xattr_find_entry+0x9f/0x130 [ldiskfs] Feb 22 03:52:03 gio0 kernel: [<ffffffffa142e7fd>] lod_qos_prep_create+0x10cd/0x1fbc [lod] Feb 22 03:52:03 gio0 kernel: [<ffffffffa11aaac9>] ? osd_declare_qid+0x279/0x4b0 [osd_ldiskfs] Feb 22 03:52:03 gio0 kernel: [<ffffffffa11aaeae>] ? osd_declare_inode_qid+0x1ae/0x290 [osd_ldiskfs] Feb 22 03:52:03 gio0 kernel: [<ffffffffa1427fdd>] lod_declare_striped_object+0x1fd/0x810 [lod] Feb 22 03:52:03 gio0 kernel: [<ffffffffa1171e23>] ? osd_declare_object_create+0x113/0x2b0 [osd_ldiskfs] Feb 22 03:52:03 gio0 kernel: [<ffffffffa1429661>] lod_declare_object_create+0x231/0x4b0 [lod] Feb 22 03:52:03 gio0 kernel: [<ffffffffa14836af>] mdd_declare_object_create_internal+0xdf/0x2f0 [mdd] Feb 22 03:52:03 gio0 kernel: [<ffffffffa1478038>] mdd_declare_create+0x48/0xef0 [mdd] Feb 22 03:52:03 gio0 kernel: [<ffffffffa1479669>] mdd_create+0x789/0x12a0 [mdd] Feb 22 03:52:03 gio0 kernel: [<ffffffffa134ed52>] mdt_reint_open+0x1f92/0x2e00 [mdt] Feb 22 03:52:03 gio0 kernel: [<ffffffffa08aa1a9>] ? upcall_cache_get_entry+0x3e9/0x8e0 [libcfs] Feb 22 03:52:03 gio0 kernel: [<ffffffff812fc212>] ? strlcpy+0x42/0x60 Feb 22 03:52:03 gio0 kernel: [<ffffffffa1341e30>] mdt_reint_rec+0x80/0x210 [mdt] Feb 22 03:52:03 gio0 kernel: [<ffffffffa1322921>] mdt_reint_internal+0x5e1/0xb30 [mdt] Feb 22 03:52:03 gio0 kernel: [<ffffffffa1322fd2>] mdt_intent_reint+0x162/0x420 [mdt] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0c95797>] ? lustre_msg_buf+0x17/0x60 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa13268b5>] mdt_intent_opc+0x215/0xa30 [mdt] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0c99e30>] ? lustre_swab_ldlm_policy_data+0x30/0x30 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa132e478>] mdt_intent_policy+0x138/0x320 [mdt] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0c491d7>] ldlm_lock_enqueue+0x357/0x9c0 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0c6f7b2>] ldlm_handle_enqueue0+0x4f2/0x16f0 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0c99eb0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0cfcc32>] tgt_enqueue+0x62/0x210 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0d01c9b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0ca4a3b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa089ecf8>] ? lc_watchdog_touch+0x68/0x180 [libcfs] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0ca1b08>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffff810af0e8>] ? __wake_up_common+0x58/0x90 Feb 22 03:52:03 gio0 kernel: [<ffffffffa0ca8360>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffffa0ca7760>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] Feb 22 03:52:03 gio0 kernel: [<ffffffff810a5baf>] kthread+0xcf/0xe0 Feb 22 03:52:03 gio0 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140 Feb 22 03:52:03 gio0 kernel: [<ffffffff81646e58>] ret_from_fork+0x58/0x90 Feb 22 03:52:03 gio0 kernel: [<ffffffff810a5ae0>] ? kthread_create_on_node+0x140/0x140
            green Oleg Drokin added a comment -

            We can do it, but since we always patch ldiskfs ourselves anyway, it has no imact on our patchlessness.

            green Oleg Drokin added a comment - We can do it, but since we always patch ldiskfs ourselves anyway, it has no imact on our patchlessness.
            pjones Peter Jones added a comment -

            Landed for 2.10

            pjones Peter Jones added a comment - Landed for 2.10

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25595/
            Subject: LU-9146 ldiskfs: backport a few patches to resolve deadlock
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 18120272b73a018a2590f1e5a895331b35df75e9

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25595/ Subject: LU-9146 ldiskfs: backport a few patches to resolve deadlock Project: fs/lustre-release Branch: master Current Patch Set: Commit: 18120272b73a018a2590f1e5a895331b35df75e9

            Yang Sheng (yang.sheng@intel.com) uploaded a new patch: https://review.whamcloud.com/25595
            Subject: LU-9146 ldiskfs: backport a few patches to resolve deadlock
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8b514e5fcd9ff38868af33623c80e86072187397

            gerrit Gerrit Updater added a comment - Yang Sheng (yang.sheng@intel.com) uploaded a new patch: https://review.whamcloud.com/25595 Subject: LU-9146 ldiskfs: backport a few patches to resolve deadlock Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8b514e5fcd9ff38868af33623c80e86072187397

            People

              ys Yang Sheng
              ys Yang Sheng
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: