Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6904

linkea prepare in mdt_reint_rename cause deadlock in 24 hours failover

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      During 24 hours failover test. MDT stuck at

      LNet: Service thread pid 3156 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      Pid: 3156, comm: mdt01_005
      
      Call Trace:
       [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70
       [<ffffffff8152a512>] schedule_timeout+0x192/0x2e0
       [<ffffffff81083f30>] ? process_timeout+0x0/0x10
       [<ffffffffa07d9fa9>] ptlrpc_set_wait+0x319/0xa30 [ptlrpc]
       [<ffffffffa07cf520>] ? ptlrpc_interrupted_set+0x0/0x110 [ptlrpc]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa07e61b5>] ? lustre_msg_set_jobid+0xf5/0x130 [ptlrpc]
       [<ffffffffa07da741>] ptlrpc_queue_wait+0x81/0x220 [ptlrpc]
       [<ffffffffa113f271>] osp_remote_sync+0x121/0x190 [osp]
       [<ffffffffa1125b47>] osp_xattr_get+0x737/0x10e0 [osp]
       [<ffffffffa0d2c378>] ? __ldiskfs_journal_stop+0x68/0xa0 [ldiskfs]
       [<ffffffffa104ab25>] lod_xattr_get+0x185/0x760 [lod]
       [<ffffffffa10b8825>] mdd_links_read+0xf5/0x2d0 [mdd]
       [<ffffffffa10bbc4f>] mdd_linkea_prepare+0x39f/0x4d0 [mdd]
       [<ffffffffa10bbea1>] mdd_links_rename+0x121/0x540 [mdd]
       [<ffffffff8152b3a6>] ? down_write+0x16/0x40
       [<ffffffffa10c28ba>] mdd_rename+0x13aa/0x1d50 [mdd]
       [<ffffffffa0f7a105>] mdt_reint_rename_internal+0x1305/0x1a50 [mdt]
       [<ffffffffa0798246>] ? ldlm_lock_enqueue+0x2c6/0x8e0 [ptlrpc]
       [<ffffffffa0f7aa4d>] mdt_reint_rename_or_migrate+0x1fd/0x7e0 [mdt]
       [<ffffffffa07b7160>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc]
       [<ffffffffa07b8ad0>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc]
       [<ffffffffa080cea2>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]
       [<ffffffff8128b45a>] ? strlcpy+0x4a/0x60
       [<ffffffffa0f7b063>] mdt_reint_rename+0x13/0x20 [mdt]
       [<ffffffffa0f7386d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa0f5f78b>] mdt_reint_internal+0x62b/0xb80 [mdt]
       [<ffffffffa0f6017b>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa084ef42>] tgt_request_handle+0xa42/0x1230 [ptlrpc]
       [<ffffffffa07f73a1>] ptlrpc_main+0xe41/0x1920 [ptlrpc]
       [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
       [<ffffffff810623a9>] ? find_busiest_queue+0x69/0x150
       [<ffffffff815296ee>] ? thread_return+0x4e/0x770
       [<ffffffffa07f6560>] ? ptlrpc_main+0x0/0x1920 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      The thread stuck here because the remote MDT restarts. But it also holds the journal handle, which cause a lot journal thread stuck

      INFO: task jbd2/sdf1-8:2813 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      jbd2/sdf1-8   D 0000000000000001     0  2813      2 0x00000080
       ffff880fe9511d20 0000000000000046 0000000000000000 0000000000016880
       ffff880fe9511cd0 ffffffff810534d7 0000000000016880 ffff88102e42c080
       ffff88102e42c638 ffff880fe9511fd8 000000000000fbc8 ffff88102e42c638
      Call Trace:
       [<ffffffff810534d7>] ? walk_tg_tree_from+0x67/0xc0
       [<ffffffffa0cd580f>] jbd2_journal_commit_transaction+0x19f/0x1500 [jbd2]
       [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
       [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70
       [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
       [<ffffffffa0cdba58>] kjournald2+0xb8/0x220 [jbd2]
       [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
       [<ffffffffa0cdb9a0>] ? kjournald2+0x0/0x220 [jbd2]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      INFO: task tdtd-0:2919 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      tdtd-0        D 0000000000000000     0  2919      2 0x00000080
       ffff880ffbe2dc20 0000000000000046 0000004cffffffff ffffffffa046b27b
       0000000055b31da1 ffff88103152a000 ffff880ffbe2dc40 ffffffffa0c816cf
       ffff880ffbe1b098 ffff880ffbe2dfd8 000000000000fbc8 ffff880ffbe1b098
      Call Trace:
       [<ffffffffa046b27b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
       [<ffffffffa0c816cf>] ? qsd_op_begin+0x5f/0xb70 [lquota]
       [<ffffffff8109b2ce>] ? prepare_to_wait+0x4e/0x80
       [<ffffffffa0cd408a>] start_this_handle+0x25a/0x480 [jbd2]
       [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
       [<ffffffffa0cd4495>] jbd2_journal_start+0xb5/0x100 [jbd2]
       [<ffffffffa0d2c406>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
       [<ffffffffa0d7dbef>] osd_trans_start+0x1df/0x410 [osd_ldiskfs]
       [<ffffffffa0860d6e>] distribute_txn_commit_thread+0xa1e/0x1700 [ptlrpc]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa0860350>] ? distribute_txn_commit_thread+0x0/0x1700 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task mdt_out03_003:3164 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt_out03_003 D 000000000000000f     0  3164      2 0x00000080
       ffff88081a6ada60 0000000000000046 0000000000000000 ffff881029e64000
       ffff88081a6ada20 ffffffffa0db3e79 ffff88081a6ada90 ffffffff815296ee
       ffff880831563058 ffff88081a6adfd8 000000000000fbc8 ffff880831563058
      Call Trace:
       [<ffffffffa0db3e79>] ? osd_declare_qid+0x289/0x480 [osd_ldiskfs]
       [<ffffffff815296ee>] ? thread_return+0x4e/0x770
       [<ffffffffa0cd408a>] start_this_handle+0x25a/0x480 [jbd2]
       [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
       [<ffffffffa0cd4495>] jbd2_journal_start+0xb5/0x100 [jbd2]
       [<ffffffffa0d2c406>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
       [<ffffffffa0d7dbef>] osd_trans_start+0x1df/0x410 [osd_ldiskfs]
       [<ffffffffa0851ffc>] out_tx_end+0x9c/0x5d0 [ptlrpc]
       [<ffffffffa0856bab>] out_handle+0xd9b/0x17e0 [ptlrpc]
       [<ffffffffa079ffb0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
       [<ffffffffa084ef42>] tgt_request_handle+0xa42/0x1230 [ptlrpc]
       [<ffffffffa07f73a1>] ptlrpc_main+0xe41/0x1920 [ptlrpc]
       [<ffffffff81069290>] ? pick_next_task_fair+0xd0/0x130
       [<ffffffff81529466>] ? schedule+0x176/0x3b0
       [<ffffffffa07f6560>] ? ptlrpc_main+0x0/0x1920 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      So in rename, it should prepare the linkEA before journal start to avoid holding transaction, while sending RPC.

      Attachments

        Issue Links

          Activity

            [LU-6904] linkea prepare in mdt_reint_rename cause deadlock in 24 hours failover

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15724/
            Subject: LU-6904 mdd: prepare linkea before declare
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: af46754e1e9c3e66928e13b2301890cfa94ac059

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15724/ Subject: LU-6904 mdd: prepare linkea before declare Project: fs/lustre-release Branch: master Current Patch Set: Commit: af46754e1e9c3e66928e13b2301890cfa94ac059

            wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15724
            Subject: LU-6904 mdd: prepare linkea before declare
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: e62e161dfe66c10160f432d8a6bc75ff91d08003

            gerrit Gerrit Updater added a comment - wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15724 Subject: LU-6904 mdd: prepare linkea before declare Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e62e161dfe66c10160f432d8a6bc75ff91d08003

            People

              di.wang Di Wang (Inactive)
              di.wang Di Wang (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: