[LU-6904] linkea prepare in mdt_reint_rename cause deadlock in 24 hours failover Created: 26/Jul/15  Updated: 26/Aug/15  Resolved: 03/Aug/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Major
Reporter: Di Wang Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-6831 The ticket for tracking all DNE2 bugs Reopened
is related to LU-6883 replay-single test 73a hang and timeo... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

During 24 hours failover test. MDT stuck at

LNet: Service thread pid 3156 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Pid: 3156, comm: mdt01_005

Call Trace:
 [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70
 [<ffffffff8152a512>] schedule_timeout+0x192/0x2e0
 [<ffffffff81083f30>] ? process_timeout+0x0/0x10
 [<ffffffffa07d9fa9>] ptlrpc_set_wait+0x319/0xa30 [ptlrpc]
 [<ffffffffa07cf520>] ? ptlrpc_interrupted_set+0x0/0x110 [ptlrpc]
 [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
 [<ffffffffa07e61b5>] ? lustre_msg_set_jobid+0xf5/0x130 [ptlrpc]
 [<ffffffffa07da741>] ptlrpc_queue_wait+0x81/0x220 [ptlrpc]
 [<ffffffffa113f271>] osp_remote_sync+0x121/0x190 [osp]
 [<ffffffffa1125b47>] osp_xattr_get+0x737/0x10e0 [osp]
 [<ffffffffa0d2c378>] ? __ldiskfs_journal_stop+0x68/0xa0 [ldiskfs]
 [<ffffffffa104ab25>] lod_xattr_get+0x185/0x760 [lod]
 [<ffffffffa10b8825>] mdd_links_read+0xf5/0x2d0 [mdd]
 [<ffffffffa10bbc4f>] mdd_linkea_prepare+0x39f/0x4d0 [mdd]
 [<ffffffffa10bbea1>] mdd_links_rename+0x121/0x540 [mdd]
 [<ffffffff8152b3a6>] ? down_write+0x16/0x40
 [<ffffffffa10c28ba>] mdd_rename+0x13aa/0x1d50 [mdd]
 [<ffffffffa0f7a105>] mdt_reint_rename_internal+0x1305/0x1a50 [mdt]
 [<ffffffffa0798246>] ? ldlm_lock_enqueue+0x2c6/0x8e0 [ptlrpc]
 [<ffffffffa0f7aa4d>] mdt_reint_rename_or_migrate+0x1fd/0x7e0 [mdt]
 [<ffffffffa07b7160>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc]
 [<ffffffffa07b8ad0>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc]
 [<ffffffffa080cea2>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]
 [<ffffffff8128b45a>] ? strlcpy+0x4a/0x60
 [<ffffffffa0f7b063>] mdt_reint_rename+0x13/0x20 [mdt]
 [<ffffffffa0f7386d>] mdt_reint_rec+0x5d/0x200 [mdt]
 [<ffffffffa0f5f78b>] mdt_reint_internal+0x62b/0xb80 [mdt]
 [<ffffffffa0f6017b>] mdt_reint+0x6b/0x120 [mdt]
 [<ffffffffa084ef42>] tgt_request_handle+0xa42/0x1230 [ptlrpc]
 [<ffffffffa07f73a1>] ptlrpc_main+0xe41/0x1920 [ptlrpc]
 [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
 [<ffffffff810623a9>] ? find_busiest_queue+0x69/0x150
 [<ffffffff815296ee>] ? thread_return+0x4e/0x770
 [<ffffffffa07f6560>] ? ptlrpc_main+0x0/0x1920 [ptlrpc]
 [<ffffffff8109abf6>] kthread+0x96/0xa0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109ab60>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

The thread stuck here because the remote MDT restarts. But it also holds the journal handle, which cause a lot journal thread stuck

INFO: task jbd2/sdf1-8:2813 blocked for more than 120 seconds.
      Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/sdf1-8   D 0000000000000001     0  2813      2 0x00000080
 ffff880fe9511d20 0000000000000046 0000000000000000 0000000000016880
 ffff880fe9511cd0 ffffffff810534d7 0000000000016880 ffff88102e42c080
 ffff88102e42c638 ffff880fe9511fd8 000000000000fbc8 ffff88102e42c638
Call Trace:
 [<ffffffff810534d7>] ? walk_tg_tree_from+0x67/0xc0
 [<ffffffffa0cd580f>] jbd2_journal_commit_transaction+0x19f/0x1500 [jbd2]
 [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
 [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70
 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0cdba58>] kjournald2+0xb8/0x220 [jbd2]
 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0cdb9a0>] ? kjournald2+0x0/0x220 [jbd2]
 [<ffffffff8109abf6>] kthread+0x96/0xa0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109ab60>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

INFO: task tdtd-0:2919 blocked for more than 120 seconds.
      Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
tdtd-0        D 0000000000000000     0  2919      2 0x00000080
 ffff880ffbe2dc20 0000000000000046 0000004cffffffff ffffffffa046b27b
 0000000055b31da1 ffff88103152a000 ffff880ffbe2dc40 ffffffffa0c816cf
 ffff880ffbe1b098 ffff880ffbe2dfd8 000000000000fbc8 ffff880ffbe1b098
Call Trace:
 [<ffffffffa046b27b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
 [<ffffffffa0c816cf>] ? qsd_op_begin+0x5f/0xb70 [lquota]
 [<ffffffff8109b2ce>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa0cd408a>] start_this_handle+0x25a/0x480 [jbd2]
 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0cd4495>] jbd2_journal_start+0xb5/0x100 [jbd2]
 [<ffffffffa0d2c406>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
 [<ffffffffa0d7dbef>] osd_trans_start+0x1df/0x410 [osd_ldiskfs]
 [<ffffffffa0860d6e>] distribute_txn_commit_thread+0xa1e/0x1700 [ptlrpc]
 [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
 [<ffffffffa0860350>] ? distribute_txn_commit_thread+0x0/0x1700 [ptlrpc]
 [<ffffffff8109abf6>] kthread+0x96/0xa0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109ab60>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
INFO: task mdt_out03_003:3164 blocked for more than 120 seconds.
      Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt_out03_003 D 000000000000000f     0  3164      2 0x00000080
 ffff88081a6ada60 0000000000000046 0000000000000000 ffff881029e64000
 ffff88081a6ada20 ffffffffa0db3e79 ffff88081a6ada90 ffffffff815296ee
 ffff880831563058 ffff88081a6adfd8 000000000000fbc8 ffff880831563058
Call Trace:
 [<ffffffffa0db3e79>] ? osd_declare_qid+0x289/0x480 [osd_ldiskfs]
 [<ffffffff815296ee>] ? thread_return+0x4e/0x770
 [<ffffffffa0cd408a>] start_this_handle+0x25a/0x480 [jbd2]
 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0cd4495>] jbd2_journal_start+0xb5/0x100 [jbd2]
 [<ffffffffa0d2c406>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
 [<ffffffffa0d7dbef>] osd_trans_start+0x1df/0x410 [osd_ldiskfs]
 [<ffffffffa0851ffc>] out_tx_end+0x9c/0x5d0 [ptlrpc]
 [<ffffffffa0856bab>] out_handle+0xd9b/0x17e0 [ptlrpc]
 [<ffffffffa079ffb0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
 [<ffffffffa084ef42>] tgt_request_handle+0xa42/0x1230 [ptlrpc]
 [<ffffffffa07f73a1>] ptlrpc_main+0xe41/0x1920 [ptlrpc]
 [<ffffffff81069290>] ? pick_next_task_fair+0xd0/0x130
 [<ffffffff81529466>] ? schedule+0x176/0x3b0
 [<ffffffffa07f6560>] ? ptlrpc_main+0x0/0x1920 [ptlrpc]
 [<ffffffff8109abf6>] kthread+0x96/0xa0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109ab60>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20

So in rename, it should prepare the linkEA before journal start to avoid holding transaction, while sending RPC.



 Comments   
Comment by Gerrit Updater [ 26/Jul/15 ]

wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15724
Subject: LU-6904 mdd: prepare linkea before declare
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e62e161dfe66c10160f432d8a6bc75ff91d08003

Comment by Gerrit Updater [ 03/Aug/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15724/
Subject: LU-6904 mdd: prepare linkea before declare
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: af46754e1e9c3e66928e13b2301890cfa94ac059

Generated at Sat Feb 10 02:04:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.