Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.8.0
-
None
-
3
-
9223372036854775807
Description
During 24 hours failover test. MDT stuck at
LNet: Service thread pid 3156 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 3156, comm: mdt01_005 Call Trace: [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70 [<ffffffff8152a512>] schedule_timeout+0x192/0x2e0 [<ffffffff81083f30>] ? process_timeout+0x0/0x10 [<ffffffffa07d9fa9>] ptlrpc_set_wait+0x319/0xa30 [ptlrpc] [<ffffffffa07cf520>] ? ptlrpc_interrupted_set+0x0/0x110 [ptlrpc] [<ffffffff81061d00>] ? default_wake_function+0x0/0x20 [<ffffffffa07e61b5>] ? lustre_msg_set_jobid+0xf5/0x130 [ptlrpc] [<ffffffffa07da741>] ptlrpc_queue_wait+0x81/0x220 [ptlrpc] [<ffffffffa113f271>] osp_remote_sync+0x121/0x190 [osp] [<ffffffffa1125b47>] osp_xattr_get+0x737/0x10e0 [osp] [<ffffffffa0d2c378>] ? __ldiskfs_journal_stop+0x68/0xa0 [ldiskfs] [<ffffffffa104ab25>] lod_xattr_get+0x185/0x760 [lod] [<ffffffffa10b8825>] mdd_links_read+0xf5/0x2d0 [mdd] [<ffffffffa10bbc4f>] mdd_linkea_prepare+0x39f/0x4d0 [mdd] [<ffffffffa10bbea1>] mdd_links_rename+0x121/0x540 [mdd] [<ffffffff8152b3a6>] ? down_write+0x16/0x40 [<ffffffffa10c28ba>] mdd_rename+0x13aa/0x1d50 [mdd] [<ffffffffa0f7a105>] mdt_reint_rename_internal+0x1305/0x1a50 [mdt] [<ffffffffa0798246>] ? ldlm_lock_enqueue+0x2c6/0x8e0 [ptlrpc] [<ffffffffa0f7aa4d>] mdt_reint_rename_or_migrate+0x1fd/0x7e0 [mdt] [<ffffffffa07b7160>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] [<ffffffffa07b8ad0>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc] [<ffffffffa080cea2>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc] [<ffffffff8128b45a>] ? strlcpy+0x4a/0x60 [<ffffffffa0f7b063>] mdt_reint_rename+0x13/0x20 [mdt] [<ffffffffa0f7386d>] mdt_reint_rec+0x5d/0x200 [mdt] [<ffffffffa0f5f78b>] mdt_reint_internal+0x62b/0xb80 [mdt] [<ffffffffa0f6017b>] mdt_reint+0x6b/0x120 [mdt] [<ffffffffa084ef42>] tgt_request_handle+0xa42/0x1230 [ptlrpc] [<ffffffffa07f73a1>] ptlrpc_main+0xe41/0x1920 [ptlrpc] [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 [<ffffffff810623a9>] ? find_busiest_queue+0x69/0x150 [<ffffffff815296ee>] ? thread_return+0x4e/0x770 [<ffffffffa07f6560>] ? ptlrpc_main+0x0/0x1920 [ptlrpc] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109ab60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20
The thread stuck here because the remote MDT restarts. But it also holds the journal handle, which cause a lot journal thread stuck
INFO: task jbd2/sdf1-8:2813 blocked for more than 120 seconds. Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. jbd2/sdf1-8 D 0000000000000001 0 2813 2 0x00000080 ffff880fe9511d20 0000000000000046 0000000000000000 0000000000016880 ffff880fe9511cd0 ffffffff810534d7 0000000000016880 ffff88102e42c080 ffff88102e42c638 ffff880fe9511fd8 000000000000fbc8 ffff88102e42c638 Call Trace: [<ffffffff810534d7>] ? walk_tg_tree_from+0x67/0xc0 [<ffffffffa0cd580f>] jbd2_journal_commit_transaction+0x19f/0x1500 [jbd2] [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0cdba58>] kjournald2+0xb8/0x220 [jbd2] [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0cdb9a0>] ? kjournald2+0x0/0x220 [jbd2] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109ab60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 INFO: task tdtd-0:2919 blocked for more than 120 seconds. Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. tdtd-0 D 0000000000000000 0 2919 2 0x00000080 ffff880ffbe2dc20 0000000000000046 0000004cffffffff ffffffffa046b27b 0000000055b31da1 ffff88103152a000 ffff880ffbe2dc40 ffffffffa0c816cf ffff880ffbe1b098 ffff880ffbe2dfd8 000000000000fbc8 ffff880ffbe1b098 Call Trace: [<ffffffffa046b27b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs] [<ffffffffa0c816cf>] ? qsd_op_begin+0x5f/0xb70 [lquota] [<ffffffff8109b2ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0cd408a>] start_this_handle+0x25a/0x480 [jbd2] [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0cd4495>] jbd2_journal_start+0xb5/0x100 [jbd2] [<ffffffffa0d2c406>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs] [<ffffffffa0d7dbef>] osd_trans_start+0x1df/0x410 [osd_ldiskfs] [<ffffffffa0860d6e>] distribute_txn_commit_thread+0xa1e/0x1700 [ptlrpc] [<ffffffff81061d00>] ? default_wake_function+0x0/0x20 [<ffffffffa0860350>] ? distribute_txn_commit_thread+0x0/0x1700 [ptlrpc] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109ab60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 INFO: task mdt_out03_003:3164 blocked for more than 120 seconds. Not tainted 2.6.32-431.29.2.el6_lustre.g2382eb0.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mdt_out03_003 D 000000000000000f 0 3164 2 0x00000080 ffff88081a6ada60 0000000000000046 0000000000000000 ffff881029e64000 ffff88081a6ada20 ffffffffa0db3e79 ffff88081a6ada90 ffffffff815296ee ffff880831563058 ffff88081a6adfd8 000000000000fbc8 ffff880831563058 Call Trace: [<ffffffffa0db3e79>] ? osd_declare_qid+0x289/0x480 [osd_ldiskfs] [<ffffffff815296ee>] ? thread_return+0x4e/0x770 [<ffffffffa0cd408a>] start_this_handle+0x25a/0x480 [jbd2] [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0cd4495>] jbd2_journal_start+0xb5/0x100 [jbd2] [<ffffffffa0d2c406>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs] [<ffffffffa0d7dbef>] osd_trans_start+0x1df/0x410 [osd_ldiskfs] [<ffffffffa0851ffc>] out_tx_end+0x9c/0x5d0 [ptlrpc] [<ffffffffa0856bab>] out_handle+0xd9b/0x17e0 [ptlrpc] [<ffffffffa079ffb0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc] [<ffffffffa084ef42>] tgt_request_handle+0xa42/0x1230 [ptlrpc] [<ffffffffa07f73a1>] ptlrpc_main+0xe41/0x1920 [ptlrpc] [<ffffffff81069290>] ? pick_next_task_fair+0xd0/0x130 [<ffffffff81529466>] ? schedule+0x176/0x3b0 [<ffffffffa07f6560>] ? ptlrpc_main+0x0/0x1920 [ptlrpc] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109ab60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20
So in rename, it should prepare the linkEA before journal start to avoid holding transaction, while sending RPC.