Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.7.0
-
None
-
OpenSFS cluster with two MDSs with one MDT each, three OSSs with two OSTs each and three clients running lustre-master build #2771
-
3
-
16842
Description
Started racer on client c13 and soon after, client c11 crashed. racer paused for a long time, but then resumed running and looks like it is hung again.
On client c11 console, I saw:
Message from syslogd@c11 at Dec 17 09:45:47 ... kernel:LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) ASSERTION( lock->l_granted_mode == lock->l_req_mode ) failed: Message from syslogd@c11 at Dec 17 09:45:47 ... kernel:LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) LBUG
This could be related to one of the other racer tickets, but I couldn’t find any of the other racer tickets mention an LBUG in osc_object_ast_clear().
From the crash dmesg on client c11, there are many “fid is insane” errors and a call trace:
… <3>LustreError: 24520:0:(file.c:3036:ll_migrate()) scratch: migrate 3 , but fid [0x0:0x0:0x0] is insane <3>LustreError: 24520:0:(file.c:3036:ll_migrate()) Skipped 33 previous similar m essages <3>LustreError: 29819:0:(file.c:3036:ll_migrate()) scratch: migrate 7 , but fid [0x0:0x0:0x0] is insane <3>LustreError: 29819:0:(file.c:3036:ll_migrate()) Skipped 7 previous similar me ssages <0>LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) ASSERTION( loc k->l_granted_mode == lock->l_req_mode ) failed: <0>LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) LBUG <4>Pid: 26538, comm: lfs <4> <4>Call Trace: <4> [<ffffffffa0fb6895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa0fb6e97>] lbug_with_loc+0x47/0xb0 [libcfs] <4> [<ffffffffa16f20ca>] osc_object_ast_clear+0x12a/0x130 [osc] <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc] <4> [<ffffffffa12bec8f>] ldlm_resource_foreach+0x29f/0x300 [ptlrpc] <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc] <4> [<ffffffffa12bed6a>] ldlm_resource_iterate+0x7a/0x1a0 [ptlrpc] <4> [<ffffffffa16f1e66>] osc_object_prune+0xd6/0x210 [osc] <4> [<ffffffff81058bd3>] ? __wake_up+0x53/0x70 <4> [<ffffffffa1118ee5>] cl_object_prune+0x55/0x100 [obdclass] <4> [<ffffffffa155b32c>] lov_delete_raid0+0xcc/0x3e0 [lov] <4> [<ffffffff8128ceb6>] ? vsnprintf+0x336/0x5e0 <4> [<ffffffffa155a819>] lov_object_delete+0x69/0x180 [lov] <4> [<ffffffffa1110141>] lu_object_free+0x81/0x1a0 [obdclass] <4> [<ffffffffa0fcbdb4>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs] <4> [<ffffffffa0fcc4a2>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs] <4> [<ffffffffa11108bd>] lu_object_put+0xad/0x330 [obdclass] <4> [<ffffffffa1615da2>] ? cl_inode_fini+0x52/0x270 [lustre] <4> [<ffffffffa11195be>] cl_object_put+0xe/0x10 [obdclass] <4> [<ffffffffa1615dda>] cl_inode_fini+0x8a/0x270 [lustre] <4> [<ffffffffa151545e>] ? mdc_null_inode+0x7e/0x1c0 [mdc] <4> [<ffffffffa15d94fd>] ll_clear_inode+0x25d/0x980 [lustre] <4> [<ffffffffa15d7ee0>] ? ll_delete_inode+0x0/0x210 [lustre] <4> [<ffffffff811a654c>] clear_inode+0xac/0x140 <4> [<ffffffffa15d7f44>] ll_delete_inode+0x64/0x210 [lustre] <4> [<ffffffff811a6c4e>] generic_delete_inode+0xde/0x1d0 <4> [<ffffffff811a6da5>] generic_drop_inode+0x65/0x80 <4> [<ffffffff811a5bf2>] iput+0x62/0x70 <4> [<ffffffffa15c27b7>] ll_migrate+0x437/0x950 [lustre] <4> [<ffffffffa15ba35e>] ll_dir_ioctl+0x5a6e/0x64d0 [lustre] <4> [<ffffffff8119f78d>] ? filldir+0x7d/0xe0 <4> [<ffffffffa15f9d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] <4> [<ffffffffa15b0bc5>] ? ll_release_page+0x35/0xd0 [lustre] <4> [<ffffffffa15b0e9f>] ? ll_dir_read+0x23f/0x300 [lustre] <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0 <4> [<ffffffff8119e4e2>] vfs_ioctl+0x22/0xa0 <4> [<ffffffff8119e684>] do_vfs_ioctl+0x84/0x580 <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0 <4> [<ffffffff8119f972>] ? vfs_readdir+0xa2/0xe0 <4> [<ffffffff8119ec01>] sys_ioctl+0x81/0xa0 <4> [<ffffffff8152c07e>] ? do_device_not_available+0xe/0x10 <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b <4> <0>Kernel panic - not syncing: LBUG <4>Pid: 26538, comm: lfs Not tainted 2.6.32-431.29.2.el6.x86_64 #1 <4>Call Trace: <4> [<ffffffff8152873c>] ? panic+0xa7/0x16f <4> [<ffffffffa0fb6eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] <4> [<ffffffffa16f20ca>] ? osc_object_ast_clear+0x12a/0x130 [osc] <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc] <4> [<ffffffffa12bec8f>] ? ldlm_resource_foreach+0x29f/0x300 [ptlrpc] <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc] <4> [<ffffffffa12bed6a>] ? ldlm_resource_iterate+0x7a/0x1a0 [ptlrpc] <4> [<ffffffffa16f1e66>] ? osc_object_prune+0xd6/0x210 [osc] <4> [<ffffffff81058bd3>] ? __wake_up+0x53/0x70 <4> [<ffffffffa1118ee5>] ? cl_object_prune+0x55/0x100 [obdclass] <4> [<ffffffffa155b32c>] ? lov_delete_raid0+0xcc/0x3e0 [lov] <4> [<ffffffff8128ceb6>] ? vsnprintf+0x336/0x5e0 <4> [<ffffffffa155a819>] ? lov_object_delete+0x69/0x180 [lov] <4> [<ffffffffa1110141>] ? lu_object_free+0x81/0x1a0 [obdclass] <4> [<ffffffffa0fcbdb4>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs] <3>LustreError: 27936:0:(file.c:3036:ll_migrate()) scratch: migrate sleep , but fid [0x0:0x0:0x0] is insane <3>LustreError: 27936:0:(file.c:3036:ll_migrate()) Skipped 122 previous similar messages <4> [<ffffffffa0fcc4a2>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs] <4> [<ffffffffa11108bd>] ? lu_object_put+0xad/0x330 [obdclass] <4> [<ffffffffa1615da2>] ? cl_inode_fini+0x52/0x270 [lustre] <4> [<ffffffffa11195be>] ? cl_object_put+0xe/0x10 [obdclass] <4> [<ffffffffa1615dda>] ? cl_inode_fini+0x8a/0x270 [lustre] <4> [<ffffffffa151545e>] ? mdc_null_inode+0x7e/0x1c0 [mdc] <4> [<ffffffffa15d94fd>] ? ll_clear_inode+0x25d/0x980 [lustre] <4> [<ffffffffa15d7ee0>] ? ll_delete_inode+0x0/0x210 [lustre] <4> [<ffffffff811a654c>] ? clear_inode+0xac/0x140 <4> [<ffffffffa15d7f44>] ? ll_delete_inode+0x64/0x210 [lustre] <4> [<ffffffff811a6c4e>] ? generic_delete_inode+0xde/0x1d0 <4> [<ffffffff811a6da5>] ? generic_drop_inode+0x65/0x80 <4> [<ffffffff811a5bf2>] ? iput+0x62/0x70 <4> [<ffffffffa15c27b7>] ? ll_migrate+0x437/0x950 [lustre] <4> [<ffffffffa15ba35e>] ? ll_dir_ioctl+0x5a6e/0x64d0 [lustre] <4> [<ffffffff8119f78d>] ? filldir+0x7d/0xe0 <4> [<ffffffffa15f9d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] <4> [<ffffffffa15b0bc5>] ? ll_release_page+0x35/0xd0 [lustre] <4> [<ffffffffa15b0e9f>] ? ll_dir_read+0x23f/0x300 [lustre] <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0 <4> [<ffffffff8119e4e2>] ? vfs_ioctl+0x22/0xa0 <4> [<ffffffff8119e684>] ? do_vfs_ioctl+0x84/0x580 <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0 <4> [<ffffffff8119f972>] ? vfs_readdir+0xa2/0xe0 <4> [<ffffffff8119ec01>] ? sys_ioctl+0x81/0xa0 <4> [<ffffffff8152c07e>] ? do_device_not_available+0xe/0x10 <4> [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
On c13, the client running racer, I see the following in dmesg:
LustreError: 30632:0:(file.c:3036:ll_migrate()) scratch: migrate 11 , but fid [0x0:0x0:0x0] is insane
LustreError: 30632:0:(file.c:3036:ll_migrate()) Skipped 6 previous similar messages
INFO: task dir_create.sh:11791 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dir_create.sh D 0000000000000004 0 11791 11766 0x00000080
ffff8803dfd87818 0000000000000086 ffff8803dfd877f8 ffffffffa08d3a13
0000000000000000 0000000000000000 ffffffffa093db60 ffff8803b26a1c00
ffff8808096f9af8 ffff8803dfd87fd8 000000000000fbc8 ffff8808096f9af8
Call Trace:
[<ffffffffa08d3a13>] ? __req_capsule_get+0x163/0x6d0 [ptlrpc]
[<ffffffff8128abba>] ? strlcpy+0x4a/0x60
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffffa0ba85d5>] ? mdc_open_pack+0x1d5/0x250 [mdc]
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffffa0babe72>] mdc_enqueue+0x222/0x1a30 [mdc]
[<ffffffffa0bad862>] mdc_intent_lock+0x1e2/0x5f9 [mdc]
[<ffffffffa12b6d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
[<ffffffffa0889f40>] ? ldlm_completion_ast+0x0/0x9a0 [ptlrpc]
[<ffffffffa0b58f1a>] ? lmv_fid_alloc+0x25a/0x3d0 [lmv]
[<ffffffffa0b73aab>] lmv_intent_open+0x31b/0x9f0 [lmv]
[<ffffffffa12b6d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
[<ffffffffa0b7445f>] lmv_intent_lock+0x2df/0x11c0 [lmv]
[<ffffffff8116f503>] ? kmem_cache_alloc_trace+0x1a3/0x1b0
[<ffffffffa12b4129>] ? ll_i2suppgid+0x19/0x30 [lustre]
[<ffffffffa12b416e>] ? ll_i2gids+0x2e/0xd0 [lustre]
[<ffffffffa1299a9c>] ? ll_prep_md_op_data+0x22c/0x530 [lustre]
[<ffffffffa12b6d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
[<ffffffffa12b8929>] ll_lookup_it+0x249/0x9a0 [lustre]
[<ffffffffa12b9109>] ll_lookup_nd+0x89/0x5e0 [lustre]
[<ffffffff81196492>] __lookup_hash+0x102/0x160
[<ffffffff81196bba>] lookup_hash+0x3a/0x50
[<ffffffff8119ba7e>] do_filp_open+0x2de/0xd20
[<ffffffff8109b39c>] ? remove_wait_queue+0x3c/0x50
[<ffffffff81016c71>] ? fpu_finit+0x21/0x40
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task dir_create.sh:11901 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dir_create.sh D 0000000000000005 0 11901 11766 0x00000080
ffff880819e69b98 0000000000000086 0000004b00000000 ffffffffa12e7983
0000000000000098 0020000000000080 5491c0bf00000005 00000000000b7709
ffff8804b7e5d098 ffff880819e69fd8 000000000000fbc8 ffff8804b7e5d098
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811989ab>] do_lookup+0x11b/0x230
[<ffffffff81199100>] __link_path_walk+0x200/0x1000
[<ffffffff8119a1ba>] path_walk+0x6a/0xe0
[<ffffffff8119b99a>] do_filp_open+0x1fa/0xd20
[<ffffffff8109b39c>] ? remove_wait_queue+0x3c/0x50
[<ffffffff81016c71>] ? fpu_finit+0x21/0x40
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task mv:23695 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mv D 0000000000000006 0 23695 11847 0x00000080
ffff88080f1a7cd8 0000000000000086 0000000000000000 ffff8807b7706aa0
ffff8807b7706aa0 ffff8807b7706aa0 ffff8807b7706aa0 ffff8807b7706aa0
ffff8807b7707058 ffff88080f1a7fd8 000000000000fbc8 ffff8807b7707058
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff81197501>] ? path_put+0x31/0x40
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811969af>] lock_rename+0x3f/0xe0
[<ffffffff8119a701>] sys_renameat+0x1b1/0x3a0
[<ffffffff8119b502>] ? user_path_at+0x62/0xa0
[<ffffffff8118e754>] ? cp_new_stat+0xe4/0x100
[<ffffffff8118ea86>] ? sys_newlstat+0x36/0x50
[<ffffffff810e1e07>] ? audit_syscall_entry+0x1d7/0x200
[<ffffffff8119a90b>] sys_rename+0x1b/0x20
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task ls:26715 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ls D 0000000000000000 0 26715 11886 0x00000080
ffff8803c17ebb58 0000000000000082 0000004b00000000 ffffffffa12e7983
0000000000000098 0020000000000080 5491c0bf00000000 00000000000bc66a
ffff88047a4ed058 ffff8803c17ebfd8 000000000000fbc8 ffff88047a4ed058
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811989ab>] do_lookup+0x11b/0x230
[<ffffffff811996a4>] __link_path_walk+0x7a4/0x1000
[<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
[<ffffffff8119a1ba>] path_walk+0x6a/0xe0
[<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
[<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
[<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
[<ffffffff810ec53e>] ? call_rcu+0xe/0x10
[<ffffffff811a28ef>] ? d_free+0x3f/0x60
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
[<ffffffff8100c715>] ? math_state_restore+0x45/0x60
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task ls:26717 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ls D 0000000000000000 0 26717 11886 0x00000080
ffff8803b26a5b58 0000000000000086 0000004b00000000 ffffffffa12e7983
0000000000000098 0020000000000080 5491c0bf00000000 00000000000acca1
ffff88040b0bd098 ffff8803b26a5fd8 000000000000fbc8 ffff88040b0bd098
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811989ab>] do_lookup+0x11b/0x230
[<ffffffff81199100>] __link_path_walk+0x200/0x1000
[<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
[<ffffffff8119a1ba>] path_walk+0x6a/0xe0
[<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
[<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
[<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
[<ffffffff810ec53e>] ? call_rcu+0xe/0x10
[<ffffffff811a28ef>] ? d_free+0x3f/0x60
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
LustreError: 14431:0:(file.c:3036:ll_migrate()) scratch: migrate 1 , but fid [0x0:0x0:0x0] is insane
LustreError: 14431:0:(file.c:3036:ll_migrate()) Skipped 48 previous similar messages
[<ffffffff8100c715>] ? math_state_restore+0x45/0x60
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task ls:26719 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ls D 0000000000000001 0 26719 11886 0x00000080
ffff880813fcdb58 0000000000000086 0000000000000000 ffffffffa12e7983
0000000000000098 0020000000000080 5491c0bf00000001 00000000000afd04
ffff880810bce638 ffff880813fcdfd8 000000000000fbc8 ffff880810bce638
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffffa04951a1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811989ab>] do_lookup+0x11b/0x230
[<ffffffff811996a4>] __link_path_walk+0x7a4/0x1000
[<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
[<ffffffff8119a1ba>] path_walk+0x6a/0xe0
[<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
[<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
[<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
[<ffffffff810ec53e>] ? call_rcu+0xe/0x10
[<ffffffff811a28ef>] ? d_free+0x3f/0x60
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
[<ffffffff8100c715>] ? math_state_restore+0x45/0x60
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task ls:26720 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ls D 0000000000000000 0 26720 11886 0x00000080
ffff880813d71b58 0000000000000086 0000004b13d71ac8 ffffffffa12e7983
0000000000000098 0020000000000080 5491c0bf00000000 00000000000ad882
ffff8806113d9ab8 ffff880813d71fd8 000000000000fbc8 ffff8806113d9ab8
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811989ab>] do_lookup+0x11b/0x230
[<ffffffff81199100>] __link_path_walk+0x200/0x1000
[<ffffffffa0899caf>] ? ptlrpc_request_cache_free+0xbf/0x100 [ptlrpc]
[<ffffffff8119a1ba>] path_walk+0x6a/0xe0
[<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
[<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
[<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
[<ffffffffa128343c>] ? ll_file_release+0x2fc/0xb40 [lustre]
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
[<ffffffff8100c715>] ? math_state_restore+0x45/0x60
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task ls:26721 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ls D 0000000000000007 0 26721 11886 0x00000080
ffff8808105d5b58 0000000000000082 0000004b00000000 ffffffffa12e7983
0000000000000098 0020000000000080 5491c0bf00000007 00000000000b048e
ffff880521c065f8 ffff8808105d5fd8 000000000000fbc8 ffff880521c065f8
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffffa04951a1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811989ab>] do_lookup+0x11b/0x230
[<ffffffff811996a4>] __link_path_walk+0x7a4/0x1000
[<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
[<ffffffff8119a1ba>] path_walk+0x6a/0xe0
[<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
[<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
[<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
[<ffffffff810ec53e>] ? call_rcu+0xe/0x10
[<ffffffff811a28ef>] ? d_free+0x3f/0x60
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
[<ffffffff8100c715>] ? math_state_restore+0x45/0x60
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task ls:26723 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ls D 0000000000000006 0 26723 11886 0x00000080
ffff880813d51b58 0000000000000086 0000004b00000000 ffffffffa12e7983
0000000000000098 0020000000000080 5491c0bf00000006 00000000000b1e5f
ffff880763c7dab8 ffff880813d51fd8 000000000000fbc8 ffff880763c7dab8
Call Trace:
[<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
[<ffffffff8152a45b>] mutex_lock+0x2b/0x50
[<ffffffff811989ab>] do_lookup+0x11b/0x230
[<ffffffff81199100>] __link_path_walk+0x200/0x1000
[<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
[<ffffffff8119a1ba>] path_walk+0x6a/0xe0
[<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
[<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
[<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
[<ffffffff810ec53e>] ? call_rcu+0xe/0x10
[<ffffffff811a28ef>] ? d_free+0x3f/0x60
[<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
[<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
[<ffffffff81185be9>] do_sys_open+0x69/0x140
[<ffffffff81185d00>] sys_open+0x20/0x30
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
LustreError: 15736:0:(file.c:3036:ll_migrate()) scratch: migrate 10 , but fid [0x0:0x0:0x0] is insane
LustreError: 15736:0:(file.c:3036:ll_migrate()) Skipped 2 previous similar messages
LustreError: 19311:0:(lmv_intent.c:239:lmv_revalidate_slaves()) scratch-clilmv-ffff880806604400: nlink 0 < 2 corrupt stripe 0 [0x380000405:0x1f60:0x0]:[0x380000405:0x1f60:0x0]
LustreError: 19311:0:(llite_lib.c:2399:ll_prep_inode()) new_inode -fatal: rc -5
LustreError: 20039:0:(lmv_intent.c:239:lmv_revalidate_slaves()) scratch-clilmv-ffff880806604400: nlink 0 < 2 corrupt stripe 0 [0x3c0000402:0x1d9b:0x0]:[0x3c0000402:0x1d9b:0x0]
LustreError: 20039:0:(lmv_intent.c:239:lmv_revalidate_slaves()) Skipped 1 previous similar message
LustreError: 20039:0:(llite_lib.c:2399:ll_prep_inode()) new_inode -fatal: rc -5
LustreError: 20039:0:(llite_lib.c:2399:ll_prep_inode()) Skipped 1 previous similar message
On the second MDS, I see the following migrate errors and call trace in demsg:
LustreError: 9049:0:(mdt_reint.c:1523:mdt_reint_migrate_internal()) scratch-MDT0
001: parent [0x3c0000400:0x1:0x0] is still on the same MDT, which should be migr
ated first: rc = -1
LustreError: 9049:0:(mdt_reint.c:1523:mdt_reint_migrate_internal()) Skipped 14 p
revious similar messages
LustreError: 8154:0:(mdt_reint.c:1160:mdt_reint_link()) scratch-MDT0001: source
inode [0x380000405:0x1b7a:0x0] on remote MDT from [0x3c0000402:0x1917:0x0]
LustreError: 8154:0:(mdt_reint.c:1160:mdt_reint_link()) Skipped 52 previous simi
lar messages
LustreError: 9052:0:(mdt_reint.c:1514:mdt_reint_migrate_internal()) scratch-MDT0
001: source [0x380000404:0x1d4f:0x0] is on the remote MDT
LustreError: 9052:0:(mdt_reint.c:1514:mdt_reint_migrate_internal()) Skipped 98 p
revious similar messages
LustreError: 9040:0:(mdd_dir.c:4021:mdd_migrate()) scratch-MDD0001: [0x3c0000402
:0x18b7:0x0]16 is already opened count 1: rc = -16
LustreError: 9040:0:(mdd_dir.c:4021:mdd_migrate()) Skipped 19 previous similar m
essages
LustreError: 9040:0:(mdt_open.c:1580:mdt_cross_open()) scratch-MDT0001: [0x3c000
0401:0x1aa4:0x0] doesn't exist!: rc = -14
LustreError: 9040:0:(mdt_open.c:1580:mdt_cross_open()) Skipped 7 previous simila
r messages
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
Lustre: 9050:0:(service.c:1335:ptlrpc_at_send_early_reply()) @@@ Couldn't add an
y time (5/5), not sending early reply
req@ffff880543aa9c80 x1487703048481864/t0(0) o36->c82a75ed-84d9-3fb3-4192-c864
a27ef414@192.168.2.113@o2ib:527/0 lens 488/3128 e 24 to 0 dl 1418838807 ref 2 fl
Interpret:/0/0 rc 0/0
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
Lustre: scratch-MDT0001: Client c82a75ed-84d9-3fb3-4192-c864a27ef414 (at 192.168
.2.113@o2ib) reconnecting
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
Lustre: scratch-MDT0001: Client c82a75ed-84d9-3fb3-4192-c864a27ef414 (at 192.168
.2.113@o2ib) reconnecting
INFO: task mdt01_014:9058 blocked for more than 120 seconds.
Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdt01_014 D 0000000000000002 0 9058 2 0x00000080
ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
Call Trace:
[<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
[<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
[<ffffffff81061d00>] ? default_wake_function+0x0/0x20
[<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
[<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
[<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
[<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
[<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
[<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
[<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
[<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
[<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
[<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
[<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
[<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
[<ffffffff8109abf6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ab60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20