Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6042

racer test_1: osc_object_ast_clear()) LBUG

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.7.0
    • None
    • OpenSFS cluster with two MDSs with one MDT each, three OSSs with two OSTs each and three clients running lustre-master build #2771
    • 3
    • 16842

    Description

      Started racer on client c13 and soon after, client c11 crashed. racer paused for a long time, but then resumed running and looks like it is hung again.

      On client c11 console, I saw:

      Message from syslogd@c11 at Dec 17 09:45:47 ...
       kernel:LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) ASSERTION( lock->l_granted_mode == lock->l_req_mode ) failed: 
      
      Message from syslogd@c11 at Dec 17 09:45:47 ...
       kernel:LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) LBUG
      

      This could be related to one of the other racer tickets, but I couldn’t find any of the other racer tickets mention an LBUG in osc_object_ast_clear().

      From the crash dmesg on client c11, there are many “fid is insane” errors and a call trace:

      …
      <3>LustreError: 24520:0:(file.c:3036:ll_migrate()) scratch: migrate 3 , but fid 
      [0x0:0x0:0x0] is insane
      <3>LustreError: 24520:0:(file.c:3036:ll_migrate()) Skipped 33 previous similar m
      essages
      <3>LustreError: 29819:0:(file.c:3036:ll_migrate()) scratch: migrate 7 , but fid 
      [0x0:0x0:0x0] is insane
      <3>LustreError: 29819:0:(file.c:3036:ll_migrate()) Skipped 7 previous similar me
      ssages
      <0>LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) ASSERTION( loc
      k->l_granted_mode == lock->l_req_mode ) failed: 
      <0>LustreError: 26538:0:(osc_object.c:212:osc_object_ast_clear()) LBUG
      <4>Pid: 26538, comm: lfs
      <4>
      <4>Call Trace:
      <4> [<ffffffffa0fb6895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4> [<ffffffffa0fb6e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4> [<ffffffffa16f20ca>] osc_object_ast_clear+0x12a/0x130 [osc]
      <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc]
      <4> [<ffffffffa12bec8f>] ldlm_resource_foreach+0x29f/0x300 [ptlrpc]
      <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc]
      <4> [<ffffffffa12bed6a>] ldlm_resource_iterate+0x7a/0x1a0 [ptlrpc]
      <4> [<ffffffffa16f1e66>] osc_object_prune+0xd6/0x210 [osc]
      <4> [<ffffffff81058bd3>] ? __wake_up+0x53/0x70
      <4> [<ffffffffa1118ee5>] cl_object_prune+0x55/0x100 [obdclass]
      <4> [<ffffffffa155b32c>] lov_delete_raid0+0xcc/0x3e0 [lov]
      <4> [<ffffffff8128ceb6>] ? vsnprintf+0x336/0x5e0
      <4> [<ffffffffa155a819>] lov_object_delete+0x69/0x180 [lov]
      <4> [<ffffffffa1110141>] lu_object_free+0x81/0x1a0 [obdclass]
      <4> [<ffffffffa0fcbdb4>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
      <4> [<ffffffffa0fcc4a2>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs]
      <4> [<ffffffffa11108bd>] lu_object_put+0xad/0x330 [obdclass]
      <4> [<ffffffffa1615da2>] ? cl_inode_fini+0x52/0x270 [lustre]
      <4> [<ffffffffa11195be>] cl_object_put+0xe/0x10 [obdclass]
      <4> [<ffffffffa1615dda>] cl_inode_fini+0x8a/0x270 [lustre]
      <4> [<ffffffffa151545e>] ? mdc_null_inode+0x7e/0x1c0 [mdc]
      <4> [<ffffffffa15d94fd>] ll_clear_inode+0x25d/0x980 [lustre]
      <4> [<ffffffffa15d7ee0>] ? ll_delete_inode+0x0/0x210 [lustre]
      <4> [<ffffffff811a654c>] clear_inode+0xac/0x140
      <4> [<ffffffffa15d7f44>] ll_delete_inode+0x64/0x210 [lustre]
      <4> [<ffffffff811a6c4e>] generic_delete_inode+0xde/0x1d0
      <4> [<ffffffff811a6da5>] generic_drop_inode+0x65/0x80
      <4> [<ffffffff811a5bf2>] iput+0x62/0x70
      <4> [<ffffffffa15c27b7>] ll_migrate+0x437/0x950 [lustre]
      <4> [<ffffffffa15ba35e>] ll_dir_ioctl+0x5a6e/0x64d0 [lustre]
      <4> [<ffffffff8119f78d>] ? filldir+0x7d/0xe0
      <4> [<ffffffffa15f9d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
      <4> [<ffffffffa15b0bc5>] ? ll_release_page+0x35/0xd0 [lustre]
      <4> [<ffffffffa15b0e9f>] ? ll_dir_read+0x23f/0x300 [lustre]
      <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0
      <4> [<ffffffff8119e4e2>] vfs_ioctl+0x22/0xa0
      <4> [<ffffffff8119e684>] do_vfs_ioctl+0x84/0x580
      <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0
      <4> [<ffffffff8119f972>] ? vfs_readdir+0xa2/0xe0
      <4> [<ffffffff8119ec01>] sys_ioctl+0x81/0xa0
      <4> [<ffffffff8152c07e>] ? do_device_not_available+0xe/0x10
      <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      <4>
      <0>Kernel panic - not syncing: LBUG
      <4>Pid: 26538, comm: lfs Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      <4>Call Trace:
      <4> [<ffffffff8152873c>] ? panic+0xa7/0x16f
      <4> [<ffffffffa0fb6eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      <4> [<ffffffffa16f20ca>] ? osc_object_ast_clear+0x12a/0x130 [osc]
      <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc]
      <4> [<ffffffffa12bec8f>] ? ldlm_resource_foreach+0x29f/0x300 [ptlrpc]
      <4> [<ffffffffa16f1fa0>] ? osc_object_ast_clear+0x0/0x130 [osc]
      <4> [<ffffffffa12bed6a>] ? ldlm_resource_iterate+0x7a/0x1a0 [ptlrpc]
      <4> [<ffffffffa16f1e66>] ? osc_object_prune+0xd6/0x210 [osc]
      <4> [<ffffffff81058bd3>] ? __wake_up+0x53/0x70
      <4> [<ffffffffa1118ee5>] ? cl_object_prune+0x55/0x100 [obdclass]
      <4> [<ffffffffa155b32c>] ? lov_delete_raid0+0xcc/0x3e0 [lov]
      <4> [<ffffffff8128ceb6>] ? vsnprintf+0x336/0x5e0
      <4> [<ffffffffa155a819>] ? lov_object_delete+0x69/0x180 [lov]
      <4> [<ffffffffa1110141>] ? lu_object_free+0x81/0x1a0 [obdclass]
      <4> [<ffffffffa0fcbdb4>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
      <3>LustreError: 27936:0:(file.c:3036:ll_migrate()) scratch: migrate sleep , but 
      fid [0x0:0x0:0x0] is insane
      <3>LustreError: 27936:0:(file.c:3036:ll_migrate()) Skipped 122 previous similar 
      messages
      <4> [<ffffffffa0fcc4a2>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs]
      <4> [<ffffffffa11108bd>] ? lu_object_put+0xad/0x330 [obdclass]
      <4> [<ffffffffa1615da2>] ? cl_inode_fini+0x52/0x270 [lustre]
      <4> [<ffffffffa11195be>] ? cl_object_put+0xe/0x10 [obdclass]
      <4> [<ffffffffa1615dda>] ? cl_inode_fini+0x8a/0x270 [lustre]
      <4> [<ffffffffa151545e>] ? mdc_null_inode+0x7e/0x1c0 [mdc]
      <4> [<ffffffffa15d94fd>] ? ll_clear_inode+0x25d/0x980 [lustre]
      <4> [<ffffffffa15d7ee0>] ? ll_delete_inode+0x0/0x210 [lustre]
      <4> [<ffffffff811a654c>] ? clear_inode+0xac/0x140
      <4> [<ffffffffa15d7f44>] ? ll_delete_inode+0x64/0x210 [lustre]
      <4> [<ffffffff811a6c4e>] ? generic_delete_inode+0xde/0x1d0
      <4> [<ffffffff811a6da5>] ? generic_drop_inode+0x65/0x80
      <4> [<ffffffff811a5bf2>] ? iput+0x62/0x70
      <4> [<ffffffffa15c27b7>] ? ll_migrate+0x437/0x950 [lustre]
      <4> [<ffffffffa15ba35e>] ? ll_dir_ioctl+0x5a6e/0x64d0 [lustre]
      <4> [<ffffffff8119f78d>] ? filldir+0x7d/0xe0
      <4> [<ffffffffa15f9d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
      <4> [<ffffffffa15b0bc5>] ? ll_release_page+0x35/0xd0 [lustre]
      <4> [<ffffffffa15b0e9f>] ? ll_dir_read+0x23f/0x300 [lustre]
      <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0
      <4> [<ffffffff8119e4e2>] ? vfs_ioctl+0x22/0xa0
      <4> [<ffffffff8119e684>] ? do_vfs_ioctl+0x84/0x580
      <4> [<ffffffff8119f710>] ? filldir+0x0/0xe0
      <4> [<ffffffff8119f972>] ? vfs_readdir+0xa2/0xe0
      <4> [<ffffffff8119ec01>] ? sys_ioctl+0x81/0xa0
      <4> [<ffffffff8152c07e>] ? do_device_not_available+0xe/0x10
      <4> [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
      

      On c13, the client running racer, I see the following in dmesg:

      LustreError: 30632:0:(file.c:3036:ll_migrate()) scratch: migrate 11 , but fid [0x0:0x0:0x0] is insane
      LustreError: 30632:0:(file.c:3036:ll_migrate()) Skipped 6 previous similar messages
      INFO: task dir_create.sh:11791 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      dir_create.sh D 0000000000000004     0 11791  11766 0x00000080
       ffff8803dfd87818 0000000000000086 ffff8803dfd877f8 ffffffffa08d3a13
       0000000000000000 0000000000000000 ffffffffa093db60 ffff8803b26a1c00
       ffff8808096f9af8 ffff8803dfd87fd8 000000000000fbc8 ffff8808096f9af8
      Call Trace:
       [<ffffffffa08d3a13>] ? __req_capsule_get+0x163/0x6d0 [ptlrpc]
       [<ffffffff8128abba>] ? strlcpy+0x4a/0x60
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffffa0ba85d5>] ? mdc_open_pack+0x1d5/0x250 [mdc]
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffffa0babe72>] mdc_enqueue+0x222/0x1a30 [mdc]
       [<ffffffffa0bad862>] mdc_intent_lock+0x1e2/0x5f9 [mdc]
       [<ffffffffa12b6d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
       [<ffffffffa0889f40>] ? ldlm_completion_ast+0x0/0x9a0 [ptlrpc]
       [<ffffffffa0b58f1a>] ? lmv_fid_alloc+0x25a/0x3d0 [lmv]
       [<ffffffffa0b73aab>] lmv_intent_open+0x31b/0x9f0 [lmv]
       [<ffffffffa12b6d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
       [<ffffffffa0b7445f>] lmv_intent_lock+0x2df/0x11c0 [lmv]
       [<ffffffff8116f503>] ? kmem_cache_alloc_trace+0x1a3/0x1b0
       [<ffffffffa12b4129>] ? ll_i2suppgid+0x19/0x30 [lustre]
       [<ffffffffa12b416e>] ? ll_i2gids+0x2e/0xd0 [lustre]
       [<ffffffffa1299a9c>] ? ll_prep_md_op_data+0x22c/0x530 [lustre]
       [<ffffffffa12b6d40>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
       [<ffffffffa12b8929>] ll_lookup_it+0x249/0x9a0 [lustre]
       [<ffffffffa12b9109>] ll_lookup_nd+0x89/0x5e0 [lustre]
       [<ffffffff81196492>] __lookup_hash+0x102/0x160
       [<ffffffff81196bba>] lookup_hash+0x3a/0x50
       [<ffffffff8119ba7e>] do_filp_open+0x2de/0xd20
       [<ffffffff8109b39c>] ? remove_wait_queue+0x3c/0x50
       [<ffffffff81016c71>] ? fpu_finit+0x21/0x40
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task dir_create.sh:11901 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      dir_create.sh D 0000000000000005     0 11901  11766 0x00000080
       ffff880819e69b98 0000000000000086 0000004b00000000 ffffffffa12e7983
       0000000000000098 0020000000000080 5491c0bf00000005 00000000000b7709
       ffff8804b7e5d098 ffff880819e69fd8 000000000000fbc8 ffff8804b7e5d098
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811989ab>] do_lookup+0x11b/0x230
       [<ffffffff81199100>] __link_path_walk+0x200/0x1000
       [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
       [<ffffffff8119b99a>] do_filp_open+0x1fa/0xd20
       [<ffffffff8109b39c>] ? remove_wait_queue+0x3c/0x50
       [<ffffffff81016c71>] ? fpu_finit+0x21/0x40
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task mv:23695 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mv            D 0000000000000006     0 23695  11847 0x00000080
       ffff88080f1a7cd8 0000000000000086 0000000000000000 ffff8807b7706aa0
       ffff8807b7706aa0 ffff8807b7706aa0 ffff8807b7706aa0 ffff8807b7706aa0
       ffff8807b7707058 ffff88080f1a7fd8 000000000000fbc8 ffff8807b7707058
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffff81197501>] ? path_put+0x31/0x40
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811969af>] lock_rename+0x3f/0xe0
       [<ffffffff8119a701>] sys_renameat+0x1b1/0x3a0
       [<ffffffff8119b502>] ? user_path_at+0x62/0xa0
       [<ffffffff8118e754>] ? cp_new_stat+0xe4/0x100
       [<ffffffff8118ea86>] ? sys_newlstat+0x36/0x50
       [<ffffffff810e1e07>] ? audit_syscall_entry+0x1d7/0x200
       [<ffffffff8119a90b>] sys_rename+0x1b/0x20
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task ls:26715 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ls            D 0000000000000000     0 26715  11886 0x00000080
       ffff8803c17ebb58 0000000000000082 0000004b00000000 ffffffffa12e7983
       0000000000000098 0020000000000080 5491c0bf00000000 00000000000bc66a
       ffff88047a4ed058 ffff8803c17ebfd8 000000000000fbc8 ffff88047a4ed058
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811989ab>] do_lookup+0x11b/0x230
       [<ffffffff811996a4>] __link_path_walk+0x7a4/0x1000
       [<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
       [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
       [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
       [<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
       [<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
       [<ffffffff810ec53e>] ? call_rcu+0xe/0x10
       [<ffffffff811a28ef>] ? d_free+0x3f/0x60
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
       [<ffffffff8100c715>] ? math_state_restore+0x45/0x60
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task ls:26717 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ls            D 0000000000000000     0 26717  11886 0x00000080
       ffff8803b26a5b58 0000000000000086 0000004b00000000 ffffffffa12e7983
       0000000000000098 0020000000000080 5491c0bf00000000 00000000000acca1
       ffff88040b0bd098 ffff8803b26a5fd8 000000000000fbc8 ffff88040b0bd098
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811989ab>] do_lookup+0x11b/0x230
       [<ffffffff81199100>] __link_path_walk+0x200/0x1000
       [<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
       [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
       [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
       [<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
       [<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
       [<ffffffff810ec53e>] ? call_rcu+0xe/0x10
       [<ffffffff811a28ef>] ? d_free+0x3f/0x60
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
      LustreError: 14431:0:(file.c:3036:ll_migrate()) scratch: migrate 1 , but fid [0x0:0x0:0x0] is insane
      LustreError: 14431:0:(file.c:3036:ll_migrate()) Skipped 48 previous similar messages
       [<ffffffff8100c715>] ? math_state_restore+0x45/0x60
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task ls:26719 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ls            D 0000000000000001     0 26719  11886 0x00000080
       ffff880813fcdb58 0000000000000086 0000000000000000 ffffffffa12e7983
       0000000000000098 0020000000000080 5491c0bf00000001 00000000000afd04
       ffff880810bce638 ffff880813fcdfd8 000000000000fbc8 ffff880810bce638
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffffa04951a1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811989ab>] do_lookup+0x11b/0x230
       [<ffffffff811996a4>] __link_path_walk+0x7a4/0x1000
       [<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
       [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
       [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
       [<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
       [<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
       [<ffffffff810ec53e>] ? call_rcu+0xe/0x10
       [<ffffffff811a28ef>] ? d_free+0x3f/0x60
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
       [<ffffffff8100c715>] ? math_state_restore+0x45/0x60
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task ls:26720 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ls            D 0000000000000000     0 26720  11886 0x00000080
       ffff880813d71b58 0000000000000086 0000004b13d71ac8 ffffffffa12e7983
       0000000000000098 0020000000000080 5491c0bf00000000 00000000000ad882
       ffff8806113d9ab8 ffff880813d71fd8 000000000000fbc8 ffff8806113d9ab8
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811989ab>] do_lookup+0x11b/0x230
       [<ffffffff81199100>] __link_path_walk+0x200/0x1000
       [<ffffffffa0899caf>] ? ptlrpc_request_cache_free+0xbf/0x100 [ptlrpc]
       [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
       [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
       [<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
       [<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
       [<ffffffffa128343c>] ? ll_file_release+0x2fc/0xb40 [lustre]
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
       [<ffffffff8100c715>] ? math_state_restore+0x45/0x60
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task ls:26721 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ls            D 0000000000000007     0 26721  11886 0x00000080
       ffff8808105d5b58 0000000000000082 0000004b00000000 ffffffffa12e7983
       0000000000000098 0020000000000080 5491c0bf00000007 00000000000b048e
       ffff880521c065f8 ffff8808105d5fd8 000000000000fbc8 ffff880521c065f8
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffffa04951a1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811989ab>] do_lookup+0x11b/0x230
       [<ffffffff811996a4>] __link_path_walk+0x7a4/0x1000
       [<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
       [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
       [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
       [<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
       [<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
       [<ffffffff810ec53e>] ? call_rcu+0xe/0x10
       [<ffffffff811a28ef>] ? d_free+0x3f/0x60
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
       [<ffffffff8100c715>] ? math_state_restore+0x45/0x60
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      INFO: task ls:26723 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ls            D 0000000000000006     0 26723  11886 0x00000080
       ffff880813d51b58 0000000000000086 0000004b00000000 ffffffffa12e7983
       0000000000000098 0020000000000080 5491c0bf00000006 00000000000b1e5f
       ffff880763c7dab8 ffff880813d51fd8 000000000000fbc8 ffff880763c7dab8
      Call Trace:
       [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
       [<ffffffff811a4148>] ? __d_lookup+0xd8/0x150
       [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
       [<ffffffff811989ab>] do_lookup+0x11b/0x230
       [<ffffffff81199100>] __link_path_walk+0x200/0x1000
       [<ffffffffa1269000>] ? return_if_equal+0x0/0x30 [lustre]
       [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
       [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
       [<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
       [<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
       [<ffffffff810ec53e>] ? call_rcu+0xe/0x10
       [<ffffffff811a28ef>] ? d_free+0x3f/0x60
       [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
       [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
       [<ffffffff81185be9>] do_sys_open+0x69/0x140
       [<ffffffff81185d00>] sys_open+0x20/0x30
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      LustreError: 15736:0:(file.c:3036:ll_migrate()) scratch: migrate 10 , but fid [0x0:0x0:0x0] is insane
      LustreError: 15736:0:(file.c:3036:ll_migrate()) Skipped 2 previous similar messages
      LustreError: 19311:0:(lmv_intent.c:239:lmv_revalidate_slaves()) scratch-clilmv-ffff880806604400: nlink 0 < 2 corrupt stripe 0 [0x380000405:0x1f60:0x0]:[0x380000405:0x1f60:0x0]
      LustreError: 19311:0:(llite_lib.c:2399:ll_prep_inode()) new_inode -fatal: rc -5
      LustreError: 20039:0:(lmv_intent.c:239:lmv_revalidate_slaves()) scratch-clilmv-ffff880806604400: nlink 0 < 2 corrupt stripe 0 [0x3c0000402:0x1d9b:0x0]:[0x3c0000402:0x1d9b:0x0]
      LustreError: 20039:0:(lmv_intent.c:239:lmv_revalidate_slaves()) Skipped 1 previous similar message
      LustreError: 20039:0:(llite_lib.c:2399:ll_prep_inode()) new_inode -fatal: rc -5
      LustreError: 20039:0:(llite_lib.c:2399:ll_prep_inode()) Skipped 1 previous similar message
      

      On the second MDS, I see the following migrate errors and call trace in demsg:

      LustreError: 9049:0:(mdt_reint.c:1523:mdt_reint_migrate_internal()) scratch-MDT0
      001: parent [0x3c0000400:0x1:0x0] is still on the same MDT, which should be migr
      ated first: rc = -1
      LustreError: 9049:0:(mdt_reint.c:1523:mdt_reint_migrate_internal()) Skipped 14 p
      revious similar messages
      LustreError: 8154:0:(mdt_reint.c:1160:mdt_reint_link()) scratch-MDT0001: source 
      inode [0x380000405:0x1b7a:0x0] on remote MDT from [0x3c0000402:0x1917:0x0]
      LustreError: 8154:0:(mdt_reint.c:1160:mdt_reint_link()) Skipped 52 previous simi
      lar messages
      LustreError: 9052:0:(mdt_reint.c:1514:mdt_reint_migrate_internal()) scratch-MDT0
      001: source [0x380000404:0x1d4f:0x0] is on the remote MDT
      LustreError: 9052:0:(mdt_reint.c:1514:mdt_reint_migrate_internal()) Skipped 98 p
      revious similar messages
      LustreError: 9040:0:(mdd_dir.c:4021:mdd_migrate()) scratch-MDD0001: [0x3c0000402
      :0x18b7:0x0]16 is already opened count 1: rc = -16
      LustreError: 9040:0:(mdd_dir.c:4021:mdd_migrate()) Skipped 19 previous similar m
      essages
      LustreError: 9040:0:(mdt_open.c:1580:mdt_cross_open()) scratch-MDT0001: [0x3c000
      0401:0x1aa4:0x0] doesn't exist!: rc = -14
      LustreError: 9040:0:(mdt_open.c:1580:mdt_cross_open()) Skipped 7 previous simila
      r messages
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      Lustre: 9050:0:(service.c:1335:ptlrpc_at_send_early_reply()) @@@ Couldn't add an
      y time (5/5), not sending early reply
        req@ffff880543aa9c80 x1487703048481864/t0(0) o36->c82a75ed-84d9-3fb3-4192-c864
      a27ef414@192.168.2.113@o2ib:527/0 lens 488/3128 e 24 to 0 dl 1418838807 ref 2 fl
       Interpret:/0/0 rc 0/0
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      Lustre: scratch-MDT0001: Client c82a75ed-84d9-3fb3-4192-c864a27ef414 (at 192.168
      .2.113@o2ib) reconnecting
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      Lustre: scratch-MDT0001: Client c82a75ed-84d9-3fb3-4192-c864a27ef414 (at 192.168
      .2.113@o2ib) reconnecting
      INFO: task mdt01_014:9058 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mdt01_014     D 0000000000000002     0  9058      2 0x00000080
       ffff880f67b45a90 0000000000000046 0000000000000000 ffff88053ece9300
       ffff88053ece9300 ffff88055da93000 ffff880f67b45a90 ffffffffa08b8b29
       ffff880b188de638 ffff880f67b45fd8 000000000000fbc8 ffff880b188de638
      Call Trace:
       [<ffffffffa08b8b29>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa08b8d75>] lu_object_find_at+0x35/0x100 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa08b8e56>] lu_object_find+0x16/0x20 [obdclass]
       [<ffffffffa145d066>] mdt_object_find+0x56/0x170 [mdt]
       [<ffffffffa1466272>] mdt_object_find_lock+0x42/0x170 [mdt]
       [<ffffffffa14840d8>] mdt_lock_slaves+0x228/0x520 [mdt]
       [<ffffffffa1485fb3>] mdt_reint_unlink+0x8c3/0x10c0 [mdt]
       [<ffffffffa08d5880>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa145bed5>] ? mdt_ucred+0x15/0x20 [mdt]
       [<ffffffffa147c09d>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa146018b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
       [<ffffffffa14609eb>] mdt_reint+0x6b/0x120 [mdt]
       [<ffffffffa0ee0ade>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
       [<ffffffffa0e90411>] ptlrpc_main+0xe41/0x1950 [ptlrpc]
       [<ffffffffa0e8f5d0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      Attachments

        Activity

          People

            bobijam Zhenyu Xu
            jamesanunez James Nunez (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: