Lustre: DEBUG MARKER: Sun Feb 9 13:35:01 2014 Lustre: DEBUG MARKER: Sun Feb 9 13:40:01 2014 INFO: task jbd2/dm-2-8:29225 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. jbd2/dm-2-8 D 000000000000000f 0 29225 2 0x00000000 ffff88107a343d20 0000000000000046 ffff88107a343ce8 ffff88107a343ce4 0000000000012e40 ffff88189c4cdb00 ffff88109c612e40 0000000000000400 ffff88107c3bb320 ffff88107a343fd8 000000000000db00 ffff88107c3bb320 Call Trace: [] jbd2_journal_commit_transaction+0x19f/0x1530 [jbd2] [] ? __switch_to+0xd0/0x320 [] ? lock_timer_base+0x3c/0x70 [] ? autoremove_wake_function+0x0/0x40 [] kjournald2+0xb8/0x220 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] ? kjournald2+0x0/0x220 [jbd2] [] kthread+0x96/0xa0 [] child_rip+0xa/0x20 [] ? kthread+0x0/0xa0 [] ? child_rip+0x0/0x20 INFO: task ldlm_cn_00:29262 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_cn_00 D 0000000000000013 0 29262 2 0x00000000 ffff881f1be2bb10 0000000000000046 0000000000000000 ffffffffa08ab080 ffff8820000000c0 ffff881f1be2bfd8 000000000000db00 0000000000000000 ffff8820514e13a0 ffff881f1be2bfd8 000000000000db00 ffff8820514e13a0 Call Trace: [] ? kiblnd_send+0x2a0/0x9e0 [ko2iblnd] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] fsfilt_ldiskfs_start+0x77/0x5e0 [fsfilt_ldiskfs] [] llog_origin_handle_cancel+0x4b0/0xd70 [ptlrpc] [] ? cfs_alloc+0x63/0x90 [libcfs] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ldlm_cancel_handler+0x1bf/0x5e0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task ldlm_cn_01:29264 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_cn_01 D 000000000000001e 0 29264 2 0x00000000 ffff881f1be2fb10 0000000000000046 0000000000000000 ffffffffa08ab080 ffff8820000000c0 ffff881f1be2ffd8 000000000000db00 0000000000000000 ffff8820514e1af0 ffff881f1be2ffd8 000000000000db00 ffff8820514e1af0 Call Trace: [] ? kiblnd_send+0x2a0/0x9e0 [ko2iblnd] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] fsfilt_ldiskfs_start+0x77/0x5e0 [fsfilt_ldiskfs] [] llog_origin_handle_cancel+0x4b0/0xd70 [ptlrpc] [] ? cfs_alloc+0x63/0x90 [libcfs] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ldlm_cancel_handler+0x1bf/0x5e0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task ldlm_cb_00:29265 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_cb_00 D 000000000000000b 0 29265 2 0x00000000 ffff881f1be33b00 0000000000000046 000500030a653ea5 0000000000003039 ffff88189c412e40 ffff88087a7a3880 ffff881b3b062400 ffff881b3b062400 ffff8820793ca360 ffff881f1be33fd8 000000000000db00 ffff8820793ca360 Call Trace: [] rwsem_down_failed_common+0x95/0x1f0 [] ? cfs_hash_lookup+0x82/0xa0 [libcfs] [] rwsem_down_write_failed+0x23/0x30 [] call_rwsem_down_write_failed+0x13/0x20 [] ? down_write+0x32/0x40 [] dqacq_handler+0x35e/0xd20 [lquota] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_swab_qdata+0x0/0x30 [ptlrpc] [] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc] [] ? dqacq_handler+0x0/0xd20 [lquota] [] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [] ? ptlrpc_update_export_timer+0x4b/0x4a0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task ldlm_cb_01:29266 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_cb_01 D 0000000000000015 0 29266 2 0x00000000 ffff881f1be37b00 0000000000000046 0000000000000000 0000000000003039 ffff88189c452e40 ffff88087a7a3880 ffff880b39cb3800 ffff880b39cb3800 ffff8820793caab0 ffff881f1be37fd8 000000000000db00 ffff8820793caab0 Call Trace: [] rwsem_down_failed_common+0x95/0x1f0 [] ? cfs_hash_lookup+0x82/0xa0 [libcfs] [] rwsem_down_write_failed+0x23/0x30 [] call_rwsem_down_write_failed+0x13/0x20 [] ? down_write+0x32/0x40 [] dqacq_handler+0x35e/0xd20 [lquota] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_swab_qdata+0x0/0x30 [ptlrpc] [] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc] [] ? dqacq_handler+0x0/0xd20 [lquota] [] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [] ? ptlrpc_update_export_timer+0x4b/0x4a0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task mdt_01:29286 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mdt_01 D 0000000000000019 0 29286 2 0x00000000 ffff8801288079b0 0000000000000046 0000000000000000 ffff88121a38ab28 ffff88121a38aad0 ffffffffa0c5364d 0000000000000004 ffff880dc8a8c2b0 ffff88087505ea70 ffff880128807fd8 000000000000db00 ffff88087505ea70 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? generic_getxattr+0x87/0x90 [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_unlink+0x21b/0x940 [mdd] [] ? lustre_msg_get_versions+0xa4/0x120 [ptlrpc] [] cml_unlink+0x97/0x200 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_unlink+0x634/0x9b0 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_reint+0x44/0xe0 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task mdt_rdpg_00:29287 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mdt_rdpg_00 D 000000000000001e 0 29287 2 0x00000000 ffff8803e5acba00 0000000000000046 ffff8803e5acb9c8 ffff8803e5acb9c4 ffff88087a7a3280 ffff88109c7cdb00 ffff88109c612e40 0000000000000480 ffff88086b32a3e0 ffff8803e5acbfd8 000000000000db00 ffff88086b32a3e0 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_attr_set+0x382/0x1d90 [mdd] [] ? null_alloc_rs+0x1ab/0x3b0 [ptlrpc] [] ? sptlrpc_svc_alloc_rs+0x74/0x2d0 [ptlrpc] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] cml_attr_set+0x66/0x1a0 [cmm] [] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc] [] mdt_mfd_close+0x4de/0x700 [mdt] [] mdt_close+0x66a/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task ldlm_cb_02:30327 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_cb_02 D 0000000000000014 0 30327 2 0x00000000 ffff8805c607bb00 0000000000000046 000500020a643ec0 0000000000003039 ffff88002834dde0 ffff88087a7a3280 ffff8802156b8000 ffff8802156b8000 ffff8804dcb84320 ffff8805c607bfd8 000000000000db00 ffff8804dcb84320 Call Trace: [] rwsem_down_failed_common+0x95/0x1f0 [] ? cfs_hash_lookup+0x82/0xa0 [libcfs] [] rwsem_down_write_failed+0x23/0x30 [] call_rwsem_down_write_failed+0x13/0x20 [] ? down_write+0x32/0x40 [] dqacq_handler+0x35e/0xd20 [lquota] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_swab_qdata+0x0/0x30 [ptlrpc] [] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc] [] ? dqacq_handler+0x0/0xd20 [lquota] [] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [] ? ptlrpc_update_export_timer+0x4b/0x4a0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task ldlm_cb_03:30332 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_cb_03 D 000000000000000c 0 30332 2 0x00000000 ffff88077cb73b00 0000000000000046 ffff88077cb73ac8 ffff88077cb73ac4 ffff88077cb73ab0 ffff8800282cdb00 ffff880028292e40 0000000000000400 ffff88086ae21320 ffff88077cb73fd8 000000000000db00 ffff88086ae21320 Call Trace: [] rwsem_down_failed_common+0x95/0x1f0 [] ? cfs_hash_lookup+0x82/0xa0 [libcfs] [] rwsem_down_write_failed+0x23/0x30 [] call_rwsem_down_write_failed+0x13/0x20 [] ? down_write+0x32/0x40 [] dqacq_handler+0x35e/0xd20 [lquota] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_swab_qdata+0x0/0x30 [ptlrpc] [] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc] [] ? dqacq_handler+0x0/0xd20 [lquota] [] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [] ? ptlrpc_update_export_timer+0x4b/0x4a0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 INFO: task ldlm_cb_04:30351 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ldlm_cb_04 D 0000000000000003 0 30351 2 0x00000000 ffff880dfa31fb00 0000000000000046 ffff880d00000000 000000030000000d ffff88189c40dde0 ffff881e69f5f400 ffff88087a7a3280 0000000000000000 ffff88107b689360 ffff880dfa31ffd8 000000000000db00 ffff88107b689360 Call Trace: [] rwsem_down_failed_common+0x95/0x1f0 [] ? cfs_hash_lookup+0x82/0xa0 [libcfs] [] rwsem_down_write_failed+0x23/0x30 [] call_rwsem_down_write_failed+0x13/0x20 [] ? down_write+0x32/0x40 [] dqacq_handler+0x35e/0xd20 [lquota] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_swab_qdata+0x0/0x30 [ptlrpc] [] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc] [] ? dqacq_handler+0x0/0xd20 [lquota] [] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [] ? ptlrpc_update_export_timer+0x4b/0x4a0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: 15094:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391949472, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff88114f4bd900/0x7369ae788935da03 lrc: 3/0,1 mode: --/EX res: 3/1 bits 0x2 rrc: 9 type: IBT flags: 0x4004030 remote: 0x0 expref: -99 pid: 15094 timeout: 0 LustreError: dumping log to /tmp/lustre-log.1391949672.15094 LustreError: 15446:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391949477, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff881ea4f39d80/0x7369ae7889362a98 lrc: 3/0,1 mode: --/EX res: 3/1 bits 0x2 rrc: 9 type: IBT flags: 0x4004030 remote: 0x0 expref: -99 pid: 15446 timeout: 0 LustreError: 15547:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391949478, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff8805e06a5d80/0x7369ae7889363e17 lrc: 3/0,1 mode: --/EX res: 3/1 bits 0x2 rrc: 9 type: IBT flags: 0x4004030 remote: 0x0 expref: -99 pid: 15547 timeout: 0 LustreError: 15245:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391949560, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff881146312000/0x7369ae788939cdfa lrc: 3/0,1 mode: --/EX res: 3/1 bits 0x2 rrc: 9 type: IBT flags: 0x4004030 remote: 0x0 expref: -99 pid: 15245 timeout: 0 LustreError: 15497:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391949568, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff880df5efeb40/0x7369ae788939ff32 lrc: 3/0,1 mode: --/EX res: 3/1 bits 0x2 rrc: 9 type: IBT flags: 0x4004030 remote: 0x0 expref: -99 pid: 15497 timeout: 0 LustreError: 15497:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 2 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 13:45:01 2014 Lustre: DEBUG MARKER: Sun Feb 9 13:50:01 2014 Lustre: Service thread pid 14935 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 14935, comm: mdt_rdpg_36 Call Trace: [] ? wake_up_process+0x15/0x20 [] ? __mutex_unlock_slowpath+0x44/0x60 [] ? fair_check_preempt_wakeup+0xb2/0x180 [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] ? mdt_handle_last_unlink+0x1e7/0x4d0 [mdt] [] mdt_close+0x692/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391950269.14935 Lustre: Service thread pid 15570 was inactive for 800.04s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 15570, comm: mdt_488 Call Trace: [] ? kiblnd_post_tx_locked+0x4e7/0x9e0 [ko2iblnd] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_attr_set+0x382/0x1d90 [mdd] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? mdt_object_lock+0x320/0xb70 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] cml_attr_set+0x66/0x1a0 [cmm] [] mdt_attr_set+0x2a8/0x590 [mdt] [] mdt_reint_setattr+0x365/0x1310 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_reint+0x44/0xe0 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 31764, comm: ldlm_cb_88 Call Trace: [] rwsem_down_failed_common+0x95/0x1f0 [] ? cfs_hash_lookup+0x82/0xa0 [libcfs] [] rwsem_down_write_failed+0x23/0x30 [] call_rwsem_down_write_failed+0x13/0x20 [] ? down_write+0x32/0x40 [] dqacq_handler+0x35e/0xd20 [lquota] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_swab_qdata+0x0/0x30 [ptlrpc] [] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc] [] ? dqacq_handler+0x0/0xd20 [lquota] [] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [] ? ptlrpc_update_export_timer+0x4b/0x4a0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 30448, comm: ldlm_cb_07 Call Trace: [] rwsem_down_failed_common+0x95/0x1f0 [] ? cfs_hash_lookup+0x82/0xa0 [libcfs] [] rwsem_down_write_failed+0x23/0x30 [] call_rwsem_down_write_failed+0x13/0x20 [] ? down_write+0x32/0x40 [] dqacq_handler+0x35e/0xd20 [lquota] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_swab_qdata+0x0/0x30 [ptlrpc] [] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc] [] ? dqacq_handler+0x0/0xd20 [lquota] [] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc] [] ? keys_fill+0x6f/0x1a0 [obdclass] [] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [] ? ptlrpc_update_export_timer+0x4b/0x4a0 [ptlrpc] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15276, comm: mdt_245 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? generic_getxattr+0x87/0x90 [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_rename+0x1ff/0x22b0 [mdd] [] ? mdd_la_get+0xad/0xb0 [mdd] [] ? mdd_iattr_get+0x120/0x280 [mdd] [] ? lu_object_put+0x96/0x290 [obdclass] [] ? cmm_mode_get+0x103/0x2e0 [cmm] [] cml_rename+0x2c4/0x9c0 [cmm] [] ? mdt_rename_sanity+0x177/0x510 [mdt] [] mdt_reint_rename+0x1c41/0x21a0 [mdt] [] ? mdt_root_squash+0x2c/0x3e0 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_reint+0x44/0xe0 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Lustre: Service thread pid 15480 was inactive for 800.07s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Service thread pid 15414 was inactive for 800.10s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. LustreError: dumping log to /tmp/lustre-log.1391950269.14938 LustreError: dumping log to /tmp/lustre-log.1391950269.16531 LustreError: dumping log to /tmp/lustre-log.1391950270.15246 Lustre: Service thread pid 31755 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 233 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950270.31755 Lustre: Service thread pid 15094 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. LustreError: dumping log to /tmp/lustre-log.1391950272.15094 LustreError: dumping log to /tmp/lustre-log.1391950272.15405 LustreError: dumping log to /tmp/lustre-log.1391950272.15385 LustreError: dumping log to /tmp/lustre-log.1391950272.15550 LustreError: dumping log to /tmp/lustre-log.1391950272.15388 LustreError: dumping log to /tmp/lustre-log.1391950272.15237 LustreError: dumping log to /tmp/lustre-log.1391950272.15470 LustreError: dumping log to /tmp/lustre-log.1391950273.15330 LustreError: dumping log to /tmp/lustre-log.1391950273.15510 Lustre: Service thread pid 15259 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 11 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950275.15259 LustreError: dumping log to /tmp/lustre-log.1391950276.2143 LustreError: dumping log to /tmp/lustre-log.1391950277.15446 LustreError: dumping log to /tmp/lustre-log.1391950277.14975 LustreError: dumping log to /tmp/lustre-log.1391950278.15547 LustreError: dumping log to /tmp/lustre-log.1391950278.15273 Lustre: Service thread pid 15076 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 6 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950280.15076 LustreError: dumping log to /tmp/lustre-log.1391950280.15500 LustreError: dumping log to /tmp/lustre-log.1391950281.15119 LustreError: dumping log to /tmp/lustre-log.1391950281.16143 LustreError: dumping log to /tmp/lustre-log.1391950281.15049 LustreError: dumping log to /tmp/lustre-log.1391950283.16603 LustreError: dumping log to /tmp/lustre-log.1391950284.16148 LustreError: dumping log to /tmp/lustre-log.1391950284.14962 LustreError: dumping log to /tmp/lustre-log.1391950285.16146 LustreError: dumping log to /tmp/lustre-log.1391950285.15614 LustreError: dumping log to /tmp/lustre-log.1391950285.16601 LustreError: dumping log to /tmp/lustre-log.1391950285.16557 LustreError: dumping log to /tmp/lustre-log.1391950286.15098 LustreError: dumping log to /tmp/lustre-log.1391950286.15050 LustreError: dumping log to /tmp/lustre-log.1391950287.15166 LustreError: dumping log to /tmp/lustre-log.1391950288.15419 Lustre: Service thread pid 14941 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 41 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950289.14941 LustreError: dumping log to /tmp/lustre-log.1391950294.15543 LustreError: dumping log to /tmp/lustre-log.1391950295.16119 LustreError: dumping log to /tmp/lustre-log.1391950297.15249 LustreError: dumping log to /tmp/lustre-log.1391950300.15008 LustreError: dumping log to /tmp/lustre-log.1391950301.15151 LustreError: dumping log to /tmp/lustre-log.1391950302.15558 LustreError: dumping log to /tmp/lustre-log.1391950302.15597 Lustre: Service thread pid 15252 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 70 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950305.15252 LustreError: dumping log to /tmp/lustre-log.1391950314.2126 LustreError: dumping log to /tmp/lustre-log.1391950317.15343 LustreError: dumping log to /tmp/lustre-log.1391950317.15386 LustreError: dumping log to /tmp/lustre-log.1391950318.15025 LustreError: dumping log to /tmp/lustre-log.1391950318.2162 LustreError: dumping log to /tmp/lustre-log.1391950322.15425 LustreError: dumping log to /tmp/lustre-log.1391950330.15290 LustreError: dumping log to /tmp/lustre-log.1391950330.15441 LustreError: dumping log to /tmp/lustre-log.1391950331.15021 LustreError: dumping log to /tmp/lustre-log.1391950335.15027 Lustre: Service thread pid 15568 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 12 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950345.15568 LustreError: dumping log to /tmp/lustre-log.1391950346.15468 LustreError: dumping log to /tmp/lustre-log.1391950360.15245 LustreError: dumping log to /tmp/lustre-log.1391950360.15439 LustreError: dumping log to /tmp/lustre-log.1391950361.15284 LustreError: dumping log to /tmp/lustre-log.1391950362.15448 LustreError: dumping log to /tmp/lustre-log.1391950362.15406 LustreError: dumping log to /tmp/lustre-log.1391950363.15288 LustreError: dumping log to /tmp/lustre-log.1391950364.15437 LustreError: dumping log to /tmp/lustre-log.1391950365.15255 LustreError: dumping log to /tmp/lustre-log.1391950368.15497 LustreError: dumping log to /tmp/lustre-log.1391950371.15109 LustreError: dumping log to /tmp/lustre-log.1391950375.16111 LustreError: dumping log to /tmp/lustre-log.1391950376.14925 LustreError: dumping log to /tmp/lustre-log.1391950383.15408 LustreError: dumping log to /tmp/lustre-log.1391950383.14950 LustreError: dumping log to /tmp/lustre-log.1391950384.15226 LustreError: dumping log to /tmp/lustre-log.1391950384.15393 LustreError: dumping log to /tmp/lustre-log.1391950384.15536 LustreError: dumping log to /tmp/lustre-log.1391950384.15563 LustreError: dumping log to /tmp/lustre-log.1391950384.15374 LustreError: dumping log to /tmp/lustre-log.1391950384.15222 LustreError: dumping log to /tmp/lustre-log.1391950384.15287 LustreError: dumping log to /tmp/lustre-log.1391950385.15204 LustreError: dumping log to /tmp/lustre-log.1391950385.15553 LustreError: dumping log to /tmp/lustre-log.1391950385.15238 LustreError: dumping log to /tmp/lustre-log.1391950385.15359 LustreError: dumping log to /tmp/lustre-log.1391950385.14995 LustreError: dumping log to /tmp/lustre-log.1391950385.15032 LustreError: dumping log to /tmp/lustre-log.1391950385.15371 LustreError: dumping log to /tmp/lustre-log.1391950385.15554 LustreError: dumping log to /tmp/lustre-log.1391950385.15411 LustreError: dumping log to /tmp/lustre-log.1391950385.15493 LustreError: dumping log to /tmp/lustre-log.1391950385.15058 LustreError: dumping log to /tmp/lustre-log.1391950385.15144 LustreError: dumping log to /tmp/lustre-log.1391950385.15589 LustreError: dumping log to /tmp/lustre-log.1391950386.15267 LustreError: dumping log to /tmp/lustre-log.1391950386.15045 LustreError: dumping log to /tmp/lustre-log.1391950386.14994 LustreError: dumping log to /tmp/lustre-log.1391950386.14954 LustreError: dumping log to /tmp/lustre-log.1391950386.15362 LustreError: dumping log to /tmp/lustre-log.1391950387.15244 LustreError: dumping log to /tmp/lustre-log.1391950387.14943 LustreError: dumping log to /tmp/lustre-log.1391950402.15011 LustreError: dumping log to /tmp/lustre-log.1391950405.16084 LustreError: dumping log to /tmp/lustre-log.1391950408.15059 Lustre: Service thread pid 15569 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 57 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950419.15569 LustreError: dumping log to /tmp/lustre-log.1391950420.15187 LustreError: dumping log to /tmp/lustre-log.1391950422.15572 LustreError: dumping log to /tmp/lustre-log.1391950423.15327 LustreError: dumping log to /tmp/lustre-log.1391950435.15161 LustreError: dumping log to /tmp/lustre-log.1391950435.15526 LustreError: dumping log to /tmp/lustre-log.1391950441.15218 LustreError: dumping log to /tmp/lustre-log.1391950458.15360 LustreError: dumping log to /tmp/lustre-log.1391950501.15581 LustreError: dumping log to /tmp/lustre-log.1391950501.15342 Lustre: DEBUG MARKER: Sun Feb 9 13:55:01 2014 Lustre: Service thread pid 14972 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 19 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950568.14972 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply req@ffff8810e0a6a800 x1458487679306033/t0(0) o35->8b5deb59-2774-a88d-1f63-34c13a33830b@JO.BOO.II.F@o2ib2:0/0 lens 360/9672 e 5 to 0 dl 1391950632 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply req@ffff8816821adc00 x1458487689517477/t0(0) o35->899fd913-3e2e-d480-5a8e-511aa57ea18d@JO.BOO.II.T@o2ib2:0/0 lens 360/9672 e 5 to 0 dl 1391950632 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply req@ffff8812971fd000 x1459155716325131/t0(0) o35->47eeb18b-57b8-aade-f081-cbd5ccf3eb8a@JO.BOO.AL.PB@o2ib2:0/0 lens 360/9672 e 5 to 0 dl 1391950633 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 238 previous similar messages Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply req@ffff8813930dfc00 x1458475096510165/t0(0) o601->LOV_OSC_UUID@JO.BOO.AL.BZB@o2ib2:0/0 lens 224/224 e 5 to 0 dl 1391950634 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 74 previous similar messages Lustre: 15458:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply req@ffff88175f5ca400 x1458479645846443/t0(0) o36->6db3b954-bb48-db6b-032a-d21c9c354193@JO.BOO.PI.BIP@o2ib2:0/0 lens 576/9672 e 5 to 0 dl 1391950636 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 15458:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 73 previous similar messages Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-522), not sending early reply req@ffff88157ba2b400 x1458492268449907/t0(0) o35->79f7512c-7a40-b7fa-b7da-e10592a81c3c@JO.BOO.LZ.LOZ@o2ib2:0/0 lens 360/9672 e 4 to 0 dl 1391950641 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 34 previous similar messages Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-522), not sending early reply req@ffff8816dd358400 x1458475045713550/t0(0) o601->LOV_OSC_UUID@JO.BOO.AL.BAW@o2ib2:0/0 lens 224/0 e 4 to 0 dl 1391950649 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 57 previous similar messages Lustre: Service thread pid 2151 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 3 previous similar messages Pid: 2151, comm: mdt_rdpg_458 Call Trace: [] ? kiblnd_check_sends+0x2da/0x670 [ko2iblnd] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_attr_set+0x382/0x1d90 [mdd] [] ? null_alloc_rs+0x1ab/0x3b0 [ptlrpc] [] ? sptlrpc_svc_alloc_rs+0x74/0x2d0 [ptlrpc] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] cml_attr_set+0x66/0x1a0 [cmm] [] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc] [] mdt_mfd_close+0x4de/0x700 [mdt] [] mdt_close+0x66a/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391950650.2151 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply req@ffff8816dd211000 x1458479645846476/t0(0) o35->6db3b954-bb48-db6b-032a-d21c9c354193@JO.BOO.PI.BIP@o2ib2:0/0 lens 360/9672 e 5 to 0 dl 1391950666 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 74 previous similar messages Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-546), not sending early reply req@ffff8816e431d000 x1458475058409001/t0(0) o601->LOV_OSC_UUID@JO.BOO.AL.BAI@o2ib2:0/0 lens 224/0 e 4 to 0 dl 1391950701 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 37 previous similar messages Lustre: Service thread pid 15492 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 15492, comm: mdt_410 Call Trace: [] ? read_tsc+0x9/0x20 [] ? getnstimeofday+0x60/0xf0 [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_check_save+0x6e/0xf0 [mdt] [] mdt_md_create+0x4df/0x6c0 [mdt] [] ? __ldlm_handle2lock+0x39/0x330 [ptlrpc] [] mdt_reint_create+0x1b8/0x780 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_reint+0x44/0xe0 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391950740.15492 Lustre: Service thread pid 15012 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 15012, comm: mdt_75 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391950751.15012 Lustre: 15346:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply req@ffff88180cda6c00 x1459155716325938/t0(0) o101->47eeb18b-57b8-aade-f081-cbd5ccf3eb8a@JO.BOO.AL.PB@o2ib2:0/0 lens 592/4936 e 3 to 0 dl 1391950766 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 15346:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 71 previous similar messages Lustre: Service thread pid 15280 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 15280, comm: mdt_249 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391950776.15280 Pid: 15372, comm: mdt_327 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Lustre: scratch3-MDT0000: Client 67e7feff-aff2-d227-7102-36f9d3fdb0cb (at JO.BOO.IW.FA@o2ib2) reconnecting Lustre: scratch3-MDT0000: Client baf7b404-d439-a053-6807-d12987ab1cf0 (at JO.BOO.IW.ZZ@o2ib2) refused reconnection, still busy with 2 active RPCs Lustre: Skipped 1 previous similar message Lustre: Skipped 11 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 14:00:01 2014 Lustre: scratch3-MDT0000: Client 2ac9cca2-9866-2930-8003-dab65469a77d (at JO.BOO.II.BLI@o2ib2) reconnecting Lustre: Skipped 107 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391950822.15150 Lustre: scratch3-MDT0000: Client aceda34b-f506-2e09-96ac-348d9b543a2e (at JO.BOO.WI.PF@o2ib2) reconnecting Lustre: Skipped 98 previous similar messages Lustre: scratch3-MDT0000: Client de2c8a2b-f9c9-9fb0-069b-42a28b49f6d9 (at JO.BOO.PO.ZZ@o2ib2) reconnecting Lustre: Skipped 26 previous similar messages Lustre: scratch3-MDT0000: Client 03ed3959-a45c-48d3-7f00-05cc81238b14 (at JO.BOO.AO.IT@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 245 previous similar messages Lustre: 15552:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff881826b25c00 x1458491541058867/t0(0) o101->a1d4d1e2-a2f8-9243-934b-44099981974c@JO.BOO.WZ.BAW@o2ib2:0/0 lens 568/4936 e 0 to 0 dl 1391950906 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 15552:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 16 previous similar messages Lustre: scratch3-MDT0000: Client 452a2d32-eb8d-d4da-7ebc-bec5b6b86004 (at JO.BOO.IO.TP@o2ib2) reconnecting Lustre: Skipped 277 previous similar messages Lustre: scratch3-MDT0000: Client 15925ee8-8491-8b61-c113-468ed5796e5b (at JO.BOO.AO.LT@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 534 previous similar messages Lustre: scratch3-MDT0000: Client d3765dbe-438a-a098-8119-64024a8b00ad (at JO.BOO.PL.BLB@o2ib2) reconnecting Lustre: Skipped 386 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 14:05:01 2014 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-4), not sending early reply req@ffff8813930df000 x1458479675272398/t0(0) o35->130fa61e-3aaf-72bb-ced8-594b05abc2dc@JO.BOO.PI.IW@o2ib2:0/0 lens 360/9672 e 1 to 0 dl 1391951174 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: 14924:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 19 previous similar messages Lustre: scratch3-MDT0000: Client 0dba972b-d13c-b5e4-c873-b8f884ad549b (at JO.BOO.PI.LBI@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 989 previous similar messages Lustre: Service thread pid 15207 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 1 previous similar message Pid: 15207, comm: mdt_176 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391951351.15207 Pid: 29285, comm: mdt_00 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391951358.29285 Lustre: Service thread pid 15088 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 1 previous similar message Pid: 15088, comm: mdt_115 Call Trace: [] ? transfer_objects+0x5c/0x80 [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391951387.15088 Lustre: scratch3-MDT0000: Client d3765dbe-438a-a098-8119-64024a8b00ad (at JO.BOO.PL.BLB@o2ib2) reconnecting Lustre: Skipped 990 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 14:10:01 2014 Pid: 15007, comm: mdt_70 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391951405.15007 Pid: 15250, comm: mdt_219 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? rwsem_down_failed_common+0x95/0x1f0 [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Lustre: Service thread pid 14953 was inactive for 1200.36s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 7 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391951419.14974 LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 4410s: evicting client at JO.BOO.AL.PB@o2ib2 ns: mdt-ffff8806e2389000 lock: ffff880311715000/0x7369ae786693c703 lrc: 3/0,0 mode: PR/PR res: 10886303477/4041 bits 0x3 rrc: 48 type: IBT flags: 0x4000020 remote: 0x718cf0f0f65c84e2 expref: 97839 pid: 15535 timeout: 4399891556 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff881f350ae000 x1458475032966177/t0(0) o601->LOV_OSC_UUID@JO.BOB.AL.BZZ@o2ib3:0/0 lens 224/0 e 0 to 0 dl 1391951691 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 58 previous similar messages Lustre: 15402:0:(ldlm_lib.c:952:target_handle_connect()) scratch3-MDT0000: connection from 47eeb18b-57b8-aade-f081-cbd5ccf3eb8a@JO.BOO.AL.PB@o2ib2 t410202353499 exp (null) cur 1391951697 last 0 Lustre: DEBUG MARKER: Sun Feb 9 14:15:01 2014 Lustre: Service thread pid 15594 was inactive for 1194.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 2 previous similar messages Pid: 15594, comm: mdt_rdpg_176 Call Trace: [] ? kiblnd_check_sends+0x2da/0x670 [ko2iblnd] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_attr_set+0x382/0x1d90 [mdd] [] ? null_alloc_rs+0x1ab/0x3b0 [ptlrpc] [] ? sptlrpc_svc_alloc_rs+0x74/0x2d0 [ptlrpc] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] cml_attr_set+0x66/0x1a0 [cmm] [] mdt_mfd_close+0x4de/0x700 [mdt] [] mdt_close+0x66a/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391951759.15594 LustreError: 15366:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391951562, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff88090218cb40/0x7369ae78893cc61b lrc: 3/0,1 mode: --/CW res: 10886303477/4041 bits 0x2 rrc: 48 type: IBT flags: 0x4004030 remote: 0x0 expref: -99 pid: 15366 timeout: 0 LustreError: 15366:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 27 previous similar messages LustreError: 15057:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391951662, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff88046682a480/0x7369ae78893cd2d2 lrc: 3/0,1 mode: --/PW res: 10886303477/4041 bits 0x2 rrc: 2 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 15057 timeout: 0 LustreError: 15057:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 1 previous similar message Lustre: Service thread pid 15213 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 15213, comm: mdt_182 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391951866.15213 Pid: 32565, comm: mdt_02 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 14987, comm: mdt_50 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15369, comm: mdt_324 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391951868.5670 LustreError: dumping log to /tmp/lustre-log.1391951868.27469 LustreError: dumping log to /tmp/lustre-log.1391951868.7634 LustreError: dumping log to /tmp/lustre-log.1391951868.7621 LustreError: dumping log to /tmp/lustre-log.1391951868.7624 LustreError: dumping log to /tmp/lustre-log.1391951868.7636 LustreError: dumping log to /tmp/lustre-log.1391951869.21614 LustreError: dumping log to /tmp/lustre-log.1391951869.27488 LustreError: dumping log to /tmp/lustre-log.1391951869.19833 LustreError: dumping log to /tmp/lustre-log.1391951869.21612 LustreError: dumping log to /tmp/lustre-log.1391951869.7628 LustreError: dumping log to /tmp/lustre-log.1391951869.7673 LustreError: dumping log to /tmp/lustre-log.1391951869.7649 LustreError: dumping log to /tmp/lustre-log.1391951869.17284 LustreError: dumping log to /tmp/lustre-log.1391951869.7627 LustreError: dumping log to /tmp/lustre-log.1391951869.7638 LustreError: dumping log to /tmp/lustre-log.1391951869.27475 LustreError: dumping log to /tmp/lustre-log.1391951869.27474 LustreError: dumping log to /tmp/lustre-log.1391951869.27490 LustreError: dumping log to /tmp/lustre-log.1391951869.7647 LustreError: dumping log to /tmp/lustre-log.1391951869.27494 LustreError: dumping log to /tmp/lustre-log.1391951869.7671 LustreError: dumping log to /tmp/lustre-log.1391951869.7630 LustreError: dumping log to /tmp/lustre-log.1391951869.7622 LustreError: dumping log to /tmp/lustre-log.1391951869.27493 LustreError: dumping log to /tmp/lustre-log.1391951869.7665 LustreError: dumping log to /tmp/lustre-log.1391951869.21608 LustreError: dumping log to /tmp/lustre-log.1391951869.27476 LustreError: dumping log to /tmp/lustre-log.1391951869.17283 LustreError: dumping log to /tmp/lustre-log.1391951869.27481 LustreError: dumping log to /tmp/lustre-log.1391951869.7643 LustreError: dumping log to /tmp/lustre-log.1391951869.27483 LustreError: dumping log to /tmp/lustre-log.1391951869.7620 LustreError: dumping log to /tmp/lustre-log.1391951869.27485 LustreError: dumping log to /tmp/lustre-log.1391951869.5672 LustreError: dumping log to /tmp/lustre-log.1391951869.7659 LustreError: dumping log to /tmp/lustre-log.1391951869.27486 LustreError: dumping log to /tmp/lustre-log.1391951869.5668 LustreError: dumping log to /tmp/lustre-log.1391951869.7667 LustreError: dumping log to /tmp/lustre-log.1391951869.7657 LustreError: dumping log to /tmp/lustre-log.1391951869.19838 LustreError: dumping log to /tmp/lustre-log.1391951869.27484 LustreError: dumping log to /tmp/lustre-log.1391951869.5671 LustreError: dumping log to /tmp/lustre-log.1391951869.27482 LustreError: dumping log to /tmp/lustre-log.1391951869.19834 LustreError: dumping log to /tmp/lustre-log.1391951869.29262 LustreError: dumping log to /tmp/lustre-log.1391951869.7663 LustreError: dumping log to /tmp/lustre-log.1391951869.7650 LustreError: dumping log to /tmp/lustre-log.1391951869.7625 LustreError: dumping log to /tmp/lustre-log.1391951870.19836 LustreError: dumping log to /tmp/lustre-log.1391951870.13442 LustreError: dumping log to /tmp/lustre-log.1391951870.7631 LustreError: dumping log to /tmp/lustre-log.1391951870.19837 LustreError: dumping log to /tmp/lustre-log.1391951870.7646 LustreError: dumping log to /tmp/lustre-log.1391951870.19835 LustreError: dumping log to /tmp/lustre-log.1391951870.7629 LustreError: dumping log to /tmp/lustre-log.1391951870.21617 LustreError: dumping log to /tmp/lustre-log.1391951870.7654 LustreError: dumping log to /tmp/lustre-log.1391951870.19828 LustreError: dumping log to /tmp/lustre-log.1391951870.14979 LustreError: dumping log to /tmp/lustre-log.1391951870.14980 LustreError: dumping log to /tmp/lustre-log.1391951870.14976 LustreError: dumping log to /tmp/lustre-log.1391951871.7674 LustreError: dumping log to /tmp/lustre-log.1391951871.7626 LustreError: dumping log to /tmp/lustre-log.1391951871.21606 LustreError: dumping log to /tmp/lustre-log.1391951871.7669 LustreError: dumping log to /tmp/lustre-log.1391951871.7623 LustreError: dumping log to /tmp/lustre-log.1391951871.7641 LustreError: dumping log to /tmp/lustre-log.1391951871.19832 LustreError: dumping log to /tmp/lustre-log.1391951871.21609 LustreError: dumping log to /tmp/lustre-log.1391951871.19830 LustreError: dumping log to /tmp/lustre-log.1391951871.27472 LustreError: dumping log to /tmp/lustre-log.1391951871.7655 LustreError: dumping log to /tmp/lustre-log.1391951871.27495 LustreError: dumping log to /tmp/lustre-log.1391951871.17285 LustreError: dumping log to /tmp/lustre-log.1391951871.7648 LustreError: dumping log to /tmp/lustre-log.1391951871.30670 LustreError: dumping log to /tmp/lustre-log.1391951871.7639 LustreError: dumping log to /tmp/lustre-log.1391951871.7653 LustreError: dumping log to /tmp/lustre-log.1391951871.7666 LustreError: dumping log to /tmp/lustre-log.1391951871.21616 LustreError: dumping log to /tmp/lustre-log.1391951871.27479 LustreError: dumping log to /tmp/lustre-log.1391951871.17287 LustreError: dumping log to /tmp/lustre-log.1391951871.7645 LustreError: dumping log to /tmp/lustre-log.1391951871.27471 LustreError: dumping log to /tmp/lustre-log.1391951871.21615 LustreError: dumping log to /tmp/lustre-log.1391951871.27492 Lustre: scratch3-MDT0000: Client 06dea7f9-40bf-144a-84c6-6fb134468496 (at JO.BOO.WW.AT@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 2119 previous similar messages Lustre: scratch3-MDT0000: Client d3765dbe-438a-a098-8119-64024a8b00ad (at JO.BOO.PL.BLB@o2ib2) reconnecting Lustre: Skipped 2117 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 14:20:01 2014 LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 55032s: evicting client at JO.BOO.AL.PW@o2ib2 ns: mdt-ffff8806e2389000 lock: ffff8814f7b00900/0x7369ae77eabc2e76 lrc: 3/0,0 mode: PR/PR res: 10889892383/32803 bits 0x2 rrc: 4 type: IBT flags: 0x20 remote: 0x7e29921093562dfd expref: 128562 pid: 15477 timeout: 4399932492 Lustre: scratch3-MDT0000: Export ffff880dca6f7400 already connecting from JO.BOO.AL.PB@o2ib2 LustreError: 15503:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391951971, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff8812e5f56b40/0x7369ae78893cd8f2 lrc: 3/0,1 mode: --/CW res: 10889892383/32803 bits 0x2 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 15503 timeout: 0 Lustre: Service thread pid 14961 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 3 previous similar messages Pid: 14961, comm: mdt_31 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391952228.14961 Lustre: DEBUG MARKER: Sun Feb 9 14:25:01 2014 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply req@ffff8813868c5c00 x1458475080125123/t0(0) o601->LOV_OSC_UUID@JO.BOO.AL.BFF@o2ib2:0/0 lens 224/0 e 0 to 0 dl 1391952312 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages Lustre: scratch3-MDT0000: Export ffff880dca6f7400 already connecting from JO.BOO.AL.PB@o2ib2 Lustre: scratch3-MDT0000: Client 92a22e96-33db-3fa7-e277-a1ddfeea3acc (at JO.BOO.IA.I@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 2166 previous similar messages Lustre: scratch3-MDT0000: Client b3ae13e9-8b45-1d12-a0e0-b868baba3d5e (at JO.BOO.PI.PW@o2ib2) reconnecting Lustre: Skipped 2164 previous similar messages Lustre: scratch3-MDT0000: haven't heard from client 47eeb18b-57b8-aade-f081-cbd5ccf3eb8a (at (no nid)) in 902 seconds. I think it's dead, and I am evicting it. exp ffff880dca6f7400, cur 1391952599 expire 1391951999 last 1391951697 Lustre: DEBUG MARKER: Sun Feb 9 14:30:01 2014 Lustre: Service thread pid 15270 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 15270, comm: mdt_239 Call Trace: [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x320/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391952762.15270 Pid: 15333, comm: mdt_288 Call Trace: [] ? schedule_timeout+0x198/0x2d0 [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x287/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_object_find_lock+0x61/0x170 [mdt] [] mdt_reint_open+0x20d/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] ? mdt_ucred+0x15/0x20 [mdt] [] ? mdt_root_squash+0x2c/0x3e0 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15338, comm: mdt_293 Call Trace: [] ? schedule_timeout+0x198/0x2d0 [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x287/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_object_find_lock+0x61/0x170 [mdt] [] mdt_reint_open+0x20d/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] ? mdt_ucred+0x15/0x20 [mdt] [] ? mdt_root_squash+0x2c/0x3e0 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15194, comm: mdt_165 Call Trace: [] ? schedule_timeout+0x198/0x2d0 [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x320/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15530, comm: mdt_448 Call Trace: [] ? schedule_timeout+0x198/0x2d0 [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x320/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [] ? __req_capsule_get+0x176/0x750 [ptlrpc] [] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Lustre: Service thread pid 15184 was inactive for 1201.06s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 152 previous similar messages Lustre: Service thread pid 15402 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: 14998:0:(ldlm_lib.c:952:target_handle_connect()) scratch3-MDT0000: connection from 47eeb18b-57b8-aade-f081-cbd5ccf3eb8a@JO.BOO.AL.PB@o2ib2 t410202353499 exp (null) cur 1391952897 last 0 Lustre: Skipped 25 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391952897.15402 Lustre: DEBUG MARKER: Sun Feb 9 14:35:01 2014 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply req@ffff881bf0b4f000 x1458475046321479/t0(0) o601->LOV_OSC_UUID@JO.BOB.AL.BFZ@o2ib3:0/0 lens 224/0 e 0 to 0 dl 1391953083 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 79 previous similar messages Lustre: scratch3-MDT0000: Client 562f838e-8ff2-30df-4042-f28793c773bd (at JO.BOO.IA.P@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 2220 previous similar messages Pid: 15503, comm: mdt_421 Call Trace: [] ? schedule_timeout+0x198/0x2d0 [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x287/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_object_find_lock+0x61/0x170 [mdt] [] mdt_reint_open+0x20d/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] ? mdt_ucred+0x15/0x20 [mdt] [] ? mdt_root_squash+0x2c/0x3e0 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391953171.15503 Lustre: scratch3-MDT0000: Client f092887d-ddd2-5715-76ce-503327fbc30b (at JO.BOO.II.BTF@o2ib2) reconnecting Lustre: Skipped 2219 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 14:40:01 2014 Pid: 15445, comm: mdt_rdpg_157 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] ? mdt_handle_last_unlink+0x1e7/0x4d0 [mdt] [] mdt_close+0x692/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391953255.15445 Pid: 15090, comm: mdt_rdpg_82 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] ? mdt_handle_last_unlink+0x1e7/0x4d0 [mdt] [] mdt_close+0x692/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15173, comm: mdt_rdpg_127 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] ? mdt_handle_last_unlink+0x1e7/0x4d0 [mdt] [] mdt_close+0x692/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 16591, comm: mdt_rdpg_402 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] ? mdt_handle_last_unlink+0x1e7/0x4d0 [mdt] [] mdt_close+0x692/0x850 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_readpage_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Lustre: Service thread pid 16567 was inactive for 1200.49s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: scratch3-MDT0000: Export ffff8816ca3fcc00 already connecting from JO.BOO.AL.PB@o2ib2 LustreError: 14908:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391953267, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff88119993a6c0/0x7369ae78893cef18 lrc: 3/0,1 mode: --/CW res: 10850421785/63527 bits 0x2 rrc: 22 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 14908 timeout: 0 Lustre: DEBUG MARKER: Sun Feb 9 14:45:01 2014 Lustre: scratch3-MDT0000: Export ffff8816ca3fcc00 already connecting from JO.BOO.AL.PB@o2ib2 Lustre: scratch3-MDT0000: Client 52d96c29-6ed7-d075-1959-9c626ff52468 (at JO.BOO.WW.AA@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 2662 previous similar messages LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 86349s: evicting client at JO.BOO.PI.WW@o2ib2 ns: mdt-ffff8806e2389000 lock: ffff8810bc2e1d80/0x7369ae7756e3ab73 lrc: 3/0,0 mode: PR/PR res: 10700123913/3552 bits 0x3 rrc: 19 type: IBT flags: 0x4000020 remote: 0xad4729460c249c89 expref: 2312 pid: 15348 timeout: 4400103692 Lustre: scratch3-MDT0000: Client 29fc07aa-935f-6679-a97a-3324e1a06655 (at JO.BOO.PB.LA@o2ib2) reconnecting Lustre: Skipped 2772 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 14:50:01 2014 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply req@ffff881ba58b6000 x1458475046321490/t0(0) o601->LOV_OSC_UUID@JO.BOB.AL.BFZ@o2ib3:0/0 lens 224/0 e 0 to 0 dl 1391953886 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 153 previous similar messages LustreError: 14948:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391953683, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff8808528d4240/0x7369ae78893cf6f8 lrc: 3/0,1 mode: --/CW res: 10700123913/3552 bits 0x2 rrc: 19 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 14948 timeout: 0 Lustre: Service thread pid 15272 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 9 previous similar messages Pid: 15272, comm: mdt_241 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391953931.15272 Lustre: Service thread pid 14998 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: scratch3-MDT0000: Export ffff8816ca3fcc00 already connecting from JO.BOO.AL.PB@o2ib2 Pid: 14998, comm: mdt_61 Call Trace: [] ? down_trylock+0x37/0x50 [] ? try_acquire_console_sem+0x15/0x60 [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_client_new+0x239/0x7c0 [mdt] [] mdt_obd_connect+0x289/0x440 [mdt] [] target_handle_connect+0xd4b/0x2de0 [ptlrpc] [] ? enqueue_task+0x43/0x90 [] ? check_preempt_curr+0x6d/0x90 [] ? try_to_wake_up+0xb0/0x2d0 [] mdt_connect+0x3a/0x510 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391954097.14998 Lustre: DEBUG MARKER: Sun Feb 9 14:55:01 2014 Lustre: Service thread pid 15490 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 15490, comm: mdt_408 Call Trace: [] ? ldlm_srv_pool_recalc+0x33/0x290 [ptlrpc] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391954269.15490 Pid: 15292, comm: mdt_261 Call Trace: [] ? ldlm_srv_pool_recalc+0x33/0x290 [ptlrpc] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 14971, comm: mdt_41 Call Trace: [] ? ldlm_srv_pool_recalc+0x33/0x290 [ptlrpc] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391954272.14971 Pid: 15522, comm: mdt_440 Call Trace: [] ? ldlm_srv_pool_recalc+0x33/0x290 [ptlrpc] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15146, comm: mdt_141 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Lustre: scratch3-MDT0000: Client aceda34b-f506-2e09-96ac-348d9b543a2e (at JO.BOO.WI.PF@o2ib2) refused reconnection, still busy with 1 active RPCs Lustre: Skipped 3223 previous similar messages LustreError: 15475:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391954149, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff8806e2389000 lock: ffff880dfa17e240/0x7369ae78893cfefb lrc: 3/0,1 mode: --/CW res: 10873478136/11183 bits 0x2 rrc: 18 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 15475 timeout: 0 Lustre: scratch3-MDT0000: Client 161e69fe-af57-54e2-c9c7-6422158e6231 (at JO.BOO.PI.F@o2ib2) reconnecting Lustre: Skipped 3228 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 15:00:01 2014 Lustre: Service thread pid 14908 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 105 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391954467.14908 Lustre: Service thread pid 16576 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. LustreError: dumping log to /tmp/lustre-log.1391954522.16576 Lustre: 27489:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-300), not sending early reply req@ffff881812e18000 x1458488085933984/t0(0) o103->e40f31b4-4d0f-ebac-5f62-82b5bf940e81@JO.BOO.IA.BIW@o2ib2:0/0 lens 296/0 e 2 to 0 dl 1391954588 ref 2 fl New:H/0/ffffffff rc 0/-1 Lustre: 27489:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 173 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 15:05:01 2014 Lustre: Service thread pid 14948 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 4 previous similar messages Pid: 14948, comm: mdt_20 Call Trace: [] ? schedule_timeout+0x198/0x2d0 [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x287/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_object_find_lock+0x61/0x170 [mdt] [] mdt_reint_open+0x20d/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] ? mdt_ucred+0x15/0x20 [mdt] [] ? mdt_root_squash+0x2c/0x3e0 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391954883.14948 Lustre: DEBUG MARKER: Sun Feb 9 15:10:01 2014 Pid: 15211, comm: mdt_180 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdd_trans_start+0x33/0x40 [mdd] [] mdd_create+0x879/0x1a90 [mdd] [] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [] cml_create+0x97/0x250 [cmm] [] ? mdt_version_get_save+0x91/0xd0 [mdt] [] mdt_reint_open+0x1939/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391955181.15211 Lustre: 15277:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff88069d3c2050 x1458580323451793/t0(0) o49->567638be-0c6e-6fec-b19e-cf333836b01c@JO.BOO.WZ.BWI@o2ib2:0/0 lens 440/0 e 0 to 0 dl 1391955215 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 15277:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 61 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 15:15:01 2014 Pid: 15475, comm: mdt_396 Call Trace: [] ? schedule_timeout+0x198/0x2d0 [] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [] cfs_waitq_wait+0xe/0x10 [libcfs] [] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [] ? default_wake_function+0x0/0x20 [] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] mdt_object_lock+0x287/0xb70 [mdt] [] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [] mdt_object_find_lock+0x61/0x170 [mdt] [] mdt_reint_open+0x20d/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] ? mdt_ucred+0x15/0x20 [mdt] [] ? mdt_root_squash+0x2c/0x3e0 [mdt] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391955349.15475 Lustre: DEBUG MARKER: Sun Feb 9 15:20:01 2014 Lustre: Service thread pid 15317 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Lustre: Skipped 2 previous similar messages Pid: 15317, comm: mdt_272 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? try_to_wake_up+0x195/0x2d0 [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1391955660.15317 Pid: 15015, comm: mdt_78 Call Trace: [] ? ldlm_srv_pool_recalc+0x33/0x290 [ptlrpc] [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15534, comm: mdt_452 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 14985, comm: mdt_48 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Pid: 15347, comm: mdt_302 Call Trace: [] start_this_handle+0x27a/0x4f0 [jbd2] [] ? autoremove_wake_function+0x0/0x40 [] jbd2_journal_start+0xd0/0x110 [jbd2] [] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs] [] osd_trans_start+0x134/0x5b0 [osd_ldiskfs] [] mdt_trans_start+0x5c/0x70 [mdt] [] mdt_empty_transno+0x45b/0x550 [mdt] [] mdt_finish_open+0xf03/0x1660 [mdt] [] mdt_reint_open+0x1519/0x24e0 [mdt] [] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [] ? md_ucred+0x1e/0x60 [mdd] [] mdt_reint_rec+0x41/0xe0 [mdt] [] mdt_reint_internal+0x544/0x8e0 [mdt] [] mdt_intent_reint+0x1ed/0x530 [mdt] [] mdt_intent_policy+0x379/0x690 [mdt] [] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [] mdt_enqueue+0x46/0x130 [mdt] [] mdt_handle_common+0x932/0x1750 [mdt] [] mdt_regular_handle+0x15/0x20 [mdt] [] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [] ? __switch_to+0x1ac/0x320 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [] ? child_rip+0x0/0x20 Lustre: Service thread pid 15061 was inactive for 1200.88s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 109 previous similar messages Lustre: 15277:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff8814ef6dac00 x1458492806469463/t0(0) o400->a4ef1605-1d2a-072e-4c48-c7ee172e183c@JO.BOO.II.BII@o2ib2:0/0 lens 192/0 e 0 to 0 dl 1391955816 ref 2 fl New:H/0/ffffffff rc 0/-1 Lustre: 15277:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 28676 previous similar messages Lustre: Service thread pid 15471 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. Lustre: Skipped 210 previous similar messages LustreError: dumping log to /tmp/lustre-log.1391955873.15471 Lustre: DEBUG MARKER: Sun Feb 9 15:25:02 2014 Lustre: DEBUG MARKER: Sun Feb 9 15:30:01 2014 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply req@ffff8816a7835400 x1458475031788276/t0(0) o601->LOV_OSC_UUID@JO.BOO.AL.BFO@o2ib2:0/0 lens 224/0 e 0 to 0 dl 1391956425 ref 2 fl New:/0/ffffffff rc 0/-1 Lustre: 32035:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 8989 previous similar messages Lustre: DEBUG MARKER: Sun Feb 9 15:35:01 2014 Uhhuh. NMI received for unknown reason 2d on CPU 0. Do you have a strange power saving mode enabled? Kernel panic - not syncing: NMI: Not continuing Pid: 0, comm: swapper Not tainted 2.6.32-220.23.1.bl6.Bull.28.10.x86_64 #1 Call Trace: [] ? panic+0x78/0x143 [] ? do_nmi+0x240/0x2b0 [] ? nmi+0x20/0x30 [] ? intel_idle+0xb1/0x170 <> [] ? cpuidle_idle_call+0xa7/0x140 [] ? cpu_idle+0xb6/0x110 [] ? rest_init+0x7a/0x80 [] ? start_kernel+0x481/0x48d [] ? x86_64_start_reservations+0x125/0x129 [] ? x86_64_start_kernel+0xfa/0x109