[632161.364661] Lustre: DEBUG MARKER: Tue Feb 4 18:05:01 2014 [632161.364663] [632460.802073] Lustre: DEBUG MARKER: Tue Feb 4 18:10:01 2014 [632460.802075] [632567.610799] LustreError: 17692:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391533708, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881b87de2b40/0x1bd0ca3af43d5e49 lrc: 3/1,0 mode: --/PR res: 8954331488/1 bits 0x3 rrc: 15 type: IBT flags: 0x4004030 remote: 0x0 expref: -99 pid: 17692 timeout: 0 [632567.647006] LustreError: dumping log to /tmp/lustre-log.1391533908.17692 [632638.859594] LustreError: 18030:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391533779, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a2abc9b40/0x1bd0ca3af43ef442 lrc: 3/0,1 mode: --/CW res: 8954331488/1 bits 0x2 rrc: 15 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18030 timeout: 0 [632760.251209] Lustre: DEBUG MARKER: Tue Feb 4 18:15:01 2014 [632760.251211] [632975.329109] Lustre: Service thread pid 5468 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [632975.346428] Pid: 5468, comm: mdt_09 [632975.350124] [632975.350125] Call Trace: [632975.357134] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [632975.364429] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [632975.372126] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [632975.380039] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [632975.387640] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [632975.395354] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [632975.402672] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [632975.410548] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [632975.418414] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [632975.426642] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [632975.434659] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [632975.442085] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [632975.449529] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [632975.458680] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [632975.466371] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [632975.474118] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [632975.481524] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [632975.490140] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [632975.497435] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [632975.505647] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [632975.513620] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [632975.522233] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [632975.529402] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [632975.538141] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [632975.546285] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [632975.554153] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [632975.562129] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [632975.570341] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [632975.578819] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [632975.586163] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [632975.594226] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [632975.602081] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [632975.609829] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [632975.616941] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [632975.624705] [<ffffffff8100412a>] child_rip+0xa/0x20 [632975.631221] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [632975.638999] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [632975.646737] [<ffffffff81004120>] ? child_rip+0x0/0x20 [632975.653354] [632975.656317] LustreError: dumping log to /tmp/lustre-log.1391534317.5468 [632991.209308] Lustre: Service thread pid 18006 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [632991.226730] Pid: 18006, comm: mdt_53 [632991.230528] [632991.230529] Call Trace: [632991.237543] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [632991.245231] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [632991.253158] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [632991.260757] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [632991.268521] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [632991.275867] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [632991.283730] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [632991.291598] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [632991.299822] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [632991.307853] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [632991.315300] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [632991.322721] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [632991.331963] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [632991.339720] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [632991.347455] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [632991.354880] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [632991.363472] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [632991.370747] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [632991.379016] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [632991.386884] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [632991.395579] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [632991.402685] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [632991.410076] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [632991.418113] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [632991.426026] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [632991.433986] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [632991.442157] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [632991.450602] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [632991.457944] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [632991.465983] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [632991.473874] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [632991.481634] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [632991.488669] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [632991.496336] [<ffffffff8100412a>] child_rip+0xa/0x20 [632991.502812] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [632991.510558] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [632991.518305] [<ffffffff81004120>] ? child_rip+0x0/0x20 [632991.524957] [632991.527972] LustreError: dumping log to /tmp/lustre-log.1391534332.18006 [633005.592384] Lustre: Service thread pid 18035 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [633005.609800] Pid: 18035, comm: mdt_82 [633005.613610] [633005.613611] Call Trace: [633005.620691] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [633005.628408] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [633005.636429] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [633005.644058] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [633005.651823] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [633005.659188] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [633005.667039] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [633005.674900] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [633005.683140] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [633005.691203] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [633005.698601] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [633005.706033] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [633005.715221] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [633005.722949] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [633005.730752] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [633005.738212] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [633005.746792] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [633005.754165] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [633005.762395] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [633005.770309] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [633005.778970] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [633005.786087] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [633005.793534] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [633005.801600] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [633005.809490] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [633005.817476] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [633005.825713] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [633005.834226] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [633005.841601] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [633005.849680] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [633005.857591] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [633005.865386] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [633005.872547] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633005.880316] [<ffffffff8100412a>] child_rip+0xa/0x20 [633005.886852] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633005.894698] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633005.902457] [<ffffffff81004120>] ? child_rip+0x0/0x20 [633005.909134] [633005.912174] LustreError: dumping log to /tmp/lustre-log.1391534347.18035 [633057.697486] Lustre: Service thread pid 17693 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [633057.714922] Pid: 17693, comm: mdt_135 [633057.718811] [633057.718812] Call Trace: [633057.725807] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [633057.733094] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [633057.740782] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [633057.748746] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [633057.756350] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [633057.764045] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [633057.771412] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [633057.779277] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [633057.787167] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [633057.795360] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [633057.803455] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [633057.810919] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [633057.818349] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [633057.827500] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [633057.835133] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [633057.842974] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [633057.850407] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [633057.858978] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [633057.866152] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [633057.874398] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [633057.882320] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [633057.891006] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [633057.898116] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [633057.905536] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [633057.913668] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [633057.921574] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [633057.929589] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [633057.937814] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [633057.946259] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [633057.953606] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [633057.961674] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [633057.969567] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [633057.977357] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [633057.984462] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633057.992297] [<ffffffff8100412a>] child_rip+0xa/0x20 [633057.998846] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633058.006725] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633058.014500] [<ffffffff81004120>] ? child_rip+0x0/0x20 [633058.021198] [633058.024232] LustreError: dumping log to /tmp/lustre-log.1391534399.17693 [633060.258826] Lustre: DEBUG MARKER: Tue Feb 4 18:20:01 2014 [633060.258828] [633060.880733] Lustre: Service thread pid 18065 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [633060.898239] Pid: 18065, comm: mdt_112 [633060.902174] [633060.902175] Call Trace: [633060.909189] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [633060.916873] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [633060.924850] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [633060.932458] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [633060.940138] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [633060.947483] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [633060.955416] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [633060.963313] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [633060.971564] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [633060.979593] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [633060.987099] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [633060.994559] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [633061.004907] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [633061.011556] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [633061.019326] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [633061.026813] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [633061.035425] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [633061.042729] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [633061.050984] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [633061.058878] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [633061.067593] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [633061.074753] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [633061.082184] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [633061.090229] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [633061.098013] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [633061.106002] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [633061.114254] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [633061.122740] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [633061.130151] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [633061.138120] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [633061.146019] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [633061.153798] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [633061.160974] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633061.168713] [<ffffffff8100412a>] child_rip+0xa/0x20 [633061.175259] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633061.183079] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633061.190935] [<ffffffff81004120>] ? child_rip+0x0/0x20 [633061.197585] [633061.200668] LustreError: dumping log to /tmp/lustre-log.1391534402.18065 [633068.475892] Lustre: Service thread pid 18056 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [633068.489175] LustreError: dumping log to /tmp/lustre-log.1391534410.18056 [633079.104838] Lustre: Service thread pid 18004 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [633079.118127] LustreError: dumping log to /tmp/lustre-log.1391534420.18004 [633144.068416] Lustre: Service thread pid 16397 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [633144.081719] LustreError: dumping log to /tmp/lustre-log.1391534485.16397 [633166.434484] Lustre: Service thread pid 17692 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [633166.447830] LustreError: dumping log to /tmp/lustre-log.1391534508.17692 [633198.680060] LustreError: 16398:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391534240, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881b82356480/0x1bd0ca3af44da1d6 lrc: 3/1,0 mode: --/PR res: 8954594907/1 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 16398 timeout: 0 [633237.683431] Lustre: Service thread pid 18030 was inactive for 800.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [633237.696713] LustreError: dumping log to /tmp/lustre-log.1391534579.18030 [633284.351832] Lustre: Service thread pid 16437 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [633284.369256] Pid: 16437, comm: mdt_29 [633284.373130] [633284.373130] Call Trace: [633284.380059] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [633284.387414] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [633284.395131] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [633284.403073] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [633284.410605] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [633284.418374] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [633284.425722] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [633284.433633] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [633284.441425] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [633284.449536] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [633284.456990] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [633284.464481] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [633284.473621] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [633284.481245] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [633284.489089] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [633284.496520] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [633284.505135] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [633284.512343] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [633284.520608] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [633284.528528] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [633284.537176] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [633284.544285] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [633284.551647] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [633284.559754] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [633284.567621] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [633284.575610] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [633284.583823] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [633284.592277] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [633284.599548] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [633284.607727] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [633284.615616] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [633284.623298] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [633284.630403] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633284.638211] [<ffffffff8100412a>] child_rip+0xa/0x20 [633284.644728] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633284.652515] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633284.660184] [<ffffffff81004120>] ? child_rip+0x0/0x20 [633284.666906] [633284.669861] LustreError: dumping log to /tmp/lustre-log.1391534626.16437 [633333.634972] Lustre: 18015:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [633333.634976] req@ffff8819f0d96400 x1458479661617350/t0(0) o101->18d50e79-e7ff-2a23-0203-d8ade50c06c4@JO.BOO.PI.WA@o2ib2:0/0 lens 656/4936 e 5 to 0 dl 1391534680 ref 2 fl Interpret:/0/0 rc 0/0 [633359.707561] Lustre: DEBUG MARKER: Tue Feb 4 18:25:01 2014 [633359.707563] [633376.471910] LustreError: 18068:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391534518, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a8471a900/0x1bd0ca3af44dca6a lrc: 3/1,0 mode: --/PR res: 8952384457/22571 bits 0x3 rrc: 3 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18068 timeout: 0 [633376.550661] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-586), not sending early reply [633376.550664] req@ffff88205e4be050 x1458488109435567/t0(0) o101->e542ecba-5c5a-3526-16a2-ba252dad69e8@JO.BOO.II.B@o2ib2:0/0 lens 640/4936 e 4 to 0 dl 1391534723 ref 2 fl Interpret:/0/0 rc 0/0 [633391.521251] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-586), not sending early reply [633391.521255] req@ffff881bc913c800 x1458669928753983/t0(0) o101->2735fb90-b05e-3f85-5a9d-e2389c950e23@JO.BOO.PI.FL@o2ib2:0/0 lens 640/4936 e 4 to 0 dl 1391534738 ref 2 fl Interpret:/0/0 rc 0/0 [633416.473937] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [633416.473941] req@ffff881d05942850 x1458479661383360/t0(0) o101->1149a764-bf57-dbdd-b71c-b33708ee8303@JO.BOO.PI.FW@o2ib2:0/0 lens 640/4936 e 3 to 0 dl 1391534763 ref 2 fl Interpret:/0/0 rc 0/0 [633419.468066] Lustre: 18020:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [633419.468069] req@ffff881ac005c000 x1458479123507454/t0(0) o101->e3b65402-5a14-5932-6e46-c3e70901f1c1@JO.BOO.PZ.BIP@o2ib2:0/0 lens 648/4936 e 3 to 0 dl 1391534766 ref 2 fl Interpret:/0/0 rc 0/0 [633427.450927] Lustre: 18064:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [633427.450931] req@ffff8819f0685400 x1458479384418998/t0(0) o101->6bd04180-4e0e-af62-e680-77d6e3b5d5c4@JO.BOO.AL.FL@o2ib2:0/0 lens 624/4936 e 3 to 0 dl 1391534774 ref 2 fl Interpret:/0/0 rc 0/0 [633437.431066] Lustre: 18064:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [633437.431070] req@ffff8819ee255c00 x1458479616376562/t0(0) o101->fa1ca0cd-dbb7-2488-9baf-a2b84717a6c4@JO.BOO.PI.WF@o2ib2:0/0 lens 640/4936 e 3 to 0 dl 1391534784 ref 2 fl Interpret:/0/0 rc 0/0 [633444.257637] Lustre: Service thread pid 18048 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [633444.275046] Pid: 18048, comm: mdt_95 [633444.278839] [633444.278840] Call Trace: [633444.285862] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [633444.293565] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [633444.301472] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [633444.309075] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [633444.316836] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [633444.324176] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [633444.332099] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [633444.340006] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [633444.348243] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [633444.356284] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [633444.363727] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [633444.371147] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [633444.380341] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [633444.388087] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [633444.395864] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [633444.403284] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [633444.411848] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [633444.419130] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [633444.427316] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [633444.435245] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [633444.443882] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [633444.450999] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [633444.458464] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [633444.466495] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [633444.474357] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [633444.482336] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [633444.490584] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [633444.499085] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [633444.506425] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [633444.514456] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [633444.522374] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [633444.530148] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [633444.537233] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633444.546154] [<ffffffff8100412a>] child_rip+0xa/0x20 [633444.551504] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633444.559345] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633444.567080] [<ffffffff81004120>] ? child_rip+0x0/0x20 [633444.573738] [633444.576753] LustreError: dumping log to /tmp/lustre-log.1391534786.18048 [633460.385978] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-494), not sending early reply [633460.385981] req@ffff881a30ca0000 x1458479661811495/t0(0) o101->18d50e79-e7ff-2a23-0203-d8ade50c06c4@JO.BOO.PI.WA@o2ib2:0/0 lens 552/4808 e 2 to 0 dl 1391534807 ref 2 fl Interpret:/0/0 rc 0/0 [633460.415682] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [633502.436190] Lustre: store1-MDT0000: Client 18d50e79-e7ff-2a23-0203-d8ade50c06c4 (at JO.BOO.PI.WA@o2ib2) reconnecting [633502.446967] Lustre: store1-MDT0000: Client 18d50e79-e7ff-2a23-0203-d8ade50c06c4 (at JO.BOO.PI.WA@o2ib2) refused reconnection, still busy with 2 active RPCs [633526.256584] Lustre: 18020:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply [633526.256588] req@ffff881bc20d0c00 x1458479612436527/t0(0) o101->a4ef4c3a-ff9b-0ac2-ef60-870a0df43748@JO.BOO.PI.LBT@o2ib2:0/0 lens 624/4936 e 0 to 0 dl 1391534873 ref 2 fl Interpret:/0/0 rc 0/0 [633526.286377] Lustre: 18020:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [633541.486773] Lustre: store1-MDT0000: Client a4ef4c3a-ff9b-0ac2-ef60-870a0df43748 (at JO.BOO.PI.LBT@o2ib2) reconnecting [633541.497618] Lustre: store1-MDT0000: Client a4ef4c3a-ff9b-0ac2-ef60-870a0df43748 (at JO.BOO.PI.LBT@o2ib2) refused reconnection, still busy with 1 active RPCs [633546.157428] Lustre: store1-MDT0000: Client e542ecba-5c5a-3526-16a2-ba252dad69e8 (at JO.BOO.II.B@o2ib2) reconnecting [633546.168096] Lustre: store1-MDT0000: Client e542ecba-5c5a-3526-16a2-ba252dad69e8 (at JO.BOO.II.B@o2ib2) refused reconnection, still busy with 1 active RPCs [633560.625291] Lustre: store1-MDT0000: Client 2735fb90-b05e-3f85-5a9d-e2389c950e23 (at JO.BOO.PI.FL@o2ib2) reconnecting [633560.636055] Lustre: store1-MDT0000: Client 2735fb90-b05e-3f85-5a9d-e2389c950e23 (at JO.BOO.PI.FL@o2ib2) refused reconnection, still busy with 1 active RPCs [633583.004964] Lustre: store1-MDT0000: Client e5e726f5-0a7d-e402-0f4c-7b4ba1709161 (at JO.BOO.AL.PL@o2ib2) reconnecting [633583.015741] Lustre: store1-MDT0000: Client e5e726f5-0a7d-e402-0f4c-7b4ba1709161 (at JO.BOO.AL.PL@o2ib2) refused reconnection, still busy with 2 active RPCs [633589.047682] Lustre: store1-MDT0000: Client e3b65402-5a14-5932-6e46-c3e70901f1c1 (at JO.BOO.PZ.BIP@o2ib2) refused reconnection, still busy with 1 active RPCs [633589.061936] Lustre: Skipped 2 previous similar messages [633596.483338] Lustre: store1-MDT0000: Client 6bd04180-4e0e-af62-e680-77d6e3b5d5c4 (at JO.BOO.AL.FL@o2ib2) reconnecting [633596.494116] Lustre: Skipped 3 previous similar messages [633602.233884] Lustre: store1-MDT0000: Client 18d50e79-e7ff-2a23-0203-d8ade50c06c4 (at JO.BOO.PI.WA@o2ib2) refused reconnection, still busy with 2 active RPCs [633602.248062] Lustre: Skipped 1 previous similar message [633614.185858] Lustre: store1-MDT0000: Client 13dc96fd-ec54-eff5-5b15-3e6486e67833 (at JO.BOO.IZ.BLF@o2ib2) reconnecting [633614.196704] Lustre: Skipped 2 previous similar messages [633630.052756] Lustre: 18053:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-386), not sending early reply [633630.052760] req@ffff881d67a64850 x1458479041820223/t0(0) o101->4c8862c3-cd29-7073-cc7f-8d955ff73907@JO.BOO.AO.BF@o2ib2:0/0 lens 648/4936 e 1 to 0 dl 1391534977 ref 2 fl Interpret:/0/0 rc 0/0 [633630.082384] Lustre: 18053:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages [633639.693692] Lustre: store1-MDT0000: Client 2ab1748d-72c2-ff52-143f-f323a2318337 (at JO.BOO.WL.ZT@o2ib2) refused reconnection, still busy with 1 active RPCs [633639.707866] Lustre: Skipped 2 previous similar messages [633652.040596] Lustre: store1-MDT0000: Client 675d8b23-9bc8-377b-75be-240ab911afc5 (at JO.BOO.IP.WL@o2ib2) reconnecting [633652.051345] Lustre: Skipped 3 previous similar messages [633659.145289] Lustre: DEBUG MARKER: Tue Feb 4 18:30:01 2014 [633659.145291] [633682.819330] Lustre: store1-MDT0000: Client e5e726f5-0a7d-e402-0f4c-7b4ba1709161 (at JO.BOO.AL.PL@o2ib2) refused reconnection, still busy with 2 active RPCs [633682.833538] Lustre: Skipped 5 previous similar messages [633739.492472] Lustre: store1-MDT0000: Client 2ab1748d-72c2-ff52-143f-f323a2318337 (at JO.BOO.WL.ZT@o2ib2) reconnecting [633739.503294] Lustre: Skipped 11 previous similar messages [633751.839199] Lustre: store1-MDT0000: Client 675d8b23-9bc8-377b-75be-240ab911afc5 (at JO.BOO.IP.WL@o2ib2) refused reconnection, still busy with 1 active RPCs [633751.853398] Lustre: Skipped 12 previous similar messages [633882.408913] Lustre: store1-MDT0000: Client e5e726f5-0a7d-e402-0f4c-7b4ba1709161 (at JO.BOO.AL.PL@o2ib2) reconnecting [633882.419680] Lustre: Skipped 23 previous similar messages [633882.425193] Lustre: store1-MDT0000: Client e5e726f5-0a7d-e402-0f4c-7b4ba1709161 (at JO.BOO.AL.PL@o2ib2) refused reconnection, still busy with 2 active RPCs [633882.440697] Lustre: Skipped 19 previous similar messages [633925.472351] Lustre: 18040:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [633925.472355] req@ffff881a02538c00 x1458479209379728/t0(0) o101->e5e726f5-0a7d-e402-0f4c-7b4ba1709161@JO.BOO.AL.PL@o2ib2:0/0 lens 560/4808 e 0 to 0 dl 1391535273 ref 2 fl Interpret:/0/0 rc 0/0 [633925.501975] Lustre: 18040:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages [633958.593639] Lustre: DEBUG MARKER: Tue Feb 4 18:35:01 2014 [633958.593641] [633973.588293] Lustre: Service thread pid 17998 was inactive for 1194.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [633973.605836] Pid: 17998, comm: mdt_45 [633973.609624] [633973.609624] Call Trace: [633973.616663] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [633973.624364] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [633973.632279] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [633973.639885] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [633973.647574] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [633973.654919] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [633973.662838] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [633973.670760] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [633973.678946] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [633973.686980] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [633973.694435] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [633973.701861] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [633973.711003] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [633973.718699] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [633973.726468] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [633973.733900] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [633973.742472] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [633973.749751] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [633973.757967] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [633973.765891] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [633973.774583] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [633973.781735] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [633973.789165] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [633973.797199] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [633973.805075] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [633973.813086] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [633973.821286] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [633973.829739] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [633973.837080] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [633973.845099] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [633973.852970] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [633973.860723] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [633973.867870] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633973.875597] [<ffffffff8100412a>] child_rip+0xa/0x20 [633973.882115] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633973.889888] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [633973.897697] [<ffffffff81004120>] ? child_rip+0x0/0x20 [633973.904334] [633973.907347] LustreError: dumping log to /tmp/lustre-log.1391535317.17998 [634017.102383] Lustre: Service thread pid 18070 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [634017.119879] Pid: 18070, comm: mdt_117 [634017.123752] [634017.123753] Call Trace: [634017.130776] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [634017.138468] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [634017.146382] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [634017.153980] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [634017.161676] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [634017.169028] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [634017.176900] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [634017.184841] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [634017.193037] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [634017.201070] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [634017.208517] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [634017.215936] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [634017.225046] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [634017.232704] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [634017.240461] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [634017.247905] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [634017.256463] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [634017.263717] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [634017.271879] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [634017.279803] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [634017.288501] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [634017.295688] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [634017.303118] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [634017.311151] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [634017.319033] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [634017.327008] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [634017.335235] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [634017.343660] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [634017.351000] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [634017.359045] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [634017.366984] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [634017.374738] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [634017.381811] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634017.389544] [<ffffffff8100412a>] child_rip+0xa/0x20 [634017.397353] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634017.405225] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634017.412978] [<ffffffff81004120>] ? child_rip+0x0/0x20 [634017.419616] [634017.422638] LustreError: dumping log to /tmp/lustre-log.1391535360.18070 [634019.718299] Pid: 16436, comm: mdt_28 [634019.722036] [634019.722037] Call Trace: [634019.726301] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [634019.732653] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [634019.739322] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [634019.746917] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [634019.754626] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [634019.761963] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [634019.769853] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [634019.777739] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [634019.785958] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [634019.793987] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [634019.801489] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [634019.808985] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [634019.818123] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [634019.825765] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [634019.833615] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [634019.841059] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [634019.849617] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [634019.856880] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [634019.865122] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [634019.873045] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [634019.881760] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [634019.888863] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [634019.896221] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [634019.904326] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [634019.912195] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [634019.920161] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [634019.928463] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [634019.936844] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [634019.944268] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [634019.952314] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [634019.960200] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [634019.967937] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [634019.974985] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634019.982791] [<ffffffff8100412a>] child_rip+0xa/0x20 [634019.989403] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634019.997102] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634020.004912] [<ffffffff81004120>] ? child_rip+0x0/0x20 [634020.011564] [634020.014513] LustreError: dumping log to /tmp/lustre-log.1391535363.16436 [634029.179184] Lustre: 31628:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 504837a1-69df-e6aa-d9dc-333c8877259d@JO.BOO.PT.BO@o2ib8 t0 exp (null) cur 1391535372 last 0 [634029.195719] Lustre: 31628:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1 previous similar message [634030.890833] Lustre: 16430:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from eeb59a3a-9b88-2571-381f-49d4b4d75070@JO.BOO.PT.BB@o2ib8 t0 exp (null) cur 1391535374 last 0 [634030.908393] Lustre: 16430:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1 previous similar message [634086.147798] Lustre: Service thread pid 7699 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [634086.165215] Lustre: Skipped 1 previous similar message [634086.170614] Pid: 7699, comm: mdt_17 [634086.175614] [634086.175615] Call Trace: [634086.182647] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [634086.190347] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [634086.198295] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [634086.205895] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [634086.213572] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [634086.220916] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [634086.228780] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [634086.236645] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [634086.244848] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [634086.252889] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [634086.260398] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [634086.267814] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [634086.276949] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [634086.284660] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [634086.292433] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [634086.299863] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [634086.308409] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [634086.315699] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [634086.323897] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [634086.331822] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [634086.340460] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [634086.347579] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [634086.355008] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [634086.363042] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [634086.370908] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [634086.378878] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [634086.387100] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [634086.395628] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [634086.402976] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [634086.411026] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [634086.418953] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [634086.426754] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [634086.433863] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634086.441644] [<ffffffff8100412a>] child_rip+0xa/0x20 [634086.448199] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634086.456024] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634086.463714] [<ffffffff81004120>] ? child_rip+0x0/0x20 [634086.470368] [634086.473393] LustreError: dumping log to /tmp/lustre-log.1391535430.7699 [634096.915546] Pid: 16398, comm: mdt_20 [634096.919330] [634096.919331] Call Trace: [634096.923615] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [634096.931217] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [634096.940496] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [634096.948350] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [634096.956681] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [634096.964475] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [634096.973278] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [634096.981727] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [634096.989718] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [634096.997556] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [634097.005519] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [634097.013908] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [634097.022403] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [634097.030397] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [634097.038964] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [634097.047464] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [634097.055655] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [634097.063650] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [634097.072012] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [634097.080581] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [634097.087989] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [634097.096131] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [634097.104044] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [634097.111889] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [634097.119127] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634097.126937] [<ffffffff8100412a>] child_rip+0xa/0x20 [634097.133544] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634097.141522] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634097.149400] [<ffffffff81004120>] ? child_rip+0x0/0x20 [634097.156107] [634097.159221] LustreError: dumping log to /tmp/lustre-log.1391535440.16398 [634138.708427] Lustre: store1-MDT0000: Client 2ab1748d-72c2-ff52-143f-f323a2318337 (at JO.BOO.WL.ZT@o2ib2) reconnecting [634138.719225] Lustre: Skipped 45 previous similar messages [634138.724792] Lustre: store1-MDT0000: Client 2ab1748d-72c2-ff52-143f-f323a2318337 (at JO.BOO.WL.ZT@o2ib2) refused reconnection, still busy with 1 active RPCs [634138.740242] Lustre: Skipped 45 previous similar messages [634174.613000] Lustre: Service thread pid 18018 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [634174.626467] LustreError: dumping log to /tmp/lustre-log.1391535518.18018 [634258.030782] Lustre: DEBUG MARKER: Tue Feb 4 18:40:01 2014 [634258.030784] [634318.321106] Lustre: 17994:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [634318.321110] req@ffff881d37d0c850 x1458479378419163/t0(0) o101->693cf41f-0ff5-9972-8e01-a0360c9427ab@JO.BOO.AL.FB@o2ib2:0/0 lens 640/4936 e 0 to 0 dl 1391535667 ref 2 fl Interpret:/0/0 rc 0/0 [634318.350761] Lustre: 17994:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [634374.490746] Lustre: Service thread pid 18068 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [634374.508379] Lustre: Skipped 1 previous similar message [634374.513730] Pid: 18068, comm: mdt_115 [634374.518905] [634374.518906] Call Trace: [634374.525910] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [634374.533502] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [634374.542686] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [634374.550300] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [634374.558664] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [634374.566400] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [634374.575114] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [634374.583430] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [634374.591266] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [634374.599035] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [634374.608259] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [634374.616682] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [634374.625119] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [634374.633107] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [634374.641542] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [634374.649990] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [634374.658025] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [634374.665992] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [634374.674212] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [634374.682655] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [634374.690004] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [634374.698028] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [634374.705909] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [634374.713661] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [634374.720773] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634374.728526] [<ffffffff8100412a>] child_rip+0xa/0x20 [634374.735050] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634374.742793] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634374.750588] [<ffffffff81004120>] ? child_rip+0x0/0x20 [634374.757240] [634374.760262] LustreError: dumping log to /tmp/lustre-log.1391535718.18068 [634378.632140] Pid: 18042, comm: mdt_89 [634378.635880] [634378.635881] Call Trace: [634378.640176] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [634378.647464] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [634378.655229] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [634378.663158] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [634378.670784] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [634378.678518] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [634378.685887] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [634378.693790] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [634378.701695] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [634378.709926] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [634378.717987] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [634378.725502] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [634378.732939] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [634378.742121] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [634378.749914] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [634378.757726] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [634378.765254] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [634378.773904] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [634378.781260] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [634378.789506] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [634378.797499] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [634378.806275] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [634378.813477] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [634378.820992] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [634378.829064] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [634378.836990] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [634378.845051] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [634378.853298] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [634378.861881] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [634378.869265] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [634378.877426] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [634378.885342] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [634378.893115] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [634378.900263] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634378.908036] [<ffffffff8100412a>] child_rip+0xa/0x20 [634378.914574] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634378.922386] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634378.930230] [<ffffffff81004120>] ? child_rip+0x0/0x20 [634378.936921] [634378.939965] LustreError: dumping log to /tmp/lustre-log.1391535723.18042 [634556.481736] Lustre: DEBUG MARKER: Tue Feb 4 18:45:01 2014 [634556.481738] [634650.072667] Lustre: store1-MDT0000: Client 675d8b23-9bc8-377b-75be-240ab911afc5 (at JO.BOO.IP.WL@o2ib2) reconnecting [634650.083425] Lustre: Skipped 98 previous similar messages [634650.089029] Lustre: store1-MDT0000: Client 675d8b23-9bc8-377b-75be-240ab911afc5 (at JO.BOO.IP.WL@o2ib2) refused reconnection, still busy with 1 active RPCs [634650.104577] Lustre: Skipped 98 previous similar messages [634767.238752] Lustre: Service thread pid 18020 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [634767.256262] Lustre: Skipped 1 previous similar message [634767.261610] Pid: 18020, comm: mdt_67 [634767.266717] [634767.266718] Call Trace: [634767.273729] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [634767.281022] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [634767.288723] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [634767.296649] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [634767.304370] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [634767.312281] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [634767.319747] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [634767.327569] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [634767.335564] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [634767.343880] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [634767.352046] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [634767.359550] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [634767.366975] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [634767.376337] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [634767.384212] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [634767.392232] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [634767.399811] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [634767.408363] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [634767.415695] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [634767.424075] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [634767.432135] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [634767.440874] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [634767.447989] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [634767.455491] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [634767.463829] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [634767.471895] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [634767.480022] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [634767.488266] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [634767.496706] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [634767.504185] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [634767.512340] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [634767.520386] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [634767.528135] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [634767.535282] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634767.543253] [<ffffffff8100412a>] child_rip+0xa/0x20 [634767.549818] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634767.557607] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [634767.565440] [<ffffffff81004120>] ? child_rip+0x0/0x20 [634767.572314] [634767.575341] LustreError: dumping log to /tmp/lustre-log.1391536112.18020 [634855.918954] Lustre: DEBUG MARKER: Tue Feb 4 18:50:01 2014 [634855.918956] [635155.367828] Lustre: DEBUG MARKER: Tue Feb 4 18:55:01 2014 [635155.367831] [635257.287162] Lustre: store1-MDT0000: Client 2735fb90-b05e-3f85-5a9d-e2389c950e23 (at JO.BOO.PI.FL@o2ib2) reconnecting [635257.297935] Lustre: Skipped 114 previous similar messages [635257.303623] Lustre: store1-MDT0000: Client 2735fb90-b05e-3f85-5a9d-e2389c950e23 (at JO.BOO.PI.FL@o2ib2) refused reconnection, still busy with 1 active RPCs [635257.319291] Lustre: Skipped 114 previous similar messages [635454.804854] Lustre: DEBUG MARKER: Tue Feb 4 19:00:01 2014 [635454.804856] [635754.278645] Lustre: DEBUG MARKER: Tue Feb 4 19:05:01 2014 [635754.278648] [635858.497785] Lustre: store1-MDT0000: Client 8e81a101-3f17-b68e-df3e-cbf2a3e18200 (at VYD.DF.BBO.W@o2ib) reconnecting [635858.508492] Lustre: Skipped 114 previous similar messages [635858.514078] Lustre: store1-MDT0000: Client 8e81a101-3f17-b68e-df3e-cbf2a3e18200 (at VYD.DF.BBO.W@o2ib) refused reconnection, still busy with 1 active RPCs [635858.529477] Lustre: Skipped 114 previous similar messages [636053.715550] Lustre: DEBUG MARKER: Tue Feb 4 19:10:01 2014 [636053.715552] [636353.164287] Lustre: DEBUG MARKER: Tue Feb 4 19:15:01 2014 [636353.164290] [636477.313308] Lustre: store1-MDT0000: Client e5e726f5-0a7d-e402-0f4c-7b4ba1709161 (at JO.BOO.AL.PL@o2ib2) reconnecting [636477.324061] Lustre: Skipped 114 previous similar messages [636477.329656] Lustre: store1-MDT0000: Client e5e726f5-0a7d-e402-0f4c-7b4ba1709161 (at JO.BOO.AL.PL@o2ib2) refused reconnection, still busy with 2 active RPCs [636477.345148] Lustre: Skipped 114 previous similar messages [636652.601209] Lustre: DEBUG MARKER: Tue Feb 4 19:20:01 2014 [636652.601211] [636952.049958] Lustre: DEBUG MARKER: Tue Feb 4 19:25:01 2014 [636952.049960] [637078.743903] Lustre: store1-MDT0000: Client 5db15b6d-96b9-01c3-2be8-f827bbca027e (at JO.BOO.PW.AA@o2ib2) reconnecting [637078.754666] Lustre: Skipped 114 previous similar messages [637078.760260] Lustre: store1-MDT0000: Client 5db15b6d-96b9-01c3-2be8-f827bbca027e (at JO.BOO.PW.AA@o2ib2) refused reconnection, still busy with 1 active RPCs [637078.775745] Lustre: Skipped 114 previous similar messages [637251.487102] Lustre: DEBUG MARKER: Tue Feb 4 19:30:01 2014 [637251.487104] [637550.935936] Lustre: DEBUG MARKER: Tue Feb 4 19:35:01 2014 [637550.935939] [637677.648309] Lustre: store1-MDT0000: Client 1149a764-bf57-dbdd-b71c-b33708ee8303 (at JO.BOO.PI.FW@o2ib2) reconnecting [637677.659073] Lustre: Skipped 114 previous similar messages [637677.664775] Lustre: store1-MDT0000: Client 1149a764-bf57-dbdd-b71c-b33708ee8303 (at JO.BOO.PI.FW@o2ib2) refused reconnection, still busy with 1 active RPCs [637677.680235] Lustre: Skipped 114 previous similar messages [637850.373431] Lustre: DEBUG MARKER: Tue Feb 4 19:40:01 2014 [637850.373432] [637879.405374] LustreError: 26:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 4228s: evicting client at VYD.DF.BBO.W@o2ib ns: mdt-ffff881be2e73000 lock: ffff881aa2082240/0x1bd0ca3af44e4473 lrc: 3/0,0 mode: PR/PR res: 8589961617/143 bits 0x3 rrc: 2 type: IBT flags: 0x4000020 remote: 0x84274abcffdcbcdf expref: 3206 pid: 18053 timeout: 4358850468 [637954.384558] Lustre: 17994:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib t283472694087 exp (null) cur 1391539305 last 0 [638045.139683] Lustre: Service thread pid 16438 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [638045.157202] Pid: 16438, comm: mdt_30 [638045.160992] [638045.160992] Call Trace: [638045.168001] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [638045.175282] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [638045.182997] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [638045.190859] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [638045.198450] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [638045.206112] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [638045.213492] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [638045.221375] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [638045.229238] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [638045.237461] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [638045.245576] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [638045.253101] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [638045.260503] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [638045.269705] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [638045.277340] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [638045.285185] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [638045.292621] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [638045.301258] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [638045.308506] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [638045.316687] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [638045.324632] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [638045.333325] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [638045.340428] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [638045.347808] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [638045.355916] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [638045.363811] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [638045.371814] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [638045.380042] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [638045.388487] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [638045.395839] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [638045.403946] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [638045.411815] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [638045.419541] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [638045.426705] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [638045.434522] [<ffffffff8100412a>] child_rip+0xa/0x20 [638045.440998] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [638045.448830] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [638045.456572] [<ffffffff81004120>] ? child_rip+0x0/0x20 [638045.463300] [638045.466189] LustreError: dumping log to /tmp/lustre-log.1391539396.16438 [638149.822538] Lustre: DEBUG MARKER: Tue Feb 4 19:45:01 2014 [638149.822540] [638279.839536] Lustre: store1-MDT0000: Client e3b65402-5a14-5932-6e46-c3e70901f1c1 (at JO.BOO.PZ.BIP@o2ib2) reconnecting [638279.850380] Lustre: Skipped 110 previous similar messages [638279.855995] Lustre: store1-MDT0000: Client e3b65402-5a14-5932-6e46-c3e70901f1c1 (at JO.BOO.PZ.BIP@o2ib2) refused reconnection, still busy with 1 active RPCs [638279.871559] Lustre: Skipped 110 previous similar messages [638403.136310] Lustre: 17994:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [638403.136314] req@ffff881bc91edc00 x1458479596494849/t0(0) o101->1125649f-888f-3681-3678-df6fa83c809a@JO.BOO.PL.BTZ@o2ib2:0/0 lens 656/4936 e 5 to 0 dl 1391539760 ref 2 fl Interpret:/0/0 rc 0/0 [638449.259983] Lustre: DEBUG MARKER: Tue Feb 4 19:50:01 2014 [638449.259985] [638546.324944] Lustre: Service thread pid 20522 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [638546.342369] Pid: 20522, comm: mdt_126 [638546.346251] [638546.346252] Call Trace: [638546.353260] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [638546.360546] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [638546.368216] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [638546.376102] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [638546.383705] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [638546.391401] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [638546.398744] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [638546.406605] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [638546.414482] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [638546.422684] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [638546.430722] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [638546.438273] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [638546.445709] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [638546.454768] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [638546.462557] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [638546.470397] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [638546.477843] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [638546.486393] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [638546.493643] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [638546.501885] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [638546.509721] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [638546.518468] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [638546.525539] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [638546.532968] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [638546.541003] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [638546.548865] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [638546.556920] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [638546.565148] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [638546.573659] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [638546.581000] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [638546.588976] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [638546.596940] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [638546.604620] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [638546.611803] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [638546.619589] [<ffffffff8100412a>] child_rip+0xa/0x20 [638546.626089] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [638546.633799] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [638546.641616] [<ffffffff81004120>] ? child_rip+0x0/0x20 [638546.648308] [638546.651266] LustreError: dumping log to /tmp/lustre-log.1391539899.20522 [638623.702964] Lustre: 18040:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply [638623.702968] req@ffff8819eda58800 x1458484840594711/t0(0) o101->13abaeb1-1e04-a145-b26f-5d0f29d6d2cc@JO.BOO.AO.BO@o2ib8:0/0 lens 648/4936 e 0 to 0 dl 1391539981 ref 2 fl Interpret:/0/0 rc 0/0 [638718.137424] LustreError: 18015:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391539870, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a7057a000/0x1bd0ca3af45e37f7 lrc: 3/1,0 mode: --/PR res: 8932845821/9714 bits 0x3 rrc: 3 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18015 timeout: 0 [638727.908224] LustreError: 18076:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391539880, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a2d3e6000/0x1bd0ca3af45e5508 lrc: 3/1,0 mode: --/PR res: 8951399136/1269 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18076 timeout: 0 [638727.943959] LustreError: 18076:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 1 previous similar message [638748.708638] Lustre: DEBUG MARKER: Tue Feb 4 19:55:01 2014 [638748.708640] [638882.712590] Lustre: store1-MDT0000: Client 8a3976d1-d35b-31b0-a9b9-8ee8f50a7e65 (at JO.BOO.AO.FW@o2ib2) reconnecting [638882.723354] Lustre: Skipped 115 previous similar messages [638882.728950] Lustre: store1-MDT0000: Client 8a3976d1-d35b-31b0-a9b9-8ee8f50a7e65 (at JO.BOO.AO.FW@o2ib2) refused reconnection, still busy with 1 active RPCs [638882.744506] Lustre: Skipped 115 previous similar messages [639048.145194] Lustre: DEBUG MARKER: Tue Feb 4 20:00:01 2014 [639048.145196] [639071.113967] Lustre: Service thread pid 18023 was inactive for 1194.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [639071.131441] Pid: 18023, comm: mdt_70 [639071.135291] [639071.135292] Call Trace: [639071.142323] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [639071.150112] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [639071.157980] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [639071.165603] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [639071.173315] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [639071.180678] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [639071.188607] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [639071.196438] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [639071.204674] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [639071.212726] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [639071.220191] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [639071.227692] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [639071.236800] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [639071.244546] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [639071.252343] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [639071.259801] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [639071.268386] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [639071.275693] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [639071.283994] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [639071.291896] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [639071.300577] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [639071.307657] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [639071.315122] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [639071.323179] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [639071.331061] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [639071.339042] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [639071.347278] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [639071.355753] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [639071.363129] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [639071.372441] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [639071.379088] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [639071.386982] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [639071.394061] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639071.401770] [<ffffffff8100412a>] child_rip+0xa/0x20 [639071.408386] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639071.416178] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639071.423950] [<ffffffff81004120>] ? child_rip+0x0/0x20 [639071.430546] [639071.433667] LustreError: dumping log to /tmp/lustre-log.1391540424.18023 [639266.410349] Lustre: 18013:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [639266.410353] req@ffff881bc933b000 x1458751573815148/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391540625 ref 2 fl Interpret:/0/0 rc 0/0 [639266.439940] Lustre: 18013:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [639347.618777] Lustre: DEBUG MARKER: Tue Feb 4 20:05:01 2014 [639347.618779] [639481.680189] Lustre: store1-MDT0000: Client 8e81a101-3f17-b68e-df3e-cbf2a3e18200 (at VYD.DF.BBO.W@o2ib) reconnecting [639481.690869] Lustre: Skipped 129 previous similar messages [639481.696460] Lustre: store1-MDT0000: Client 8e81a101-3f17-b68e-df3e-cbf2a3e18200 (at VYD.DF.BBO.W@o2ib) refused reconnection, still busy with 3 active RPCs [639481.711949] Lustre: Skipped 129 previous similar messages [639647.055882] Lustre: DEBUG MARKER: Tue Feb 4 20:10:01 2014 [639647.055885] [639716.176749] Lustre: Service thread pid 18015 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [639716.194282] Pid: 18015, comm: mdt_62 [639716.198209] [639716.198210] Call Trace: [639716.205173] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [639716.212881] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [639716.222169] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [639716.229627] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [639716.237994] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [639716.245701] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [639716.254427] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [639716.262825] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [639716.270635] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [639716.278463] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [639716.286304] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [639716.294844] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [639716.303246] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [639716.311164] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [639716.319661] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [639716.327968] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [639716.336017] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [639716.343952] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [639716.352292] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [639716.360725] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [639716.368042] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [639716.376032] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [639716.384097] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [639716.391772] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [639716.398837] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639716.406768] [<ffffffff8100412a>] child_rip+0xa/0x20 [639716.413089] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639716.421032] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639716.428664] [<ffffffff81004120>] ? child_rip+0x0/0x20 [639716.435271] [639716.438317] LustreError: dumping log to /tmp/lustre-log.1391541071.18015 [639716.447150] Pid: 17994, comm: mdt_41 [639716.451644] [639716.451644] Call Trace: [639716.458888] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [639716.466356] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [639716.475715] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [639716.483328] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [639716.491659] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [639716.499490] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [639716.508050] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [639716.516317] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [639716.524398] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [639716.532135] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [639716.540059] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [639716.548400] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [639716.556852] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [639716.564562] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [639716.573180] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [639716.581616] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [639716.589656] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [639716.597579] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [639716.605596] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [639716.614280] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [639716.621672] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [639716.629655] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [639716.637508] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [639716.645063] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [639716.652402] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639716.660155] [<ffffffff8100412a>] child_rip+0xa/0x20 [639716.666711] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639716.674218] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639716.682229] [<ffffffff81004120>] ? child_rip+0x0/0x20 [639716.688810] [639725.947543] Pid: 18076, comm: mdt_123 [639725.951434] [639725.951435] Call Trace: [639725.955673] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [639725.961935] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [639725.969933] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [639725.977535] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [639725.985990] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [639725.993628] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [639726.002388] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [639726.010656] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [639726.018504] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [639726.026269] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [639726.034179] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [639726.042556] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [639726.050957] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [639726.058910] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [639726.067282] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [639726.075816] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [639726.083778] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [639726.091749] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [639726.100014] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [639726.108447] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [639726.115785] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [639726.123847] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [639726.131742] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [639726.139484] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [639726.146598] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639726.154433] [<ffffffff8100412a>] child_rip+0xa/0x20 [639726.160888] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639726.168665] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639726.176404] [<ffffffff81004120>] ? child_rip+0x0/0x20 [639726.183274] [639726.186106] LustreError: dumping log to /tmp/lustre-log.1391541080.18076 [639834.534224] Pid: 18040, comm: mdt_87 [639834.538023] [639834.538024] Call Trace: [639834.542299] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [639834.548797] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [639834.555580] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [639834.563136] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [639834.570966] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [639834.578331] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [639834.586231] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [639834.594137] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [639834.602294] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [639834.610429] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [639834.617889] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [639834.625370] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [639834.634490] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [639834.642177] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [639834.650112] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [639834.657541] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [639834.666186] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [639834.673390] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [639834.681691] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [639834.689628] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [639834.698381] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [639834.705517] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [639834.712915] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [639834.721053] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [639834.728959] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [639834.736974] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [639834.745208] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [639834.753693] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [639834.761050] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [639834.769123] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [639834.777025] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [639834.784851] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [639834.791932] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639834.799814] [<ffffffff8100412a>] child_rip+0xa/0x20 [639834.806391] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639834.814244] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639834.821932] [<ffffffff81004120>] ? child_rip+0x0/0x20 [639834.828680] [639834.832686] LustreError: dumping log to /tmp/lustre-log.1391541189.18040 [639946.504294] Lustre: DEBUG MARKER: Tue Feb 4 20:15:01 2014 [639946.504296] [639951.225793] Pid: 18066, comm: mdt_113 [639951.229740] [639951.229741] Call Trace: [639951.233998] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [639951.240198] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [639951.247006] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [639951.254887] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [639951.262662] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [639951.270361] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [639951.277933] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [639951.285982] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [639951.293847] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [639951.302373] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [639951.310672] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [639951.318412] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [639951.326062] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [639951.335270] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [639951.343188] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [639951.351289] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [639951.359009] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [639951.367935] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [639951.375268] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [639951.383644] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [639951.391802] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [639951.400753] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [639951.408112] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [639951.415578] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [639951.423662] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [639951.431811] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [639951.439808] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [639951.448027] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [639951.456859] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [639951.464117] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [639951.472433] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [639951.480606] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [639951.488715] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [639951.496146] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639951.503850] [<ffffffff8100412a>] child_rip+0xa/0x20 [639951.510589] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639951.518647] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [639951.526724] [<ffffffff81004120>] ? child_rip+0x0/0x20 [639951.533346] [639951.536580] LustreError: dumping log to /tmp/lustre-log.1391541306.18066 [640087.326828] Lustre: store1-MDT0000: Client 4c8862c3-cd29-7073-cc7f-8d955ff73907 (at JO.BOO.AO.BF@o2ib2) reconnecting [640087.337654] Lustre: Skipped 139 previous similar messages [640087.343275] Lustre: store1-MDT0000: Client 4c8862c3-cd29-7073-cc7f-8d955ff73907 (at JO.BOO.AO.BF@o2ib2) refused reconnection, still busy with 1 active RPCs [640087.358777] Lustre: Skipped 139 previous similar messages [640245.941349] Lustre: DEBUG MARKER: Tue Feb 4 20:20:01 2014 [640245.941351] [640352.257184] LustreError: 18010:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391541508, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881b974ae000/0x1bd0ca3af45f89da lrc: 3/1,0 mode: --/PR res: 8952384457/22571 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18010 timeout: 0 [640545.391108] Lustre: DEBUG MARKER: Tue Feb 4 20:25:01 2014 [640545.391110] [640629.003435] LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 1583s: evicting client at VYD.DF.BBO.W@o2ib ns: mdt-ffff881be2e73000 lock: ffff8819ed0f4d80/0x1bd0ca3af45edaa9 lrc: 3/0,0 mode: PR/PR res: 8589961617/143 bits 0x3 rrc: 3 type: IBT flags: 0x4000020 remote: 0x84274abcffeb9f01 expref: 3212 pid: 18064 timeout: 4359125904 [640629.037160] LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 2755s: evicting client at JO.BOO.AO.BO@o2ib8 ns: mdt-ffff881be2e73000 lock: ffff881b7def3d80/0x1bd0ca3af4522f2c lrc: 3/0,0 mode: PR/PR res: 8589961617/143 bits 0x3 rrc: 3 type: IBT flags: 0x4000020 remote: 0xcf8cf76f62c63077 expref: 12 pid: 18023 timeout: 4359125904 [640635.080644] Lustre: 5462:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 13abaeb1-1e04-a145-b26f-5d0f29d6d2cc@JO.BOO.AO.BO@o2ib8 t0 exp (null) cur 1391541991 last 0 [640679.331163] Lustre: 18032:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib t283472694087 exp (null) cur 1391542035 last 0 [640686.180486] Lustre: store1-MDT0000: Client 4c8862c3-cd29-7073-cc7f-8d955ff73907 (at JO.BOO.AO.BF@o2ib2) reconnecting [640686.191271] Lustre: Skipped 123 previous similar messages [640686.196862] Lustre: store1-MDT0000: Client 4c8862c3-cd29-7073-cc7f-8d955ff73907 (at JO.BOO.AO.BF@o2ib2) refused reconnection, still busy with 1 active RPCs [640686.212380] Lustre: Skipped 123 previous similar messages [640722.766507] Lustre: MGS: haven't heard from client b9f9c88e-6ca9-0ab5-5dcb-aead6513ca41 (at JO.BOO.AL.FB@o2ib2) in 902 seconds. I think it's dead, and I am evicting it. exp ffff8819df5f3000, cur 1391542079 expire 1391541479 last 1391541177 [640789.687755] LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 15032s: evicting client at JO.BOO.AO.WW@o2ib2 ns: mdt-ffff881be2e73000 lock: ffff88082f4756c0/0x1bd0ca3af350df1f lrc: 3/0,0 mode: PR/PR res: 8952604482/2755 bits 0x3 rrc: 2 type: IBT flags: 0x4000020 remote: 0x5bcece9813564154 expref: 583 pid: 18004 timeout: 4359142053 [640814.572851] Lustre: 18038:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 2fd75f33-abb1-54d3-a78e-b54fab31bf48@JO.BOO.AO.WW@o2ib2 t283472590519 exp (null) cur 1391542171 last 0 [640844.827603] Lustre: DEBUG MARKER: Tue Feb 4 20:30:01 2014 [640844.827605] [640934.573155] LustreError: 18016:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391541991, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881b8dec86c0/0x1bd0ca3af45f98f2 lrc: 3/1,0 mode: --/PR res: 8952384136/21 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18016 timeout: 0 [640951.080716] Lustre: Service thread pid 18010 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [640951.098180] Lustre: Skipped 4 previous similar messages [640951.103696] Pid: 18010, comm: mdt_57 [640951.108716] [640951.108717] Call Trace: [640951.115912] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [640951.125109] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [640951.132782] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [640951.141219] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [640951.148775] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [640951.157575] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [640951.165935] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [640951.173827] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [640951.181643] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [640951.189454] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [640951.197954] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [640951.206402] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [640951.214375] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [640951.222733] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [640951.231196] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [640951.239158] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [640951.247275] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [640951.255462] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [640951.263983] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [640951.271370] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [640951.279344] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [640951.287272] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [640951.295070] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [640951.302218] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [640951.309957] [<ffffffff8100412a>] child_rip+0xa/0x20 [640951.316488] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [640951.324258] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [640951.331992] [<ffffffff81004120>] ? child_rip+0x0/0x20 [640951.338645] [640951.341761] LustreError: dumping log to /tmp/lustre-log.1391542308.18010 [640994.934574] LustreError: 18041:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391542152, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a5b7a0000/0x1bd0ca3af45fab9f lrc: 3/1,0 mode: --/PR res: 8949924332/3357 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18041 timeout: 0 [640994.970382] LustreError: 18041:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 1 previous similar message [641014.296536] LustreError: 18038:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391542171, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff88170d98fb40/0x1bd0ca3af45fce4b lrc: 3/1,0 mode: --/PR res: 8952384734/771 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18038 timeout: 0 [641144.275959] Lustre: DEBUG MARKER: Tue Feb 4 20:35:01 2014 [641144.275961] [641287.225379] Lustre: store1-MDT0000: Client 18d50e79-e7ff-2a23-0203-d8ade50c06c4 (at JO.BOO.PI.WA@o2ib2) reconnecting [641287.236142] Lustre: Skipped 109 previous similar messages [641287.241747] Lustre: store1-MDT0000: Client 18d50e79-e7ff-2a23-0203-d8ade50c06c4 (at JO.BOO.PI.WA@o2ib2) refused reconnection, still busy with 2 active RPCs [641287.257444] Lustre: Skipped 109 previous similar messages [641309.536536] Lustre: 18052:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [641309.536539] req@ffff8819ee040c00 x1458479249836811/t0(0) o101->fa289a64-81d7-ab30-f92c-63cc786af9c0@JO.BOO.AL.PW@o2ib2:0/0 lens 560/4808 e 5 to 0 dl 1391542672 ref 2 fl Interpret:/0/0 rc 0/0 [641309.566191] Lustre: 18052:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages [641427.514719] Lustre: Service thread pid 18064 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [641427.532099] Pid: 18064, comm: mdt_111 [641427.536007] [641427.536008] Call Trace: [641427.544173] [<ffffffff8119605f>] ? __find_get_block_slow+0xaf/0x130 [641427.550971] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [641427.558695] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [641427.566666] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [641427.574231] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [641427.581962] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [641427.589308] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [641427.597192] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [641427.605123] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [641427.613341] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [641427.621400] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [641427.628881] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [641427.636266] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [641427.645390] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [641427.653118] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [641427.660893] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [641427.668419] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [641427.676981] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [641427.684247] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [641427.692495] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [641427.700481] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [641427.709147] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [641427.716273] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [641427.723689] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [641427.731756] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [641427.739617] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [641427.747638] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [641427.755870] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [641427.764291] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [641427.771664] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [641427.779781] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [641427.787750] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [641427.795526] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [641427.802639] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641427.810463] [<ffffffff8100412a>] child_rip+0xa/0x20 [641427.817015] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641427.824716] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641427.832401] [<ffffffff81004120>] ? child_rip+0x0/0x20 [641427.839054] [641427.842033] LustreError: dumping log to /tmp/lustre-log.1391542785.18064 [641433.552924] Pid: 5462, comm: mdt_03 [641433.556582] [641433.556583] Call Trace: [641433.562274] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [641433.569513] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [641433.577246] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [641433.585193] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [641433.592801] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [641433.600538] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [641433.607926] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [641433.615831] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [641433.623740] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [641433.631935] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [641433.639977] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [641433.647424] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [641433.654894] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [641433.664130] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [641433.671822] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [641433.679619] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [641433.687067] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [641433.695675] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [641433.702989] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [641433.711187] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [641433.719122] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [641433.727765] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [641433.734915] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [641433.742399] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [641433.750392] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [641433.758306] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [641433.766333] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [641433.774636] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [641433.783141] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [641433.790410] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [641433.798577] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [641433.806496] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [641433.814226] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [641433.821343] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641433.829093] [<ffffffff8100412a>] child_rip+0xa/0x20 [641433.835670] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641433.843482] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641433.851154] [<ffffffff81004120>] ? child_rip+0x0/0x20 [641433.857874] [641433.860825] LustreError: dumping log to /tmp/lustre-log.1391542791.5462 [641433.869747] Pid: 18016, comm: mdt_63 [641433.874181] [641433.874182] Call Trace: [641433.881201] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [641433.888890] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [641433.898134] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [641433.905826] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [641433.914262] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [641433.921989] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [641433.930711] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [641433.939037] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [641433.946930] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [641433.954753] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [641433.962686] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [641433.971040] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [641433.979484] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [641433.987478] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [641433.995908] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [641434.004396] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [641434.012515] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [641434.020414] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [641434.028698] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [641434.037173] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [641434.044574] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [641434.052653] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [641434.060530] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [641434.068327] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [641434.075491] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641434.083273] [<ffffffff8100412a>] child_rip+0xa/0x20 [641434.089782] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641434.097616] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641434.105404] [<ffffffff81004120>] ? child_rip+0x0/0x20 [641434.112075] [641443.713003] Lustre: DEBUG MARKER: Tue Feb 4 20:40:01 2014 [641443.713005] [641540.083617] Lustre: 18013:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply [641540.083621] req@ffff8819ecd46400 x1458751573825958/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391542903 ref 2 fl Interpret:/0/0 rc 0/0 [641743.172256] Lustre: DEBUG MARKER: Tue Feb 4 20:45:01 2014 [641743.172258] [641807.618045] Lustre: 1356:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from f3ce3ee7-e532-e3d4-eb30-5cd162343de7@JO.BOO.AL.FL@o2ib2 t0 exp (null) cur 1391543166 last 0 [641807.907238] Lustre: 18039:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 15341dcf-bce6-041e-b8a9-0f71609e0acf@JO.BOO.AL.FL@o2ib2 t0 exp (null) cur 1391543166 last 0 [641890.818078] Lustre: store1-MDT0000: Client fa1ca0cd-dbb7-2488-9baf-a2b84717a6c4 (at JO.BOO.PI.WF@o2ib2) reconnecting [641890.828855] Lustre: Skipped 125 previous similar messages [641890.834517] Lustre: store1-MDT0000: Client fa1ca0cd-dbb7-2488-9baf-a2b84717a6c4 (at JO.BOO.PI.WF@o2ib2) refused reconnection, still busy with 1 active RPCs [641890.850074] Lustre: Skipped 125 previous similar messages [641910.841671] Lustre: 31628:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from c3e21cdc-5c14-c1b6-48c2-146cf299df46@JO.BOO.AL.PL@o2ib2 t0 exp (null) cur 1391543269 last 0 [641986.985635] Lustre: Service thread pid 18041 was inactive for 1194.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [641987.003134] Lustre: Skipped 2 previous similar messages [641987.008562] Pid: 18041, comm: mdt_88 [641987.013671] [641987.013672] Call Trace: [641987.019325] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [641987.026965] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [641987.036141] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [641987.043860] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [641987.052253] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [641987.060006] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [641987.068823] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [641987.077248] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [641987.085118] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [641987.092924] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [641987.100841] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [641987.109267] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [641987.117698] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [641987.125678] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [641987.134083] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [641987.142554] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [641987.150627] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [641987.158634] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [641987.166897] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [641987.175388] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [641987.182771] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [641987.190843] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [641987.198767] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [641987.206539] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [641987.213683] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641987.222688] [<ffffffff8100412a>] child_rip+0xa/0x20 [641987.229231] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641987.237037] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641987.244803] [<ffffffff81004120>] ? child_rip+0x0/0x20 [641987.251482] [641987.254511] LustreError: dumping log to /tmp/lustre-log.1391543346.18041 [641987.263418] Pid: 18032, comm: mdt_79 [641987.267937] [641987.267938] Call Trace: [641987.273577] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [641987.281229] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [641987.290580] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [641987.298427] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [641987.306908] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [641987.314657] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [641987.323400] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [641987.331890] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [641987.339881] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [641987.347796] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [641987.355761] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [641987.364201] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [641987.372703] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [641987.380806] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [641987.389287] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [641987.397865] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [641987.406017] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [641987.414012] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [641987.422261] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [641987.430833] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [641987.438304] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [641987.446465] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [641987.454381] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [641987.462157] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [641987.469390] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641987.477261] [<ffffffff8100412a>] child_rip+0xa/0x20 [641987.483807] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641987.491613] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [641987.499506] [<ffffffff81004120>] ? child_rip+0x0/0x20 [641987.506244] [642006.347600] Pid: 18038, comm: mdt_85 [642006.351348] [642006.351349] Call Trace: [642006.357058] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [642006.364681] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [642006.373855] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [642006.381502] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [642006.389853] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [642006.397566] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [642006.406296] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [642006.414620] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [642006.422464] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [642006.430251] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [642006.438127] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [642006.446526] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [642006.454933] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [642006.462917] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [642006.471323] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [642006.479768] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [642006.487800] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [642006.495772] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [642006.503992] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [642006.512450] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [642006.519814] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [642006.527885] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [642006.535770] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [642006.543526] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [642006.550641] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [642006.558374] [<ffffffff8100412a>] child_rip+0xa/0x20 [642006.564885] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [642006.572670] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [642006.580467] [<ffffffff81004120>] ? child_rip+0x0/0x20 [642006.587109] [642006.590133] LustreError: dumping log to /tmp/lustre-log.1391543365.18038 [642008.333688] Pid: 18062, comm: mdt_109 [642008.337514] [642008.337515] Call Trace: [642008.343225] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [642008.350524] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [642008.358274] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [642008.366195] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [642008.373788] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [642008.381492] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [642008.388844] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [642008.396749] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [642008.404614] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [642008.412815] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [642008.420856] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [642008.428347] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [642008.435795] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [642008.444942] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [642008.452639] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [642008.460420] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [642008.467905] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [642008.476462] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [642008.483717] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [642008.491940] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [642008.499830] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [642008.508520] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [642008.515622] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [642008.523054] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [642008.531120] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [642008.538991] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [642008.546993] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [642008.555173] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [642008.563638] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [642008.571018] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [642008.579064] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [642008.586946] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [642008.594693] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [642008.601809] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [642008.609588] [<ffffffff8100412a>] child_rip+0xa/0x20 [642008.616095] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [642008.623884] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [642008.631642] [<ffffffff81004120>] ? child_rip+0x0/0x20 [642008.638327] [642008.641346] LustreError: dumping log to /tmp/lustre-log.1391543367.18062 [642042.608640] Lustre: DEBUG MARKER: Tue Feb 4 20:50:01 2014 [642042.608642] [642342.057541] Lustre: DEBUG MARKER: Tue Feb 4 20:55:01 2014 [642342.057543] [642490.560696] Lustre: store1-MDT0000: Client a2ab85c1-cfa5-4978-d0ba-45042f8f50fb (at JO.BOO.PI.TB@o2ib2) reconnecting [642490.571462] Lustre: Skipped 138 previous similar messages [642490.577073] Lustre: store1-MDT0000: Client a2ab85c1-cfa5-4978-d0ba-45042f8f50fb (at JO.BOO.PI.TB@o2ib2) refused reconnection, still busy with 1 active RPCs [642490.592589] Lustre: Skipped 138 previous similar messages [642641.493724] Lustre: DEBUG MARKER: Tue Feb 4 21:00:01 2014 [642641.493725] [642940.967293] Lustre: DEBUG MARKER: Tue Feb 4 21:05:01 2014 [642940.967295] [643093.179300] Lustre: store1-MDT0000: Client e0d548f2-83b6-24ec-ae3d-6b6344ac6d94 (at JO.BOO.PI.WF@o2ib8) reconnecting [643093.190062] Lustre: Skipped 138 previous similar messages [643093.195706] Lustre: store1-MDT0000: Client e0d548f2-83b6-24ec-ae3d-6b6344ac6d94 (at JO.BOO.PI.WF@o2ib8) refused reconnection, still busy with 1 active RPCs [643093.211198] Lustre: Skipped 138 previous similar messages [643240.403482] Lustre: DEBUG MARKER: Tue Feb 4 21:10:01 2014 [643240.403484] [643538.860222] Lustre: DEBUG MARKER: Tue Feb 4 21:15:01 2014 [643538.860224] [643692.007554] Lustre: store1-MDT0000: Client e0d548f2-83b6-24ec-ae3d-6b6344ac6d94 (at JO.BOO.PI.WF@o2ib8) reconnecting [643692.018313] Lustre: Skipped 137 previous similar messages [643694.397978] Lustre: store1-MDT0000: Client 13dc96fd-ec54-eff5-5b15-3e6486e67833 (at JO.BOO.IZ.BLF@o2ib2) refused reconnection, still busy with 1 active RPCs [643694.412261] Lustre: Skipped 138 previous similar messages [643838.888402] Lustre: DEBUG MARKER: Tue Feb 4 21:20:01 2014 [643838.888405] [643870.079040] Lustre: 1356:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from fe219c6c-d540-8606-7c8b-988728ce190b@JO.BOO.AL.FB@o2ib2 t0 exp (null) cur 1391545232 last 0 [643870.095418] Lustre: 1356:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1 previous similar message [643870.399045] Lustre: 18017:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from c4dcbec1-4ccf-a60d-580a-b9182019a383@JO.BOO.AL.FB@o2ib2 t0 exp (null) cur 1391545233 last 0 [644138.336659] Lustre: DEBUG MARKER: Tue Feb 4 21:25:01 2014 [644138.336661] [644255.472507] Lustre: 31627:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 2f2339c9-621f-28ba-e7c5-6e789d4071d2@JO.BOO.AL.PI@o2ib2 t0 exp (null) cur 1391545619 last 0 [644293.221688] Lustre: store1-MDT0000: Client 13dc96fd-ec54-eff5-5b15-3e6486e67833 (at JO.BOO.IZ.BLF@o2ib2) reconnecting [644293.232598] Lustre: Skipped 138 previous similar messages [644293.238212] Lustre: store1-MDT0000: Client 13dc96fd-ec54-eff5-5b15-3e6486e67833 (at JO.BOO.IZ.BLF@o2ib2) refused reconnection, still busy with 1 active RPCs [644293.253783] Lustre: Skipped 137 previous similar messages [644329.144167] Lustre: 18017:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 9464fe0f-ac73-1aba-8622-b071ffdde55c@JO.BOO.AL.PI@o2ib2 t0 exp (null) cur 1391545692 last 0 [644437.773645] Lustre: DEBUG MARKER: Tue Feb 4 21:30:01 2014 [644437.773647] [644737.222794] Lustre: DEBUG MARKER: Tue Feb 4 21:35:01 2014 [644737.222796] [644771.086687] Lustre: Service thread pid 3706 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [644771.104025] Lustre: Skipped 3 previous similar messages [644771.109472] Pid: 3706, comm: mdt_11 [644771.114476] [644771.114477] Call Trace: [644771.121479] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [644771.128760] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [644771.136459] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [644771.144370] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [644771.151975] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [644771.159678] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [644771.167023] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [644771.174890] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [644771.182761] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [644771.190963] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [644771.199005] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [644771.206499] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [644771.213875] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [644771.223033] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [644771.230732] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [644771.238549] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [644771.245975] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [644771.254550] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [644771.261828] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [644771.270065] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [644771.278009] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [644771.286714] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [644771.293822] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [644771.301252] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [644771.309286] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [644771.317190] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [644771.325179] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [644771.333406] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [644771.341839] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [644771.349198] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [644771.357228] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [644771.365172] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [644771.372930] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [644771.380032] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [644771.387778] [<ffffffff8100412a>] child_rip+0xa/0x20 [644771.394290] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [644771.402103] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [644771.409860] [<ffffffff81004120>] ? child_rip+0x0/0x20 [644771.416513] [644771.419536] LustreError: dumping log to /tmp/lustre-log.1391546136.3706 [644917.543192] Lustre: store1-MDT0000: Client 2ab1748d-72c2-ff52-143f-f323a2318337 (at JO.BOO.WL.ZT@o2ib2) reconnecting [644917.553993] Lustre: Skipped 138 previous similar messages [644917.559642] Lustre: store1-MDT0000: Client 2ab1748d-72c2-ff52-143f-f323a2318337 (at JO.BOO.WL.ZT@o2ib2) refused reconnection, still busy with 1 active RPCs [644917.575084] Lustre: Skipped 138 previous similar messages [645036.660141] Lustre: DEBUG MARKER: Tue Feb 4 21:40:01 2014 [645036.660143] [645128.742926] Lustre: 18265:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [645128.742930] req@ffff881a2d1c1800 x1458479725293089/t0(0) o101->401c4787-3769-68ef-1942-ee0f5753e2c3@JO.BOO.PI.B@o2ib2:0/0 lens 624/4936 e 5 to 0 dl 1391546499 ref 2 fl Interpret:/0/0 rc 0/0 [645128.772441] Lustre: 18265:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 6 previous similar messages [645336.108734] Lustre: DEBUG MARKER: Tue Feb 4 21:45:01 2014 [645336.108736] [645517.364139] Lustre: store1-MDT0000: Client 8d5c25ee-0d04-f6a2-5a7c-4d3b6e9c8a36 (at JO.BOO.PB.AL@o2ib2) reconnecting [645517.374933] Lustre: Skipped 137 previous similar messages [645517.380540] Lustre: store1-MDT0000: Client 8d5c25ee-0d04-f6a2-5a7c-4d3b6e9c8a36 (at JO.BOO.PB.AL@o2ib2) refused reconnection, still busy with 1 active RPCs [645517.395990] Lustre: Skipped 137 previous similar messages [645635.546161] Lustre: DEBUG MARKER: Tue Feb 4 21:50:01 2014 [645635.546163] [645934.995099] Lustre: DEBUG MARKER: Tue Feb 4 21:55:01 2014 [645934.995101] [645971.048223] Lustre: MGS: haven't heard from client 35cfa559-c4bd-b609-a77f-bc51dc4039a4 (at JO.BOO.AL.PW@o2ib2) in 902 seconds. I think it's dead, and I am evicting it. exp ffff880f44d16800, cur 1391547338 expire 1391546738 last 1391546436 [645971.069785] Lustre: Skipped 5 previous similar messages [645971.075249] Lustre: store1-MDT0000: haven't heard from client fa289a64-81d7-ab30-f92c-63cc786af9c0 (at JO.BOO.AL.PW@o2ib2) in 902 seconds. I think it's dead, and I am evicting it. exp ffff8819e2123c00, cur 1391547338 expire 1391546738 last 1391546436 [646116.797045] Lustre: store1-MDT0000: Client a4ef4c3a-ff9b-0ac2-ef60-870a0df43748 (at JO.BOO.PI.LBT@o2ib2) reconnecting [646116.807892] Lustre: Skipped 138 previous similar messages [646116.813502] Lustre: store1-MDT0000: Client a4ef4c3a-ff9b-0ac2-ef60-870a0df43748 (at JO.BOO.PI.LBT@o2ib2) refused reconnection, still busy with 1 active RPCs [646116.829117] Lustre: Skipped 138 previous similar messages [646234.430658] Lustre: DEBUG MARKER: Tue Feb 4 22:00:01 2014 [646234.430661] [646533.904329] Lustre: DEBUG MARKER: Tue Feb 4 22:05:01 2014 [646533.904331] [646719.442651] Lustre: store1-MDT0000: Client 56711354-b2c7-2b5a-d4cc-9fc38fce9702 (at JO.BOO.PO.BIO@o2ib2) reconnecting [646719.453482] Lustre: Skipped 138 previous similar messages [646719.459121] Lustre: store1-MDT0000: Client 56711354-b2c7-2b5a-d4cc-9fc38fce9702 (at JO.BOO.PO.BIO@o2ib2) refused reconnection, still busy with 1 active RPCs [646719.474689] Lustre: Skipped 138 previous similar messages [646833.341566] Lustre: DEBUG MARKER: Tue Feb 4 22:10:01 2014 [646833.341568] [647131.792282] Lustre: DEBUG MARKER: Tue Feb 4 22:15:01 2014 [647131.792284] [647319.105777] Lustre: store1-MDT0000: Client e542ecba-5c5a-3526-16a2-ba252dad69e8 (at JO.BOO.II.B@o2ib2) reconnecting [647319.116456] Lustre: Skipped 138 previous similar messages [647319.122064] Lustre: store1-MDT0000: Client e542ecba-5c5a-3526-16a2-ba252dad69e8 (at JO.BOO.II.B@o2ib2) refused reconnection, still busy with 1 active RPCs [647319.137488] Lustre: Skipped 138 previous similar messages [647431.229024] Lustre: DEBUG MARKER: Tue Feb 4 22:20:01 2014 [647431.229026] [647532.699302] Lustre: 1355:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 29a310b3-f074-69a9-6f19-2bd82b2a69f3@JO.BOO.AL.PW@o2ib2 t0 exp (null) cur 1391548902 last 0 [647532.981718] Lustre: 18055:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 70c2987a-c22f-f255-ff92-b918b282f8e7@JO.BOO.AL.PW@o2ib2 t0 exp (null) cur 1391548903 last 0 [647730.678085] Lustre: DEBUG MARKER: Tue Feb 4 22:25:01 2014 [647730.678087] [647923.996238] Lustre: store1-MDT0000: Client 675d8b23-9bc8-377b-75be-240ab911afc5 (at JO.BOO.IP.WL@o2ib2) reconnecting [647924.007014] Lustre: Skipped 138 previous similar messages [647924.012706] Lustre: store1-MDT0000: Client 675d8b23-9bc8-377b-75be-240ab911afc5 (at JO.BOO.IP.WL@o2ib2) refused reconnection, still busy with 1 active RPCs [647924.028178] Lustre: Skipped 138 previous similar messages [648030.115124] Lustre: DEBUG MARKER: Tue Feb 4 22:30:01 2014 [648030.115126] [648035.672072] Lustre: Service thread pid 18053 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [648035.689897] Pid: 18053, comm: mdt_100 [648035.693899] [648035.693900] Call Trace: [648035.700906] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [648035.708375] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [648035.716388] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [648035.724517] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [648035.732283] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [648035.740037] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [648035.747681] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [648035.755855] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [648035.763973] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [648035.772234] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [648035.780303] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [648035.787802] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [648035.795422] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [648035.804669] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [648035.812484] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [648035.820184] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [648035.827783] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [648035.836628] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [648035.844273] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [648035.852587] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [648035.860512] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [648035.869420] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [648035.876823] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [648035.884554] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [648035.892700] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [648035.900555] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [648035.908761] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [648035.917187] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [648035.925642] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [648035.932956] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [648035.940940] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [648035.948816] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [648035.956559] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [648035.963635] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [648035.971356] [<ffffffff8100412a>] child_rip+0xa/0x20 [648035.977870] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [648035.985670] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [648035.993395] [<ffffffff81004120>] ? child_rip+0x0/0x20 [648036.000051] [648036.003075] LustreError: dumping log to /tmp/lustre-log.1391549407.18053 [648329.563838] Lustre: DEBUG MARKER: Tue Feb 4 22:35:01 2014 [648329.563840] [648394.207302] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [648394.207306] req@ffff881bc20ac000 x1458479203416411/t0(0) o101->5933b4ef-147e-8706-4997-cd4fd0058ade@JO.BOO.AL.PB@o2ib2:0/0 lens 632/4936 e 5 to 0 dl 1391549770 ref 2 fl Interpret:/0/0 rc 0/0 [648531.211810] Lustre: store1-MDT0000: Client 2735fb90-b05e-3f85-5a9d-e2389c950e23 (at JO.BOO.PI.FL@o2ib2) reconnecting [648531.222574] Lustre: Skipped 138 previous similar messages [648531.228203] Lustre: store1-MDT0000: Client 2735fb90-b05e-3f85-5a9d-e2389c950e23 (at JO.BOO.PI.FL@o2ib2) refused reconnection, still busy with 1 active RPCs [648531.243690] Lustre: Skipped 138 previous similar messages [648629.001041] Lustre: DEBUG MARKER: Tue Feb 4 22:40:01 2014 [648629.001044] [648928.449718] Lustre: DEBUG MARKER: Tue Feb 4 22:45:01 2014 [648928.449720] [649139.955104] Lustre: store1-MDT0000: Client 8e81a101-3f17-b68e-df3e-cbf2a3e18200 (at VYD.DF.BBO.W@o2ib) reconnecting [649139.965798] Lustre: Skipped 144 previous similar messages [649139.971397] Lustre: store1-MDT0000: Client 8e81a101-3f17-b68e-df3e-cbf2a3e18200 (at VYD.DF.BBO.W@o2ib) refused reconnection, still busy with 2 active RPCs [649139.986901] Lustre: Skipped 144 previous similar messages [649227.886520] Lustre: DEBUG MARKER: Tue Feb 4 22:50:01 2014 [649227.886522] [649527.335225] Lustre: DEBUG MARKER: Tue Feb 4 22:55:01 2014 [649527.335227] [649748.212027] Lustre: store1-MDT0000: Client 6c19fd7a-808b-92cf-2bae-1b52e52383bf (at JO.BOO.AO.WA@o2ib8) reconnecting [649748.222778] Lustre: Skipped 144 previous similar messages [649748.228459] Lustre: store1-MDT0000: Client 6c19fd7a-808b-92cf-2bae-1b52e52383bf (at JO.BOO.AO.WA@o2ib8) refused reconnection, still busy with 1 active RPCs [649748.243942] Lustre: Skipped 144 previous similar messages [649826.772204] Lustre: DEBUG MARKER: Tue Feb 4 23:00:01 2014 [649826.772206] [650126.246477] Lustre: DEBUG MARKER: Tue Feb 4 23:05:01 2014 [650126.246479] [650349.007036] Lustre: store1-MDT0000: Client 1125649f-888f-3681-3678-df6fa83c809a (at JO.BOO.PL.BTZ@o2ib2) reconnecting [650349.017890] Lustre: Skipped 138 previous similar messages [650349.023484] Lustre: store1-MDT0000: Client 1125649f-888f-3681-3678-df6fa83c809a (at JO.BOO.PL.BTZ@o2ib2) refused reconnection, still busy with 1 active RPCs [650349.039064] Lustre: Skipped 138 previous similar messages [650414.777560] LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 9588s: evicting client at VYD.DF.BBO.W@o2ib ns: mdt-ffff881be2e73000 lock: ffff881a5ae73240/0x1bd0ca3af460687c lrc: 3/0,0 mode: PR/PR res: 8589961617/1344 bits 0x3 rrc: 3 type: IBT flags: 0x4000020 remote: 0x84274abcffec208f expref: 3208 pid: 17690 timeout: 4360106426 [650425.683560] Lustre: DEBUG MARKER: Tue Feb 4 23:10:01 2014 [650425.683562] [650437.406946] Lustre: 18009:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib t283472722244 exp (null) cur 1391551813 last 0 [650560.114529] Lustre: MGS: haven't heard from client f36735d4-2e11-73ef-5d96-8b365db7ff8b (at JO.BOO.AL.PB@o2ib2) in 902 seconds. I think it's dead, and I am evicting it. exp ffff880841c26000, cur 1391551936 expire 1391551336 last 1391551034 [650725.132030] Lustre: DEBUG MARKER: Tue Feb 4 23:15:01 2014 [650725.132032] [650908.608434] Lustre: Service thread pid 18031 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [650908.625850] Pid: 18031, comm: mdt_78 [650908.629690] [650908.629691] Call Trace: [650908.636684] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [650908.643965] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [650908.651676] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [650908.659630] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [650908.667226] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [650908.674925] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [650908.682283] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [650908.690176] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [650908.698033] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [650908.706242] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [650908.714277] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [650908.721809] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [650908.729238] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [650908.738373] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [650908.746082] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [650908.753865] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [650908.761300] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [650908.769974] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [650908.777214] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [650908.785406] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [650908.793341] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [650908.802017] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [650908.809157] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [650908.816584] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [650908.824626] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [650908.832491] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [650908.840458] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [650908.848666] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [650908.857138] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [650908.864481] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [650908.872514] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [650908.880404] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [650908.888160] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [650908.895264] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [650908.903016] [<ffffffff8100412a>] child_rip+0xa/0x20 [650908.909591] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [650908.917368] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [650908.925110] [<ffffffff81004120>] ? child_rip+0x0/0x20 [650908.931769] [650908.934790] LustreError: dumping log to /tmp/lustre-log.1391552285.18031 [650951.495597] Lustre: store1-MDT0000: Client 5db15b6d-96b9-01c3-2be8-f827bbca027e (at JO.BOO.PW.AA@o2ib2) reconnecting [650951.506365] Lustre: Skipped 132 previous similar messages [650951.511965] Lustre: store1-MDT0000: Client 5db15b6d-96b9-01c3-2be8-f827bbca027e (at JO.BOO.PW.AA@o2ib2) refused reconnection, still busy with 1 active RPCs [650951.527434] Lustre: Skipped 132 previous similar messages [651024.568955] Lustre: DEBUG MARKER: Tue Feb 4 23:20:01 2014 [651024.568957] [651213.658033] Lustre: Service thread pid 18052 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [651213.675696] Pid: 18052, comm: mdt_99 [651213.679677] [651213.679678] Call Trace: [651213.686745] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [651213.694640] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [651213.702706] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [651213.710493] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [651213.718301] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [651213.725604] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [651213.733663] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [651213.741691] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [651213.750018] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [651213.758147] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [651213.765628] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [651213.773272] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [651213.782621] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [651213.790405] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [651213.798245] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [651213.805688] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [651213.814557] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [651213.821992] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [651213.830326] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [651213.838449] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [651213.847075] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [651213.854310] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [651213.861876] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [651213.870067] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [651213.878005] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [651213.885990] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [651213.894518] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [651213.903284] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [651213.910735] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [651213.918788] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [651213.926776] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [651213.934670] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [651213.941918] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [651213.949817] [<ffffffff8100412a>] child_rip+0xa/0x20 [651213.956362] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [651213.964295] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [651213.972161] [<ffffffff81004120>] ? child_rip+0x0/0x20 [651213.978834] [651213.981876] LustreError: dumping log to /tmp/lustre-log.1391552591.18052 [651213.990863] Pid: 3987, comm: mdt_02 [651213.995394] [651213.995394] Call Trace: [651214.002549] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [651214.010406] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [651214.018478] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [651214.026104] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [651214.033950] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [651214.041377] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [651214.049368] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [651214.057279] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [651214.065617] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [651214.073878] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [651214.081470] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [651214.089110] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [651214.098381] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [651214.106219] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [651214.114240] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [651214.121843] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [651214.130590] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [651214.139402] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [651214.147738] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [651214.155726] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [651214.164610] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [651214.171926] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [651214.179641] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [651214.187907] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [651214.195870] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [651214.204158] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [651214.212571] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [651214.221224] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [651214.228827] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [651214.236931] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [651214.244920] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [651214.252893] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [651214.260286] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [651214.268187] [<ffffffff8100412a>] child_rip+0xa/0x20 [651214.274784] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [651214.282868] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [651214.290788] [<ffffffff81004120>] ? child_rip+0x0/0x20 [651214.297609] [651267.502204] Lustre: 18013:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-559), not sending early reply [651267.502208] req@ffff8819ee1e2400 x1459140931159355/t0(0) o101->c4dcbec1-4ccf-a60d-580a-b9182019a383@JO.BOO.AL.FB@o2ib2:0/0 lens 640/4936 e 5 to 0 dl 1391552649 ref 2 fl Interpret:/0/0 rc 0/0 [651324.018575] Lustre: DEBUG MARKER: Tue Feb 4 23:25:01 2014 [651324.018577] [651347.595300] LustreError: 6792:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391552525, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a847f6900/0x1bd0ca3af4671658 lrc: 3/1,0 mode: --/PR res: 8949924332/3357 bits 0x3 rrc: 5 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6792 timeout: 0 [651381.278780] Lustre: 18072:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-368), not sending early reply [651381.278784] req@ffff8819ec0bac00 x1458488051871829/t0(0) o101->56a9f9ba-9ef4-3407-1ea9-1acaa222b873@JO.BOO.IA.BAP@o2ib2:0/0 lens 632/4936 e 1 to 0 dl 1391552763 ref 2 fl Interpret:/0/0 rc 0/0 [651550.402068] Lustre: store1-MDT0000: Client 1149a764-bf57-dbdd-b71c-b33708ee8303 (at JO.BOO.PI.FW@o2ib2) reconnecting [651550.412835] Lustre: Skipped 136 previous similar messages [651550.418512] Lustre: store1-MDT0000: Client 1149a764-bf57-dbdd-b71c-b33708ee8303 (at JO.BOO.PI.FW@o2ib2) refused reconnection, still busy with 1 active RPCs [651550.433984] Lustre: Skipped 136 previous similar messages [651564.521661] LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 19313s: evicting client at JO.BOO.PI.FW@o2ib2 ns: mdt-ffff881be2e73000 lock: ffff880e87538d80/0x1bd0ca3af4392e4d lrc: 3/0,0 mode: PR/PR res: 8952384428/56271 bits 0x3 rrc: 2 type: IBT flags: 0x4000020 remote: 0x398c953ae47280a expref: 52 pid: 16397 timeout: 4360221623 [651623.454896] Lustre: DEBUG MARKER: Tue Feb 4 23:30:01 2014 [651623.454899] [651650.206180] Lustre: 18061:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 1149a764-bf57-dbdd-b71c-b33708ee8303@JO.BOO.PI.FW@o2ib2 t0 exp (null) cur 1391553028 last 0 [651748.157883] LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 9288s: evicting client at JO.BOO.PI.B@o2ib2 ns: mdt-ffff881be2e73000 lock: ffff880f20293b40/0x1bd0ca3af46306ca lrc: 3/0,0 mode: PR/PR res: 8952384428/56439 bits 0x3 rrc: 2 type: IBT flags: 0x4000020 remote: 0xc39ea601781957b1 expref: 38 pid: 18009 timeout: 4360240011 [651785.396008] Lustre: 18039:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 401c4787-3769-68ef-1942-ee0f5753e2c3@JO.BOO.PI.B@o2ib2 t0 exp (null) cur 1391553163 last 0 [651892.554261] Lustre: 17419:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-146), not sending early reply [651892.554265] req@ffff881cdd981850 x1458751573887771/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391553276 ref 2 fl Interpret:/0/0 rc 0/0 [651892.583869] Lustre: 17419:0:(service.c:1035:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [651922.904360] Lustre: DEBUG MARKER: Tue Feb 4 23:35:01 2014 [651922.904362] [652149.545785] Lustre: store1-MDT0000: Client 13abaeb1-1e04-a145-b26f-5d0f29d6d2cc (at JO.BOO.AO.BO@o2ib8) reconnecting [652149.556627] Lustre: Skipped 143 previous similar messages [652149.562236] Lustre: store1-MDT0000: Client 13abaeb1-1e04-a145-b26f-5d0f29d6d2cc (at JO.BOO.AO.BO@o2ib8) refused reconnection, still busy with 2 active RPCs [652149.577756] Lustre: Skipped 143 previous similar messages [652222.341720] Lustre: DEBUG MARKER: Tue Feb 4 23:40:01 2014 [652222.341722] [652339.648803] Lustre: Service thread pid 6792 was inactive for 1194.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [652339.666332] Lustre: Skipped 1 previous similar message [652339.671664] Pid: 6792, comm: mdt_14 [652339.676701] [652339.676702] Call Trace: [652339.682354] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [652339.689984] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [652339.699252] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [652339.707034] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [652339.715432] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [652339.723232] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [652339.731978] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [652339.740409] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [652339.748272] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [652339.756104] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [652339.764019] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [652339.772413] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [652339.780861] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [652339.788844] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [652339.797250] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [652339.805733] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [652339.813854] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [652339.821888] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [652339.830137] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [652339.838623] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [652339.846016] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [652339.854079] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [652339.861998] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [652339.869778] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [652339.876925] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [652339.884706] [<ffffffff8100412a>] child_rip+0xa/0x20 [652339.891246] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [652339.899061] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [652339.906874] [<ffffffff81004120>] ? child_rip+0x0/0x20 [652339.913548] [652339.916588] LustreError: dumping log to /tmp/lustre-log.1391553719.6792 [652521.790698] Lustre: DEBUG MARKER: Tue Feb 4 23:45:01 2014 [652521.790701] [652585.103629] Lustre: 17690:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [652585.103634] req@ffff881bbd2d7c00 x1458479661721993/t0(0) o101->1149a764-bf57-dbdd-b71c-b33708ee8303@JO.BOO.PI.FW@o2ib2:0/0 lens 656/4936 e 0 to 0 dl 1391553970 ref 2 fl Interpret:/0/0 rc 0/0 [652727.234732] LustreError: 14:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 889s: evicting client at VYD.DF.BBO.W@o2ib ns: mdt-ffff881be2e73000 lock: ffff881b85cedd80/0x1bd0ca3af46faed5 lrc: 3/0,0 mode: PR/PR res: 8908553484/63357 bits 0x3 rrc: 2 type: IBT flags: 0x4000020 remote: 0x84274abcfff00591 expref: 3205 pid: 18009 timeout: 4360338140 [652748.373326] Lustre: store1-MDT0000: Client 13abaeb1-1e04-a145-b26f-5d0f29d6d2cc (at JO.BOO.AO.BO@o2ib8) reconnecting [652748.384137] Lustre: Skipped 145 previous similar messages [652751.409913] Lustre: store1-MDT0000: Client e3b65402-5a14-5932-6e46-c3e70901f1c1 (at JO.BOO.PZ.BIP@o2ib2) refused reconnection, still busy with 1 active RPCs [652751.424130] Lustre: Skipped 146 previous similar messages [652805.773403] Lustre: 17690:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib t283472751740 exp (null) cur 1391554186 last 0 [652821.227198] Lustre: DEBUG MARKER: Tue Feb 4 23:50:01 2014 [652821.227200] [653008.631572] LustreError: 17690:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391554189, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a08465240/0x1bd0ca3af46ffa2a lrc: 3/1,0 mode: --/PR res: 8949924332/3356 bits 0x3 rrc: 3 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 17690 timeout: 0 [653034.520705] Lustre: Service thread pid 18039 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [653034.538203] Pid: 18039, comm: mdt_86 [653034.541998] [653034.541999] Call Trace: [653034.549012] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [653034.556297] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [653034.564041] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [653034.571970] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [653034.579575] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [653034.587286] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [653034.594660] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [653034.602527] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [653034.610413] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [653034.618605] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [653034.626661] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [653034.634118] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [653034.641517] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [653034.650668] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [653034.658364] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [653034.666172] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [653034.673627] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [653034.682162] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [653034.689448] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [653034.697642] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [653034.705630] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [653034.714317] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [653034.722767] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [653034.730312] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [653034.738358] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [653034.746253] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [653034.754257] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [653034.762544] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [653034.771058] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [653034.778361] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [653034.786437] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [653034.794384] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [653034.802160] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [653034.809269] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [653034.817053] [<ffffffff8100412a>] child_rip+0xa/0x20 [653034.823608] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [653034.831422] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [653034.839225] [<ffffffff81004120>] ? child_rip+0x0/0x20 [653034.845880] [653034.848893] LustreError: dumping log to /tmp/lustre-log.1391554415.18039 [653091.279206] LustreError: 18265:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391554272, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff880e88ac3b40/0x1bd0ca3af4705c46 lrc: 3/1,0 mode: --/PR res: 8952384734/771 bits 0x3 rrc: 5 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18265 timeout: 0 [653106.698905] LustreError: 18072:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391554287, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a2d130900/0x1bd0ca3af4706315 lrc: 3/1,0 mode: --/PR res: 8951399136/1269 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18072 timeout: 0 [653120.252896] Lustre: DEBUG MARKER: Tue Feb 4 23:55:01 2014 [653120.252898] [653316.516713] LustreError: 18013:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391554497, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881a5b788240/0x1bd0ca3af4709f36 lrc: 3/1,0 mode: --/PR res: 8952384752/641 bits 0x3 rrc: 3 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18013 timeout: 0 [653350.233179] Lustre: store1-MDT0000: Client e3b65402-5a14-5932-6e46-c3e70901f1c1 (at JO.BOO.PZ.BIP@o2ib2) reconnecting [653350.244033] Lustre: Skipped 144 previous similar messages [653351.105619] Lustre: store1-MDT0000: Client 2fd75f33-abb1-54d3-a78e-b54fab31bf48 (at JO.BOO.AO.WW@o2ib2) refused reconnection, still busy with 2 active RPCs [653351.119780] Lustre: Skipped 144 previous similar messages [653419.689514] Lustre: DEBUG MARKER: Wed Feb 5 00:00:01 2014 [653419.689517] [653557.403487] Lustre: 16134:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [653557.403491] req@ffff881bbd1c2400 x1458751573928248/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391554944 ref 2 fl Interpret:/0/0 rc 0/0 [653628.264288] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [653628.264292] req@ffff881b83378c00 x1458479672303182/t0(0) o101->ec374fc7-4ce6-8d3c-2b0a-a0ab2cfb256e@JO.BOO.PI.BWW@o2ib2:0/0 lens 632/4936 e 0 to 0 dl 1391555015 ref 2 fl Interpret:/0/0 rc 0/0 [653640.240716] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [653640.240720] req@ffff881bc1c33000 x1458751573930599/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391555027 ref 2 fl Interpret:/0/0 rc 0/0 [653655.211788] Lustre: 16430:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [653655.211792] req@ffff8819efd94c00 x1458751573930756/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391555042 ref 2 fl Interpret:/0/0 rc 0/0 [653719.171144] Lustre: DEBUG MARKER: Wed Feb 5 00:05:01 2014 [653719.171146] [653915.340257] Lustre: Service thread pid 18013 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [653915.357677] Pid: 18013, comm: mdt_60 [653915.361468] [653915.361469] Call Trace: [653915.368478] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [653915.376069] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [653915.385231] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [653915.392863] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [653915.401225] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [653915.408933] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [653915.417653] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [653915.425972] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [653915.433808] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [653915.441583] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [653915.449476] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [653915.457843] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [653915.466247] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [653915.474201] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [653915.482577] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [653915.491030] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [653915.499066] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [653915.507041] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [653915.515245] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [653915.523708] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [653915.531050] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [653915.539160] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [653915.547055] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [653915.554799] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [653915.561980] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [653915.569782] [<ffffffff8100412a>] child_rip+0xa/0x20 [653915.576306] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [653915.584078] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [653915.591821] [<ffffffff81004120>] ? child_rip+0x0/0x20 [653915.598468] [653915.601496] LustreError: dumping log to /tmp/lustre-log.1391555298.18013 [653949.929230] Lustre: store1-MDT0000: Client 2fd75f33-abb1-54d3-a78e-b54fab31bf48 (at JO.BOO.AO.WW@o2ib2) reconnecting [653949.939996] Lustre: Skipped 152 previous similar messages [653949.945595] Lustre: store1-MDT0000: Client 2fd75f33-abb1-54d3-a78e-b54fab31bf48 (at JO.BOO.AO.WW@o2ib2) refused reconnection, still busy with 2 active RPCs [653949.961102] Lustre: Skipped 151 previous similar messages [654005.793658] Lustre: Service thread pid 18002 was inactive for 800.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [654005.811067] Pid: 18002, comm: mdt_49 [654005.814936] [654005.814936] Call Trace: [654005.821875] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [654005.829223] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [654005.837008] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [654005.844895] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [654005.852454] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [654005.860152] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [654005.867491] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [654005.875354] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [654005.883225] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [654005.891433] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [654005.899546] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [654005.906993] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [654005.914407] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [654005.923550] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [654005.931186] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [654005.939033] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [654005.946470] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [654005.955036] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [654005.962245] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [654005.970524] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [654005.978437] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [654005.987068] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [654005.994221] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [654006.001593] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [654006.009696] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [654006.017562] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [654006.025533] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [654006.033782] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [654006.042156] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [654006.049576] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [654006.057529] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [654006.065494] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [654006.073243] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [654006.080292] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654006.088113] [<ffffffff8100412a>] child_rip+0xa/0x20 [654006.094699] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654006.102404] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654006.110215] [<ffffffff81004120>] ? child_rip+0x0/0x20 [654006.116860] [654006.119819] LustreError: dumping log to /tmp/lustre-log.1391555388.18002 [654006.670869] Lustre: Service thread pid 17690 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [654006.688416] Pid: 17690, comm: mdt_132 [654006.692297] [654006.692297] Call Trace: [654006.699292] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [654006.706836] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [654006.716037] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [654006.723732] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [654006.732093] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [654006.739819] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [654006.748526] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [654006.756816] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [654006.764683] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [654006.772471] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [654006.780355] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [654006.790046] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [654006.798516] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [654006.806540] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [654006.814931] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [654006.823417] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [654006.831422] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [654006.839460] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [654006.847668] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [654006.856124] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [654006.863460] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [654006.871504] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [654006.879325] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [654006.887149] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [654006.894260] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654006.901999] [<ffffffff8100412a>] child_rip+0xa/0x20 [654006.908446] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654006.916286] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654006.924035] [<ffffffff81004120>] ? child_rip+0x0/0x20 [654006.930623] [654006.933717] LustreError: dumping log to /tmp/lustre-log.1391555389.17690 [654018.607895] Lustre: DEBUG MARKER: Wed Feb 5 00:10:01 2014 [654018.607897] [654077.303507] Lustre: Service thread pid 17419 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [654077.321125] Pid: 17419, comm: mdt_36 [654077.324933] [654077.324934] Call Trace: [654077.331908] [<ffffffff811206d9>] ? zone_statistics+0x99/0xc0 [654077.339217] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [654077.346934] [<ffffffffa0a9a7fe>] qos_statfs_update+0x7fe/0xa70 [lov] [654077.354878] [<ffffffff8114bbca>] ? cache_alloc_refill+0x9a/0x250 [654077.362462] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [654077.370167] [<ffffffffa0a9b24a>] alloc_qos+0x1aa/0x2190 [lov] [654077.377486] [<ffffffffa0aa111f>] ? lsm_alloc_plain+0xff/0x930 [lov] [654077.385346] [<ffffffffa0a9e1ae>] qos_prep_create+0x1ee/0x2380 [lov] [654077.393248] [<ffffffffa0994efd>] ? quota_search_lqs+0x9d/0x660 [lquota] [654077.401480] [<ffffffffa0a98f1a>] lov_prep_create_set+0xea/0x390 [lov] [654077.409538] [<ffffffffa0a7fb7d>] lov_create+0x1ad/0x1400 [lov] [654077.417005] [<ffffffffa0cac0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] [654077.424447] [<ffffffffa0c035c3>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs] [654077.433622] [<ffffffffa0ccc916>] ? mdd_read_unlock+0x26/0x30 [mdd] [654077.441343] [<ffffffffa0cb090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] [654077.449150] [<ffffffffa0cc2871>] mdd_create+0xf81/0x1a90 [mdd] [654077.456597] [<ffffffffa0c05414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] [654077.465183] [<ffffffffa0d8f407>] cml_create+0x97/0x250 [cmm] [654077.472500] [<ffffffffa0d20611>] ? mdt_version_get_save+0x91/0xd0 [mdt] [654077.480656] [<ffffffffa0d36dc9>] mdt_reint_open+0x1939/0x24e0 [mdt] [654077.488622] [<ffffffffa0786d24>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] [654077.497169] [<ffffffffa0cc556e>] ? md_ucred+0x1e/0x60 [mdd] [654077.504283] [<ffffffffa0d1ecb1>] mdt_reint_rec+0x41/0xe0 [mdt] [654077.511733] [<ffffffffa0d15ed4>] mdt_reint_internal+0x544/0x8e0 [mdt] [654077.519804] [<ffffffffa0d1653d>] mdt_intent_reint+0x1ed/0x530 [mdt] [654077.527690] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [654077.535674] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [654077.543915] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [654077.552397] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [654077.559743] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [654077.567820] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [654077.575744] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [654077.583555] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [654077.590641] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654077.598469] [<ffffffff8100412a>] child_rip+0xa/0x20 [654077.604976] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654077.612759] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654077.620522] [<ffffffff81004120>] ? child_rip+0x0/0x20 [654077.627196] [654077.630229] LustreError: dumping log to /tmp/lustre-log.1391555460.17419 [654089.318466] Lustre: Service thread pid 18265 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [654089.335942] Pid: 18265, comm: mdt_124 [654089.339839] [654089.339840] Call Trace: [654089.346886] [<ffffffff81486598>] ? schedule_timeout+0x198/0x2d0 [654089.354491] [<ffffffffa075bda0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [654089.363685] [<ffffffffa051b6de>] cfs_waitq_wait+0xe/0x10 [libcfs] [654089.371351] [<ffffffffa075ff2a>] ldlm_completion_ast+0x52a/0x730 [ptlrpc] [654089.379758] [<ffffffff8104a320>] ? default_wake_function+0x0/0x20 [654089.387504] [<ffffffffa075f686>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [654089.396234] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [654089.404567] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [654089.412449] [<ffffffffa0d07290>] mdt_object_lock+0x320/0xb70 [mdt] [654089.420247] [<ffffffffa0d04be0>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [654089.428158] [<ffffffffa075fa00>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] [654089.436551] [<ffffffffa0d17c92>] mdt_getattr_name_lock+0xe52/0x18b0 [mdt] [654089.444933] [<ffffffffa078611d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [654089.452899] [<ffffffffa07b1596>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [654089.461296] [<ffffffffa07883a4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [654089.469770] [<ffffffffa0d18c4d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [654089.477834] [<ffffffffa0d14c09>] mdt_intent_policy+0x379/0x690 [mdt] [654089.485827] [<ffffffffa07423c1>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [654089.494053] [<ffffffffa07683cd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [654089.502562] [<ffffffffa0d15586>] mdt_enqueue+0x46/0x130 [mdt] [654089.509927] [<ffffffffa0d0a762>] mdt_handle_common+0x932/0x1750 [mdt] [654089.518027] [<ffffffffa0d0b655>] mdt_regular_handle+0x15/0x20 [mdt] [654089.525941] [<ffffffffa07974f6>] ptlrpc_main+0xd16/0x1a80 [ptlrpc] [654089.533705] [<ffffffff810017cc>] ? __switch_to+0x1ac/0x320 [654089.540952] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654089.548728] [<ffffffff8100412a>] child_rip+0xa/0x20 [654089.555197] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654089.563087] [<ffffffffa07967e0>] ? ptlrpc_main+0x0/0x1a80 [ptlrpc] [654089.570845] [<ffffffff81004120>] ? child_rip+0x0/0x20 [654089.577460] [654089.580502] LustreError: dumping log to /tmp/lustre-log.1391555472.18265 [654104.738173] Lustre: Service thread pid 18072 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [654104.751525] LustreError: dumping log to /tmp/lustre-log.1391555487.18072 [654287.968214] Lustre: 18055:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-574), not sending early reply [654287.968218] req@ffff881d05b5d850 x1458751573932601/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 560/4808 e 3 to 0 dl 1391555676 ref 2 fl Interpret:/0/0 rc 0/0 [654318.056106] Lustre: DEBUG MARKER: Wed Feb 5 00:15:01 2014 [654318.056108] [654378.790258] Lustre: 18025:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-574), not sending early reply [654378.790262] req@ffff88205e64c050 x1458479034465361/t0(0) o101->10e845e4-82fe-2a77-d272-bd68fb30aaf8@JO.BOO.AO.LT@o2ib2:0/0 lens 648/4936 e 3 to 0 dl 1391555767 ref 2 fl Interpret:/0/0 rc 0/0 [654437.873696] LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 1602s: evicting client at VYD.DF.BBO.W@o2ib ns: mdt-ffff881be2e73000 lock: ffff881a05d81900/0x1bd0ca3af47033b9 lrc: 3/0,0 mode: PR/PR res: 8908553484/63357 bits 0x3 rrc: 2 type: IBT flags: 0x4000020 remote: 0x84274abcfff02b54 expref: 2875 pid: 17419 timeout: 4360509544 [654470.802013] Lustre: 18055:0:(ldlm_lib.c:952:target_handle_connect()) store1-MDT0000: connection from 8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib t283472751740 exp (null) cur 1391555854 last 0 [654551.935757] Lustre: store1-MDT0000: Client 8a3976d1-d35b-31b0-a9b9-8ee8f50a7e65 (at JO.BOO.AO.FW@o2ib2) reconnecting [654551.946518] Lustre: Skipped 156 previous similar messages [654551.952107] Lustre: store1-MDT0000: Client 8a3976d1-d35b-31b0-a9b9-8ee8f50a7e65 (at JO.BOO.AO.FW@o2ib2) refused reconnection, still busy with 1 active RPCs [654551.967665] Lustre: Skipped 156 previous similar messages [654617.130776] Lustre: DEBUG MARKER: Wed Feb 5 00:20:01 2014 [654617.130778] [654672.911938] LustreError: 16134:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391555856, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff88171e709240/0x1bd0ca3af471c3ed lrc: 3/1,0 mode: --/PR res: 8692249185/7 bits 0x3 rrc: 3 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 16134 timeout: 0 [654673.660473] LustreError: 18055:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391555857, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff8819ed4d9d80/0x1bd0ca3af471c893 lrc: 3/1,0 mode: --/PR res: 8952384752/641 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18055 timeout: 0 [654685.557140] LustreError: 18017:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1391555869, 200s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff881be2e73000 lock: ffff881af9c456c0/0x1bd0ca3af471f6e5 lrc: 3/1,0 mode: --/PR res: 8952384754/246 bits 0x3 rrc: 4 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 18017 timeout: 0 [654916.579503] Lustre: DEBUG MARKER: Wed Feb 5 00:25:01 2014 [654916.579505] [655157.826893] Lustre: store1-MDT0000: Client 4c8862c3-cd29-7073-cc7f-8d955ff73907 (at JO.BOO.AO.BF@o2ib2) reconnecting [655157.837717] Lustre: Skipped 156 previous similar messages [655157.843416] Lustre: store1-MDT0000: Client 4c8862c3-cd29-7073-cc7f-8d955ff73907 (at JO.BOO.AO.BF@o2ib2) refused reconnection, still busy with 1 active RPCs [655157.858878] Lustre: Skipped 156 previous similar messages [655216.017231] Lustre: DEBUG MARKER: Wed Feb 5 00:30:01 2014 [655216.017233] [655221.384497] Lustre: 18005:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [655221.384501] req@ffff8819ed01dc00 x1458751573938948/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391556611 ref 2 fl Interpret:/0/0 rc 0/0 [655222.384174] Lustre: 5463:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [655222.384178] req@ffff881d678d6850 x1458751573939025/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 560/4808 e 0 to 0 dl 1391556612 ref 2 fl Interpret:/0/0 rc 0/0 [655234.359002] Lustre: 18061:0:(service.c:1035:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply [655234.359006] req@ffff881d67b7f850 x1458751573939822/t0(0) o101->8e81a101-3f17-b68e-df3e-cbf2a3e18200@VYD.DF.BBO.W@o2ib:0/0 lens 544/4808 e 0 to 0 dl 1391556624 ref 2 fl Interpret:/0/0 rc 0/0 [655444.910448] Uhhuh. NMI received for unknown reason 3d on CPU 0. [655444.916528] Do you have a strange power saving mode enabled? [655444.922337] Kernel panic - not syncing: NMI: Not continuing [655444.928060] Pid: 0, comm: swapper Not tainted 2.6.32-220.23.1.bl6.Bull.28.10.x86_64 #1 [655444.936173] Call Trace: [655444.938768] <NMI> [<ffffffff814851a0>] ? panic+0x78/0x143 [655444.944518] [<ffffffff81488ff0>] ? do_nmi+0x240/0x2b0 [655444.949801] [<ffffffff81488830>] ? nmi+0x20/0x30 [655444.954655] [<ffffffff812a9831>] ? intel_idle+0xb1/0x170 [655444.960200] <<EOE>> [<ffffffff813aa267>] ? cpuidle_idle_call+0xa7/0x140 [655444.967165] [<ffffffff81001e06>] ? cpu_idle+0xb6/0x110 [655444.972536] [<ffffffff8146ff2a>] ? rest_init+0x7a/0x80 [655444.977909] [<ffffffff81adff8c>] ? start_kernel+0x481/0x48d [655444.983721] [<ffffffff81adf31e>] ? x86_64_start_reservations+0x125/0x129 [655444.990651] [<ffffffff81adf41c>] ? x86_64_start_kernel+0xfa/0x109