ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0002_UUID: not available for connect from 10.20.30.72@o2ib (no target) LustreError: Skipped 205 previous similar messages Lustre: scratch-OST0004: haven't heard from client 51b6f8ba-0d6a-0cfd-1b31-bef2e50d9802 (at 10.20.30.1@o2ib) in 233 seconds. I think it's dead, and I am evicting it. exp ffff8808730b9c00, cur 1433843261 expire 1433843111 last 1433843028 Lustre: scratch-OST0005: haven't heard from client 51b6f8ba-0d6a-0cfd-1b31-bef2e50d9802 (at 10.20.30.1@o2ib) in 233 seconds. I think it's dead, and I am evicting it. exp ffff880861860c00, cur 1433843261 expire 1433843111 last 1433843028 Lustre: Skipped 11 previous similar messages Lustre: scratch-OST0004: haven't heard from client c87da71c-4e3e-93cc-59d4-22b12b4bae76 (at 10.20.30.13@o2ib) in 233 seconds. I think it's dead, and I am evicting it. exp ffff881057828400, cur 1433843361 expire 1433843211 last 1433843128 Lustre: Skipped 23 previous similar messages Lustre: scratch-OST0004: haven't heard from client f33f2984-e74c-9f84-4fa5-663d1a1ef177 (at 10.20.30.34@o2ib) in 233 seconds. I think it's dead, and I am evicting it. exp ffff88107172e800, cur 1433843461 expire 1433843311 last 1433843228 Lustre: Skipped 65 previous similar messages Lustre: scratch-OST0004: haven't heard from client 9225e48b-3c9e-b2b2-87aa-a9b17c9e9523 (at 10.20.30.59@o2ib) in 233 seconds. I think it's dead, and I am evicting it. exp ffff880829a2c800, cur 1433843561 expire 1433843411 last 1433843328 Lustre: Skipped 74 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.71@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.1@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.2@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.3@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.4@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0002_UUID: not available for connect from 10.20.30.9@o2ib (no target) LustreError: Skipped 5 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.10@o2ib (no target) LustreError: Skipped 5 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.13@o2ib (no target) LustreError: Skipped 8 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.12@o2ib (no target) LustreError: Skipped 20 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.36@o2ib (no target) LustreError: Skipped 41 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.60@o2ib (no target) LustreError: Skipped 80 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 ib_send_cm_mra: cm_id_priv->id.state: 0x6 Lustre: scratch-OST0005: haven't heard from client a96ebeb0-a111-96e6-06c5-a514a12d0065 (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881057ef8800, cur 1433887413 expire 1433887263 last 1433887186 Lustre: Skipped 29 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.41@o2ib (no target) LustreError: Skipped 32 previous similar messages Lustre: scratch-OST0003: haven't heard from client 29b639ea-94fa-9f0b-daca-494c4f3196b0 (at 10.20.30.34@o2ib) in 228 seconds. I think it's dead, and I am evicting it. exp ffff881072ba0c00, cur 1435436728 expire 1435436578 last 1435436500 Lustre: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.34@o2ib (no target) LustreError: Skipped 2 previous similar messages LNetError: 3890:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds LNetError: 3890:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 10.20.30.41@o2ib (24): c: 5, oc: 0, rc: 7 LustreError: 3890:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff880eadf8f9c0 LNet: Service thread pid 5085 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 5085, comm: ll_ost_io03_001 Call Trace: [] ? lock_timer_base+0x3c/0x70 [] schedule_timeout+0x192/0x2e0 [] ? process_timeout+0x0/0x10 [] cfs_waitq_timedwait+0x11/0x20 [libcfs] [] target_bulk_io+0x3b8/0x910 [ptlrpc] [] ? cfs_free+0xe/0x10 [libcfs] [] ? cfs_crypto_unregister+0x0/0x60 [libcfs] [] ? default_wake_function+0x0/0x20 [] ost_brw_read+0x103d/0x1340 [ost] [] ? target_bulk_timeout+0x0/0xc0 [ptlrpc] [] ? lustre_msg_buf+0x55/0x60 [ptlrpc] [] ? class_handle2object+0x95/0x190 [obdclass] [] ? lustre_msg_get_version+0x8c/0x100 [ptlrpc] [] ? lustre_msg_check_version+0xe8/0x100 [ptlrpc] [] ost_handle+0x2ac8/0x48e0 [ost] [] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc] [] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] [] ? cfs_timer_arm+0xe/0x10 [libcfs] [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] [] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] [] ? __wake_up+0x53/0x70 [] ptlrpc_main+0xace/0x1700 [ptlrpc] [] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [] child_rip+0xa/0x20 [] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1436549578.5085 Lustre: scratch-OST0005: haven't heard from client cbb135df-3373-9291-449d-9a511b2fe23c (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881055a1e800, cur 1436549589 expire 1436549439 last 1436549362 Lustre: Skipped 2 previous similar messages Lustre: scratch-OST0004: haven't heard from client cbb135df-3373-9291-449d-9a511b2fe23c (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881073668c00, cur 1436549604 expire 1436549454 last 1436549377 Lustre: Skipped 1 previous similar message LustreError: 5085:0:(ldlm_lib.c:2725:target_bulk_io()) @@@ Eviction on bulk PUT req@ffff880db1715800 x1503539821929992/t0(0) o3->cbb135df-3373-9291-449d-9a511b2fe23c@10.20.30.41@o2ib:0/0 lens 488/432 e 4 to 0 dl 1436549774 ref 1 fl Interpret:/0/0 rc 0/0 Lustre: scratch-OST0004: Bulk IO read error with cbb135df-3373-9291-449d-9a511b2fe23c (at 10.20.30.41@o2ib), client will retry: rc -107 LNet: Service thread pid 5085 completed after 226.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.41@o2ib (no target) LustreError: Skipped 1 previous similar message ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0002_UUID: not available for connect from 10.20.30.69@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.69@o2ib (no target) LustreError: Skipped 2 previous similar messages Lustre: scratch-OST0005: haven't heard from client f549c5b5-60c4-e42a-6ff1-83794d025b03 (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880ec8252000, cur 1439184666 expire 1439184516 last 1439184439 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.41@o2ib (no target) LustreError: Skipped 2 previous similar messages Lustre: scratch-OST0004: haven't heard from client 4e2c004b-610e-7f62-9f63-12c47f58485c (at 10.20.30.34@o2ib) in 228 seconds. I think it's dead, and I am evicting it. exp ffff880aec98d000, cur 1439229126 expire 1439228976 last 1439228898 Lustre: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.34@o2ib (no target) LustreError: Skipped 2 previous similar messages LustreError: 1896:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel LustreError: 1894:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel LustreError: 1896:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.69@o2ib (no target) LustreError: Skipped 2 previous similar messages Lustre: scratch-OST0004: haven't heard from client 657c9d22-2f12-2e18-ddd9-50bcc79a7233 (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880e7b3fbc00, cur 1442958520 expire 1442958370 last 1442958293 Lustre: Skipped 2 previous similar messages Lustre: scratch-OST0003: haven't heard from client 657c9d22-2f12-2e18-ddd9-50bcc79a7233 (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880a4cce2800, cur 1442958520 expire 1442958370 last 1442958293 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.41@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.69@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.69@o2ib (no target) LustreError: Skipped 2 previous similar messages LustreError: 5039:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel LustreError: 5041:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel LustreError: 16847:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel LustreError: 16847:0:(ost_handler.c:1764:ost_blocking_ast()) Skipped 1 previous similar message LustreError: 16847:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel Lustre: scratch-OST0003: haven't heard from client b8344be8-3d24-b225-9d20-5667496f777b (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88094f204c00, cur 1443756073 expire 1443755923 last 1443755846 Lustre: Skipped 1 previous similar message Lustre: scratch-OST0005: haven't heard from client b8344be8-3d24-b225-9d20-5667496f777b (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880e426b8800, cur 1443756074 expire 1443755924 last 1443755847 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0002_UUID: not available for connect from 10.20.30.41@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.69@o2ib (no target) LustreError: Skipped 2 previous similar messages Lustre: scratch-OST0003: haven't heard from client 619f3599-1965-5d09-60d9-764f6ea8712f (at 10.20.30.34@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880eb274bc00, cur 1444579978 expire 1444579828 last 1444579751 Lustre: Skipped 1 previous similar message Lustre: scratch-OST0004: haven't heard from client 619f3599-1965-5d09-60d9-764f6ea8712f (at 10.20.30.34@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880eac785400, cur 1444579978 expire 1444579828 last 1444579751 Lustre: scratch-OST0005: haven't heard from client 619f3599-1965-5d09-60d9-764f6ea8712f (at 10.20.30.34@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88103914ac00, cur 1444579986 expire 1444579836 last 1444579759 ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 10.20.30.34@o2ib (no target) LustreError: Skipped 2 previous similar messages ib_send_cm_mra: cm_id_priv->id.state: 0x6 LustreError: 137-5: scratch-OST0000_UUID: not available for connect from 10.20.30.41@o2ib (no target) LustreError: Skipped 2 previous similar messages Lustre: scratch-OST0005: haven't heard from client 49628f1f-dc9d-f93b-1178-45e0fc7cc42f (at 10.20.30.41@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880ba6665400, cur 1444712827 expire 1444712677 last 1444712600 [root@mds2 ~]#