Jun 10 09:53:14 sandy776 kernel: LustreError: 2857:0:(o2iblnd_cb.c:1093:kiblnd_init_rdma()) RDMA too fragmented for 10.3.6.25@o2ib (256): 128/255 src 128/256 dst frags Jun 10 09:53:14 sandy776 kernel: LustreError: 2857:0:(o2iblnd_cb.c:1589:kiblnd_reply()) Can't setup rdma for GET from 10.3.6.25@o2ib: -90 Jun 10 09:53:14 sandy776 kernel: LustreError: 2857:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff88104c47a000 Jun 10 09:53:14 sandy776 kernel: Lustre: 4537:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1437440613076982 sent from lnec-OST000f-osc-ffff8808599d1000 to NID 10.3.6.25@o2ib 0s ago has failed due to network error (44s prior to deadline). Jun 10 09:53:14 sandy776 kernel: req@ffff880854952400 x1437440613076982/t0 o4->lnec-OST000f_UUID@10.3.6.25@o2ib:6/4 lens 448/608 e 0 to 1 dl 1370850838 ref 1 fl Rpc:/0/0 rc 0/0 Jun 10 09:53:14 sandy776 kernel: Lustre: lnec-OST000f-osc-ffff8808599d1000: Connection to service lnec-OST000f via nid 10.3.6.25@o2ib was lost; in progress operations using this service will wait for recovery to complete. Jun 10 09:53:14 sandy776 kernel: LustreError: 2837:0:(o2iblnd_cb.c:1093:kiblnd_init_rdma()) RDMA too fragmented for 10.3.6.25@o2ib (256): 128/255 src 128/255 dst frags Jun 10 09:53:14 sandy776 kernel: LustreError: 2837:0:(o2iblnd_cb.c:1589:kiblnd_reply()) Can't setup rdma for GET from 10.3.6.25@o2ib: -90 Jun 10 09:53:14 sandy776 kernel: LustreError: 2837:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff88080c248000 Jun 10 09:53:14 sandy776 kernel: Lustre: 4547:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1437440613076974 sent from lnec-OST000d-osc-ffff8808599d1000 to NID 10.3.6.25@o2ib 0s ago has failed due to network error (44s prior to deadline). Jun 10 09:53:14 sandy776 kernel: req@ffff881069a82400 x1437440613076974/t0 o4->lnec-OST000d_UUID@10.3.6.25@o2ib:6/4 lens 448/608 e 0 to 1 dl 1370850838 ref 1 fl Rpc:/0/0 rc 0/0 Jun 10 09:53:14 sandy776 kernel: Lustre: lnec-OST000d-osc-ffff8808599d1000: Connection to service lnec-OST000d via nid 10.3.6.25@o2ib was lost; in progress operations using this service will wait for recovery to complete. Jun 10 09:53:14 sandy776 kernel: LustreError: 2850:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff881052734000 Jun 10 09:53:14 sandy776 kernel: LustreError: 2842:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff88080c140000 Jun 10 09:53:14 sandy776 kernel: LustreError: 2841:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff880819360000 Jun 10 09:53:14 sandy776 kernel: LustreError: 2826:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff8808606c0000 Jun 10 09:53:14 sandy776 kernel: LustreError: 2840:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff881052702000 Jun 10 09:53:14 sandy776 kernel: LustreError: 4482:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Jun 10 09:53:14 sandy776 kernel: LustreError: 4502:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Jun 10 09:53:14 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:53:14 sandy776 kernel: LustreError: 4482:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 1 previous similar message Jun 10 09:53:14 sandy776 kernel: LustreError: 4482:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Jun 10 09:53:14 sandy776 kernel: LustreError: 4490:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Jun 10 09:53:14 sandy776 kernel: LustreError: 4490:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 1 previous similar message Jun 10 09:53:14 sandy776 kernel: LustreError: 4482:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 2 previous similar messages Jun 10 09:53:14 sandy776 kernel: LustreError: 2855:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff88080c310000 Jun 10 09:53:14 sandy776 kernel: LustreError: 2827:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff88104c63a000 Jun 10 09:53:14 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:53:14 sandy776 kernel: LustreError: 2854:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff881073248000 Jun 10 09:53:14 sandy776 kernel: LustreError: 2853:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff88080c2cc000 Jun 10 09:53:21 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:53:21 sandy776 kernel: LustreError: Skipped 2 previous similar messages Jun 10 09:53:28 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:53:28 sandy776 kernel: LustreError: Skipped 3 previous similar messages Jun 10 09:53:35 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:53:35 sandy776 kernel: LustreError: Skipped 3 previous similar messages Jun 10 09:53:42 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:53:42 sandy776 kernel: LustreError: Skipped 3 previous similar messages Jun 10 09:53:56 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:53:56 sandy776 kernel: LustreError: Skipped 7 previous similar messages Jun 10 09:53:58 sandy776 kernel: Lustre: 4512:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1437440613076981 sent from lnec-OST000d-osc-ffff8808599d1000 to NID 10.3.6.25@o2ib 44s ago has timed out (44s prior to deadline). Jun 10 09:53:58 sandy776 kernel: req@ffff880854945c00 x1437440613076981/t0 o4->lnec-OST000d_UUID@10.3.6.25@o2ib:6/4 lens 448/608 e 0 to 1 dl 1370850838 ref 1 fl Rpc:/0/0 rc 0/0 Jun 10 09:53:58 sandy776 kernel: Lustre: 4512:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 9 previous similar messages Jun 10 09:54:13 sandy776 ntpd[3636]: synchronized to 192.168.120.2, stratum 4 Jun 10 09:54:17 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:54:17 sandy776 kernel: LustreError: Skipped 11 previous similar messages Jun 10 09:54:17 sandy776 kernel: Lustre: lnec-OST000f-osc-ffff8808599d1000: Connection restored to service lnec-OST000f using nid 10.3.6.25@o2ib. Jun 10 09:54:17 sandy776 kernel: LustreError: 2833:0:(o2iblnd_cb.c:1093:kiblnd_init_rdma()) RDMA too fragmented for 10.3.6.25@o2ib (256): 128/255 src 128/256 dst frags Jun 10 09:54:17 sandy776 kernel: LustreError: 2833:0:(o2iblnd_cb.c:1093:kiblnd_init_rdma()) Skipped 9 previous similar messages Jun 10 09:54:17 sandy776 kernel: LustreError: 2833:0:(o2iblnd_cb.c:1589:kiblnd_reply()) Can't setup rdma for GET from 10.3.6.25@o2ib: -90 Jun 10 09:54:17 sandy776 kernel: LustreError: 2833:0:(o2iblnd_cb.c:1589:kiblnd_reply()) Skipped 9 previous similar messages Jun 10 09:54:17 sandy776 kernel: LustreError: 2833:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff881052708000 Jun 10 09:54:17 sandy776 kernel: LustreError: 2843:0:(events.c:199:client_bulk_callback()) event type 0, status -5, desc ffff881052480000 Jun 10 09:54:17 sandy776 kernel: Lustre: 4537:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1437440613077336 sent from lnec-OST000f-osc-ffff8808599d1000 to NID 10.3.6.25@o2ib 0s ago has failed due to network error (91s prior to deadline). Jun 10 09:54:17 sandy776 kernel: req@ffff880854951c00 x1437440613077336/t0 o4->lnec-OST000f_UUID@10.3.6.25@o2ib:6/4 lens 448/608 e 1 to 1 dl 1370850948 ref 1 fl Rpc:/2/0 rc -11/0 Jun 10 09:54:17 sandy776 kernel: Lustre: 4537:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 3 previous similar messages Jun 10 09:54:17 sandy776 kernel: Lustre: lnec-OST000f-osc-ffff8808599d1000: Connection to service lnec-OST000f via nid 10.3.6.25@o2ib was lost; in progress operations using this service will wait for recovery to complete. Jun 10 09:54:17 sandy776 kernel: Lustre: Skipped 2 previous similar messages Jun 10 09:54:17 sandy776 kernel: Lustre: Server lnec-OST000f_UUID version (2.1.5.0) is much newer than client version (1.8.9) Jun 10 09:54:17 sandy776 kernel: Lustre: Skipped 32 previous similar messages Jun 10 09:54:18 sandy776 ntpd[3636]: synchronized to 192.168.120.3, stratum 4 Jun 10 09:54:52 sandy776 kernel: LustreError: 11-0: an error occurred while communicating with 10.3.6.25@o2ib. The ost_connect operation failed with -16 Jun 10 09:54:52 sandy776 kernel: LustreError: Skipped 17 previous similar messages Jun 10 09:54:59 sandy776 kernel: LustreError: 167-0: This client was evicted by lnec-OST000c; in progress operations using this service will fail. Jun 10 09:54:59 sandy776 kernel: Lustre: Server lnec-OST000c_UUID version (2.1.5.0) is much newer than client version (1.8.9) Jun 10 09:54:59 sandy776 kernel: LustreError: 4547:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881069a82400 x1437440613076974/t0 o4->lnec-OST000d_UUID@10.3.6.25@o2ib:6/4 lens 448/608 e 0 to 1 dl 1370850838 ref 1 fl Rpc:EX/0/0 rc -4/0 Jun 10 09:54:59 sandy776 kernel: Lustre: lnec-OST000c-osc-ffff8808599d1000: Connection restored to service lnec-OST000c using nid 10.3.6.25@o2ib. Jun 10 09:55:01 sandy776 kernel: Lustre: 4537:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1437440613077337 sent from lnec-OST000f-osc-ffff8808599d1000 to NID 10.3.6.25@o2ib 44s ago has timed out (44s prior to deadline). Jun 10 09:55:01 sandy776 kernel: req@ffff880854952400 x1437440613077337/t0 o4->lnec-OST000f_UUID@10.3.6.25@o2ib:6/4 lens 448/608 e 0 to 1 dl 1370850901 ref 1 fl Rpc:/2/0 rc 0/0 Jun 10 09:55:01 sandy776 kernel: Lustre: 4537:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 1 previous similar message Jun 10 09:55:19 sandy776 ntpd[3636]: synchronized to 192.168.142.158, stratum 4 Jun 10 09:55:20 sandy776 kernel: Lustre: lnec-OST000f-osc-ffff8808599d1000: Connection restored to service lnec-OST000f using nid 10.3.6.25@o2ib. Jun 10 09:55:20 sandy776 kernel: Lustre: Skipped 2 previous similar messages Jun 10 09:55:20 sandy776 kernel: Lustre: Server lnec-OST000f_UUID version (2.1.5.0) is much newer than client version (1.8.9) Jun 10 09:55:20 sandy776 kernel: Lustre: Skipped 2 previous similar messages