Mar 2 23:31:06 c6 kernel: Lustre: 50154:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138783982664 sent from dc2-OST0058-osc-ffff886012dbc000 to NID 10.10.0.9@o2ib 7s ago has timed out (7s prior to deadline). Mar 2 23:31:06 c6 kernel: Lustre: dc2-OST0058-osc-ffff886012dbc000: Connection to service dc2-OST0058 via nid 10.10.0.9@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 2 23:31:07 c6 kernel: Lustre: dc2-OST0058-osc-ffff886012dbc000: Connection restored to service dc2-OST0058 using nid 10.10.0.9@o2ib. Mar 2 23:31:08 c6 kernel: Lustre: Server dc2-OST0058_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 16:49:05 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785397358 sent from dc2-OST0082-osc-ffff886012dbc000 to NID 10.10.0.13@o2ib 17s ago has timed out (17s prior to deadline). Mar 3 16:49:05 c6 kernel: Lustre: dc2-OST0082-osc-ffff886012dbc000: Connection to service dc2-OST0082 via nid 10.10.0.13@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 16:49:06 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785397361 sent from dc2-OST0085-osc-ffff886012dbc000 to NID 10.10.0.13@o2ib 19s ago has timed out (17s prior to deadline). Mar 3 16:49:09 c6 kernel: Lustre: dc2-OST0082-osc-ffff886012dbc000: Connection restored to service dc2-OST0082 using nid 10.10.0.13@o2ib. Mar 3 16:49:09 c6 kernel: Lustre: Server dc2-OST0082_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 16:49:09 c6 kernel: Lustre: dc2-OST0085-osc-ffff886012dbc000: Connection to service dc2-OST0085 via nid 10.10.0.13@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 16:49:09 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785397362 sent from dc2-OST0086-osc-ffff886012dbc000 to NID 10.10.0.13@o2ib 22s ago has timed out (17s prior to deadline). Mar 3 16:49:12 c6 kernel: Lustre: dc2-OST0085-osc-ffff886012dbc000: Connection restored to service dc2-OST0085 using nid 10.10.0.13@o2ib. Mar 3 16:49:12 c6 kernel: Lustre: Server dc2-OST0085_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 16:49:12 c6 kernel: Lustre: dc2-OST0086-osc-ffff886012dbc000: Connection to service dc2-OST0086 via nid 10.10.0.13@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 16:49:13 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785397364 sent from dc2-OST0088-osc-ffff886012dbc000 to NID 10.10.0.13@o2ib 26s ago has timed out (17s prior to deadline). Mar 3 16:49:15 c6 kernel: Lustre: dc2-OST0086-osc-ffff886012dbc000: Connection restored to service dc2-OST0086 using nid 10.10.0.13@o2ib. Mar 3 16:49:15 c6 kernel: Lustre: Server dc2-OST0086_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 16:49:15 c6 kernel: Lustre: dc2-OST0088-osc-ffff886012dbc000: Connection to service dc2-OST0088 via nid 10.10.0.13@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 16:49:17 c6 kernel: Lustre: dc2-OST0088-osc-ffff886012dbc000: Connection restored to service dc2-OST0088 using nid 10.10.0.13@o2ib. Mar 3 16:49:17 c6 kernel: Lustre: Server dc2-OST0088_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 16:52:09 c6 kernel: Lustre: 59481:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785400496 sent from dc2-OST001c-osc-ffff886012dbc000 to NID 10.10.0.3@o2ib 8s ago has timed out (8s prior to deadline). Mar 3 16:52:09 c6 kernel: Lustre: dc2-OST001c-osc-ffff886012dbc000: Connection to service dc2-OST001c via nid 10.10.0.3@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 16:52:11 c6 kernel: LustreError: 59481:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 3 16:52:11 c6 kernel: LustreError: 59481:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 3 16:52:13 c6 kernel: Lustre: dc2-OST001c-osc-ffff886012dbc000: Connection restored to service dc2-OST001c using nid 10.10.0.3@o2ib. Mar 3 16:52:13 c6 kernel: Lustre: Server dc2-OST001c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:22:26 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785418548 sent from dc2-OST0089-osc-ffff886012dbc000 to NID 10.10.0.14@o2ib 17s ago has timed out (17s prior to deadline). Mar 3 17:22:26 c6 kernel: Lustre: dc2-OST0089-osc-ffff886012dbc000: Connection to service dc2-OST0089 via nid 10.10.0.14@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 17:22:27 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785418549 sent from dc2-OST008a-osc-ffff886012dbc000 to NID 10.10.0.14@o2ib 20s ago has timed out (17s prior to deadline). Mar 3 17:22:29 c6 kernel: Lustre: dc2-OST0089-osc-ffff886012dbc000: Connection restored to service dc2-OST0089 using nid 10.10.0.14@o2ib. Mar 3 17:22:29 c6 kernel: Lustre: Server dc2-OST0089_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:22:29 c6 kernel: Lustre: dc2-OST008a-osc-ffff886012dbc000: Connection to service dc2-OST008a via nid 10.10.0.14@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 17:22:32 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785418550 sent from dc2-OST008b-osc-ffff886012dbc000 to NID 10.10.0.14@o2ib 23s ago has timed out (17s prior to deadline). Mar 3 17:22:32 c6 kernel: Lustre: dc2-OST008a-osc-ffff886012dbc000: Connection restored to service dc2-OST008a using nid 10.10.0.14@o2ib. Mar 3 17:22:32 c6 kernel: Lustre: Server dc2-OST008a_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:22:32 c6 kernel: Lustre: dc2-OST008b-osc-ffff886012dbc000: Connection to service dc2-OST008b via nid 10.10.0.14@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 17:22:33 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785418551 sent from dc2-OST008c-osc-ffff886012dbc000 to NID 10.10.0.14@o2ib 26s ago has timed out (17s prior to deadline). Mar 3 17:22:36 c6 kernel: Lustre: dc2-OST008b-osc-ffff886012dbc000: Connection restored to service dc2-OST008b using nid 10.10.0.14@o2ib. Mar 3 17:22:36 c6 kernel: Lustre: Server dc2-OST008b_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:22:36 c6 kernel: Lustre: dc2-OST008c-osc-ffff886012dbc000: Connection to service dc2-OST008c via nid 10.10.0.14@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 17:22:37 c6 kernel: Lustre: dc2-OST008c-osc-ffff886012dbc000: Connection restored to service dc2-OST008c using nid 10.10.0.14@o2ib. Mar 3 17:22:37 c6 kernel: Lustre: Server dc2-OST008c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:26:59 c6 kernel: Lustre: 6539:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785424846 sent from dc2-OST003b-osc-ffff886012dbc000 to NID 10.10.0.6@o2ib 7s ago has timed out (7s prior to deadline). Mar 3 17:26:59 c6 kernel: Lustre: 6539:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 5 previous similar messages Mar 3 17:26:59 c6 kernel: Lustre: dc2-OST003b-osc-ffff886012dbc000: Connection to service dc2-OST003b via nid 10.10.0.6@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 17:27:00 c6 kernel: Lustre: Skipped 5 previous similar messages Mar 3 17:27:02 c6 kernel: LustreError: 6539:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 3 17:27:02 c6 kernel: Lustre: dc2-OST003b-osc-ffff886012dbc000: Connection restored to service dc2-OST003b using nid 10.10.0.6@o2ib. Mar 3 17:27:02 c6 kernel: Lustre: Skipped 5 previous similar messages Mar 3 17:27:02 c6 kernel: Lustre: Server dc2-OST003b_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:27:02 c6 kernel: Lustre: Skipped 5 previous similar messages Mar 3 17:27:02 c6 kernel: LustreError: 6539:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 3 17:31:12 c6 kernel: Lustre: 50155:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785426835 sent from dc2-OST0020-osc-ffff886012dbc000 to NID 10.10.0.4@o2ib 48s ago has timed out (48s prior to deadline). Mar 3 17:31:12 c6 kernel: Lustre: dc2-OST0020-osc-ffff886012dbc000: Connection to service dc2-OST0020 via nid 10.10.0.4@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 17:31:14 c6 kernel: LustreError: 50155:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 3 17:31:14 c6 kernel: Lustre: dc2-OST0020-osc-ffff886012dbc000: Connection restored to service dc2-OST0020 using nid 10.10.0.4@o2ib. Mar 3 17:31:14 c6 kernel: Lustre: Server dc2-OST0020_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:31:14 c6 kernel: LustreError: 50155:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 3 17:49:56 c6 kernel: Lustre: 56794:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785438510 sent from dc2-OST0097-osc-ffff886012dbc000 to NID 10.10.0.15@o2ib 7s ago has timed out (7s prior to deadline). Mar 3 17:49:56 c6 kernel: Lustre: dc2-OST0097-osc-ffff886012dbc000: Connection to service dc2-OST0097 via nid 10.10.0.15@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 17:49:59 c6 kernel: LustreError: 56794:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 3 17:49:59 c6 kernel: Lustre: dc2-OST0097-osc-ffff886012dbc000: Connection restored to service dc2-OST0097 using nid 10.10.0.15@o2ib. Mar 3 17:49:59 c6 kernel: Lustre: Server dc2-OST0097_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 17:49:59 c6 kernel: LustreError: 56794:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 3 19:10:55 c6 kernel: Lustre: 56799:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785486703 sent from dc2-OST000e-osc-ffff886012dbc000 to NID 10.10.0.2@o2ib 8s ago has timed out (8s prior to deadline). Mar 3 19:10:55 c6 kernel: Lustre: dc2-OST000e-osc-ffff886012dbc000: Connection to service dc2-OST000e via nid 10.10.0.2@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 19:10:57 c6 kernel: LustreError: 56799:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 3 19:10:57 c6 kernel: Lustre: dc2-OST000e-osc-ffff886012dbc000: Connection restored to service dc2-OST000e using nid 10.10.0.2@o2ib. Mar 3 19:10:57 c6 kernel: Lustre: Server dc2-OST000e_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 3 19:10:57 c6 kernel: LustreError: 56799:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 3 19:11:10 c6 kernel: Lustre: 6527:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1459138785486766 sent from dc2-OST003c-osc-ffff886012dbc000 to NID 10.10.0.6@o2ib 17s ago has timed out (17s prior to deadline). Mar 3 19:11:10 c6 kernel: Lustre: dc2-OST003c-osc-ffff886012dbc000: Connection to service dc2-OST003c via nid 10.10.0.6@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 3 19:11:11 c6 kernel: Lustre: dc2-OST003c-osc-ffff886012dbc000: Connection restored to service dc2-OST003c using nid 10.10.0.6@o2ib. Mar 3 19:11:11 c6 kernel: Lustre: Server dc2-OST003c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 4 07:58:52 c6 kernel: LustreError: 1434:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 07:58:52 c6 kernel: LustreError: 1434:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 07:58:53 c6 kernel: LustreError: 1434:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 07:58:53 c6 kernel: LustreError: 1434:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 07:58:54 c6 kernel: LustreError: 1434:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 07:58:54 c6 kernel: LustreError: 1434:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 07:58:55 c6 kernel: LustreError: 11-0: an error occurred while communicating with 149.165.235.173@tcp. The mgs_disconnect operation failed with -107 Mar 4 07:58:55 c6 kernel: Lustre: client client(ffff886012794c00) umount complete Mar 4 08:00:13 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 08:00:13 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 27 previous similar messages Mar 4 08:00:13 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 08:00:14 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 27 previous similar messages Mar 4 08:00:15 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 08:00:15 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 8903 previous similar messages Mar 4 08:00:16 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 08:00:16 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 8903 previous similar messages Mar 4 08:00:19 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 08:00:19 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 56338 previous similar messages Mar 4 08:00:20 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 08:00:20 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 56338 previous similar messages Mar 4 08:00:27 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 08:00:27 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 168569 previous similar messages Mar 4 08:00:28 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 08:00:28 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 168652 previous similar messages Mar 4 08:00:43 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 08:00:43 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 415739 previous similar messages Mar 4 08:00:43 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 08:00:44 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 415656 previous similar messages Mar 4 08:01:15 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway Mar 4 08:01:15 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 961835 previous similar messages Mar 4 08:01:16 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 Mar 4 08:01:16 c6 kernel: LustreError: 1446:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 962837 previous similar messages Mar 4 08:01:24 c6 kernel: Lustre: client dc2-client(ffff886012dbc000) umount complete Mar 4 16:09:25 c6 kernel: Lustre: Acceptor stopping Mar 4 16:09:28 c6 kernel: Lustre: Removed LNI 149.165.226.206@tcp Mar 4 16:12:25 c6 kernel: Lustre: Build Version: v1_8_9_WC1-g171bd56-CHANGED-2.6.32-431.3.1.el6.x86_64 Mar 4 16:12:25 c6 kernel: Lustre: Added LNI 149.165.226.206@tcp [8/256/0/0] Mar 4 16:12:25 c6 kernel: Lustre: Accept secure, port 988 Mar 4 16:12:27 c6 kernel: Lustre: Lustre Client File System; http://www.lustre.org/ Mar 4 16:12:27 c6 kernel: Lustre: MGC10.10.0.171@o2ib: Reactivating import Mar 4 16:12:27 c6 kernel: Lustre: Server MGS version (2.1.6.0) is much newer than client version (1.8.9) Mar 4 16:12:28 c6 kernel: Lustre: 3561:0:(obd_config.c:1130:class_config_llog_handler()) skipping 'lmv' config: cmd=cf001,clilmv:lmv Mar 4 16:12:28 c6 kernel: Lustre: 3561:0:(obd_config.c:1130:class_config_llog_handler()) skipping 'lmv' config: cmd=cf003,clilmv: Mar 4 16:12:29 c6 kernel: Lustre: Server dc2-MDT0000_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 4 16:12:29 c6 kernel: Lustre: client supports 64-bits dir hash/offset! Mar 4 16:12:30 c6 kernel: Lustre: Client dc2-client(ffff88200f247c00) mount complete Mar 4 16:12:30 c6 kernel: Lustre: Server dc2-OST0000_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 4 16:12:34 c6 kernel: Lustre: This looks like an old mount command; I will try to contact MDT 'mds-wan' for profile 'client' Mar 4 16:12:34 c6 kernel: Lustre: MGC149.165.235.173@tcp: Reactivating import Mar 4 16:12:34 c6 kernel: Lustre: setting import MGS INACTIVE by administrator request Mar 4 16:12:35 c6 kernel: Lustre: Client client(ffff884011ab8400) mount complete Mar 4 17:58:16 c6 kernel: Lustre: 3541:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461680912521641 sent from dc2-OST002f-osc-ffff88200f247c00 to NID 10.10.0.5@o2ib 17s ago has timed out (17s prior to deadline). Mar 4 17:58:16 c6 kernel: Lustre: dc2-OST002f-osc-ffff88200f247c00: Connection to service dc2-OST002f via nid 10.10.0.5@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 4 17:58:17 c6 kernel: Lustre: dc2-OST002f-osc-ffff88200f247c00: Connection restored to service dc2-OST002f using nid 10.10.0.5@o2ib. Mar 4 17:58:17 c6 kernel: Lustre: Server dc2-OST002f_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 4 17:58:18 c6 kernel: Lustre: Skipped 167 previous similar messages Mar 4 18:58:31 c6 kernel: Lustre: Build Version: v1_8_9_WC1-g171bd56-CHANGED-2.6.32-431.3.1.el6.x86_64 Mar 4 18:58:31 c6 kernel: Lustre: Added LNI 149.165.226.206@tcp [8/256/0/0] Mar 4 18:58:32 c6 kernel: Lustre: Accept secure, port 988 Mar 4 18:58:33 c6 kernel: Lustre: Lustre Client File System; http://www.lustre.org/ Mar 4 18:58:34 c6 kernel: Lustre: This looks like an old mount command; I will try to contact MDT 'mds-wan' for profile 'client' Mar 4 18:58:34 c6 kernel: Lustre: MGC149.165.235.173@tcp: Reactivating import Mar 4 18:58:34 c6 kernel: Lustre: setting import MGS INACTIVE by administrator request Mar 4 18:58:34 c6 kernel: Lustre: Client client(ffff887fff1a4800) mount complete Mar 4 18:58:41 c6 kernel: Lustre: MGC10.10.0.171@o2ib: Reactivating import Mar 4 18:58:41 c6 kernel: Lustre: Server MGS version (2.1.6.0) is much newer than client version (1.8.9) Mar 4 18:58:43 c6 kernel: Lustre: 3416:0:(obd_config.c:1130:class_config_llog_handler()) skipping 'lmv' config: cmd=cf001,clilmv:lmv Mar 4 18:58:43 c6 kernel: Lustre: 3416:0:(obd_config.c:1130:class_config_llog_handler()) skipping 'lmv' config: cmd=cf014,clilmv:dc2-MDT0000_UUID Mar 4 18:58:43 c6 kernel: Lustre: 3416:0:(obd_config.c:1130:class_config_llog_handler()) Skipped 1 previous similar message Mar 4 18:58:44 c6 kernel: Lustre: Server dc2-MDT0000_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 4 18:58:44 c6 kernel: Lustre: client supports 64-bits dir hash/offset! Mar 4 18:58:44 c6 kernel: Lustre: Client dc2-client(ffff8880133b3400) mount complete Mar 4 18:58:44 c6 kernel: Lustre: Server dc2-OST0000_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 05:40:54 c6 kernel: Lustre: 3378:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691363080775 sent from MDC_mds03.local_mds-wan_MNT_client-ffff887fff1a4800 to NID 149.165.235.173@tcp 17s ago has timed out (17s prior to deadline). Mar 5 05:40:54 c6 kernel: Lustre: MDC_mds03.local_mds-wan_MNT_client-ffff887fff1a4800: Connection to service mds-wan via nid 149.165.235.173@tcp was lost; in progress operations using this service will wait for recovery to complete. Mar 5 05:41:02 c6 kernel: Lustre: 3379:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691363080994 sent from MDC_mds03.local_mds-wan_MNT_client-ffff887fff1a4800 to NID 149.165.235.173@tcp 6s ago has timed out (6s prior to deadline). Mar 5 05:41:10 c6 kernel: Lustre: 3379:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691363081213 sent from MDC_mds03.local_mds-wan_MNT_client-ffff887fff1a4800 to NID 149.165.235.174@tcp 6s ago has timed out (6s prior to deadline). Mar 5 05:41:11 c6 kernel: Lustre: 3380:0:(import.c:517:import_select_connection()) MDC_mds03.local_mds-wan_MNT_client-ffff887fff1a4800: tried all connections, increasing latency to 2s Mar 5 05:41:11 c6 kernel: Lustre: MDC_mds03.local_mds-wan_MNT_client-ffff887fff1a4800: Connection restored to service mds-wan using nid 149.165.235.173@tcp. Mar 5 09:58:26 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.3@o2ib. The ost_write operation failed with -107 Mar 5 09:58:26 c6 kernel: Lustre: dc2-OST001f-osc-ffff8880133b3400: Connection to service dc2-OST001f via nid 10.10.0.3@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 09:58:28 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST001f; in progress operations using this service will fail. Mar 5 09:58:29 c6 kernel: Lustre: Server dc2-OST001f_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 09:58:29 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcf164400 x1461691363490723/t0 o4->dc2-OST001f_UUID@10.10.0.3@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 09:58:29 c6 kernel: Lustre: Skipped 167 previous similar messages Mar 5 09:58:30 c6 kernel: LustreError: 11196:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST001f-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 09:58:30 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 09:58:30 c6 kernel: LustreError: 11196:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff881822142840 (23321452/0/0/0) (rc: 1) Mar 5 09:58:31 c6 kernel: Lustre: dc2-OST001f-osc-ffff8880133b3400: Connection restored to service dc2-OST001f using nid 10.10.0.3@o2ib. Mar 5 10:00:26 c6 kernel: Lustre: dc2-OST0071-osc-ffff8880133b3400: Connection to service dc2-OST0071 via nid 10.10.0.11@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:00:26 c6 kernel: LustreError: 11141:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -107 from cancel RPC: canceling anyway Mar 5 10:00:31 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0071; in progress operations using this service will fail. Mar 5 10:00:31 c6 kernel: Lustre: Server dc2-OST0071_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:00:31 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcb891000 x1461691363510571/t0 o4->dc2-OST0071_UUID@10.10.0.11@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 10:00:31 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) Skipped 10 previous similar messages Mar 5 10:00:31 c6 kernel: LustreError: 11236:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST0071-osc-ffff8880133b3400 resource refcount nonzero (2) after lock cleanup; forcing cleanup. Mar 5 10:00:31 c6 kernel: LustreError: 11236:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff8815fbef3500 (22484368/0/0/0) (rc: 2) Mar 5 10:00:31 c6 kernel: Lustre: dc2-OST0071-osc-ffff8880133b3400: Connection restored to service dc2-OST0071 using nid 10.10.0.11@o2ib. Mar 5 10:00:31 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 10:00:31 c6 kernel: LustreError: 11141:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -107 Mar 5 10:08:18 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.13@o2ib. The ost_write operation failed with -107 Mar 5 10:08:18 c6 kernel: Lustre: dc2-OST0080-osc-ffff8880133b3400: Connection to service dc2-OST0080 via nid 10.10.0.13@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:08:20 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0080; in progress operations using this service will fail. Mar 5 10:08:22 c6 kernel: Lustre: Server dc2-OST0080_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:08:22 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcf589000 x1461691363597566/t0 o4->dc2-OST0080_UUID@10.10.0.13@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 10:08:22 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) Skipped 2 previous similar messages Mar 5 10:08:22 c6 kernel: Lustre: dc2-OST0080-osc-ffff8880133b3400: Connection restored to service dc2-OST0080 using nid 10.10.0.13@o2ib. Mar 5 10:12:14 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691363632107 sent from dc2-OST0061-osc-ffff8880133b3400 to NID 10.10.0.10@o2ib 7s ago has timed out (7s prior to deadline). Mar 5 10:12:14 c6 kernel: Lustre: dc2-OST0061-osc-ffff8880133b3400: Connection to service dc2-OST0061 via nid 10.10.0.10@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:12:15 c6 kernel: Lustre: dc2-OST0061-osc-ffff8880133b3400: Connection restored to service dc2-OST0061 using nid 10.10.0.10@o2ib. Mar 5 10:12:15 c6 kernel: Lustre: Server dc2-OST0061_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:13:56 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.10@o2ib. The obd_ping operation failed with -107 Mar 5 10:13:56 c6 kernel: Lustre: dc2-OST0061-osc-ffff8880133b3400: Connection to service dc2-OST0061 via nid 10.10.0.10@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:13:57 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0061; in progress operations using this service will fail. Mar 5 10:13:57 c6 kernel: Lustre: Server dc2-OST0061_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:13:59 c6 kernel: LustreError: 11292:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST0061-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 10:13:59 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 10:13:59 c6 kernel: LustreError: 11292:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff8807abed5500 (23279772/0/0/0) (rc: 1) Mar 5 10:13:59 c6 kernel: Lustre: dc2-OST0061-osc-ffff8880133b3400: Connection restored to service dc2-OST0061 using nid 10.10.0.10@o2ib. Mar 5 10:26:54 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The obd_ping operation failed with -107 Mar 5 10:26:54 c6 kernel: Lustre: dc2-OST0052-osc-ffff8880133b3400: Connection to service dc2-OST0052 via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:26:54 c6 kernel: LustreError: 3390:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -107 from cancel RPC: canceling anyway Mar 5 10:26:54 c6 kernel: LustreError: 3390:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -107 Mar 5 10:26:54 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0052; in progress operations using this service will fail. Mar 5 10:26:54 c6 kernel: Lustre: Server dc2-OST0052_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:26:55 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcd138000 x1461691363753866/t0 o4->dc2-OST0052_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 10:26:55 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) Skipped 2 previous similar messages Mar 5 10:26:57 c6 kernel: LustreError: 11332:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST0052-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 10:26:57 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 10:26:57 c6 kernel: LustreError: 11332:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff8819af05dc80 (23058921/0/0/0) (rc: 1) Mar 5 10:26:58 c6 kernel: Lustre: dc2-OST0052-osc-ffff8880133b3400: Connection restored to service dc2-OST0052 using nid 10.10.0.8@o2ib. Mar 5 10:39:55 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691363821323 sent from dc2-OST009c-osc-ffff8880133b3400 to NID 10.10.0.15@o2ib 7s ago has timed out (7s prior to deadline). Mar 5 10:39:55 c6 kernel: Lustre: dc2-OST009c-osc-ffff8880133b3400: Connection to service dc2-OST009c via nid 10.10.0.15@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:39:56 c6 kernel: Lustre: dc2-OST009c-osc-ffff8880133b3400: Connection restored to service dc2-OST009c using nid 10.10.0.15@o2ib. Mar 5 10:39:56 c6 kernel: Lustre: Server dc2-OST009c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:41:51 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.15@o2ib. The obd_ping operation failed with -107 Mar 5 10:41:51 c6 kernel: Lustre: dc2-OST009c-osc-ffff8880133b3400: Connection to service dc2-OST009c via nid 10.10.0.15@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:41:53 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST009c; in progress operations using this service will fail. Mar 5 10:41:53 c6 kernel: Lustre: Server dc2-OST009c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:41:54 c6 kernel: LustreError: 11415:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST009c-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 10:41:54 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 10:41:54 c6 kernel: LustreError: 11415:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff88116cdfac80 (23152881/0/0/0) (rc: 1) Mar 5 10:41:55 c6 kernel: Lustre: dc2-OST009c-osc-ffff8880133b3400: Connection restored to service dc2-OST009c using nid 10.10.0.15@o2ib. Mar 5 10:54:48 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691363867291 sent from dc2-OST006f-osc-ffff8880133b3400 to NID 10.10.0.11@o2ib 8s ago has timed out (8s prior to deadline). Mar 5 10:54:48 c6 kernel: Lustre: dc2-OST006f-osc-ffff8880133b3400: Connection to service dc2-OST006f via nid 10.10.0.11@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:54:49 c6 kernel: Lustre: dc2-OST006f-osc-ffff8880133b3400: Connection restored to service dc2-OST006f using nid 10.10.0.11@o2ib. Mar 5 10:54:49 c6 kernel: Lustre: Server dc2-OST006f_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:56:51 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.11@o2ib. The obd_ping operation failed with -107 Mar 5 10:56:51 c6 kernel: Lustre: dc2-OST006f-osc-ffff8880133b3400: Connection to service dc2-OST006f via nid 10.10.0.11@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 10:56:53 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST006f; in progress operations using this service will fail. Mar 5 10:56:53 c6 kernel: Lustre: Server dc2-OST006f_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 10:56:54 c6 kernel: LustreError: 11463:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST006f-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 10:56:54 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 10:56:54 c6 kernel: LustreError: 11463:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff8817d095be40 (20791927/0/0/0) (rc: 1) Mar 5 10:56:55 c6 kernel: Lustre: dc2-OST006f-osc-ffff8880133b3400: Connection restored to service dc2-OST006f using nid 10.10.0.11@o2ib. Mar 5 11:01:51 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The obd_ping operation failed with -107 Mar 5 11:01:51 c6 kernel: Lustre: dc2-OST004e-osc-ffff8880133b3400: Connection to service dc2-OST004e via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 11:01:53 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST004e; in progress operations using this service will fail. Mar 5 11:01:53 c6 kernel: Lustre: Server dc2-OST004e_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 11:01:54 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcc54e800 x1461691363884537/t0 o4->dc2-OST004e_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 11:01:55 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcc54cc00 x1461691363884538/t0 o4->dc2-OST004e_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 11:01:55 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcc54c800 x1461691363884539/t0 o4->dc2-OST004e_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 11:01:56 c6 kernel: Lustre: dc2-OST004e-osc-ffff8880133b3400: Connection restored to service dc2-OST004e using nid 10.10.0.8@o2ib. Mar 5 11:19:47 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691363942189 sent from dc2-OST005c-osc-ffff8880133b3400 to NID 10.10.0.9@o2ib 7s ago has timed out (7s prior to deadline). Mar 5 11:19:48 c6 kernel: Lustre: dc2-OST005c-osc-ffff8880133b3400: Connection to service dc2-OST005c via nid 10.10.0.9@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 11:19:48 c6 kernel: Lustre: dc2-OST005c-osc-ffff8880133b3400: Connection restored to service dc2-OST005c using nid 10.10.0.9@o2ib. Mar 5 11:19:48 c6 kernel: Lustre: Server dc2-OST005c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 11:21:52 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.9@o2ib. The obd_ping operation failed with -107 Mar 5 11:21:52 c6 kernel: Lustre: dc2-OST005c-osc-ffff8880133b3400: Connection to service dc2-OST005c via nid 10.10.0.9@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 11:21:53 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST005c; in progress operations using this service will fail. Mar 5 11:21:54 c6 kernel: Lustre: Server dc2-OST005c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 11:21:55 c6 kernel: LustreError: 11557:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST005c-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 11:21:55 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 11:21:55 c6 kernel: LustreError: 11557:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff8814ae8b4680 (23607308/0/0/0) (rc: 1) Mar 5 11:21:56 c6 kernel: Lustre: dc2-OST005c-osc-ffff8880133b3400: Connection restored to service dc2-OST005c using nid 10.10.0.9@o2ib. Mar 5 11:47:29 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364026276 sent from dc2-OST008e-osc-ffff8880133b3400 to NID 10.10.0.14@o2ib 7s ago has timed out (7s prior to deadline). Mar 5 11:47:29 c6 kernel: Lustre: dc2-OST008e-osc-ffff8880133b3400: Connection to service dc2-OST008e via nid 10.10.0.14@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 11:47:30 c6 kernel: Lustre: dc2-OST008e-osc-ffff8880133b3400: Connection restored to service dc2-OST008e using nid 10.10.0.14@o2ib. Mar 5 11:47:30 c6 kernel: Lustre: Server dc2-OST008e_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 11:49:21 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.14@o2ib. The obd_ping operation failed with -107 Mar 5 11:49:21 c6 kernel: Lustre: dc2-OST008e-osc-ffff8880133b3400: Connection to service dc2-OST008e via nid 10.10.0.14@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 11:49:22 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST008e; in progress operations using this service will fail. Mar 5 11:49:22 c6 kernel: Lustre: Server dc2-OST008e_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 11:49:24 c6 kernel: LustreError: 11642:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST008e-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 11:49:24 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 11:49:24 c6 kernel: LustreError: 11642:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff880d1e40fc80 (21850653/0/0/0) (rc: 1) Mar 5 11:49:25 c6 kernel: Lustre: dc2-OST008e-osc-ffff8880133b3400: Connection restored to service dc2-OST008e using nid 10.10.0.14@o2ib. Mar 5 12:07:59 c6 kernel: Lustre: 11356:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364077893 sent from dc2-OST0066-osc-ffff8880133b3400 to NID 10.10.0.10@o2ib 9s ago has timed out (9s prior to deadline). Mar 5 12:07:59 c6 kernel: Lustre: dc2-OST0066-osc-ffff8880133b3400: Connection to service dc2-OST0066 via nid 10.10.0.10@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 12:08:01 c6 kernel: LustreError: 11356:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 5 12:08:01 c6 kernel: LustreError: 11356:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 5 12:08:02 c6 kernel: Lustre: dc2-OST0066-osc-ffff8880133b3400: Connection restored to service dc2-OST0066 using nid 10.10.0.10@o2ib. Mar 5 12:08:02 c6 kernel: Lustre: Server dc2-OST0066_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 12:17:41 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The obd_ping operation failed with -107 Mar 5 12:17:41 c6 kernel: Lustre: dc2-OST004b-osc-ffff8880133b3400: Connection to service dc2-OST004b via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 12:17:43 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST004b; in progress operations using this service will fail. Mar 5 12:17:43 c6 kernel: Lustre: Server dc2-OST004b_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 12:17:44 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcaf8b000 x1461691364110191/t0 o4->dc2-OST004b_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 12:17:44 c6 kernel: Lustre: dc2-OST004b-osc-ffff8880133b3400: Connection restored to service dc2-OST004b using nid 10.10.0.8@o2ib. Mar 5 12:20:11 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364117357 sent from dc2-OST0052-osc-ffff8880133b3400 to NID 10.10.0.8@o2ib 19s ago has timed out (19s prior to deadline). Mar 5 12:20:11 c6 kernel: Lustre: dc2-OST0052-osc-ffff8880133b3400: Connection to service dc2-OST0052 via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 12:20:12 c6 kernel: Lustre: dc2-OST0052-osc-ffff8880133b3400: Connection restored to service dc2-OST0052 using nid 10.10.0.8@o2ib. Mar 5 12:20:12 c6 kernel: Lustre: Server dc2-OST0052_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 12:22:16 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The obd_ping operation failed with -107 Mar 5 12:22:16 c6 kernel: Lustre: dc2-OST0052-osc-ffff8880133b3400: Connection to service dc2-OST0052 via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 12:22:18 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0052; in progress operations using this service will fail. Mar 5 12:22:18 c6 kernel: Lustre: Server dc2-OST0052_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 12:22:20 c6 kernel: LustreError: 11792:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST0052-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 12:22:20 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 12:22:20 c6 kernel: LustreError: 11792:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff881c5bc2e0c0 (23060169/0/0/0) (rc: 1) Mar 5 12:22:20 c6 kernel: Lustre: dc2-OST0052-osc-ffff8880133b3400: Connection restored to service dc2-OST0052 using nid 10.10.0.8@o2ib. Mar 5 12:47:15 c6 kernel: Lustre: 11356:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364202791 sent from dc2-OST000b-osc-ffff8880133b3400 to NID 10.10.0.2@o2ib 7s ago has timed out (7s prior to deadline). Mar 5 12:47:15 c6 kernel: Lustre: dc2-OST000b-osc-ffff8880133b3400: Connection to service dc2-OST000b via nid 10.10.0.2@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 12:47:17 c6 kernel: LustreError: 11356:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 5 12:47:17 c6 kernel: Lustre: dc2-OST000b-osc-ffff8880133b3400: Connection restored to service dc2-OST000b using nid 10.10.0.2@o2ib. Mar 5 12:47:17 c6 kernel: Lustre: Server dc2-OST000b_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 12:47:17 c6 kernel: LustreError: 11356:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 5 12:57:19 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The ost_write operation failed with -107 Mar 5 12:57:19 c6 kernel: Lustre: dc2-OST004e-osc-ffff8880133b3400: Connection to service dc2-OST004e via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 12:57:20 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST004e; in progress operations using this service will fail. Mar 5 12:57:20 c6 kernel: Lustre: Server dc2-OST004e_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 12:57:21 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcc54d800 x1461691364237124/t0 o4->dc2-OST004e_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 12:57:23 c6 kernel: LustreError: 11903:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST004e-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 12:57:23 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 12:57:23 c6 kernel: LustreError: 11903:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff88098801bb00 (20398513/0/0/0) (rc: 1) Mar 5 12:57:24 c6 kernel: Lustre: dc2-OST004e-osc-ffff8880133b3400: Connection restored to service dc2-OST004e using nid 10.10.0.8@o2ib. Mar 5 13:00:25 c6 kernel: Lustre: 3389:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364245020 sent from dc2-OST0070-osc-ffff8880133b3400 to NID 10.10.0.11@o2ib 9s ago has timed out (9s prior to deadline). Mar 5 13:00:25 c6 kernel: Lustre: dc2-OST0070-osc-ffff8880133b3400: Connection to service dc2-OST0070 via nid 10.10.0.11@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:00:26 c6 kernel: LustreError: 3389:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 5 13:00:27 c6 kernel: Lustre: dc2-OST0070-osc-ffff8880133b3400: Connection restored to service dc2-OST0070 using nid 10.10.0.11@o2ib. Mar 5 13:00:27 c6 kernel: Lustre: Server dc2-OST0070_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:00:27 c6 kernel: LustreError: 3389:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 5 13:05:30 c6 kernel: Lustre: 3378:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364259142 sent from dc2-OST006e-osc-ffff8880133b3400 to NID 10.10.0.11@o2ib 23s ago has timed out (19s prior to deadline). Mar 5 13:05:30 c6 kernel: Lustre: dc2-OST006e-osc-ffff8880133b3400: Connection to service dc2-OST006e via nid 10.10.0.11@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:05:31 c6 kernel: Lustre: dc2-OST006e-osc-ffff8880133b3400: Connection restored to service dc2-OST006e using nid 10.10.0.11@o2ib. Mar 5 13:05:31 c6 kernel: Lustre: Server dc2-OST006e_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:08:56 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The obd_ping operation failed with -107 Mar 5 13:08:56 c6 kernel: Lustre: dc2-OST0053-osc-ffff8880133b3400: Connection to service dc2-OST0053 via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:08:58 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0053; in progress operations using this service will fail. Mar 5 13:08:58 c6 kernel: Lustre: Server dc2-OST0053_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:08:59 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcbbccc00 x1461691364271647/t0 o4->dc2-OST0053_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 13:09:00 c6 kernel: Lustre: dc2-OST0053-osc-ffff8880133b3400: Connection restored to service dc2-OST0053 using nid 10.10.0.8@o2ib. Mar 5 13:12:02 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364282248 sent from dc2-OST004c-osc-ffff8880133b3400 to NID 10.10.0.8@o2ib 10s ago has timed out (10s prior to deadline). Mar 5 13:12:02 c6 kernel: Lustre: dc2-OST004c-osc-ffff8880133b3400: Connection to service dc2-OST004c via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:12:03 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST004c; in progress operations using this service will fail. Mar 5 13:12:03 c6 kernel: Lustre: Server dc2-OST004c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:12:04 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 13:12:05 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fce363400 x1461691364282467/t0 o4->dc2-OST004c_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 13:12:05 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fce364400 x1461691364282468/t0 o4->dc2-OST004c_UUID@10.10.0.8@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 13:12:06 c6 kernel: Lustre: dc2-OST004c-osc-ffff8880133b3400: Connection restored to service dc2-OST004c using nid 10.10.0.8@o2ib. Mar 5 13:14:20 c6 kernel: Lustre: 11137:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1461691364289261 sent from dc2-OST0051-osc-ffff8880133b3400 to NID 10.10.0.8@o2ib 10s ago has timed out (10s prior to deadline). Mar 5 13:14:20 c6 kernel: Lustre: dc2-OST0051-osc-ffff8880133b3400: Connection to service dc2-OST0051 via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:14:22 c6 kernel: Lustre: dc2-OST0051-osc-ffff8880133b3400: Connection restored to service dc2-OST0051 using nid 10.10.0.8@o2ib. Mar 5 13:14:22 c6 kernel: Lustre: Server dc2-OST0051_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:16:26 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The obd_ping operation failed with -107 Mar 5 13:16:26 c6 kernel: Lustre: dc2-OST0051-osc-ffff8880133b3400: Connection to service dc2-OST0051 via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:16:27 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0051; in progress operations using this service will fail. Mar 5 13:16:27 c6 kernel: Lustre: Server dc2-OST0051_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:16:29 c6 kernel: LustreError: 11978:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST0051-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 13:16:29 c6 kernel: LustreError: 11137:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 13:16:29 c6 kernel: LustreError: 11978:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff880e2318e840 (23603679/0/0/0) (rc: 1) Mar 5 13:16:30 c6 kernel: Lustre: dc2-OST0051-osc-ffff8880133b3400: Connection restored to service dc2-OST0051 using nid 10.10.0.8@o2ib. Mar 5 13:18:24 c6 kernel: Lustre: dc2-OST002c-osc-ffff8880133b3400: Connection to service dc2-OST002c via nid 10.10.0.5@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:18:24 c6 kernel: LustreError: 11356:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -107 from cancel RPC: canceling anyway Mar 5 13:18:29 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST002c; in progress operations using this service will fail. Mar 5 13:18:29 c6 kernel: Lustre: Server dc2-OST002c_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:18:29 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fcc1a9000 x1461691364303307/t0 o4->dc2-OST002c_UUID@10.10.0.5@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 13:18:29 c6 kernel: LustreError: 11984:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST002c-osc-ffff8880133b3400 resource refcount nonzero (2) after lock cleanup; forcing cleanup. Mar 5 13:18:29 c6 kernel: LustreError: 11984:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff881589b2f240 (22024075/0/0/0) (rc: 2) Mar 5 13:18:29 c6 kernel: Lustre: dc2-OST002c-osc-ffff8880133b3400: Connection restored to service dc2-OST002c using nid 10.10.0.5@o2ib. Mar 5 13:18:29 c6 kernel: LustreError: 11137:0:(lov_request.c:211:lov_update_enqueue_set()) enqueue objid 0x1843a subobj 0x1500f8b on OST idx 44: rc -5 Mar 5 13:18:29 c6 kernel: LustreError: 11356:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -107 Mar 5 13:39:26 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.11@o2ib. The ldlm_enqueue operation failed with -107 Mar 5 13:39:26 c6 kernel: Lustre: dc2-OST0073-osc-ffff8880133b3400: Connection to service dc2-OST0073 via nid 10.10.0.11@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:39:28 c6 kernel: LustreError: 11265:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway Mar 5 13:39:28 c6 kernel: LustreError: 11265:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 Mar 5 13:39:35 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0073; in progress operations using this service will fail. Mar 5 13:39:37 c6 kernel: Lustre: Server dc2-OST0073_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:39:37 c6 kernel: LustreError: 13543:0:(file.c:1001:ll_glimpse_size()) obd_enqueue returned rc -4, returning -EIO Mar 5 13:39:37 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff881fd195bc00 x1461691364654913/t0 o4->dc2-OST0073_UUID@10.10.0.11@o2ib:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Mar 5 13:39:37 c6 kernel: LustreError: 3378:0:(client.c:859:ptlrpc_import_delay_req()) Skipped 1 previous similar message Mar 5 13:39:38 c6 kernel: LustreError: 13616:0:(ldlm_resource.c:521:ldlm_namespace_cleanup()) Namespace dc2-OST0073-osc-ffff8880133b3400 resource refcount nonzero (1) after lock cleanup; forcing cleanup. Mar 5 13:39:38 c6 kernel: LustreError: 13462:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 Mar 5 13:39:38 c6 kernel: LustreError: 13616:0:(ldlm_resource.c:526:ldlm_namespace_cleanup()) Resource: ffff8815c0741b00 (21885599/0/0/0) (rc: 1) Mar 5 13:39:39 c6 kernel: Lustre: dc2-OST0073-osc-ffff8880133b3400: Connection restored to service dc2-OST0073 using nid 10.10.0.11@o2ib. Mar 5 13:44:23 c6 kernel: LustreError: 11-0: an error occurred while communicating with 10.10.0.8@o2ib. The ldlm_enqueue operation failed with -107 Mar 5 13:44:24 c6 kernel: Lustre: dc2-OST0052-osc-ffff8880133b3400: Connection to service dc2-OST0052 via nid 10.10.0.8@o2ib was lost; in progress operations using this service will wait for recovery to complete. Mar 5 13:44:31 c6 kernel: LustreError: 167-0: This client was evicted by dc2-OST0052; in progress operations using this service will fail. Mar 5 13:44:31 c6 kernel: Lustre: Server dc2-OST0052_UUID version (2.1.6.0) is much newer than client version (1.8.9) Mar 5 13:44:31 c6 kernel: LustreError: 15981:0:(file.c:1001:ll_glimpse_size()) obd_enqueue returned rc -4, returning -EIO Mar 5 13:44:31 c6 kernel: LustreError: 15948:0:(file.c:1001:ll_glimpse_size()) obd_enqueue returned rc -4, returning -EIO Mar 5 13:44:31 c6 kernel: LustreError: 3378:0:(osc_request.c:2357:brw_interpret()) ASSERTION(!(aa->aa_oa->o_valid & OBD_MD_FLHANDLE)) failed Mar 5 13:44:31 c6 kernel: LustreError: 3378:0:(osc_request.c:2357:brw_interpret()) LBUG