HOSTS ------------------------------------------------------------------------- cpu-e-1056 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:50:00 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1056 kernel: Lustre: Mounted fs1-client Aug 21 01:55:30 cpu-e-1056 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-1054 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:50:04 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1054 kernel: Lustre: Mounted fs1-client Aug 21 01:55:29 cpu-e-1054 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-1059 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:50:03 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1059 kernel: Lustre: Mounted fs1-client Aug 21 01:29:23 cpu-e-1059 kernel: Lustre: 77583:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347361/real 1566347363] req@ffff904554aa0d80 x1642422187213168/t0(0) o101->fs1-OST00da-osc-ffff9043a428f000@10.47.18.19@o2ib1:28/4 lens 328/400 e 0 to 1 dl 1566347368 ref 2 fl Rpc:ReX/0/ffffffff rc 0/-1 Aug 21 01:29:23 cpu-e-1059 kernel: Lustre: fs1-OST00da-osc-ffff9043a428f000: Connection to fs1-OST00da (at 10.47.18.19@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:23 cpu-e-1059 kernel: Lustre: fs1-OST00da-osc-ffff9043a428f000: Connection restored to 10.47.18.19@o2ib1 (at 10.47.18.19@o2ib1) Aug 21 01:29:23 cpu-e-1059 kernel: LustreError: 66144:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff904548845000 Aug 21 01:29:23 cpu-e-1059 kernel: LustreError: 66144:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff904548845000 Aug 21 01:29:23 cpu-e-1059 kernel: LustreError: 66144:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff90453ac3f600 Aug 21 01:29:23 cpu-e-1059 kernel: LustreError: 66144:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff90453ac3f600 Aug 21 01:29:23 cpu-e-1059 kernel: LustreError: 66144:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff905cc3574200 Aug 21 01:29:23 cpu-e-1059 kernel: LustreError: 66144:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff905cc3574200 Aug 21 01:29:24 cpu-e-1059 kernel: LustreError: 66144:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff90453f7f4400 Aug 21 01:29:24 cpu-e-1059 kernel: Lustre: 66641:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347363/real 1566347364] req@ffff9035b33c0d80 x1642422187217552/t0(0) o103->fs1-OST00be-osc-ffff9043a428f000@10.47.18.16@o2ib1:17/18 lens 328/224 e 0 to 1 dl 1566347375 ref 1 fl Rpc:eX/0/ffffffff rc 0/-1 Aug 21 01:29:24 cpu-e-1059 kernel: Lustre: 66641:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 15 previous similar messages Aug 21 01:29:24 cpu-e-1059 kernel: Lustre: fs1-OST00be-osc-ffff9043a428f000: Connection to fs1-OST00be (at 10.47.18.16@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:24 cpu-e-1059 kernel: Lustre: Skipped 15 previous similar messages Aug 21 01:29:24 cpu-e-1059 kernel: Lustre: fs1-OST00be-osc-ffff9043a428f000: Connection restored to 10.47.18.16@o2ib1 (at 10.47.18.16@o2ib1) Aug 21 01:29:24 cpu-e-1059 kernel: Lustre: Skipped 14 previous similar messages Aug 21 01:29:28 cpu-e-1059 kernel: Lustre: 77593:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347361/real 1566347361] req@ffff905d59416c00 x1642422187213040/t0(0) o101->fs1-OST00e8-osc-ffff9043a428f000@10.47.18.20@o2ib1:28/4 lens 328/400 e 0 to 1 dl 1566347368 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:29:28 cpu-e-1059 kernel: Lustre: fs1-OST00e8-osc-ffff9043a428f000: Connection to fs1-OST00e8 (at 10.47.18.20@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:28 cpu-e-1059 kernel: Lustre: fs1-OST00e8-osc-ffff9043a428f000: Connection restored to 10.47.18.20@o2ib1 (at 10.47.18.20@o2ib1) Aug 21 01:29:54 cpu-e-1059 kernel: Lustre: fs1-OST0117-osc-ffff9043a428f000: Connection restored to 10.47.18.24@o2ib1 (at 10.47.18.24@o2ib1) Aug 21 01:29:54 cpu-e-1059 kernel: Lustre: Skipped 2 previous similar messages Aug 21 01:33:37 cpu-e-1059 kernel: Lustre: 66636:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347361/real 1566347361] req@ffff90455537a880 x1642422187213408/t0(0) o3->fs1-OST0087-osc-ffff9043a428f000@10.47.18.12@o2ib1:6/4 lens 488/440 e 2 to 1 dl 1566347455 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:33:37 cpu-e-1059 kernel: Lustre: 66636:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Aug 21 01:33:37 cpu-e-1059 kernel: Lustre: fs1-OST0087-osc-ffff9043a428f000: Connection to fs1-OST0087 (at 10.47.18.12@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:33:37 cpu-e-1059 kernel: Lustre: Skipped 2 previous similar messages Aug 21 01:33:37 cpu-e-1059 kernel: Lustre: fs1-OST0087-osc-ffff9043a428f000: Connection restored to 10.47.18.12@o2ib1 (at 10.47.18.12@o2ib1) Aug 21 01:33:47 cpu-e-1059 kernel: Lustre: 66635:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347364/real 1566347364] req@ffff904557931f80 x1642422187220016/t0(0) o3->fs1-OST00b5-osc-ffff9043a428f000@10.47.18.16@o2ib1:6/4 lens 488/440 e 3 to 1 dl 1566347446 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:33:47 cpu-e-1059 kernel: Lustre: 66635:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 1 previous similar message Aug 21 01:33:47 cpu-e-1059 kernel: Lustre: fs1-OST00b5-osc-ffff9043a428f000: Connection to fs1-OST00b5 (at 10.47.18.16@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:33:47 cpu-e-1059 kernel: Lustre: Skipped 1 previous similar message Aug 21 01:34:22 cpu-e-1059 kernel: Lustre: 66657:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347363/real 1566347363] req@ffff905d57852880 x1642422187213536/t0(0) o3->fs1-OST0110-osc-ffff9043a428f000@10.47.18.23@o2ib1:6/4 lens 488/440 e 2 to 1 dl 1566347457 ref 2 fl Rpc:X/2/ffffffff rc 0/-1 Aug 21 01:34:22 cpu-e-1059 kernel: Lustre: 66657:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 1 previous similar message Aug 21 01:34:22 cpu-e-1059 kernel: Lustre: fs1-OST0110-osc-ffff9043a428f000: Connection to fs1-OST0110 (at 10.47.18.23@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:34:22 cpu-e-1059 kernel: Lustre: Skipped 1 previous similar message Aug 21 01:34:22 cpu-e-1059 kernel: Lustre: fs1-OST0110-osc-ffff9043a428f000: Connection restored to 10.47.18.23@o2ib1 (at 10.47.18.23@o2ib1) Aug 21 01:34:22 cpu-e-1059 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:36:18 cpu-e-1059 kernel: Lustre: 66634:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347362/real 1566347362] req@ffff90455b448900 x1642422187214496/t0(0) o103->fs1-OST0010-osc-ffff9043a428f000@10.47.18.2@o2ib1:17/18 lens 328/224 e 0 to 1 dl 1566347374 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:36:18 cpu-e-1059 kernel: Lustre: 66634:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Aug 21 01:36:18 cpu-e-1059 kernel: Lustre: fs1-OST0010-osc-ffff9043a428f000: Connection to fs1-OST0010 (at 10.47.18.2@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:36:18 cpu-e-1059 kernel: Lustre: Skipped 2 previous similar messages Aug 21 01:36:18 cpu-e-1059 kernel: Lustre: fs1-OST0069-osc-ffff9043a428f000: Connection restored to 10.47.18.9@o2ib1 (at 10.47.18.9@o2ib1) Aug 21 01:36:18 cpu-e-1059 kernel: Lustre: Skipped 2 previous similar messages Aug 21 01:38:04 cpu-e-1059 kernel: Lustre: 66644:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347364/real 1566347364] req@ffff90454a2e8900 x1642422187221152/t0(0) o3->fs1-OST0048-osc-ffff9043a428f000@10.47.18.7@o2ib1:6/4 lens 488/440 e 3 to 1 dl 1566347446 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:38:04 cpu-e-1059 kernel: Lustre: 66644:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 3 previous similar messages Aug 21 01:38:04 cpu-e-1059 kernel: Lustre: fs1-OST0048-osc-ffff9043a428f000: Connection to fs1-OST0048 (at 10.47.18.7@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:38:04 cpu-e-1059 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:38:04 cpu-e-1059 kernel: Lustre: fs1-OST0048-osc-ffff9043a428f000: Connection restored to 10.47.18.7@o2ib1 (at 10.47.18.7@o2ib1) Aug 21 01:38:04 cpu-e-1059 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:41:07 cpu-e-1059 kernel: Lustre: 66666:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347363/real 1566347363] req@ffff905beab68480 x1642422187214000/t0(0) o103->fs1-OST0071-osc-ffff9043a428f000@10.47.18.10@o2ib1:17/18 lens 328/224 e 0 to 1 dl 1566347375 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Aug 21 01:41:07 cpu-e-1059 kernel: Lustre: 66666:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 9 previous similar messages Aug 21 01:41:07 cpu-e-1059 kernel: Lustre: fs1-OST0071-osc-ffff9043a428f000: Connection to fs1-OST0071 (at 10.47.18.10@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:41:07 cpu-e-1059 kernel: Lustre: Skipped 9 previous similar messages Aug 21 01:41:07 cpu-e-1059 kernel: Lustre: fs1-OST0071-osc-ffff9043a428f000: Connection restored to 10.47.18.10@o2ib1 (at 10.47.18.10@o2ib1) Aug 21 01:41:07 cpu-e-1059 kernel: Lustre: Skipped 9 previous similar messages Aug 21 01:55:30 cpu-e-1059 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-1055 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:50:34 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1055 kernel: Lustre: Mounted fs1-client Aug 21 01:55:30 cpu-e-1055 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-1057 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:50:20 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1057 kernel: Lustre: Mounted fs1-client Aug 21 01:22:53 cpu-e-1057 kernel: perf: interrupt took too long (3943 > 3921), lowering kernel.perf_event_max_sample_rate to 50000 Aug 21 01:55:30 cpu-e-1057 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-1061 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:49:58 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1061 kernel: Lustre: Mounted fs1-client Aug 21 01:55:30 cpu-e-1061 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-1060 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:49:46 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1060 kernel: Lustre: Mounted fs1-client Aug 21 01:29:23 cpu-e-1060 kernel: LustreError: 66211:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff88503ceade00 Aug 21 01:29:23 cpu-e-1060 kernel: Lustre: 66712:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347363/real 1566347363] req@ffff8838d6231680 x1642422187205744/t0(0) o400->fs1-OST0058-osc-ffff8838c0412000@10.47.18.8@o2ib1:28/4 lens 224/224 e 0 to 1 dl 1566347370 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 Aug 21 01:29:23 cpu-e-1060 kernel: Lustre: fs1-OST0058-osc-ffff8838c0412000: Connection to fs1-OST0058 (at 10.47.18.8@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:23 cpu-e-1060 kernel: Lustre: fs1-OST0058-osc-ffff8838c0412000: Connection restored to 10.47.18.8@o2ib1 (at 10.47.18.8@o2ib1) Aug 21 01:29:23 cpu-e-1060 kernel: LustreError: 66211:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff8838673b2600 Aug 21 01:29:23 cpu-e-1060 kernel: LustreError: 66211:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff88503e5c5400 Aug 21 01:29:23 cpu-e-1060 kernel: LustreError: 66211:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff88503e5c5400 Aug 21 01:29:23 cpu-e-1060 kernel: LustreError: 66211:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff88504d60ca00 Aug 21 01:29:24 cpu-e-1060 kernel: Lustre: 66729:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347362/real 1566347364] req@ffff8837b4562880 x1642422187204496/t0(0) o3->fs1-OST011c-osc-ffff8838c0412000@10.47.18.24@o2ib1:6/4 lens 488/440 e 0 to 1 dl 1566347374 ref 2 fl Rpc:eX/0/ffffffff rc 0/-1 Aug 21 01:29:24 cpu-e-1060 kernel: Lustre: 66729:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 13 previous similar messages Aug 21 01:29:24 cpu-e-1060 kernel: Lustre: fs1-OST011c-osc-ffff8838c0412000: Connection to fs1-OST011c (at 10.47.18.24@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:24 cpu-e-1060 kernel: Lustre: Skipped 13 previous similar messages Aug 21 01:29:24 cpu-e-1060 kernel: Lustre: fs1-OST011c-osc-ffff8838c0412000: Connection restored to 10.47.18.24@o2ib1 (at 10.47.18.24@o2ib1) Aug 21 01:29:24 cpu-e-1060 kernel: Lustre: Skipped 13 previous similar messages Aug 21 01:29:28 cpu-e-1060 kernel: Lustre: 77665:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347361/real 1566347361] req@ffff8838556f9b00 x1642422187203248/t0(0) o101->fs1-OST010a-osc-ffff8838c0412000@10.47.18.23@o2ib1:28/4 lens 328/400 e 0 to 1 dl 1566347368 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:29:28 cpu-e-1060 kernel: Lustre: 77665:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Aug 21 01:29:28 cpu-e-1060 kernel: Lustre: fs1-OST010a-osc-ffff8838c0412000: Connection to fs1-OST010a (at 10.47.18.23@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:28 cpu-e-1060 kernel: Lustre: Skipped 2 previous similar messages Aug 21 01:29:28 cpu-e-1060 kernel: Lustre: fs1-OST010a-osc-ffff8838c0412000: Connection restored to 10.47.18.23@o2ib1 (at 10.47.18.23@o2ib1) Aug 21 01:29:28 cpu-e-1060 kernel: Lustre: Skipped 2 previous similar messages Aug 21 01:33:39 cpu-e-1060 kernel: Lustre: 66714:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347363/real 1566347363] req@ffff8838c3300d80 x1642422187207776/t0(0) o3->fs1-OST0084-osc-ffff8838c0412000@10.47.18.12@o2ib1:6/4 lens 488/440 e 3 to 1 dl 1566347450 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:33:39 cpu-e-1060 kernel: Lustre: 66714:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 1 previous similar message Aug 21 01:33:39 cpu-e-1060 kernel: Lustre: fs1-OST0084-osc-ffff8838c0412000: Connection to fs1-OST0084 (at 10.47.18.12@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:33:39 cpu-e-1060 kernel: Lustre: Skipped 1 previous similar message Aug 21 01:33:39 cpu-e-1060 kernel: Lustre: fs1-OST0084-osc-ffff8838c0412000: Connection restored to 10.47.18.12@o2ib1 (at 10.47.18.12@o2ib1) Aug 21 01:33:39 cpu-e-1060 kernel: Lustre: Skipped 1 previous similar message Aug 21 01:34:05 cpu-e-1060 kernel: Lustre: 66711:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347363/real 1566347363] req@ffff88387931f980 x1642422187206240/t0(0) o400->fs1-OST0102-osc-ffff8838c0412000@10.47.18.22@o2ib1:28/4 lens 224/224 e 0 to 1 dl 1566347370 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Aug 21 01:34:05 cpu-e-1060 kernel: Lustre: fs1-OST0102-osc-ffff8838c0412000: Connection to fs1-OST0102 (at 10.47.18.22@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:34:05 cpu-e-1060 kernel: Lustre: fs1-OST0102-osc-ffff8838c0412000: Connection restored to 10.47.18.22@o2ib1 (at 10.47.18.22@o2ib1) Aug 21 01:34:17 cpu-e-1060 kernel: Lustre: 66722:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347362/real 1566347362] req@ffff8850d9b93180 x1642422187203536/t0(0) o3->fs1-OST006a-osc-ffff8838c0412000@10.47.18.9@o2ib1:6/4 lens 488/440 e 2 to 1 dl 1566347456 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:34:17 cpu-e-1060 kernel: Lustre: 66722:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 3 previous similar messages Aug 21 01:34:17 cpu-e-1060 kernel: Lustre: fs1-OST006a-osc-ffff8838c0412000: Connection to fs1-OST006a (at 10.47.18.9@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:34:17 cpu-e-1060 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:34:17 cpu-e-1060 kernel: Lustre: fs1-OST006a-osc-ffff8838c0412000: Connection restored to 10.47.18.9@o2ib1 (at 10.47.18.9@o2ib1) Aug 21 01:34:17 cpu-e-1060 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:36:23 cpu-e-1060 kernel: Lustre: 66709:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347361/real 1566347361] req@ffff8837b1a43a80 x1642422187202864/t0(0) o3->fs1-OST00db-osc-ffff8838c0412000@10.47.18.19@o2ib1:6/4 lens 488/440 e 3 to 1 dl 1566347448 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:36:23 cpu-e-1060 kernel: Lustre: 66709:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 3 previous similar messages Aug 21 01:36:23 cpu-e-1060 kernel: Lustre: fs1-OST00db-osc-ffff8838c0412000: Connection to fs1-OST00db (at 10.47.18.19@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:36:23 cpu-e-1060 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:36:23 cpu-e-1060 kernel: Lustre: fs1-OST00db-osc-ffff8838c0412000: Connection restored to 10.47.18.19@o2ib1 (at 10.47.18.19@o2ib1) Aug 21 01:36:23 cpu-e-1060 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:38:14 cpu-e-1060 kernel: Lustre: 66716:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347364/real 1566347364] req@ffff88386224e780 x1642422187209808/t0(0) o3->fs1-OST00b3-osc-ffff8838c0412000@10.47.18.15@o2ib1:6/4 lens 488/440 e 2 to 1 dl 1566347458 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:38:14 cpu-e-1060 kernel: Lustre: 66716:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 7 previous similar messages Aug 21 01:38:14 cpu-e-1060 kernel: Lustre: fs1-OST00b3-osc-ffff8838c0412000: Connection to fs1-OST00b3 (at 10.47.18.15@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:38:14 cpu-e-1060 kernel: Lustre: Skipped 7 previous similar messages Aug 21 01:38:14 cpu-e-1060 kernel: Lustre: fs1-OST00b3-osc-ffff8838c0412000: Connection restored to 10.47.18.15@o2ib1 (at 10.47.18.15@o2ib1) Aug 21 01:38:14 cpu-e-1060 kernel: Lustre: Skipped 7 previous similar messages Aug 21 01:39:35 cpu-e-1060 kernel: Lustre: 66706:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347363/real 1566347363] req@ffff8837b2a6ad00 x1642422187208016/t0(0) o3->fs1-OST011b-osc-ffff8838c0412000@10.47.18.24@o2ib1:6/4 lens 488/440 e 2 to 1 dl 1566347458 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:39:35 cpu-e-1060 kernel: Lustre: fs1-OST011b-osc-ffff8838c0412000: Connection to fs1-OST011b (at 10.47.18.24@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:39:35 cpu-e-1060 kernel: Lustre: fs1-OST011b-osc-ffff8838c0412000: Connection restored to 10.47.18.24@o2ib1 (at 10.47.18.24@o2ib1) Aug 21 01:42:06 cpu-e-1060 kernel: Lustre: 66703:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347362/real 1566347362] req@ffff8837b1a40d80 x1642422187203744/t0(0) o3->fs1-OST0003-osc-ffff8838c0412000@10.47.18.1@o2ib1:6/4 lens 488/440 e 3 to 1 dl 1566347449 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:42:06 cpu-e-1060 kernel: Lustre: 66703:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 4 previous similar messages Aug 21 01:42:06 cpu-e-1060 kernel: Lustre: fs1-OST0003-osc-ffff8838c0412000: Connection to fs1-OST0003 (at 10.47.18.1@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:42:06 cpu-e-1060 kernel: Lustre: Skipped 4 previous similar messages Aug 21 01:42:06 cpu-e-1060 kernel: Lustre: fs1-OST0003-osc-ffff8838c0412000: Connection restored to 10.47.18.1@o2ib1 (at 10.47.18.1@o2ib1) Aug 21 01:42:06 cpu-e-1060 kernel: Lustre: Skipped 4 previous similar messages Aug 21 01:43:28 cpu-e-1060 kernel: perf: interrupt took too long (3951 > 3943), lowering kernel.perf_event_max_sample_rate to 50000 Aug 21 01:55:30 cpu-e-1060 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-1058 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:50:13 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-1058 kernel: Lustre: Mounted fs1-client Aug 21 01:55:30 cpu-e-1058 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-837 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:49:07 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-837 kernel: Lustre: Mounted fs1-client Aug 21 01:29:23 cpu-e-837 kernel: Lustre: 67314:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347363/real 1566347363] req@ffff93ff3a698d80 x1642422187204944/t0(0) o400->fs1-OST00c8-osc-ffff94183cf09800@10.47.18.17@o2ib1:28/4 lens 224/224 e 0 to 1 dl 1566347370 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 Aug 21 01:29:23 cpu-e-837 kernel: Lustre: fs1-OST00c8-osc-ffff94183cf09800: Connection to fs1-OST00c8 (at 10.47.18.17@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fffc7fa800 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d4d01600 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d4d01600 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d4d01600 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d4d01600 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3b8bc00 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbcc8600 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d752a000 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3b8bc00 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3bc8e00 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff6f3f8e00 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ffe6bd5800 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff4322400 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff4322400 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:23 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3ba8c00 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3b8bc00 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3bc8e00 Aug 21 01:29:24 cpu-e-837 kernel: Lustre: 67318:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347363/real 1566347364] req@ffff93ff3a69e780 x1642422187205264/t0(0) o400->fs1-OST0113-osc-ffff94183cf09800@10.47.18.23@o2ib1:28/4 lens 224/224 e 0 to 1 dl 1566347370 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 Aug 21 01:29:24 cpu-e-837 kernel: Lustre: 67318:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 15 previous similar messages Aug 21 01:29:24 cpu-e-837 kernel: Lustre: fs1-OST0113-osc-ffff94183cf09800: Connection to fs1-OST0113 (at 10.47.18.23@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:24 cpu-e-837 kernel: Lustre: Skipped 15 previous similar messages Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:24 cpu-e-837 kernel: Lustre: fs1-OST005d-osc-ffff94183cf09800: Connection restored to 10.47.18.8@o2ib1 (at 10.47.18.8@o2ib1) Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:24 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fed02b6c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66818:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4cece00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66817:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4cece00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66823:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66821:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66822:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 166-1: MGC10.47.18.1@o2ib1: Connection to MGS (at 10.47.18.1@o2ib1) was lost; in progress operations using this service will fail Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66823:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: Lustre: fs1-OST00c5-osc-ffff94183cf09800: Connection restored to 10.47.18.17@o2ib1 (at 10.47.18.17@o2ib1) Aug 21 01:29:25 cpu-e-837 kernel: Lustre: Skipped 11 previous similar messages Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66816:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66818:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66819:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66817:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66816:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66816:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66819:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66817:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417d4d01600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66820:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66822:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66821:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66820:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66820:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66821:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66822:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66823:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff9417cbd28c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66821:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66822:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66823:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66820:0:(events.c:200:client_bulk_callback()) event type 2, status -5, desc ffff93ff6f3fe800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff9b89600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff6bf31e00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff6bf31e00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff4322400 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:25 cpu-e-837 kernel: Lustre: 67319:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347361/real 1566347361] req@ffff94005b9bd100 x1642422187202560/t0(0) o3->fs1-OST00c7-osc-ffff94183cf09800@10.47.18.17@o2ib1:6/4 lens 488/440 e 0 to 1 dl 1566347405 ref 2 fl Rpc:eX/0/ffffffff rc 0/-1 Aug 21 01:29:25 cpu-e-837 kernel: Lustre: 67319:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 102 previous similar messages Aug 21 01:29:25 cpu-e-837 kernel: Lustre: fs1-OST00c7-osc-ffff94183cf09800: Connection to fs1-OST00c7 (at 10.47.18.17@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:25 cpu-e-837 kernel: Lustre: Skipped 92 previous similar messages Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff6bf31e00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fed02b6c00 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d752a000 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ffe6bd5800 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff9b89600 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:25 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ffe6bd5800 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d752a000 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cd8b4000 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3b8bc00 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d752a000 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3ba8c00 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff76e59400 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff76e59400 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff76e59400 Aug 21 01:29:26 cpu-e-837 kernel: Lustre: fs1-OST00e3-osc-ffff94183cf09800: Connection restored to 10.47.18.19@o2ib1 (at 10.47.18.19@o2ib1) Aug 21 01:29:26 cpu-e-837 kernel: Lustre: Skipped 41 previous similar messages Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417d7529800 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff6bf31e00 Aug 21 01:29:26 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff9b89600 Aug 21 01:29:27 cpu-e-837 kernel: LNetError: 66815:0:(o2iblnd_cb.c:3335:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds Aug 21 01:29:27 cpu-e-837 kernel: LNetError: 66815:0:(o2iblnd_cb.c:3410:kiblnd_check_conns()) Timed out RDMA with 10.47.18.5@o2ib1 (2): c: 14, oc: 0, rc: 16 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff4322400 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93fff4322400 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3ba8c00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9416e3ba8c00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff9417cbd1ca00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff6bf31e00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff93ff6bf31e00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:27 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:28 cpu-e-837 kernel: LNetError: 66815:0:(o2iblnd_cb.c:3335:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds Aug 21 01:29:28 cpu-e-837 kernel: LNetError: 66815:0:(o2iblnd_cb.c:3335:kiblnd_check_txs_locked()) Skipped 2 previous similar messages Aug 21 01:29:28 cpu-e-837 kernel: LNetError: 66815:0:(o2iblnd_cb.c:3410:kiblnd_check_conns()) Timed out RDMA with 10.47.18.34@o2ib1 (4): c: 13, oc: 0, rc: 16 Aug 21 01:29:28 cpu-e-837 kernel: LNetError: 66815:0:(o2iblnd_cb.c:3410:kiblnd_check_conns()) Skipped 2 previous similar messages Aug 21 01:29:28 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:28 cpu-e-837 kernel: LustreError: 66815:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc ffff940040fb7c00 Aug 21 01:29:28 cpu-e-837 kernel: Lustre: 67316:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1566347361/real 1566347361] req@ffff9418589dc800 x1642422187202496/t0(0) o3->fs1-OST0074-osc-ffff94183cf09800@10.47.18.10@o2ib1:6/4 lens 488/440 e 0 to 1 dl 1566347405 ref 2 fl Rpc:eXS/0/ffffffff rc -11/-1 Aug 21 01:29:28 cpu-e-837 kernel: Lustre: 67316:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 9 previous similar messages Aug 21 01:29:28 cpu-e-837 kernel: Lustre: fs1-OST006f-osc-ffff94183cf09800: Connection to fs1-OST006f (at 10.47.18.10@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:28 cpu-e-837 kernel: Lustre: Skipped 4 previous similar messages Aug 21 01:29:28 cpu-e-837 kernel: Lustre: fs1-OST006f-osc-ffff94183cf09800: Connection restored to 10.47.18.10@o2ib1 (at 10.47.18.10@o2ib1) Aug 21 01:29:28 cpu-e-837 kernel: Lustre: Skipped 1 previous similar message Aug 21 01:29:32 cpu-e-837 kernel: Lustre: 78382:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347365/real 1566347365] req@ffff941855251b00 x1642422187206768/t0(0) o101->fs1-OST00f4-osc-ffff94183cf09800@10.47.18.21@o2ib1:28/4 lens 328/400 e 0 to 1 dl 1566347372 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:29:32 cpu-e-837 kernel: Lustre: 78382:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 4 previous similar messages Aug 21 01:29:33 cpu-e-837 kernel: Lustre: fs1-OST0000-osc-ffff94183cf09800: Connection to fs1-OST0000 (at 10.47.18.1@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:29:33 cpu-e-837 kernel: Lustre: Skipped 4 previous similar messages Aug 21 01:29:33 cpu-e-837 kernel: Lustre: fs1-OST0000-osc-ffff94183cf09800: Connection restored to 10.47.18.1@o2ib1 (at 10.47.18.1@o2ib1) Aug 21 01:29:33 cpu-e-837 kernel: Lustre: Skipped 4 previous similar messages Aug 21 01:30:02 cpu-e-837 kernel: Lustre: fs1-MDT000d-mdc-ffff94183cf09800: Connection restored to 10.47.18.14@o2ib1 (at 10.47.18.14@o2ib1) Aug 21 01:30:02 cpu-e-837 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:30:09 cpu-e-837 kernel: Lustre: 78364:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347365/real 1566347365] req@ffff941853681680 x1642422187208064/t0(0) o101->fs1-OST00cd-osc-ffff94183cf09800@10.47.18.18@o2ib1:28/4 lens 328/400 e 0 to 1 dl 1566347409 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:30:09 cpu-e-837 kernel: Lustre: 78364:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 4 previous similar messages Aug 21 01:30:09 cpu-e-837 kernel: Lustre: fs1-OST00cd-osc-ffff94183cf09800: Connection to fs1-OST00cd (at 10.47.18.18@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:30:09 cpu-e-837 kernel: Lustre: Skipped 3 previous similar messages Aug 21 01:35:21 cpu-e-837 kernel: Lustre: 67307:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347365/real 1566347365] req@ffff93ff1834da00 x1642422187208464/t0(0) o3->fs1-OST00db-osc-ffff94183cf09800@10.47.18.19@o2ib1:6/4 lens 488/440 e 2 to 1 dl 1566347459 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 Aug 21 01:35:21 cpu-e-837 kernel: Lustre: fs1-OST00db-osc-ffff94183cf09800: Connection to fs1-OST00db (at 10.47.18.19@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:35:21 cpu-e-837 kernel: Lustre: fs1-OST0073-osc-ffff94183cf09800: Connection restored to 10.47.18.10@o2ib1 (at 10.47.18.10@o2ib1) Aug 21 01:35:21 cpu-e-837 kernel: Lustre: Skipped 60 previous similar messages Aug 21 01:36:42 cpu-e-837 kernel: Lustre: 67308:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347363/real 1566347364] req@ffff93ff3a66f080 x1642422187204848/t0(0) o400->fs1-OST00b7-osc-ffff94183cf09800@10.47.18.16@o2ib1:28/4 lens 224/224 e 0 to 1 dl 1566347370 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Aug 21 01:36:42 cpu-e-837 kernel: Lustre: 67308:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 13 previous similar messages Aug 21 01:36:42 cpu-e-837 kernel: Lustre: fs1-OST00b7-osc-ffff94183cf09800: Connection to fs1-OST00b7 (at 10.47.18.16@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:36:42 cpu-e-837 kernel: Lustre: Skipped 13 previous similar messages Aug 21 01:36:42 cpu-e-837 kernel: Lustre: fs1-OST00b7-osc-ffff94183cf09800: Connection restored to 10.47.18.16@o2ib1 (at 10.47.18.16@o2ib1) Aug 21 01:36:42 cpu-e-837 kernel: Lustre: Skipped 13 previous similar messages Aug 21 01:38:44 cpu-e-837 kernel: perf: interrupt took too long (3149 > 3135), lowering kernel.perf_event_max_sample_rate to 63000 Aug 21 01:39:04 cpu-e-837 kernel: Lustre: 67317:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1566347363/real 1566347363] req@ffff94003e77bf00 x1642422187204224/t0(0) o400->fs1-OST0067-osc-ffff94183cf09800@10.47.18.9@o2ib1:28/4 lens 224/224 e 0 to 1 dl 1566347370 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Aug 21 01:39:04 cpu-e-837 kernel: Lustre: 67317:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 4 previous similar messages Aug 21 01:39:04 cpu-e-837 kernel: Lustre: fs1-OST0067-osc-ffff94183cf09800: Connection to fs1-OST0067 (at 10.47.18.9@o2ib1) was lost; in progress operations using this service will wait for recovery to complete Aug 21 01:39:04 cpu-e-837 kernel: Lustre: Skipped 4 previous similar messages Aug 21 01:39:04 cpu-e-837 kernel: Lustre: fs1-MDT0006-mdc-ffff94183cf09800: Connection restored to 10.47.18.7@o2ib1 (at 10.47.18.7@o2ib1) Aug 21 01:39:04 cpu-e-837 kernel: Lustre: Skipped 4 previous similar messages Aug 21 01:54:56 cpu-e-837 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS HOSTS ------------------------------------------------------------------------- cpu-e-836 ------------------------------------------------------------------------------- -- Logs begin at Tue 2019-08-20 19:48:44 BST, end at Wed 2019-08-21 11:52:03 BST. -- Aug 21 01:22:42 cpu-e-836 kernel: Lustre: Mounted fs1-client Aug 21 01:54:56 cpu-e-836 kernel: Adding 15999996k swap on /dev/sda2. Priority:-2 extents:1 across:15999996k SSFS