[ 2212.173927] Lustre: DEBUG MARKER: ==== Checking the clients loads BEFORE failover -- failure NOT OK
[ 2212.998458] Lustre: DEBUG MARKER: /usr/sbin/lctl mark Done checking client loads. Failing type1=clients item1=onyx-34vm5,onyx-34vm6 ...
[ 2213.336608] Lustre: DEBUG MARKER: Done checking client loads. Failing type1=clients item1=onyx-34vm5,onyx-34vm6 ...
[ 2253.919179] LNet: Service thread pid 4319 was inactive for 40.03s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[ 2253.926365] Pid: 4319, comm: ll_ost_io00_067
[ 2253.928076]
Call Trace:
[ 2253.931101] [<ffffffff8160a409>] schedule+0x29/0x70
[ 2253.933951] [<ffffffff816082b5>] schedule_timeout+0x175/0x2d0
[ 2253.936095] [<ffffffffa081e3aa>] ? ptlrpc_start_bulk_transfer+0x16a/0x710 [ptlrpc]
[ 2253.938049] [<ffffffff8107ee80>] ? process_timeout+0x0/0x10
[ 2253.940092] [<ffffffffa07e2cae>] target_bulk_io+0x4de/0xb00 [ptlrpc]
[ 2253.941956] [<ffffffff810a9650>] ? default_wake_function+0x0/0x20
[ 2253.944070] [<ffffffffa088f941>] tgt_brw_write+0x10b1/0x1650 [ptlrpc]
[ 2253.945900] [<ffffffff812dfbab>] ? string.isra.6+0x3b/0xf0
[ 2253.947956] [<ffffffffa07e01f0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 2253.949879] [<ffffffffa088b29b>] tgt_request_handle+0x88b/0x1100 [ptlrpc]
[ 2253.952132] [<ffffffffa0832fbb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 2253.954241] [<ffffffffa0830078>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 2253.956362] [<ffffffffa0836900>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
[ 2253.958180] [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40
[ 2253.960260] [<ffffffffa0835d00>] ? ptlrpc_main+0x0/0x1f60 [ptlrpc]
[ 2253.962104] [<ffffffff8109739f>] kthread+0xcf/0xe0
[ 2253.965627] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2253.967444] [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
[ 2253.969266] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2253.972575] LustreError: dumping log to /tmp/lustre-log.1436386922.4319
[ 2254.275273] Pid: 4320, comm: ll_ost_io00_068
[ 2254.279470]
Call Trace:
[ 2254.285875] [<ffffffff8160a409>] schedule+0x29/0x70
[ 2254.287722] [<ffffffff816082b5>] schedule_timeout+0x175/0x2d0
[ 2254.289620] [<ffffffffa081e3aa>] ? ptlrpc_start_bulk_transfer+0x16a/0x710 [ptlrpc]
[ 2254.291530] [<ffffffff8107ee80>] ? process_timeout+0x0/0x10
[ 2254.293318] [<ffffffffa07e2cae>] target_bulk_io+0x4de/0xb00 [ptlrpc]
[ 2254.295041] [<ffffffff810a9650>] ? default_wake_function+0x0/0x20
[ 2254.296794] [<ffffffffa088f941>] tgt_brw_write+0x10b1/0x1650 [ptlrpc]
[ 2254.298482] [<ffffffff812dfbab>] ? string.isra.6+0x3b/0xf0
[ 2254.300158] [<ffffffffa07e01f0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 2254.301808] [<ffffffffa088b29b>] tgt_request_handle+0x88b/0x1100 [ptlrpc]
[ 2254.303486] [<ffffffffa0832fbb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 2254.305306] [<ffffffffa0830078>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 2254.306949] [<ffffffffa0836900>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
[ 2254.308571] [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40
[ 2254.310115] [<ffffffff810125f6>] ? __switch_to+0x136/0x4a0
[ 2254.311663] [<ffffffffa0835d00>] ? ptlrpc_main+0x0/0x1f60 [ptlrpc]
[ 2254.313179] [<ffffffff8109739f>] kthread+0xcf/0xe0
[ 2254.314588] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2254.315973] [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
[ 2254.317357] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2254.319694] Pid: 4330, comm: ll_ost_io00_071
[ 2254.320938]
Call Trace:
[ 2254.323020] [<ffffffff8160a409>] schedule+0x29/0x70
[ 2254.324327] [<ffffffff816082b5>] schedule_timeout+0x175/0x2d0
[ 2254.325717] [<ffffffffa081e3aa>] ? ptlrpc_start_bulk_transfer+0x16a/0x710 [ptlrpc]
[ 2254.327192] [<ffffffff8107ee80>] ? process_timeout+0x0/0x10
[ 2254.328564] [<ffffffffa07e2cae>] target_bulk_io+0x4de/0xb00 [ptlrpc]
[ 2254.329942] [<ffffffff810a9650>] ? default_wake_function+0x0/0x20
[ 2254.331377] [<ffffffffa088f941>] tgt_brw_write+0x10b1/0x1650 [ptlrpc]
[ 2254.332768] [<ffffffff812dfbab>] ? string.isra.6+0x3b/0xf0
[ 2254.334173] [<ffffffffa07e01f0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 2254.335613] [<ffffffffa088b29b>] tgt_request_handle+0x88b/0x1100 [ptlrpc]
[ 2254.337071] [<ffffffffa0832fbb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 2254.338680] [<ffffffffa0830078>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 2254.340149] [<ffffffffa0836900>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
[ 2254.341516] [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40
[ 2254.342882] [<ffffffff810125f6>] ? __switch_to+0x136/0x4a0
[ 2254.344241] [<ffffffffa0835d00>] ? ptlrpc_main+0x0/0x1f60 [ptlrpc]
[ 2254.345619] [<ffffffff8109739f>] kthread+0xcf/0xe0
[ 2254.346921] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2254.348223] [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
[ 2254.349523] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2254.351850] Pid: 4257, comm: ll_ost_io00_039
[ 2254.353101]
Call Trace:
[ 2254.355139] [<ffffffff8160a409>] schedule+0x29/0x70
[ 2254.356389] [<ffffffff816082b5>] schedule_timeout+0x175/0x2d0
[ 2254.357738] [<ffffffffa081e3aa>] ? ptlrpc_start_bulk_transfer+0x16a/0x710 [ptlrpc]
[ 2254.359179] [<ffffffff8107ee80>] ? process_timeout+0x0/0x10
[ 2254.360681] [<ffffffffa07e2cae>] target_bulk_io+0x4de/0xb00 [ptlrpc]
[ 2254.362039] [<ffffffff810a9650>] ? default_wake_function+0x0/0x20
[ 2254.363468] [<ffffffffa088f941>] tgt_brw_write+0x10b1/0x1650 [ptlrpc]
[ 2254.364833] [<ffffffff812dfbab>] ? string.isra.6+0x3b/0xf0
[ 2254.366180] [<ffffffffa07e01f0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 2254.367586] [<ffffffffa088b29b>] tgt_request_handle+0x88b/0x1100 [ptlrpc]
[ 2254.369024] [<ffffffffa0832fbb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 2254.370488] [<ffffffffa0830078>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 2254.371916] [<ffffffffa0836900>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
[ 2254.373266] [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40
[ 2254.374639] [<ffffffffa0835d00>] ? ptlrpc_main+0x0/0x1f60 [ptlrpc]
[ 2254.375995] [<ffffffff8109739f>] kthread+0xcf/0xe0
[ 2254.377272] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2254.378521] [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
[ 2254.379843] [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 2254.382140] LNet: Service thread pid 2840 was inactive for 40.17s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
[ 2255.071164] LNet: Service thread pid 4267 was inactive for 40.03s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
[ 2255.076671] LNet: Skipped 7 previous similar messages
[ 2255.078304] LustreError: dumping log to /tmp/lustre-log.1436386923.4267
[ 2264.610805] Lustre: lustre-OST0001: haven't heard from client 2d4be017-9a1c-7408-76c8-bc0239710d98 (at 10.2.4.133@tcp) in 49 seconds. I think it's dead, and I am evicting it. exp ffff88004084b000, cur 1436386933 expire 1436386903 last 1436386884
[ 2264.617306] Lustre: Skipped 6 previous similar messages
[ 2266.890151] LustreError: 4269:0:(ldlm_lib.c:3077:target_bulk_io()) @@@ Eviction on bulk WRITE req@ffff880051723000 x1506160196432764/t0(0) o4->2d4be017-9a1c-7408-76c8-bc0239710d98@10.2.4.133@tcp:199/0 lens 608/448 e 2 to 0 dl 1436386944 ref 1 fl Interpret:/0/0 rc 0/0
[ 2266.905189] Lustre: lustre-OST0003: Bulk IO write error with 2d4be017-9a1c-7408-76c8-bc0239710d98 (at 10.2.4.133@tcp), client will retry: rc -107