[ 59.983080] IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready
[ 577.564976] LNet: HW NUMA nodes: 2, HW CPU cores: 32, npartitions: 2
[ 577.588287] alg: No test for adler32 (adler32-zlib)
[ 577.604774] alg: No test for crc32 (crc32-table)
[ 578.440401] Lustre: Lustre: Build Version: 2.10.55
[ 578.626899] LNet: Added LNI 192.168.1.107@o2ib [8/256/0/180]
[ 578.644034] LNetError: 126:0:(o2iblnd_cb.c:2305:kiblnd_passive_connect()) Can't accept conn from 192.168.1.119@o2ib on NA (ib0:1:192.168.1.107): bad dst nid 192.168.1.107@o2ib
[ 579.250293] LustreError: 137-5: soaked-OST000b_UUID: not available for connect from 192.168.1.128@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 579.404451] Lustre: soaked-OST0005: Not available for connect from 192.168.1.118@o2ib (not set up)
[ 579.468193] Lustre: soaked-OST0005: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
[ 579.509366] LustreError: 11-0: soaked-MDT0001-lwp-OST0005: operation mds_connect to node 192.168.1.109@o2ib failed: rc = -114
[ 579.517897] Lustre: soaked-OST0005: Will be in recovery for at least 2:30, or until 32 clients reconnect
[ 579.518181] Lustre: soaked-OST0005: Connection restored to (at 192.168.1.126@o2ib)
[ 580.759587] LustreError: 137-5: soaked-OST0017_UUID: not available for connect from 192.168.1.133@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 580.818149] LustreError: Skipped 53 previous similar messages
[ 580.844275] Lustre: soaked-OST0005: Connection restored to (at 192.168.1.121@o2ib)
[ 580.869576] Lustre: Skipped 4 previous similar messages
[ 581.910615] LustreError: 137-5: soaked-OST0011_UUID: not available for connect from 192.168.1.120@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 581.911026] Lustre: soaked-OST0005: Connection restored to (at 192.168.1.120@o2ib)
[ 581.911029] Lustre: Skipped 1 previous similar message
[ 582.011699] LustreError: Skipped 6 previous similar messages
[ 584.510520] Lustre: soaked-OST0005: Connection restored to soaked-MDT0001-mdtlov_UUID (at 192.168.1.109@o2ib)
[ 584.543241] Lustre: Skipped 1 previous similar message
[ 584.613345] LNet: 7784:0:(o2iblnd_cb.c:3198:kiblnd_check_conns()) Timed out tx for 192.168.1.114@o2ib: 4295251 seconds
[ 585.156427] LustreError: 137-5: soaked-OST0017_UUID: not available for connect from 192.168.1.136@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 585.214952] LustreError: Skipped 4 previous similar messages
[ 585.613324] LNet: 7784:0:(o2iblnd_cb.c:3198:kiblnd_check_conns()) Timed out tx for 192.168.1.115@o2ib: 4295252 seconds
[ 588.629665] Lustre: soaked-OST0005: Connection restored to (at 192.168.1.123@o2ib)
[ 588.654988] Lustre: Skipped 15 previous similar messages
[ 588.718309] Lustre: soaked-OST000b: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
[ 588.763453] LustreError: 11-0: soaked-MDT0001-lwp-OST000b: operation mds_connect to node 192.168.1.109@o2ib failed: rc = -114
[ 588.800753] LustreError: Skipped 3 previous similar messages
[ 589.290380] Lustre: soaked-OST000b: Will be in recovery for at least 2:30, or until 32 clients reconnect
[ 590.882947] LustreError: 137-5: soaked-OST0017_UUID: not available for connect from 192.168.1.109@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 596.295018] Lustre: soaked-OST0011: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
[ 596.343902] LustreError: 11-0: soaked-MDT0001-lwp-OST0011: operation mds_connect to node 192.168.1.109@o2ib failed: rc = -114
[ 596.349189] Lustre: soaked-OST0011: Will be in recovery for at least 2:30, or until 32 clients reconnect
[ 596.412527] LustreError: Skipped 3 previous similar messages
[ 596.989322] Lustre: soaked-OST0011: Connection restored to soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib)
[ 597.025080] Lustre: Skipped 27 previous similar messages
[ 601.257640] LustreError: 137-5: soaked-OST0017_UUID: not available for connect from 192.168.1.110@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 601.322237] LustreError: Skipped 1 previous similar message
[ 604.746939] LustreError: 11-0: soaked-MDT0001-lwp-OST0017: operation mds_connect to node 192.168.1.109@o2ib failed: rc = -114
[ 604.751172] Lustre: soaked-OST0017: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
[ 604.755435] Lustre: soaked-OST0017: Will be in recovery for at least 2:30, or until 32 clients reconnect
[ 604.859781] LustreError: Skipped 3 previous similar messages
[ 613.202223] Lustre: soaked-OST0017: Connection restored to (at 192.168.1.120@o2ib)
[ 613.230721] Lustre: Skipped 60 previous similar messages
[ 625.052799] Lustre: soaked-OST0011: Recovery over after 0:28, of 32 clients 32 recovered and 0 were evicted.
[ 625.178799] Lustre: soaked-OST0011: deleting orphan objects from 0x740000402:1954773 to 0x740000402:1956385
[ 625.186027] Lustre: soaked-OST0011: deleting orphan objects from 0x740000401:2946444 to 0x740000401:2946577
[ 625.247687] Lustre: soaked-OST0011: deleting orphan objects from 0x740000400:1324860 to 0x740000400:1326312
[ 629.859616] LustreError: 11-0: soaked-MDT0003-lwp-OST0005: operation mds_connect to node 192.168.1.111@o2ib failed: rc = -114
[ 629.900030] LustreError: Skipped 3 previous similar messages
[ 632.686916] Lustre: soaked-OST0017: Recovery over after 0:28, of 32 clients 32 recovered and 0 were evicted.
[ 632.812885] Lustre: soaked-OST0017: deleting orphan objects from 0x8c0000402:1330224 to 0x8c0000402:1330433
[ 632.845820] Lustre: soaked-OST0017: deleting orphan objects from 0x8c0000401:1961276 to 0x8c0000401:1962209
[ 632.892404] Lustre: soaked-OST0017: deleting orphan objects from 0x8c0000400:2942161 to 0x8c0000400:2942193
[ 633.363106] LustreError: 7850:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) delayed lvb init failed (rc -2)
[ 633.370215] LustreError: 29224:0:(ofd_io.c:616:ofd_preprw_write()) soaked-OST0005: BRW to missing obj 0x440000402:1949013
[ 633.434168] LustreError: 7850:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) Skipped 1 previous similar message
[ 633.468475] format at lustre_dlm.h:1098:ldlm_lvbo_fill doesn't end in newline
[ 635.387737] Lustre: soaked-OST000b: Recovery over after 0:46, of 32 clients 32 recovered and 0 were evicted.
[ 635.423428] Lustre: Skipped 1 previous similar message
[ 633.493466] LustreError: 7850:0:(lustre_dlm.h:1098:ldlm_lvbo_fill()) lock ffff880797e70800: delayed lvb init failed (rc -2)
[ 635.794185] LustreError: 7850:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) delayed lvb init failed (rc -2)
[ 635.830221] LustreError: 7850:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) Skipped 10 previous similar messages
[ 654.748696] LustreError: 11-0: soaked-MDT0000-lwp-OST0011: operation mds_connect to node 192.168.1.108@o2ib failed: rc = -114
[ 654.789007] LustreError: Skipped 1 previous similar message
[ 679.789097] LustreError: 11-0: soaked-MDT0003-lwp-OST0005: operation mds_connect to node 192.168.1.111@o2ib failed: rc = -114
[ 679.829488] LustreError: Skipped 5 previous similar messages
[ 729.808913] LustreError: 11-0: soaked-MDT0003-lwp-OST0005: operation mds_connect to node 192.168.1.111@o2ib failed: rc = -114
[ 729.849328] LustreError: Skipped 7 previous similar messages
[ 749.610650] LNet: 7784:0:(o2iblnd_cb.c:3198:kiblnd_check_conns()) Timed out tx for 192.168.1.114@o2ib: 0 seconds
[ 804.746279] LustreError: 11-0: soaked-MDT0000-lwp-OST0017: operation mds_connect to node 192.168.1.108@o2ib failed: rc = -114
[ 804.786597] LustreError: Skipped 9 previous similar messages
[ 833.890248] LNet: Service thread pid 35547 was inactive for 200.52s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[ 833.952191] Pid: 35547, comm: ll_ost_io01_022
[ 833.969466]
Call Trace:
[ 833.987632] [<ffffffff816a9589>] schedule+0x29/0x70
[ 834.006382] [<ffffffffc06b34d5>] cv_wait_common+0x125/0x150 [spl]
[ 834.028931] [<ffffffff810b1920>] ? autoremove_wake_function+0x0/0x40
[ 834.052232] [<ffffffffc06b3515>] __cv_wait+0x15/0x20 [spl]
[ 834.072541] [<ffffffffc1f7181f>] txg_wait_synced+0xef/0x140 [zfs]
[ 834.094665] [<ffffffffc0b562c8>] osd_trans_stop+0x468/0x590 [osd_zfs]
[ 834.117804] [<ffffffffc117629f>] ofd_trans_stop+0x1f/0x60 [ofd]
[ 834.139098] [<ffffffffc117c4c4>] ofd_commitrw_write+0x7e4/0x1c50 [ofd]
[ 834.162354] [<ffffffffc11805e9>] ofd_commitrw+0x4b9/0xac0 [ofd]
[ 834.183547] [<ffffffffc0f74677>] obd_commitrw+0x2ed/0x330 [ptlrpc]
[ 834.205542] [<ffffffffc0f47c41>] tgt_brw_write+0xff1/0x17c0 [ptlrpc]
[ 834.227963] [<ffffffff811de501>] ? __slab_free+0x81/0x2f0
[ 834.247183] [<ffffffff8118aeb4>] ? free_one_page+0x2e4/0x310
[ 834.267230] [<ffffffffc0e99f80>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 834.290662] [<ffffffffc0f43c55>] tgt_request_handle+0x925/0x1370 [ptlrpc]
[ 834.314361] [<ffffffffc0eeca3e>] ptlrpc_server_handle_request+0x24e/0xab0 [ptlrpc]
[ 834.340655] [<ffffffffc0ee9908>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 834.364065] [<ffffffff811dca2e>] ? kmalloc_order_trace+0x2e/0xa0
[ 834.385198] [<ffffffffc0ef01e2>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[ 834.406858] [<ffffffff81029557>] ? __switch_to+0xd7/0x510
[ 834.425923] [<ffffffff816a904d>] ? __schedule+0x39d/0x8b0
[ 834.444999] [<ffffffffc0eef750>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
[ 834.466633] [<ffffffff810b099f>] kthread+0xcf/0xe0
[ 834.483693] [<ffffffff810b08d0>] ? kthread+0x0/0xe0
[ 834.501018] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
[ 834.519757] [<ffffffff810b08d0>] ? kthread+0x0/0xe0
[ 834.542850] LustreError: dumping log to /tmp/lustre-log.1510193654.35547
[ 834.802579] Pid: 32862, comm: ll_ost_io00_019
[ 834.819060]
Call Trace:
[ 834.835684] [<ffffffff816a9589>] schedule+0x29/0x70
[ 834.853784] [<ffffffffc06b34d5>] cv_wait_common+0x125/0x150 [spl]
[ 834.875745] [<ffffffff810b1920>] ? autoremove_wake_function+0x0/0x40
[ 834.898451] [<ffffffffc06b3515>] __cv_wait+0x15/0x20 [spl]
[ 834.918158] [<ffffffffc1f26bfb>] dmu_tx_wait+0x20b/0x3c0 [zfs]
[ 834.938932] [<ffffffffc1f26e41>] dmu_tx_assign+0x91/0x490 [zfs]
[ 834.959865] [<ffffffffc0b52f37>] osd_trans_start+0xa7/0x3a0 [osd_zfs]
[ 834.982424] [<ffffffffc117620b>] ofd_trans_start+0x6b/0xe0 [ofd]
[ 835.003465] [<ffffffffc117c62b>] ofd_commitrw_write+0x94b/0x1c50 [ofd]
[ 835.026153] [<ffffffffc11805e9>] ofd_commitrw+0x4b9/0xac0 [ofd]
[ 835.046815] [<ffffffffc0f74677>] obd_commitrw+0x2ed/0x330 [ptlrpc]
[ 835.068248] [<ffffffffc0f47c41>] tgt_brw_write+0xff1/0x17c0 [ptlrpc]
[ 835.090232] [<ffffffff810cd7a4>] ? update_curr+0x104/0x190
[ 835.110138] [<ffffffff810c8f28>] ? __enqueue_entity+0x78/0x80
[ 835.130802] [<ffffffff810cf91c>] ? enqueue_entity+0x26c/0xb60
[ 835.151423] [<ffffffffc0e99f80>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 835.175126] [<ffffffffc0f43c55>] tgt_request_handle+0x925/0x1370 [ptlrpc]
[ 835.199009] [<ffffffffc0eeca3e>] ptlrpc_server_handle_request+0x24e/0xab0 [ptlrpc]
[ 835.225389] [<ffffffffc0ee9908>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 835.248817] [<ffffffff810c4832>] ? default_wake_function+0x12/0x20
[ 835.270461] [<ffffffff810ba598>] ? __wake_up_common+0x58/0x90
[ 835.290685] [<ffffffffc0ef01e2>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[ 835.312227] [<ffffffff81029557>] ? __switch_to+0xd7/0x510
[ 835.331138] [<ffffffff816a9000>] ? __schedule+0x350/0x8b0
[ 835.350071] [<ffffffffc0eef750>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
[ 835.371526] [<ffffffff810b099f>] kthread+0xcf/0xe0
[ 835.388344] [<ffffffff810b08d0>] ? kthread+0x0/0xe0
[ 835.405400] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
[ 835.423854] [<ffffffff810b08d0>] ? kthread+0x0/0xe0
[ 835.446443] Pid: 29220, comm: ll_ost_io01_004