/var/log/messages-20180401.gz:Mar 31 12:40:57 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180401.gz:Mar 31 12:40:57 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180402.gz:Apr 1 08:30:43 warble1 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180402.gz:Apr 1 08:30:43 warble1 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180402.gz:Apr 1 11:40:58 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180402.gz:Apr 1 11:40:58 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180403.gz:Apr 2 08:30:44 warble1 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180403.gz:Apr 2 08:30:44 warble1 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180403.gz:Apr 2 11:40:58 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180403.gz:Apr 2 11:40:58 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180404.gz:Apr 3 08:30:44 warble1 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180404.gz:Apr 3 08:30:44 warble1 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180404.gz:Apr 3 11:40:59 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180404.gz:Apr 3 11:40:59 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180405.gz:Apr 4 08:30:45 warble1 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180405.gz:Apr 4 08:30:45 warble1 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180405.gz:Apr 4 11:40:59 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180405.gz:Apr 4 11:40:59 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180406.gz:Apr 5 08:30:46 warble1 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180406.gz:Apr 5 08:30:46 warble1 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: LNet: Service thread pid 202469 was inactive for 200.19s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: Pid: 202469, comm: mdt_rdpg01_009 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: #012Call Trace: /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: /var/log/messages-20180406.gz:Apr 5 11:16:46 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1522891006.202469 /var/log/messages-20180406.gz:Apr 5 11:23:20 warble1 kernel: Lustre: 166298:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b929130f00 x1594448788515072/t0(0) o35->dab1eea9-d5c1-3600-5d68-c575fa8994a1@192.168.44.207@o2ib44:535/0 lens 512/696 e 24 to 0 dl 1522891405 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180406.gz:Apr 5 11:23:26 warble1 kernel: Lustre: dagg-MDT0002: Client dab1eea9-d5c1-3600-5d68-c575fa8994a1 (at 192.168.44.207@o2ib44) reconnecting /var/log/messages-20180406.gz:Apr 5 11:23:26 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 960835cf-0272-8aa9-7468-7a100bafba4a (at 192.168.44.207@o2ib44) /var/log/messages-20180406.gz:Apr 5 11:41:00 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180406.gz:Apr 5 11:41:00 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180406.gz:Apr 5 18:59:01 warble1 rsyslogd: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ] /var/log/messages-20180406.gz:Apr 5 21:24:07 warble1 kernel: LNet: 121:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.16@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180406.gz:Apr 5 21:24:07 warble1 kernel: Lustre: MGS: Connection restored to ede62a87-76ca-610a-2387-990c318116d8 (at 192.168.44.16@o2ib44) /var/log/messages-20180406.gz:Apr 5 21:24:07 warble2 kernel: LNet: 395:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.16@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180406.gz:Apr 5 21:24:07 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 99d7c337-1ed6-5788-f492-227d20485b51 (at 192.168.44.16@o2ib44) /var/log/messages-20180406.gz:Apr 5 21:29:26 warble2 kernel: LNet: 395:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.16@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180406.gz:Apr 5 21:29:26 warble2 kernel: LNet: 395:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages -- /var/log/messages-20180407.gz:Apr 6 19:17:45 warble1 kernel: Lustre: images-MDT0000: Connection restored to b9cd63b3-1509-d4bd-4136-eb0b862e8a40 (at 192.168.44.102@o2ib44) /var/log/messages-20180407.gz:Apr 6 19:17:45 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180407.gz:Apr 6 19:17:45 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 6 19:21:13 warble1 kernel: Lustre: dagg-MDT0001: haven't heard from client d4c6598a-6dac-7f09-2348-bc327561919f (at 192.168.44.102@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bd037a4c00, cur 1523006473 expire 1523006323 last 1523006246 /var/log/messages-20180407.gz:Apr 6 19:21:13 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180407.gz:Apr 6 19:21:22 warble1 kernel: Lustre: MGS: haven't heard from client e2d03ff6-c9c4-3b78-3849-9bbd2ad7c058 (at 192.168.44.102@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bd1441a400, cur 1523006482 expire 1523006332 last 1523006255 /var/log/messages-20180407.gz:Apr 6 19:21:22 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 6 19:21:32 warble1 kernel: Lustre: images-MDT0000: haven't heard from client 6c699b97-5988-0955-0315-8c1bf3e44e10 (at 192.168.44.102@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bcb0e5d800, cur 1523006492 expire 1523006342 last 1523006265 /var/log/messages-20180407.gz:Apr 6 19:21:32 warble2 kernel: Lustre: apps-MDT0000: haven't heard from client ff31550d-2ce4-d4f1-6674-acb985f588ea (at 192.168.44.102@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bcb3f43800, cur 1523006492 expire 1523006342 last 1523006265 /var/log/messages-20180407.gz:Apr 6 19:21:32 warble2 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 6 19:21:57 warble2 kernel: Lustre: home-MDT0000: haven't heard from client 014aaaae-465a-b155-6c05-882c3d69b4c6 (at 192.168.44.102@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bcaf180000, cur 1523006517 expire 1523006367 last 1523006290 /var/log/messages-20180407.gz:Apr 6 19:23:42 warble1 kernel: Lustre: images-MDT0000: Connection restored to b9cd63b3-1509-d4bd-4136-eb0b862e8a40 (at 192.168.44.102@o2ib44) /var/log/messages-20180407.gz:Apr 6 19:23:42 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 6 19:23:42 warble2 kernel: Lustre: apps-MDT0000: Connection restored to 13e7a798-9238-6f2c-cbfa-354fc1b709f1 (at 192.168.44.102@o2ib44) /var/log/messages-20180407.gz:Apr 6 19:23:42 warble2 kernel: LustreError: 137-5: dagg-MDT0001_UUID: not available for connect from 192.168.44.102@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180407.gz:Apr 6 19:23:42 warble2 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 6 19:23:42 warble2 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 6 19:27:29 warble1 kernel: Lustre: MGS: haven't heard from client e2d03ff6-c9c4-3b78-3849-9bbd2ad7c058 (at 192.168.44.102@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885dad9bf000, cur 1523006849 expire 1523006699 last 1523006622 /var/log/messages-20180407.gz:Apr 6 19:27:29 warble2 kernel: Lustre: apps-MDT0000: haven't heard from client ff31550d-2ce4-d4f1-6674-acb985f588ea (at 192.168.44.102@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88ba40800c00, cur 1523006849 expire 1523006699 last 1523006622 /var/log/messages-20180407.gz:Apr 6 19:27:29 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: LNet: Service thread pid 23423 was inactive for 200.23s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: Pid: 23423, comm: mdt_rdpg01_004 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523020612.23423 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: Pid: 351830, comm: mdt_rdpg01_037 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 6 23:16:52 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: LNet: Service thread pid 249429 was inactive for 200.40s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: Pid: 249429, comm: mdt_rdpg01_054 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523020613.249429 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: Pid: 343528, comm: mdt_rdpg01_033 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] -- /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:16:53 warble1 kernel: /var/log/messages-20180407.gz:Apr 6 23:23:27 warble1 kernel: Lustre: 455633:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bb5abaec00 x1594455637621024/t0(0) o35->1649aabf-3fc7-579d-53cc-12bd7e762c0c@192.168.44.199@o2ib44:282/0 lens 512/696 e 24 to 0 dl 1523021012 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 6 23:23:28 warble1 kernel: Lustre: 455633:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bbd7cc2400 x1594455637623216/t0(0) o35->1649aabf-3fc7-579d-53cc-12bd7e762c0c@192.168.44.199@o2ib44:283/0 lens 512/696 e 24 to 0 dl 1523021013 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 6 23:23:28 warble1 kernel: Lustre: 455633:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 6 23:23:33 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 6 23:23:33 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 6 23:33:34 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 6 23:33:34 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 6 23:39:04 warble1 kernel: Lustre: 455633:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88bc00ca5d00 x1594448790165536/t0(0) o35->ba0dc1ba-a5de-598a-6412-3605eeca8e99@192.168.44.206@o2ib44:464/0 lens 512/696 e 0 to 0 dl 1523021949 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 6 23:39:10 warble1 kernel: Lustre: dagg-MDT0002: Client ba0dc1ba-a5de-598a-6412-3605eeca8e99 (at 192.168.44.206@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 6 23:39:10 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to c4ab4068-3bae-dfff-0a35-f956f2cb25a4 (at 192.168.44.206@o2ib44) /var/log/messages-20180407.gz:Apr 6 23:43:35 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 6 23:43:35 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: LNet: Service thread pid 144148 was inactive for 1203.24s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: Pid: 144148, comm: mdt_rdpg01_023 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: /var/log/messages-20180407.gz:Apr 6 23:46:37 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523022397.144148 /var/log/messages-20180407.gz:Apr 6 23:51:46 warble1 kernel: Lustre: dagg-MDT0002: Client ba0dc1ba-a5de-598a-6412-3605eeca8e99 (at 192.168.44.206@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 6 23:51:46 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to c4ab4068-3bae-dfff-0a35-f956f2cb25a4 (at 192.168.44.206@o2ib44) /var/log/messages-20180407.gz:Apr 6 23:53:36 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 6 23:53:36 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 00:03:37 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 00:03:37 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: LNet: Service thread pid 145064 was inactive for 200.66s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: Pid: 145064, comm: mdt00_117 /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] ? lod_declare_destroy+0x3d0/0x5c0 [lod] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] ? mdd_declare_finish_unlink+0x8e/0x1f0 [mdd] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdd_rename+0x4d1/0x14a0 [mdd] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdt_reint_rename_internal.isra.36+0x1664/0x20c0 [mdt] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x19b/0x860 [mdt] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 00:09:37 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 00:09:38 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523023778.145064 /var/log/messages-20180407.gz:Apr 7 00:13:38 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 00:13:38 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 00:13:38 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 00:13:38 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 00:16:13 warble1 kernel: Lustre: 143582:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885952a28600 x1595015331327440/t0(0) o36->648c8bb7-edb8-ecdf-617f-773ccb2226ea@192.168.44.101@o2ib44:427/0 lens 792/3128 e 24 to 0 dl 1523024177 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 00:16:18 warble1 kernel: Lustre: dagg-MDT0002: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: LNet: Service thread pid 143624 was inactive for 200.09s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: Pid: 143624, comm: mdt_rdpg01_018 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 00:21:58 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523024518.143624 /var/log/messages-20180407.gz:Apr 7 00:23:39 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 00:23:39 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 00:23:39 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 00:23:39 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 00:28:33 warble1 kernel: Lustre: 143627:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd26df050 x1594455650268736/t0(0) o35->1649aabf-3fc7-579d-53cc-12bd7e762c0c@192.168.44.199@o2ib44:413/0 lens 512/696 e 24 to 0 dl 1523024918 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 00:33:40 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 00:33:40 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 00:33:40 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 00:33:40 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 00:43:41 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 00:43:41 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 00:43:41 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 00:43:41 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 00:53:42 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 00:53:42 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: LNet: Service thread pid 23452 was inactive for 200.66s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: Pid: 23452, comm: mdt00_014 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_object_local_lock+0x452/0xaf0 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? lu_object_find_at+0x211/0x290 [obdclass] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_object_lock+0x14/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_reint_open+0xc71/0x31a0 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] mdt_intent_policy+0x43e/0xc70 [mdt] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? ldlm_resource_get+0x9f/0xa30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 01:02:42 warble1 kernel: [] ? finish_task_switch+0x57/0x160 -- /var/log/messages-20180407.gz:Apr 7 01:03:43 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 01:03:43 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 01:03:43 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:03:43 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 01:04:21 warble1 kernel: LustreError: 23452:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523026761, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0001_UUID lock: ffff885e294c7400/0xe8af923330d89d63 lrc: 3/0,1 mode: --/CW res: [0x28001c991:0xea2c:0x0].0x0 bits 0x2 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 23452 timeout: 0 lvb_type: 0 /var/log/messages-20180407.gz:Apr 7 01:04:21 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523027061.23452 /var/log/messages-20180407.gz:Apr 7 01:08:36 warble1 kernel: LustreError: 137-5: home-MDT0000_UUID: not available for connect from 192.168.44.102@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180407.gz:Apr 7 01:08:36 warble1 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 01:08:36 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 13e7a798-9238-6f2c-cbfa-354fc1b709f1 (at 192.168.44.102@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:09:02 warble2 kernel: Lustre: apps-MDT0000: Connection restored to 13e7a798-9238-6f2c-cbfa-354fc1b709f1 (at 192.168.44.102@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:09:02 warble2 kernel: Lustre: home-MDT0000: Connection restored to 13e7a798-9238-6f2c-cbfa-354fc1b709f1 (at 192.168.44.102@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:09:17 warble1 kernel: Lustre: 143611:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885a82868300 x1595015334613312/t0(0) o101->648c8bb7-edb8-ecdf-617f-773ccb2226ea@192.168.44.101@o2ib44:591/0 lens 936/3512 e 24 to 0 dl 1523027361 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 01:13:44 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 01:13:44 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:13:44 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:13:44 warble1 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:23:45 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 01:23:45 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:23:45 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:23:45 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: LNet: Service thread pid 23975 was inactive for 200.32s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: Pid: 23975, comm: mdt_rdpg01_007 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523028426.23975 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: Pid: 166297, comm: mdt_rdpg01_028 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] -- /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:06 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: LNet: Service thread pid 344781 was inactive for 200.42s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: LNet: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: Pid: 344781, comm: mdt_rdpg01_034 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 01:27:08 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523028428.344781 /var/log/messages-20180407.gz:Apr 7 01:33:40 warble1 kernel: Lustre: 173524:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bb7d980000 x1594455653465456/t0(0) o35->1649aabf-3fc7-579d-53cc-12bd7e762c0c@192.168.44.199@o2ib44:545/0 lens 512/696 e 24 to 0 dl 1523028825 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 01:33:42 warble1 kernel: Lustre: 173524:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bc331aa700 x1594455653466544/t0(0) o35->1649aabf-3fc7-579d-53cc-12bd7e762c0c@192.168.44.199@o2ib44:547/0 lens 512/696 e 23 to 0 dl 1523028827 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 01:33:42 warble1 kernel: Lustre: 173524:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:33:46 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 01:33:46 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:33:46 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:33:46 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:43:47 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 01:43:47 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 01:43:47 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:43:47 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 01:53:48 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 01:53:48 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:53:48 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 01:53:48 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: LNet: Service thread pid 249359 was inactive for 200.66s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: Pid: 249359, comm: mdt_rdpg01_049 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 01:57:09 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523030229.249359 /var/log/messages-20180407.gz:Apr 7 02:03:43 warble1 kernel: Lustre: 173524:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bb136ff200 x1594455653530256/t0(0) o35->1649aabf-3fc7-579d-53cc-12bd7e762c0c@192.168.44.199@o2ib44:83/0 lens 512/696 e 24 to 0 dl 1523030628 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 02:03:49 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 02:03:49 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:03:49 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 02:03:49 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: LNet: Service thread pid 364565 was inactive for 200.42s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: Pid: 364565, comm: mdt00_063 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? number.isra.2+0x323/0x360 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_object_local_lock+0x452/0xaf0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? lu_object_find_at+0x95/0x290 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_object_lock+0x14/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_reint_open+0xc71/0x31a0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] mdt_intent_policy+0x43e/0xc70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? ldlm_resource_get+0x9f/0xa30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 02:03:55 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523030635.364565 /var/log/messages-20180407.gz:Apr 7 02:05:34 warble1 kernel: LustreError: 364565:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523030434, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0001_UUID lock: ffff88bd0bc38a00/0xe8af92333179276d lrc: 3/1,0 mode: --/CR res: [0x28001c991:0xea2c:0x0].0x0 bits 0x2 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 364565 timeout: 0 lvb_type: 0 /var/log/messages-20180407.gz:Apr 7 02:10:29 warble1 kernel: Lustre: 364509:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88575a362700 x1595015335575824/t0(0) o101->648c8bb7-edb8-ecdf-617f-773ccb2226ea@192.168.44.101@o2ib44:489/0 lens 896/3512 e 24 to 0 dl 1523031034 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 02:13:50 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 02:13:50 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:13:50 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 02:13:50 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: LNet: Service thread pid 143643 was inactive for 200.12s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: Pid: 143643, comm: mdt01_080 /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? nvlist_lookup_common.part.71+0xa2/0xb0 [znvpair] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] mdt_object_local_lock+0x452/0xaf0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:46 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? lu_object_find_at+0x95/0x290 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] mdt_object_lock+0x14/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] mdt_reint_open+0xc71/0x31a0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] mdt_intent_policy+0x43e/0xc70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? ldlm_resource_get+0x9f/0xa30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 02:19:47 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523031587.143643 /var/log/messages-20180407.gz:Apr 7 02:21:26 warble1 kernel: LustreError: 143643:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523031386, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0001_UUID lock: ffff88bcb04a8c00/0xe8af923331897b15 lrc: 3/1,0 mode: --/CR res: [0x28001c991:0xea2c:0x0].0x0 bits 0x2 rrc: 8 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 143643 timeout: 0 lvb_type: 0 /var/log/messages-20180407.gz:Apr 7 02:23:51 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 02:23:51 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:23:51 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 02:23:51 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:26:21 warble1 kernel: Lustre: 145034:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bc5a5e5d00 x1594448842425152/t0(0) o101->cbc5fea3-98aa-c5ec-05a7-b5de8ce27b26@192.168.44.104@o2ib44:686/0 lens 896/3512 e 24 to 0 dl 1523031986 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 02:33:52 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 02:33:52 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 02:33:52 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 02:33:52 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 02:43:53 warble1 kernel: Lustre: dagg-MDT0002: Client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 02:43:53 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:43:53 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 02:43:53 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: LNet: Service thread pid 211934 was inactive for 200.32s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: Pid: 211934, comm: mdt00_033 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_object_local_lock+0x452/0xaf0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? lu_object_find_at+0x95/0x290 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_object_lock+0x14/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_reint_open+0xc71/0x31a0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] mdt_intent_policy+0x43e/0xc70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? ldlm_resource_get+0x9f/0xa30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 02:50:53 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523033453.211934 /var/log/messages-20180407.gz:Apr 7 02:52:33 warble1 kernel: LustreError: 211934:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523033253, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0001_UUID lock: ffff885dd52a2000/0xe8af923331b8c701 lrc: 3/1,0 mode: --/CR res: [0x28001c991:0xea2c:0x0].0x0 bits 0x2 rrc: 12 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 211934 timeout: 0 lvb_type: 0 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: LNet: Service thread pid 145034 was inactive for 200.65s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: Pid: 145034, comm: mdt01_089 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_object_local_lock+0x452/0xaf0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? lu_object_find_at+0x95/0x290 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_object_lock+0x14/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_reint_open+0xc71/0x31a0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] mdt_intent_policy+0x43e/0xc70 [mdt] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? ldlm_resource_get+0x9f/0xa30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 02:52:45 warble1 kernel: [] ret_from_fork+0x58/0x90 -- /var/log/messages-20180407.gz:Apr 7 02:54:24 warble1 kernel: LustreError: 145034:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523033364, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0001_UUID lock: ffff88bca26fb200/0xe8af923331bc05a0 lrc: 3/1,0 mode: --/CR res: [0x28001c991:0xea2c:0x0].0x0 bits 0x2 rrc: 12 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 145034 timeout: 0 lvb_type: 0 /var/log/messages-20180407.gz:Apr 7 02:57:28 warble1 kernel: Lustre: 364509:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885e936b6050 x1595015336008416/t0(0) o101->648c8bb7-edb8-ecdf-617f-773ccb2226ea@192.168.44.101@o2ib44:288/0 lens 896/3512 e 24 to 0 dl 1523033853 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 02:59:19 warble1 kernel: Lustre: 364490:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b99aafa700 x1594448842584896/t0(0) o101->cbc5fea3-98aa-c5ec-05a7-b5de8ce27b26@192.168.44.104@o2ib44:399/0 lens 896/3512 e 24 to 0 dl 1523033964 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 03:02:34 warble1 kernel: LustreError: 143589:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523033854, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0001_UUID lock: ffff8854d946fe00/0xe8af923331c830ed lrc: 3/1,0 mode: --/CR res: [0x28001c991:0xea2c:0x0].0x0 bits 0x2 rrc: 14 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 143589 timeout: 0 lvb_type: 0 /var/log/messages-20180407.gz:Apr 7 03:03:34 warble1 kernel: LustreError: 137-5: home-MDT0000_UUID: not available for connect from 192.168.44.199@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180407.gz:Apr 7 03:03:34 warble1 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 03:03:34 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 4ceb2312-ea73-5592-c1b4-1db4c5b9fe10 (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 03:03:59 warble2 kernel: Lustre: apps-MDT0000: Connection restored to 4ceb2312-ea73-5592-c1b4-1db4c5b9fe10 (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 03:03:59 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 03:03:59 warble1 kernel: Lustre: images-MDT0000: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 03:03:59 warble1 kernel: Lustre: Skipped 7 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:04:31 warble1 kernel: Lustre: dagg-MDT0002: haven't heard from client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bd20d33400, cur 1523034271 expire 1523034121 last 1523034044 /var/log/messages-20180407.gz:Apr 7 03:04:31 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 03:04:31 warble2 kernel: Lustre: dagg-MDT0000: haven't heard from client 1649aabf-3fc7-579d-53cc-12bd7e762c0c (at 192.168.44.199@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bca8210400, cur 1523034271 expire 1523034121 last 1523034044 /var/log/messages-20180407.gz:Apr 7 03:04:31 warble2 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:04:47 warble1 kernel: Lustre: MGS: haven't heard from client 9bbb567c-0440-2485-d555-336ca80eacd9 (at 192.168.44.199@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e8c1c3c00, cur 1523034287 expire 1523034137 last 1523034060 /var/log/messages-20180407.gz:Apr 7 03:04:47 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 03:10:04 warble1 kernel: Lustre: 143597:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885932de4e00 x1595015336143824/t0(0) o101->648c8bb7-edb8-ecdf-617f-773ccb2226ea@192.168.44.101@o2ib44:289/0 lens 896/3512 e 0 to 0 dl 1523034609 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 03:10:10 warble1 kernel: Lustre: dagg-MDT0001: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 03:10:10 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: LNet: Service thread pid 362191 was inactive for 200.29s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: Pid: 362191, comm: mdt_rdpg01_038 /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:20 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523034681.362191 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: LNet: Service thread pid 249361 was inactive for 200.69s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: Pid: 249361, comm: mdt_rdpg01_051 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: Pid: 364526, comm: mdt_rdpg01_016 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] -- /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:21 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: LNet: Service thread pid 143627 was inactive for 200.60s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: LNet: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: Pid: 143627, comm: mdt_rdpg01_019 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:11:22 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523034682.143627 /var/log/messages-20180407.gz:Apr 7 03:12:23 warble1 kernel: LNet: Service thread pid 22443 was inactive for 200.18s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180407.gz:Apr 7 03:12:23 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523034743.22443 /var/log/messages-20180407.gz:Apr 7 03:15:10 warble1 kernel: LustreError: 364505:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523034610, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0001_UUID lock: ffff88575a66d200/0xe8af9233327e61ed lrc: 3/1,0 mode: --/CR res: [0x28001c991:0xea2c:0x0].0x0 bits 0x2 rrc: 16 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 364505 timeout: 0 lvb_type: 0 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: LNet: Service thread pid 143589 was inactive for 1203.86s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: Pid: 143589, comm: mdt00_090 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_object_local_lock+0x452/0xaf0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? lu_object_find_at+0x95/0x290 [obdclass] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_object_lock+0x14/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_reint_open+0xc71/0x31a0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] mdt_intent_policy+0x43e/0xc70 [mdt] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? ldlm_resource_get+0x9f/0xa30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:17:38 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523035058.143589 /var/log/messages-20180407.gz:Apr 7 03:17:55 warble1 kernel: Lustre: 249265:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bbe7e1c200 x1597017121369360/t0(0) o35->dc32fca6-f266-16ac-e214-b77ef785a309@192.168.44.199@o2ib44:5/0 lens 512/696 e 24 to 0 dl 1523035080 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 03:17:56 warble1 kernel: Lustre: 166300:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdb8402400 x1597017121380768/t0(0) o35->dc32fca6-f266-16ac-e214-b77ef785a309@192.168.44.199@o2ib44:6/0 lens 512/696 e 23 to 0 dl 1523035081 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 03:17:56 warble1 kernel: Lustre: 166300:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:18:01 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 03:18:01 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:18:58 warble1 kernel: Lustre: 15389:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff8858f8b25050 x1594462407856208/t0(0) o35->02c65d3d-616e-a18f-3474-eecff1778104@192.168.44.14@o2ib44:68/0 lens 512/696 e 24 to 0 dl 1523035143 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 03:22:41 warble1 kernel: Lustre: 88312:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885ac333a400 x1595015336511344/t0(0) o101->648c8bb7-edb8-ecdf-617f-773ccb2226ea@192.168.44.101@o2ib44:290/0 lens 896/3512 e 0 to 0 dl 1523035365 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180407.gz:Apr 7 03:22:46 warble1 kernel: Lustre: dagg-MDT0001: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 03:22:46 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:28:02 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 03:28:02 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: LNet: Service thread pid 364505 was inactive for 1201.45s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: Pid: 364505, comm: mdt00_043 /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:11 warble1 kernel: [] mdt_object_local_lock+0x452/0xaf0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? lu_object_find_at+0x95/0x290 [obdclass] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] mdt_object_lock+0x14/0x20 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] mdt_reint_open+0xc71/0x31a0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] mdt_intent_policy+0x43e/0xc70 [mdt] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? ldlm_resource_get+0x9f/0xa30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:30:12 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523035812.364505 /var/log/messages-20180407.gz:Apr 7 03:35:22 warble1 kernel: Lustre: dagg-MDT0001: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 03:35:22 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:38:03 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 03:38:03 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180407.gz:Apr 7 03:47:58 warble1 kernel: Lustre: dagg-MDT0001: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180407.gz:Apr 7 03:47:58 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:48:04 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180407.gz:Apr 7 03:48:04 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: LNet: Service thread pid 23977 was inactive for 200.54s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: Pid: 23977, comm: mdt_rdpg00_008 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:48:34 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523036914.23977 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: LNet: Service thread pid 127653 was inactive for 200.54s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: Pid: 127653, comm: mdt_rdpg00_049 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: #012Call Trace: /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: /var/log/messages-20180407.gz:Apr 7 03:49:39 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523036979.127653 /var/log/messages-20180408.gz:Apr 7 03:55:09 warble1 kernel: Lustre: 143632:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885e93a50450 x1594462407981744/t0(0) o35->02c65d3d-616e-a18f-3474-eecff1778104@192.168.44.14@o2ib44:729/0 lens 512/696 e 24 to 0 dl 1523037314 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180408.gz:Apr 7 03:56:13 warble1 kernel: Lustre: 143632:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bd7a4de000 x1594462408051392/t0(0) o35->02c65d3d-616e-a18f-3474-eecff1778104@192.168.44.14@o2ib44:38/0 lens 512/696 e 8 to 0 dl 1523037378 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180408.gz:Apr 7 03:58:05 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 03:58:05 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180408.gz:Apr 7 03:58:05 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 03:58:05 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180408.gz:Apr 7 04:08:06 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 04:08:06 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180408.gz:Apr 7 04:08:06 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 04:08:06 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180408.gz:Apr 7 04:18:07 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 04:18:07 warble1 kernel: Lustre: Skipped 3 previous similar messages -- /var/log/messages-20180408.gz:Apr 7 15:19:16 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 15:19:16 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180408.gz:Apr 7 15:19:16 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 15:19:16 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180408.gz:Apr 7 15:29:17 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 15:29:17 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180408.gz:Apr 7 15:29:17 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 15:29:17 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180408.gz:Apr 7 15:39:18 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 15:39:18 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180408.gz:Apr 7 15:39:18 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 15:39:18 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180408.gz:Apr 7 15:49:19 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 15:49:19 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180408.gz:Apr 7 15:49:19 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 15:49:19 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180408.gz:Apr 7 15:59:20 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 15:59:20 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180408.gz:Apr 7 15:59:20 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 15:59:20 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: LNet: Service thread pid 364509 was inactive for 200.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: Pid: 364509, comm: mdt00_046 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: #012Call Trace: /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ldlm_cli_enqueue_fini+0x938/0xdb0 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] lod_object_lock+0xf0/0x950 [lod] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? htable_lookup+0x9d/0x170 [obdclass] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] mdd_object_lock+0x3b/0xd0 [mdd] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] mdt_remote_object_lock+0x1e2/0x710 [mdt] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: /var/log/messages-20180408.gz:Apr 7 16:08:15 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523081295.364509 /var/log/messages-20180408.gz:Apr 7 16:09:21 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 16:09:21 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180408.gz:Apr 7 16:09:21 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 95c61da7-ad27-b822-2f72-0782ea5197ce (at 192.168.44.199@o2ib44) /var/log/messages-20180408.gz:Apr 7 16:09:21 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180408.gz:Apr 7 16:09:54 warble2 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 192.168.44.21@o2ib44, removing former export from same NID /var/log/messages-20180408.gz:Apr 7 16:09:54 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 192.168.44.21@o2ib44 (at 192.168.44.21@o2ib44) /var/log/messages-20180408.gz:Apr 7 16:09:55 warble1 kernel: Lustre: dagg-MDT0000-osp-MDT0001: Connection to dagg-MDT0000 (at 192.168.44.22@o2ib44) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180408.gz:Apr 7 16:09:55 warble1 kernel: LustreError: 364509:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523081094, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0001 lock: ffff885e17acba00/0xe8af92334ef209b6 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 2 type: IBT flags: 0x1000001000000 nid: local remote: 0xaea67cc1322b9dc0 expref: -99 pid: 364509 timeout: 0 lvb_type: 0 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: LNet: Service thread pid 102678 was inactive for 200.46s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: Pid: 102678, comm: mdt01_111 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: #012Call Trace: /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x67c/0x860 [mdt] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: /var/log/messages-20180408.gz:Apr 7 16:10:19 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1523081419.102678 /var/log/messages-20180408.gz:Apr 7 16:11:58 warble2 kernel: LustreError: 102678:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523081218, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88bae3edba00/0xaea67cc1322ba465 lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102678 timeout: 0 lvb_type: 0 /var/log/messages-20180408.gz:Apr 7 16:11:58 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1523081518.102678 /var/log/messages-20180408.gz:Apr 7 16:14:49 warble1 kernel: Lustre: 150602:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885c96c7b000 x1594462426718320/t0(0) o36->02c65d3d-616e-a18f-3474-eecff1778104@192.168.44.14@o2ib44:564/0 lens 768/3128 e 24 to 0 dl 1523081694 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180408.gz:Apr 7 16:16:53 warble2 kernel: Lustre: 102661:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bc0642bf00 x1594465554616000/t0(0) o36->bd95096f-2790-2a0e-d286-cdfd62034abc@192.168.44.13@o2ib44:688/0 lens 752/3128 e 24 to 0 dl 1523081818 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180408.gz:Apr 7 16:16:59 warble2 kernel: Lustre: dagg-MDT0000: Client bd95096f-2790-2a0e-d286-cdfd62034abc (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180408.gz:Apr 7 16:16:59 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to fcef106e-d6cd-4e67-23c7-bc318374c9d5 (at 192.168.44.13@o2ib44) /var/log/messages-20180408.gz:Apr 7 16:19:22 warble1 kernel: Lustre: dagg-MDT0002: Client dc32fca6-f266-16ac-e214-b77ef785a309 (at 192.168.44.199@o2ib44) reconnecting -- /var/log/messages-20180410.gz:Apr 10 01:49:27 warble1 kernel: Lustre: images-MDT0000: Imperative Recovery not enabled, recovery window 300-900 /var/log/messages-20180410.gz:Apr 10 01:49:27 warble1 Lustre(warble1-images-MDT0)[8185 INFO: warble1-images-MDT0-pool/MDT0 mounted successfully /var/log/messages-20180410.gz:Apr 10 01:49:27 warble1 stonith-ng[15435]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 01:49:27 warble2 stonith-ng[30870]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 01:49:27 warble1 stonith-ng[15435]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 01:49:27 warble2 stonith-ng[30870]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 01:49:27 warble1 kernel: Lustre: images-MDT0000: Will be in recovery for at least 5:00, or until 123 clients reconnect /var/log/messages-20180410.gz:Apr 10 01:49:29 warble1 stonith-ng[15435]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 01:49:29 warble2 stonith-ng[30870]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 01:49:32 warble1 kernel: Lustre: 23312:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1523288967/real 1523288967] req@ffff88bde868a400 x1597281413784960/t0(0) o8->images-OST0000-osc-MDT0000@192.168.44.51@o2ib44:28/4 lens 520/544 e 0 to 1 dl 1523288972 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180410.gz:Apr 10 01:49:52 warble1 kernel: LustreError: 167-0: dagg-MDT0000-lwp-MDT0001: This client was evicted by dagg-MDT0000; in progress operations using this service will fail. /var/log/messages-20180410.gz:Apr 10 01:49:52 warble2 kernel: Lustre: dagg-MDT0000: Recovery over after 0:31, of 125 clients 125 recovered and 0 were evicted. /var/log/messages-20180410.gz:Apr 10 01:50:11 warble1 kernel: Lustre: images-MDT0000: Connection restored to (at 192.168.44.107@o2ib44) /var/log/messages-20180410.gz:Apr 10 01:50:11 warble1 kernel: Lustre: Skipped 218 previous similar messages /var/log/messages-20180410.gz:Apr 10 01:50:11 warble1 kernel: Lustre: images-MDT0000: Recovery over after 0:44, of 123 clients 123 recovered and 0 were evicted. /var/log/messages-20180410.gz:Apr 10 01:50:11 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:01:54 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.199@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:01:54 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180410.gz:Apr 10 02:01:54 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 28d167c4-3465-c916-3ef7-d28d070b2432 (at 192.168.44.199@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:01:54 warble2 kernel: Lustre: Skipped 144 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: LNet: Service thread pid 23401 was inactive for 200.46s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: Pid: 23401, comm: mdt_rdpg00_000 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: #012Call Trace: /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: /var/log/messages-20180410.gz:Apr 10 02:07:16 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523290036.23401 /var/log/messages-20180410.gz:Apr 10 02:13:51 warble1 kernel: Lustre: 24747:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885df45f8c00 x1595015449719920/t0(0) o35->648c8bb7-edb8-ecdf-617f-773ccb2226ea@192.168.44.101@o2ib44:171/0 lens 512/696 e 24 to 0 dl 1523290436 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180410.gz:Apr 10 02:13:57 warble1 kernel: Lustre: dagg-MDT0002: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:13:57 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.101@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: LNet: Service thread pid 26661 was inactive for 200.48s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: Pid: 26661, comm: mdt_rdpg01_004 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: #012Call Trace: /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:18:52 warble1 kernel: /var/log/messages-20180410.gz:Apr 10 02:18:53 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523290732.26661 /var/log/messages-20180410.gz:Apr 10 02:22:53 warble1 kernel: LNet: Service thread pid 29145 was inactive for 312.86s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180410.gz:Apr 10 02:22:53 warble1 kernel: Pid: 29145, comm: mdt_rdpg01_009 /var/log/messages-20180410.gz:Apr 10 02:22:53 warble1 kernel: #012Call Trace: /var/log/messages-20180410.gz:Apr 10 02:22:53 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180410.gz:Apr 10 02:22:53 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] mdd_attr_set+0x59a/0xb50 [mdd] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] mdt_mfd_close+0x1a3/0x610 [mdt] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: /var/log/messages-20180410.gz:Apr 10 02:22:54 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523290974.29145 /var/log/messages-20180410.gz:Apr 10 02:23:58 warble1 kernel: Lustre: dagg-MDT0002: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:23:58 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.101@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:25:27 warble1 kernel: Lustre: 28502:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bde51b4500 x1594466097554704/t0(0) o35->bd95096f-2790-2a0e-d286-cdfd62034abc@192.168.44.13@o2ib44:112/0 lens 512/696 e 24 to 0 dl 1523291132 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180410.gz:Apr 10 02:25:33 warble1 kernel: Lustre: dagg-MDT0002: Client bd95096f-2790-2a0e-d286-cdfd62034abc (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:25:33 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:27:37 warble1 kernel: Lustre: 28502:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/4), not sending early reply#012 req@ffff88bde5965a00 x1594466097686896/t0(0) o35->bd95096f-2790-2a0e-d286-cdfd62034abc@192.168.44.13@o2ib44:241/0 lens 512/696 e 4 to 0 dl 1523291261 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180410.gz:Apr 10 02:30:21 warble1 kernel: LustreError: 11-0: dagg-OST0000-osc-MDT0001: operation ost_statfs to node 192.168.44.32@o2ib44 failed: rc = -107 /var/log/messages-20180410.gz:Apr 10 02:30:21 warble1 kernel: Lustre: dagg-OST0000-osc-MDT0001: Connection to dagg-OST0000 (at 192.168.44.32@o2ib44) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180410.gz:Apr 10 02:30:21 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:30:21 warble2 kernel: LustreError: 11-0: dagg-OST0001-osc-MDT0000: operation ost_statfs to node 192.168.44.32@o2ib44 failed: rc = -107 /var/log/messages-20180410.gz:Apr 10 02:30:21 warble2 kernel: Lustre: dagg-OST0000-osc-MDT0000: Connection to dagg-OST0000 (at 192.168.44.32@o2ib44) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180410.gz:Apr 10 02:30:21 warble2 kernel: LustreError: Skipped 1 previous similar message -- /var/log/messages-20180410.gz:Apr 10 02:31:08 warble2 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180410.gz:Apr 10 02:31:09 warble1 kernel: Lustre: MGS: Connection restored to 192.168.44.51@o2ib44 (at 192.168.44.51@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:31:09 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:31:09 warble1 kernel: LustreError: 137-5: home-MDT0000_UUID: not available for connect from 192.168.44.51@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180410.gz:Apr 10 02:31:09 warble1 kernel: LustreError: Skipped 56 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:31:10 warble1 kernel: LustreError: 11-0: images-OST0000-osc-MDT0000: operation ost_statfs to node 192.168.44.52@o2ib44 failed: rc = -107 /var/log/messages-20180410.gz:Apr 10 02:31:10 warble1 kernel: LustreError: Skipped 3 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:31:10 warble1 kernel: Lustre: images-OST0000-osc-MDT0000: Connection to images-OST0000 (at 192.168.44.52@o2ib44) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180410.gz:Apr 10 02:31:10 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:31:20 warble2 kernel: Lustre: apps-OST0000-osc-MDT0000: Connection restored to 192.168.44.51@o2ib44 (at 192.168.44.51@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:31:20 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180410.gz:Apr 10 02:31:35 warble2 kernel: Lustre: home-MDT0000: Connection restored to 192.168.44.51@o2ib44 (at 192.168.44.51@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:31:35 warble2 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:33:59 warble1 kernel: Lustre: dagg-MDT0002: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:33:59 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.101@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:33:59 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:35:34 warble1 kernel: Lustre: dagg-MDT0002: Client bd95096f-2790-2a0e-d286-cdfd62034abc (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:35:34 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:44:00 warble1 kernel: Lustre: dagg-MDT0002: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:44:00 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.101@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: LNet: Service thread pid 135207 was inactive for 200.56s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: Pid: 135207, comm: mdt00_033 /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: #012Call Trace: /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdd_rename+0x4d1/0x14a0 [mdd] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdt_reint_rename_internal.isra.36+0x1664/0x20c0 [mdt] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x19b/0x860 [mdt] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:32 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 02:45:33 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180410.gz:Apr 10 02:45:33 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:45:33 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 02:45:33 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 02:45:33 warble1 kernel: /var/log/messages-20180410.gz:Apr 10 02:45:33 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523292333.135207 /var/log/messages-20180410.gz:Apr 10 02:52:07 warble1 kernel: Lustre: 134863:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885df13dc800 x1594700045083424/t0(0) o36->ef1edffb-bdb1-02ff-aa7c-d4975a7fb7d0@192.168.44.200@o2ib44:202/0 lens 808/3128 e 24 to 0 dl 1523292732 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180410.gz:Apr 10 02:52:13 warble1 kernel: Lustre: dagg-MDT0002: Client ef1edffb-bdb1-02ff-aa7c-d4975a7fb7d0 (at 192.168.44.200@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:52:13 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.200@o2ib44) /var/log/messages-20180410.gz:Apr 10 02:54:01 warble1 kernel: Lustre: dagg-MDT0002: Client 648c8bb7-edb8-ecdf-617f-773ccb2226ea (at 192.168.44.101@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 02:54:23 warble1 stonith-ng[15435]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 02:54:23 warble2 stonith-ng[30870]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 02:54:23 warble1 stonith-ng[15435]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 02:54:23 warble2 stonith-ng[30870]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180410.gz:Apr 10 02:54:23 warble1 stonith-ng[15435]: notice: On loss of CCM Quorum: Ignore -- /var/log/messages-20180410.gz:Apr 10 02:56:11 warble1 kernel: LustreError: Skipped 55 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:56:35 warble2 kernel: LustreError: 137-5: dagg-MDT0000_UUID: not available for connect from 192.168.44.192@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180410.gz:Apr 10 02:56:35 warble2 kernel: LustreError: Skipped 164 previous similar messages /var/log/messages-20180410.gz:Apr 10 02:56:47 warble1 kernel: Lustre: dagg-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 /var/log/messages-20180410.gz:Apr 10 02:56:49 warble1 kernel: Lustre: dagg-MDT0000: Will be in recovery for at least 2:30, or until 124 clients reconnect /var/log/messages-20180410.gz:Apr 10 02:57:12 warble1 kernel: LustreError: 167-0: dagg-MDT0000-lwp-MDT0002: This client was evicted by dagg-MDT0000; in progress operations using this service will fail. /var/log/messages-20180410.gz:Apr 10 02:57:12 warble1 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180410.gz:Apr 10 02:57:37 warble1 kernel: Lustre: dagg-MDT0000: Recovery over after 0:48, of 124 clients 124 recovered and 0 were evicted. /var/log/messages-20180410.gz:Apr 10 03:02:14 warble1 kernel: Lustre: dagg-MDT0002: Client ef1edffb-bdb1-02ff-aa7c-d4975a7fb7d0 (at 192.168.44.200@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 03:02:14 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.200@o2ib44) /var/log/messages-20180410.gz:Apr 10 03:02:14 warble1 kernel: Lustre: Skipped 150 previous similar messages /var/log/messages-20180410.gz:Apr 10 03:05:43 warble1 kernel: Lustre: 84557:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885e03177b00 x1594462458774880/t0(0) o36->02c65d3d-616e-a18f-3474-eecff1778104@192.168.44.14@o2ib44:263/0 lens 624/568 e 0 to 0 dl 1523293548 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180410.gz:Apr 10 03:05:49 warble1 kernel: Lustre: dagg-MDT0002: Client 02c65d3d-616e-a18f-3474-eecff1778104 (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 03:05:49 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180410.gz:Apr 10 03:12:04 warble1 kernel: LustreError: 135208:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523293624, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88bde274a200/0x25e3f37caf912605 lrc: 3/0,1 mode: --/CW res: [0x200011570:0x5:0x0].0x0 bits 0x2 rrc: 6 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 135208 timeout: 0 lvb_type: 0 /var/log/messages-20180410.gz:Apr 10 03:12:04 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523293924.135208 /var/log/messages-20180410.gz:Apr 10 03:12:15 warble1 kernel: Lustre: dagg-MDT0002: Client ef1edffb-bdb1-02ff-aa7c-d4975a7fb7d0 (at 192.168.44.200@o2ib44) reconnecting /var/log/messages-20180410.gz:Apr 10 03:12:15 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.200@o2ib44) /var/log/messages-20180410.gz:Apr 10 03:12:15 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180410.gz:Apr 10 03:12:49 warble1 kernel: LustreError: 135213:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1523293669, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff885deb8fba00/0x25e3f37caf9307ca lrc: 3/1,0 mode: --/PR res: [0x200011570:0x5:0x0].0x0 bits 0x13 rrc: 8 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 135213 timeout: 0 lvb_type: 0 /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: LNet: Service thread pid 84553 was inactive for 1204.18s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: Pid: 84553, comm: mdt00_011 /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: #012Call Trace: /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] top_trans_stop+0x46b/0x970 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] mdd_trans_stop+0x24/0x40 [mdd] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] mdd_create+0x817/0x1320 [mdd] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] mdt_create+0x846/0xbb0 [mdt] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ? ldlm_resource_putref+0x2a5/0x510 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ? lprocfs_stats_lock+0x24/0xd0 [obdclass] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] mdt_reint_create+0x16b/0x350 [mdt] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] mdt_reint_rec+0x80/0x210 [mdt] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] tgt_request_handle+0x925/0x1370 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:17 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: [] ? __wake_up_common+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: [] kthread+0xcf/0xe0 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: [] ret_from_fork+0x58/0x90 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1523293998.84553 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: Lustre: Failing over dagg-MDT0000 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: LustreError: 135208:0:(lod_qos.c:208:lod_statfs_and_check()) dagg-MDT0000-mdtlov: statfs: rc = -108 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: LustreError: 135213:0:(ldlm_lockd.c:1415:ldlm_handle_enqueue0()) ### lock on destroyed export ffff885de9da6800 ns: mdt-dagg-MDT0000_UUID lock: ffff885deb8fba00/0x25e3f37caf9307ca lrc: 3/0,0 mode: PR/PR res: [0x200011570:0x5:0x0].0x0 bits 0x13 rrc: 4 type: IBT flags: 0x50200000000000 nid: 192.168.44.14@o2ib44 remote: 0xcc4108bbaf402552 expref: 2 pid: 135213 timeout: 0 lvb_type: 0 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: Lustre: dagg-MDT0000: Not available for connect from 192.168.44.122@o2ib44 (stopping) /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: LustreError: 181886:0:(lod_dev.c:1672:lod_device_free()) ASSERTION( atomic_read(&lu->ld_ref) == 0 ) failed: lu is ffff885eb2e80000 /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: LustreError: 181886:0:(lod_dev.c:1672:lod_device_free()) LBUG /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: Pid: 181886, comm: umount /var/log/messages-20180410.gz:Apr 10 03:13:18 warble1 kernel: #012Call Trace: -- /var/log/messages-20180516.gz:May 15 06:41:27 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180516.gz:May 15 06:41:27 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 06:41:27 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 06:42:54 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180516.gz:May 15 06:42:54 warble2 kernel: LustreError: 137-5: dagg-MDT0000_UUID: not available for connect from 10.8.49.155@tcp201 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 06:42:54 warble2 kernel: Lustre: MGS: Connection restored to 7f932477-6f96-1e3a-e894-50a1018f3ee3 (at 10.8.49.155@tcp201) /var/log/messages-20180516.gz:May 15 06:43:30 warble1 kernel: Lustre: dagg-MDT0000: Client 22c84389-af1f-9970-0e9b-70c3a4861afd (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180516.gz:May 15 06:43:30 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180516.gz:May 15 06:43:30 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 06:43:30 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180516.gz:May 15 06:47:00 warble2 kernel: Lustre: MGS: haven't heard from client afbbacdf-bb35-0df3-3052-cb3d661b024f (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e1a22ac00, cur 1526330820 expire 1526330670 last 1526330593 /var/log/messages-20180516.gz:May 15 06:47:17 warble1 kernel: Lustre: dagg-MDT0002: haven't heard from client 22c84389-af1f-9970-0e9b-70c3a4861afd (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88b6e2660400, cur 1526330837 expire 1526330687 last 1526330610 /var/log/messages-20180516.gz:May 15 06:47:17 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180516.gz:May 15 07:48:57 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180516.gz:May 15 07:48:57 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 07:48:57 warble2 kernel: LustreError: 137-5: dagg-MDT0000_UUID: not available for connect from 10.8.49.155@tcp201 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 07:48:57 warble2 kernel: LustreError: Skipped 2 previous similar messages /var/log/messages-20180516.gz:May 15 07:49:47 warble1 kernel: Lustre: dagg-MDT0000: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180516.gz:May 15 07:50:12 warble2 kernel: Lustre: MGS: Connection restored to 7f932477-6f96-1e3a-e894-50a1018f3ee3 (at 10.8.49.155@tcp201) /var/log/messages-20180516.gz:May 15 11:06:44 warble1 kernel: LustreError: 83224:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0001-osp-MDT0000: fail to cancel 1 of 1 llog-records: rc = -116 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: LNet: Service thread pid 89675 was inactive for 200.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: Pid: 89675, comm: mdt_rdpg01_005 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:10:36 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526346636.89675 /var/log/messages-20180516.gz:May 15 11:17:10 warble1 kernel: Lustre: 297721:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6cf4e3c00 x1598178963482656/t0(0) o35->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:530/0 lens 512/696 e 24 to 0 dl 1526347035 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:17:16 warble1 kernel: Lustre: dagg-MDT0001: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:17:16 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: LNet: Service thread pid 87640 was inactive for 513.86s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: Pid: 87640, comm: mdt_rdpg01_003 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:19:59 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526347199.87640 /var/log/messages-20180516.gz:May 15 11:21:20 warble1 kernel: Lustre: 87231:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6d67a9200 x1598178964303328/t0(0) o35->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:25/0 lens 512/696 e 2 to 0 dl 1526347285 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:22:52 warble1 kernel: Lustre: 80824:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6c8137500 x1598178964582800/t0(0) o35->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:117/0 lens 512/696 e 1 to 0 dl 1526347377 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: LNet: Service thread pid 388766 was inactive for 200.72s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: Pid: 388766, comm: mdt01_042 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? osp_update_request_destroy+0x42e/0x4b0 [osp] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? lod_obj_for_each_stripe+0x60/0x230 [lod] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? lod_declare_destroy+0x43b/0x5c0 [lod] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? mdd_env_info+0x21/0x60 [mdd] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] mdd_unlink+0x3f6/0xaa0 [mdd] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] mdt_reint_unlink+0xc28/0x11d0 [mdt] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:23:32 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526347412.388766 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: LNet: Service thread pid 389203 was inactive for 200.65s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: Pid: 389203, comm: mdt_rdpg00_042 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:23:42 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526347422.389203 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: LNet: Service thread pid 389654 was inactive for 714.73s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: Pid: 389654, comm: mdt_rdpg01_045 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:24:52 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526347492.389654 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: LNet: Service thread pid 389180 was inactive for 200.38s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: Pid: 389180, comm: mdt00_071 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:25:22 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526347522.389180 /var/log/messages-20180516.gz:May 15 11:27:02 warble1 kernel: LustreError: 389180:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526347322, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88539fb3e000/0x39ca858fb1ba4836 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 12 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 389180 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:27:02 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526347622.389180 /var/log/messages-20180516.gz:May 15 11:27:17 warble1 kernel: Lustre: dagg-MDT0001: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:27:17 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 11:28:42 warble1 kernel: LustreError: 389410:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526347422, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b6dd045000/0x39ca858fb1fb9ad4 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 12 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 389410 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:29:56 warble1 kernel: LustreError: 9927:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526347496, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b6b456fe00/0x39ca858fb22c45d8 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 12 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 9927 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:30:07 warble1 kernel: Lustre: 86634:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/4), not sending early reply#012 req@ffff88b6b81f0600 x1598178964969632/t0(0) o36->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:551/0 lens 616/3128 e 24 to 0 dl 1526347811 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:30:17 warble1 kernel: Lustre: 390008:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff8853c9288300 x1598186339211712/t0(0) o35->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:562/0 lens 512/696 e 24 to 0 dl 1526347822 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:30:23 warble1 kernel: Lustre: dagg-MDT0001: Client 96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:30:23 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 192.168.44.14@o2ib44) /var/log/messages-20180516.gz:May 15 11:30:50 warble1 kernel: LustreError: 390003:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526347550, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b6b1dd8a00/0x39ca858fb24f9b28 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 12 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 390003 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: LNet: Service thread pid 389410 was inactive for 463.18s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: Pid: 389410, comm: mdt01_076 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:31:25 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526347885.389410 /var/log/messages-20180516.gz:May 15 11:31:57 warble1 kernel: Lustre: 389179:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885dd527f800 x1598186339681280/t0(0) o101->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:662/0 lens 696/3384 e 24 to 0 dl 1526347922 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:31:58 warble1 kernel: LustreError: 388775:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526347618, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b6f7ebb200/0x39ca858fb27c37af lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 12 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 388775 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:32:03 warble1 kernel: Lustre: dagg-MDT0002: Client 96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:32:03 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.14@o2ib44) /var/log/messages-20180516.gz:May 15 11:33:37 warble1 kernel: Lustre: 86027:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6de5c6300 x1598178965663728/t0(0) o101->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:7/0 lens 696/3384 e 2 to 0 dl 1526348022 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:33:43 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:33:43 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 11:34:51 warble1 kernel: Lustre: 389404:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6d2281200 x1598178966094624/t0(0) o101->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:81/0 lens 696/3384 e 1 to 0 dl 1526348096 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:34:51 warble1 kernel: Lustre: 389404:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: LNet: Service thread pid 9927 was inactive for 614.77s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: Pid: 9927, comm: mdt01_021 /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:10 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:35:11 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526348111.9927 /var/log/messages-20180516.gz:May 15 11:36:54 warble1 kernel: Lustre: 388919:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/4), not sending early reply#012 req@ffff88b6ece55100 x1598178967162656/t0(0) o101->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:203/0 lens 704/3384 e 1 to 0 dl 1526348218 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:36:54 warble1 kernel: Lustre: 388919:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 11:37:03 warble1 kernel: LustreError: 388391:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526347923, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff8853b86dbe00/0x39ca858fb342155a lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 388391 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:37:19 warble1 kernel: Lustre: dagg-MDT0001: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:37:19 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 11:37:30 warble1 kernel: LustreError: 388912:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526347950, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b6d48b5400/0x39ca858fb3539f40 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 388912 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: LNet: Service thread pid 390003 was inactive for 716.22s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: Pid: 390003, comm: mdt01_110 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:37:46 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526348266.390003 /var/log/messages-20180516.gz:May 15 11:39:13 warble1 kernel: LustreError: 389898:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526348053, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b6ec743400/0x39ca858fb396dd97 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 20 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 389898 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: LNet: Service thread pid 148618 was inactive for 1202.03s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: Pid: 148618, comm: mdt_rdpg01_070 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:41:19 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526348479.148618 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: Pid: 388775, comm: mdt01_044 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] -- /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:41:23 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526348483.388775 /var/log/messages-20180516.gz:May 15 11:42:44 warble1 kernel: Lustre: 390004:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-128), not sending early reply#012 req@ffff8853a6b5b000 x1598186341106384/t0(0) o35->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:553/0 lens 512/696 e 0 to 0 dl 1526348568 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:42:44 warble1 kernel: Lustre: 390004:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 11:42:49 warble1 kernel: Lustre: dagg-MDT0001: Client 96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:42:49 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 192.168.44.14@o2ib44) /var/log/messages-20180516.gz:May 15 11:43:38 warble1 kernel: LustreError: 389192:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526348318, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b7351b9e00/0x39ca858fb4442427 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 20 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 389192 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 11:43:44 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:43:44 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: LNet: Service thread pid 80824 was inactive for 1201.26s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: Pid: 80824, comm: mdt_rdpg01_000 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:46:18 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526348778.80824 /var/log/messages-20180516.gz:May 15 11:47:20 warble1 kernel: Lustre: dagg-MDT0001: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 11:47:20 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 11:47:20 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 11:47:20 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 11:47:34 warble1 kernel: Lustre: 389773:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff8853c7461200 x1598186341601008/t0(0) o36->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:89/0 lens 608/3392 e 0 to 0 dl 1526348859 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:47:34 warble1 kernel: Lustre: 389773:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 11:50:36 warble1 kernel: Pid: 388926, comm: mdt_rdpg00_032 /var/log/messages-20180516.gz:May 15 11:50:36 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:50:36 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:50:36 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:50:36 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] -- /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:54:17 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:54:18 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526349258.389898 /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: LNet: Service thread pid 389215 was inactive for 1202.36s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: LNet: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: Pid: 389215, comm: mdt00_080 /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:06 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] mdt_reint_setattr+0xba5/0x1060 [mdt] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:55:07 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526349307.389215 /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: Pid: 306936, comm: mdt_rdpg00_018 /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 11:55:52 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] -- /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: /var/log/messages-20180516.gz:May 15 11:59:00 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526349540.389915 /var/log/messages-20180516.gz:May 15 11:59:06 warble1 kernel: Lustre: 389906:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88b6aa7d5a00 x1598178970957440/t0(0) o101->eea0e93b-08a6-28be-b9f1-682a3447515f@192.168.44.13@o2ib44:26/0 lens 1664/3384 e 0 to 0 dl 1526349551 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 11:59:06 warble1 kernel: Lustre: 389906:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 6 previous similar messages /var/log/messages-20180516.gz:May 15 12:03:46 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 12:03:46 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:03:46 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 12:03:46 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: LNet: Service thread pid 9919 was inactive for 1202.41s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: LNet: Skipped 6 previous similar messages /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: Pid: 9919, comm: mdt01_016 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: /var/log/messages-20180516.gz:May 15 12:06:39 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526349999.9919 /var/log/messages-20180516.gz:May 15 12:13:47 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 12:13:47 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:13:47 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 12:13:47 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:23:33 warble1 kernel: LNet: Service thread pid 388296 was inactive for 200.28s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 12:23:33 warble1 kernel: Pid: 388296, comm: mdt_rdpg00_029 /var/log/messages-20180516.gz:May 15 12:23:33 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 12:23:33 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 12:23:33 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 12:23:33 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:33 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: /var/log/messages-20180516.gz:May 15 12:23:34 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526351014.388296 /var/log/messages-20180516.gz:May 15 12:23:48 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 12:23:48 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:23:48 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 12:23:48 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:30:08 warble1 kernel: Lustre: 390018:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff8853a52fdd00 x1598186345843424/t0(0) o35->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:378/0 lens 512/696 e 24 to 0 dl 1526351413 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: Pid: 388297, comm: mdt_rdpg00_030 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] -- /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: /var/log/messages-20180516.gz:May 15 12:32:58 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526351578.388297 /var/log/messages-20180516.gz:May 15 12:33:49 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 12:33:49 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180516.gz:May 15 12:33:49 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 12:33:49 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: LNet: Service thread pid 390012 was inactive for 564.07s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: Pid: 390012, comm: mdt_rdpg00_070 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: /var/log/messages-20180516.gz:May 15 12:34:01 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526351641.390012 /var/log/messages-20180516.gz:May 15 12:34:19 warble1 kernel: Lustre: 390007:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885dec93d400 x1598186345897648/t0(0) o35->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:629/0 lens 512/696 e 2 to 0 dl 1526351664 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 12:43:20 warble1 kernel: Lustre: 388295:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88b6b81f3f00 x1598186346281376/t0(0) o35->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:415/0 lens 512/696 e 0 to 0 dl 1526352205 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 12:43:20 warble1 kernel: Lustre: 388295:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 12:43:50 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 12:43:50 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:43:50 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 12:43:50 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: LNet: Service thread pid 85266 was inactive for 200.36s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: Pid: 85266, comm: mdt01_006 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: /var/log/messages-20180516.gz:May 15 12:47:11 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526352431.85266 /var/log/messages-20180516.gz:May 15 12:48:50 warble1 kernel: LustreError: 85266:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1526352230, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0002_UUID lock: ffff88b6c5272400/0x39ca858fbe411734 lrc: 3/1,0 mode: --/PR res: [0x680027502:0x33ce:0x0].0x0 bits 0x13 rrc: 24 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 85266 timeout: 0 lvb_type: 0 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: Pid: 390018, comm: mdt_rdpg00_074 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: #012Call Trace: -- /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: /var/log/messages-20180516.gz:May 15 12:50:53 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526352653.390018 /var/log/messages-20180516.gz:May 15 12:52:46 warble1 kernel: Lustre: 89619:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88537d6c6c00 x1598186346489760/t0(0) o35->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:225/0 lens 512/696 e 0 to 0 dl 1526352770 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 12:53:51 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 12:53:51 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 12:53:51 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 12:53:51 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: LNet: Service thread pid 390007 was inactive for 1202.84s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: Pid: 390007, comm: mdt_rdpg00_066 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: /var/log/messages-20180516.gz:May 15 13:00:18 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526353218.390007 /var/log/messages-20180516.gz:May 15 13:03:52 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 13:03:52 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 13:03:52 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 13:03:52 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 13:04:12 warble1 kernel: Pid: 389204, comm: mdt_rdpg00_043 /var/log/messages-20180516.gz:May 15 13:04:12 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 13:04:12 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 13:04:12 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 13:04:12 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:04:12 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 13:04:12 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] -- /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: /var/log/messages-20180516.gz:May 15 13:10:17 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526353817.389222 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: LNet: Service thread pid 390389 was inactive for 200.70s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: LNet: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: Pid: 390389, comm: mdt_rdpg01_065 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: /var/log/messages-20180516.gz:May 15 13:11:02 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526353862.390389 /var/log/messages-20180516.gz:May 15 13:13:53 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 13:13:53 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180516.gz:May 15 13:13:53 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 13:13:53 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: Pid: 389209, comm: mdt_rdpg00_048 /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] -- /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 13:16:49 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:16:50 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:16:50 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 13:16:50 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:16:50 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 13:16:50 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:16:50 warble1 kernel: /var/log/messages-20180516.gz:May 15 13:16:50 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526354210.389209 /var/log/messages-20180516.gz:May 15 13:19:52 warble1 kernel: Lustre: 182247:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff8853ae2a7800 x1598186347001856/t0(0) o35->96a0ae1d-7776-11f3-bbd2-f21f3bba6a4b@192.168.44.14@o2ib44:342/0 lens 512/696 e 0 to 0 dl 1526354397 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 13:19:52 warble1 kernel: Lustre: 182247:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 13:23:54 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 13:23:54 warble1 kernel: Lustre: Skipped 8 previous similar messages /var/log/messages-20180516.gz:May 15 13:23:54 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 13:23:54 warble1 kernel: Lustre: Skipped 8 previous similar messages /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: LNet: Service thread pid 90001 was inactive for 1203.34s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: Pid: 90001, comm: mdt_rdpg00_011 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: /var/log/messages-20180516.gz:May 15 13:25:46 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526354746.90001 /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: Pid: 389406, comm: mdt_rdpg00_051 /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 13:27:24 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] -- /var/log/messages-20180516.gz:May 15 13:31:01 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 13:31:01 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 13:31:01 warble1 kernel: /var/log/messages-20180516.gz:May 15 13:31:02 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526355061.390019 /var/log/messages-20180516.gz:May 15 13:33:56 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 13:33:56 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 13:33:56 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 13:33:56 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 13:43:57 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 13:43:57 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180516.gz:May 15 13:43:57 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 13:43:57 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180516.gz:May 15 13:53:58 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 13:53:58 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 13:53:58 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 13:53:58 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 14:03:59 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 14:03:59 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 14:03:59 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 14:03:59 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: LNet: Service thread pid 388298 was inactive for 200.58s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: LNet: Skipped 2 previous similar messages /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: Pid: 388298, comm: mdt_rdpg00_031 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: /var/log/messages-20180516.gz:May 15 14:08:26 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526357306.388298 /var/log/messages-20180516.gz:May 15 14:14:00 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 14:14:00 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 14:14:00 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 14:14:00 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180516.gz:May 15 14:15:00 warble1 kernel: Lustre: 388293:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdc913c800 x1598565458065440/t0(0) o35->c011e66b-3f26-636e-265c-666424088eb5@192.168.44.153@o2ib44:630/0 lens 512/696 e 24 to 0 dl 1526357705 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 14:15:00 warble1 kernel: Lustre: 388293:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 14:24:01 warble1 kernel: Lustre: dagg-MDT0002: Client eea0e93b-08a6-28be-b9f1-682a3447515f (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 14:24:01 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 14:24:01 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 14:24:01 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180516.gz:May 15 14:28:30 warble2 kernel: Lustre: MGS: haven't heard from client 5f1350ab-d283-9b39-2371-adf1a74cffcc (at 192.168.44.132@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e1726b400, cur 1526358510 expire 1526358360 last 1526358283 -- /var/log/messages-20180516.gz:May 15 14:57:24 warble2 kernel: LustreError: 137-5: dagg-MDT0001_UUID: not available for connect from 10.8.49.155@tcp201 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 14:57:24 warble2 kernel: LustreError: 137-5: dagg-MDT0002_UUID: not available for connect from 10.8.49.155@tcp201 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 14:57:24 warble2 kernel: LustreError: Skipped 5 previous similar messages /var/log/messages-20180516.gz:May 15 15:02:24 warble2 kernel: LustreError: 137-5: dagg-MDT0000_UUID: not available for connect from 10.8.49.155@tcp201 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 15:02:24 warble2 kernel: LustreError: Skipped 7 previous similar messages /var/log/messages-20180516.gz:May 15 15:03:38 warble2 kernel: Lustre: MGS: Connection restored to d01f1a11-cd19-ca51-e111-645e6312446f (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 15:03:38 warble1 kernel: LustreError: 137-5: images-MDT0000_UUID: not available for connect from 192.168.44.13@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 15:04:03 warble2 kernel: Lustre: images-MDT0000: Connection restored to d01f1a11-cd19-ca51-e111-645e6312446f (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 15:04:06 warble1 kernel: LustreError: 137-5: home-MDT0000_UUID: not available for connect from 192.168.44.13@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 15:04:06 warble1 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 15:04:16 warble2 kernel: Lustre: MGS: haven't heard from client 2971a319-c13e-555d-845f-f5ec5c45a3c4 (at 192.168.44.13@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e4abe9000, cur 1526360656 expire 1526360506 last 1526360429 /var/log/messages-20180516.gz:May 15 15:04:31 warble1 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 15:04:31 warble1 kernel: Lustre: Skipped 25 previous similar messages /var/log/messages-20180516.gz:May 15 15:04:31 warble2 kernel: Lustre: apps-MDT0000: Connection restored to d01f1a11-cd19-ca51-e111-645e6312446f (at 192.168.44.13@o2ib44) /var/log/messages-20180516.gz:May 15 15:04:31 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180516.gz:May 15 15:04:35 warble2 kernel: Lustre: images-MDT0000: haven't heard from client a9a289df-0ed7-393f-cc64-e70a81cc1e2e (at 192.168.44.13@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88aa86f32c00, cur 1526360675 expire 1526360525 last 1526360448 /var/log/messages-20180516.gz:May 15 15:05:19 warble1 kernel: Lustre: dagg-MDT0001: Client 22c84389-af1f-9970-0e9b-70c3a4861afd (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180516.gz:May 15 15:05:19 warble1 kernel: LNetError: 86027:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180516.gz:May 15 15:05:19 warble1 kernel: LNetError: 86027:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 19 previous similar messages /var/log/messages-20180516.gz:May 15 15:05:19 warble1 kernel: Lustre: Skipped 19 previous similar messages /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: LNet: Service thread pid 390005 was inactive for 200.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: Pid: 390005, comm: mdt_rdpg01_058 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: #012Call Trace: /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: /var/log/messages-20180516.gz:May 15 15:09:49 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1526360989.390005 /var/log/messages-20180516.gz:May 15 15:12:49 warble2 kernel: LustreError: 137-5: dagg-MDT0001_UUID: not available for connect from 10.8.49.155@tcp201 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180516.gz:May 15 15:12:49 warble2 kernel: LustreError: Skipped 25 previous similar messages /var/log/messages-20180516.gz:May 15 15:14:54 warble1 kernel: Lustre: dagg-MDT0000: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180516.gz:May 15 15:14:54 warble1 kernel: Lustre: Skipped 28 previous similar messages /var/log/messages-20180516.gz:May 15 15:16:23 warble1 kernel: Lustre: 389913:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6de5c1800 x1600505085979072/t0(0) o35->70a6dfba-b342-0475-868f-78883dbeeb06@192.168.44.13@o2ib44:538/0 lens 512/696 e 24 to 0 dl 1526361388 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180516.gz:May 15 15:16:29 warble1 kernel: Lustre: dagg-MDT0001: Client 70a6dfba-b342-0475-868f-78883dbeeb06 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180516.gz:May 15 15:16:29 warble1 kernel: Lustre: Skipped 27 previous similar messages /var/log/messages-20180516.gz:May 15 15:16:34 warble1 kernel: LNetError: 389677:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180516.gz:May 15 15:16:34 warble1 kernel: LNetError: 389677:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 27 previous similar messages /var/log/messages-20180516.gz:May 15 15:17:12 warble1 kernel: Lustre: Failing over dagg-MDT0000 /var/log/messages-20180516.gz:May 15 15:17:12 warble1 kernel: LustreError: 11-0: dagg-MDT0000-osp-MDT0001: operation out_update to node 0@lo failed: rc = -19 /var/log/messages-20180516.gz:May 15 15:17:12 warble1 kernel: LustreError: Skipped 1 previous similar message -- /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: finished job 258273 mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble2 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 17:05:47 transom1 bobMon: mem gmond data from warble1 is incomplete in a confusing way - restart its gmond? /var/log/messages-20180605.gz:Jun 4 18:27:00 warble2 kernel: LustreError: 19934:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0002-osp-MDT0000: fail to cancel 0 of 1 llog-records: rc = -116 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: LNet: Service thread pid 344716 was inactive for 200.53s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: Pid: 344716, comm: mdt_rdpg01_047 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101011.344716 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: LNet: Service thread pid 224334 was inactive for 200.29s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: Pid: 224334, comm: mdt_rdpg00_037 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:30:11 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:30:12 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101012.224334 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: LNet: Service thread pid 230929 was inactive for 200.28s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: Pid: 230929, comm: mdt_rdpg01_011 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:30:19 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101019.230929 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: LNet: Service thread pid 25256 was inactive for 212.40s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: Pid: 25256, comm: mdt_rdpg00_014 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:31:45 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101105.25256 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: LNet: Service thread pid 304542 was inactive for 212.43s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: Pid: 304542, comm: mdt_rdpg00_053 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101111.304542 /var/log/messages-20180605.gz:Jun 4 18:31:51 warble2 kernel: LNet: Service thread pid 105096 was inactive for 212.70s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 18:33:08 warble2 kernel: LNet: Service thread pid 45906 was inactive for 262.51s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 18:33:08 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:33:08 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101188.45906 /var/log/messages-20180605.gz:Jun 4 18:34:12 warble2 kernel: LNet: Service thread pid 19529 was inactive for 312.37s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 18:34:12 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101252.19529 /var/log/messages-20180605.gz:Jun 4 18:34:19 warble2 kernel: LNet: Service thread pid 167460 was inactive for 312.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 18:34:19 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101259.167460 /var/log/messages-20180605.gz:Jun 4 18:34:53 warble2 kernel: LNet: Service thread pid 338895 was inactive for 336.76s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 18:34:53 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101293.338895 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: LNet: Service thread pid 344186 was inactive for 362.48s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: Pid: 344186, comm: mdt_rdpg01_030 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:35:24 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101324.344186 /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: Pid: 332339, comm: mdt_rdpg01_025 /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] -- /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:29 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:35:30 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101330.332339 /var/log/messages-20180605.gz:Jun 4 18:36:45 warble2 kernel: Lustre: 130426:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b69a57ef00 x1601995970831536/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:285/0 lens 512/696 e 24 to 0 dl 1528101410 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:36:46 warble2 kernel: Lustre: 423432:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885b8f382700 x1601994364774448/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:286/0 lens 512/696 e 24 to 0 dl 1528101411 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:36:52 warble2 kernel: Lustre: dagg-MDT0000: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:36:52 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 70a6dfba-b342-0475-868f-78883dbeeb06 (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:36:52 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to d455300a-7f35-d998-af38-ca4bbf71bc7c (at 192.168.44.14@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:36:53 warble2 kernel: Lustre: 34555:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6429ca100 x1601995970842256/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:293/0 lens 512/696 e 23 to 0 dl 1528101418 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: LNet: Service thread pid 130428 was inactive for 436.88s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: Pid: 130428, comm: mdt_rdpg01_072 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:37:26 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101446.130428 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: LNet: Service thread pid 341541 was inactive for 462.72s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: Pid: 341541, comm: mdt_rdpg01_046 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:38:06 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528101486.341541 /var/log/messages-20180605.gz:Jun 4 18:38:08 warble2 kernel: Lustre: 304544:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885b8f380300 x1601994365143360/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:368/0 lens 512/696 e 5 to 0 dl 1528101493 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:38:13 warble2 kernel: Lustre: 330104:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885df1cfc850 x1601994365144240/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:373/0 lens 512/696 e 5 to 0 dl 1528101498 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:38:41 warble2 kernel: Lustre: 330104:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885b59de5a00 x1601994365509360/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:401/0 lens 512/696 e 4 to 0 dl 1528101526 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:38:41 warble2 kernel: Lustre: 330104:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 18:39:02 warble2 kernel: Lustre: 167767:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885b8f383600 x1601994365964576/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:422/0 lens 512/696 e 3 to 0 dl 1528101547 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:39:02 warble2 kernel: Lustre: 167767:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:39:17 warble2 kernel: Lustre: dagg-MDT0000: Client fd3c9f49-8e3a-791d-866b-14bb5dafa2e2 (at 192.168.44.191@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:39:17 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:39:17 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 3c8c99b3-7287-ae2a-4733-2c8d2fbfae1f (at 192.168.44.191@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:40:04 warble2 kernel: Lustre: 130426:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b6a1173c00 x1601995971155440/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:484/0 lens 512/696 e 2 to 0 dl 1528101609 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:40:04 warble2 kernel: Lustre: 130426:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 4 18:46:53 warble2 kernel: Lustre: dagg-MDT0000: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:46:53 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 70a6dfba-b342-0475-868f-78883dbeeb06 (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: LNet: Service thread pid 40492 was inactive for 200.40s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: Pid: 40492, comm: mdt00_017 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: LNet: Service thread pid 27208 was inactive for 200.70s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: Pid: 27208, comm: mdt00_065 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: /var/log/messages-20180605.gz:Jun 4 18:47:08 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528102028.40492 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdd_rename+0x4d1/0x14a0 [mdd] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdt_reint_rename_internal.isra.36+0x166a/0x20c0 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x19b/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:47:08 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528102028.27208 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: LNet: Service thread pid 30330 was inactive for 200.15s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: Pid: 30330, comm: mdt00_008 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: /var/log/messages-20180605.gz:Jun 4 18:47:17 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528102037.30330 /var/log/messages-20180605.gz:Jun 4 18:48:47 warble1 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 192.168.44.22@o2ib44) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180605.gz:Jun 4 18:48:47 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 18:48:47 warble1 kernel: LustreError: 40492:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528101827, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885e7bd22e00/0x4d56b892023b8260 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 3 type: IBT flags: 0x1000001000000 nid: local remote: 0x68dbe0b12db8c674 expref: -99 pid: 40492 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 18:48:47 warble2 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 192.168.44.21@o2ib44, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 18:48:47 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 192.168.44.21@o2ib44 (at 192.168.44.21@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:48:47 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:48:47 warble1 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection restored to 192.168.44.22@o2ib44 (at 192.168.44.22@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:48:57 warble1 kernel: LustreError: 30330:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528101837, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885da1e7f600/0x4d56b8920242572c lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 3 type: IBT flags: 0x1000001000000 nid: local remote: 0x68dbe0b12dbfa764 expref: -99 pid: 30330 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 18:49:22 warble2 kernel: Lustre: 330104:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885b62853300 x1601994368789008/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:287/0 lens 512/696 e 0 to 0 dl 1528102167 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:49:22 warble2 kernel: Lustre: 330104:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: LNet: Service thread pid 53272 was inactive for 200.51s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: Pid: 53272, comm: mdt01_087 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] mdd_create+0x817/0x1320 [mdd] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] mdt_create+0x849/0xbb0 [mdt] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? ldlm_resource_putref+0x2ae/0x520 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? lprocfs_stats_lock+0x24/0xd0 [obdclass] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] mdt_reint_create+0x16b/0x350 [mdt] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:50:13 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528102213.53272 /var/log/messages-20180605.gz:Jun 4 18:53:42 warble2 kernel: Lustre: 27045:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885b61f42d00 x1601955952822576/t0(0) o36->1a074ea6-b336-018a-5129-9f8e23e8eb32@192.168.44.121@o2ib44:547/0 lens 776/3128 e 24 to 0 dl 1528102427 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:53:42 warble1 kernel: Lustre: 40635:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885e9ed4c200 x1601955977863808/t0(0) o36->5197992c-d116-13d7-6f95-a4fd7c626787@192.168.44.122@o2ib44:547/0 lens 776/3128 e 24 to 0 dl 1528102427 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:53:48 warble2 kernel: Lustre: dagg-MDT0000: Client 1a074ea6-b336-018a-5129-9f8e23e8eb32 (at 192.168.44.121@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:53:48 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:53:48 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 2f9d4c44-1faa-24ee-636b-f7939762e4dd (at 192.168.44.121@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:53:48 warble1 kernel: Lustre: dagg-MDT0002: Client 5197992c-d116-13d7-6f95-a4fd7c626787 (at 192.168.44.122@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:53:48 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 11e12418-6971-d874-941e-6f261d9ba260 (at 192.168.44.122@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:53:52 warble1 kernel: Lustre: 26792:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88565f63e000 x1601912630233680/t0(0) o36->4daddac5-b65d-0372-67b2-20b3b6329c08@192.168.44.174@o2ib44:557/0 lens 776/3128 e 23 to 0 dl 1528102437 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:53:58 warble1 kernel: Lustre: dagg-MDT0002: Client 4daddac5-b65d-0372-67b2-20b3b6329c08 (at 192.168.44.174@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:53:58 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to 9fd50f81-6d8d-dba0-3a04-ea67c799a365 (at 192.168.44.174@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: LNet: Service thread pid 341947 was inactive for 1200.12s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: Pid: 341947, comm: mdt_rdpg00_058 /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 18:56:52 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 18:56:53 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528102613.341947 /var/log/messages-20180605.gz:Jun 4 18:56:54 warble2 kernel: Lustre: dagg-MDT0000: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:56:54 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 70a6dfba-b342-0475-868f-78883dbeeb06 (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:57:33 warble2 kernel: Lustre: dagg-MDT0000: Client c7c406ea-6ebe-a8cb-c838-f6840e69ac20 (at 192.168.44.187@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 18:57:33 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:57:33 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 72dd4154-bfd5-e460-0ae0-36e7cfa3294c (at 192.168.44.187@o2ib44) /var/log/messages-20180605.gz:Jun 4 18:57:33 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 18:59:23 warble2 kernel: Lustre: 423433:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885b64d42d00 x1601994368982768/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:133/0 lens 512/696 e 0 to 0 dl 1528102768 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 18:59:23 warble2 kernel: Lustre: 423433:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:00:14 warble2 kernel: Pid: 335097, comm: mdt_rdpg01_045 /var/log/messages-20180605.gz:Jun 4 19:00:14 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 19:00:14 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 19:00:14 warble2 kernel: [] schedule+0x29/0x70 -- /var/log/messages-20180605.gz:Jun 4 19:06:33 warble1 kernel: Lustre: server umount dagg-MDT0002 complete /var/log/messages-20180605.gz:Jun 4 19:06:33 warble1 Lustre(warble1-dagg-MDT2)[293502 INFO: warble1-dagg-MDT2-pool/MDT2 unmounted successfully /var/log/messages-20180605.gz:Jun 4 19:06:33 warble2 stonith-ng[14418]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:33 warble1 stonith-ng[14945]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:33 warble2 stonith-ng[14418]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:33 warble1 stonith-ng[14945]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:47 warble2 kernel: LustreError: 137-5: images-MDT0000_UUID: not available for connect from 192.168.44.157@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180605.gz:Jun 4 19:06:47 warble2 kernel: LustreError: Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:06:48 warble1 stonith-ng[14945]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:48 warble2 stonith-ng[14418]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:49 warble2 kernel: Lustre: 18379:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528103202/real 1528103202] req@ffff88b6876a1800 x1600515570296912/t0(0) o400->dagg-MDT0001-osp-MDT0000@192.168.44.21@o2ib44:24/4 lens 224/224 e 0 to 1 dl 1528103209 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180605.gz:Jun 4 19:06:49 warble2 kernel: Lustre: dagg-MDT0001-osp-MDT0000: Connection to dagg-MDT0001 (at 192.168.44.21@o2ib44) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180605.gz:Jun 4 19:06:49 warble2 kernel: LustreError: 166-1: MGC192.168.44.21@o2ib44: Connection to MGS (at 192.168.44.21@o2ib44) was lost; in progress operations using this service will fail /var/log/messages-20180605.gz:Jun 4 19:06:50 warble2 lrmd[14419]: notice: warble1-zfs-dagg-MDT1-pool_start_0:130097:stderr [ cannot open 'warble1-dagg-MDT1-pool': no such pool ] /var/log/messages-20180605.gz:Jun 4 19:06:50 warble2 stonith-ng[14418]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:50 warble1 stonith-ng[14945]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:50 warble2 Lustre(warble1-dagg-MDT1)[131083 INFO: Starting to mount warble1-dagg-MDT1-pool/MDT1 /var/log/messages-20180605.gz:Jun 4 19:06:50 warble2 stonith-ng[14418]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:50 warble1 stonith-ng[14945]: notice: On loss of CCM Quorum: Ignore /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: Lustre: 18370:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528103209/real 1528103209] req@ffff88b6c0019b00 x1600515570297520/t0(0) o250->MGC192.168.44.21@o2ib44@192.168.44.21@o2ib44:26/25 lens 520/544 e 0 to 1 dl 1528103215 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: LNet: Service thread pid 330107 was inactive for 1201.24s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: LNet: Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: Pid: 330107, comm: mdt_rdpg00_029 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528103215.330107 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: Lustre: 18370:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: Pid: 232742, comm: mdt_rdpg01_012 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:06:55 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] -- /var/log/messages-20180605.gz:Jun 4 19:10:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:11:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 19:11:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:11:55 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 70a6dfba-b342-0475-868f-78883dbeeb06 (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 4 19:11:55 warble2 kernel: Lustre: Skipped 556 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:12:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 19:12:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:13:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 19:13:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:13:50 warble2 kernel: Lustre: dagg-MDT0000: Client 1a074ea6-b336-018a-5129-9f8e23e8eb32 (at 192.168.44.121@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 19:13:50 warble2 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:14:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 19:14:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:15:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 19:15:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:16:28 warble2 kernel: Lustre: dagg-MDT0002: recovery is timed out, evict stale exports /var/log/messages-20180605.gz:Jun 4 19:16:28 warble2 kernel: Lustre: dagg-MDT0002: disconnecting 1 stale clients /var/log/messages-20180605.gz:Jun 4 19:16:28 warble2 kernel: Lustre: dagg-MDT0002: Recovery over after 8:57, of 129 clients 128 recovered and 1 was evicted. /var/log/messages-20180605.gz:Jun 4 19:16:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 19:16:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: LNet: Service thread pid 330110 was inactive for 1202.32s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: LNet: Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: Pid: 330110, comm: mdt_rdpg00_030 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? keys_fini+0xb1/0x1d0 [obdclass] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? ptlrpc_at_remove_timed+0x4e/0xb0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? ptlrpc_server_drop_request+0xa7/0x6c0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 19:16:57 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528103817.330110 /var/log/messages-20180605.gz:Jun 4 19:18:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 19:18:31 warble2 kernel: Lustre: 136544:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 5 previous similar messages /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: INFO: task mdt00_052:27185 blocked for more than 120 seconds. /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: mdt00_052 D ffff88bd7d1cdee0 0 27185 2 0x00000000 /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: Call Trace: /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: [] rwsem_down_read_failed+0x10d/0x1a0 /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: [] call_rwsem_down_read_failed+0x18/0x30 /var/log/messages-20180605.gz:Jun 4 19:19:04 warble2 kernel: [] ? mdd_readlink+0x2a0/0x2a0 [mdd] -- /var/log/messages-20180605.gz:Jun 4 20:52:46 warble1 kernel: hfi1 0000:d8:00.0: hfi1_0: Switching to NO_DMA_RTAIL /var/log/messages-20180605.gz:Jun 4 20:52:48 warble1 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready /var/log/messages-20180605.gz:Jun 4 20:52:51 warble1 ntpd[14477]: Listen normally on 11 ib0 fe80::211:7501:171:de62 UDP 123 /var/log/messages-20180605.gz:Jun 4 20:52:57 warble2 kernel: LustreError: 69220:0:(tgt_handler.c:509:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff88be016e4500 x1602337637470480/t0(0) o601->dagg-MDT0000-lwp-MDT0001_UUID@0@lo:222/0 lens 336/0 e 0 to 0 dl 1528109652 ref 1 fl Interpret:/0/ffffffff rc 0/-1 /var/log/messages-20180605.gz:Jun 4 20:52:57 warble2 kernel: LustreError: 11-0: dagg-MDT0000-lwp-MDT0001: operation quota_acquire to node 0@lo failed: rc = -11 /var/log/messages-20180605.gz:Jun 4 20:52:57 warble2 kernel: LustreError: 69220:0:(tgt_handler.c:509:tgt_filter_recovery_request()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:53:57 warble2 kernel: LustreError: 69218:0:(tgt_handler.c:509:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff88bdfdd98900 x1602337637476800/t0(0) o601->dagg-MDT0000-lwp-MDT0001_UUID@0@lo:375/0 lens 336/0 e 0 to 0 dl 1528109805 ref 1 fl Interpret:/0/ffffffff rc 0/-1 /var/log/messages-20180605.gz:Jun 4 20:53:57 warble2 kernel: LustreError: 11-0: dagg-MDT0000-lwp-MDT0001: operation quota_acquire to node 0@lo failed: rc = -11 /var/log/messages-20180605.gz:Jun 4 20:53:57 warble2 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:53:57 warble2 kernel: LustreError: 69218:0:(tgt_handler.c:509:tgt_filter_recovery_request()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:54:01 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.159@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 20:54:01 warble2 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180605.gz:Jun 4 20:54:26 warble2 kernel: Lustre: dagg-MDT0001: Client 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 20:54:27 warble2 kernel: Lustre: dagg-MDT0001: Client 07a7a787-ad21-2b1a-5302-c80ca70de269 (at 10.8.49.158@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 20:54:40 warble2 kernel: Lustre: dagg-MDT0001: Client 2f97c2f1-1474-b879-139d-cd5539f51fc8 (at 10.8.49.176@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 20:54:40 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:54:57 warble2 kernel: LustreError: 69220:0:(tgt_handler.c:509:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff88be04713000 x1602337637482560/t0(0) o601->dagg-MDT0000-lwp-MDT0001_UUID@0@lo:498/0 lens 336/0 e 0 to 0 dl 1528109928 ref 1 fl Interpret:/0/ffffffff rc 0/-1 /var/log/messages-20180605.gz:Jun 4 20:54:57 warble2 kernel: LustreError: 11-0: dagg-MDT0000-lwp-MDT0001: operation quota_acquire to node 0@lo failed: rc = -11 /var/log/messages-20180605.gz:Jun 4 20:54:57 warble2 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:54:57 warble2 kernel: LustreError: 69220:0:(tgt_handler.c:509:tgt_filter_recovery_request()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: LNet: Service thread pid 68907 was inactive for 200.63s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: Pid: 68907, comm: mdt00_034 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? ptlrpc_interrupted_set+0x0/0x110 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ptlrpc_set_wait+0x4c0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ptlrpc_queue_wait+0x7d/0x220 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] osp_remote_sync+0xd3/0x200 [osp] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] osp_attr_get+0x45d/0x6f0 [osp] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] osp_object_init+0x155/0x2c0 [osp] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] lu_object_alloc+0xe5/0x320 [obdclass] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] lu_object_find_at+0x180/0x2b0 [obdclass] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] lu_object_find+0x16/0x20 [obdclass] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] mdt_object_find+0x4b/0x170 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] mdt_reint_open+0x1d43/0x31a0 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? upcall_cache_get_entry+0x211/0x8f0 [obdclass] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? mdt_ucred+0x15/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? mdt_root_squash+0x21/0x430 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? ucred_set_jobid+0x53/0x70 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? ldlm_resource_get+0x618/0xa60 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 20:55:15 warble2 kernel: [] ret_from_fork+0x5d/0xb0 -- /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] mdt_intent_reint+0x162/0x430 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 20:55:16 warble2 kernel: LNet: Service thread pid 69122 was inactive for 200.91s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 20:55:19 warble2 kernel: Lustre: MGS: haven't heard from client 49460677-b398-6155-fdc9-79a11c11ce3a (at 10.8.49.159@tcp201) in 228 seconds. I think it's dead, and I am evicting it. exp ffff885dcdabd000, cur 1528109719 expire 1528109569 last 1528109491 /var/log/messages-20180605.gz:Jun 4 20:55:19 warble2 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180605.gz:Jun 4 20:55:44 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) in 228 seconds. I think it's dead, and I am evicting it. exp ffff88bdff89fc00, cur 1528109744 expire 1528109594 last 1528109516 /var/log/messages-20180605.gz:Jun 4 20:55:44 warble2 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180605.gz:Jun 4 20:55:57 warble2 kernel: LustreError: 69218:0:(tgt_handler.c:509:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff88be12c0b450 x1602337637488176/t0(0) o601->dagg-MDT0000-lwp-MDT0001_UUID@0@lo:620/0 lens 336/0 e 0 to 0 dl 1528110050 ref 1 fl Interpret:/0/ffffffff rc 0/-1 /var/log/messages-20180605.gz:Jun 4 20:55:57 warble2 kernel: LustreError: 11-0: dagg-MDT0000-lwp-MDT0001: operation quota_acquire to node 0@lo failed: rc = -11 /var/log/messages-20180605.gz:Jun 4 20:55:57 warble2 kernel: LustreError: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:55:57 warble2 kernel: LustreError: 69218:0:(tgt_handler.c:509:tgt_filter_recovery_request()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:56:35 warble2 kernel: Lustre: dagg-MDT0001: Client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 20:56:45 warble2 kernel: Lustre: dagg-MDT0002: Recovery already passed deadline 0:58. If you do not want to wait more, please abort the recovery by force. /var/log/messages-20180605.gz:Jun 4 20:56:45 warble2 kernel: Lustre: Skipped 18 previous similar messages /var/log/messages-20180605.gz:Jun 4 20:56:57 warble2 kernel: LustreError: 69220:0:(tgt_handler.c:509:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff88be12c09c50 x1602337637494576/t0(0) o601->dagg-MDT0000-lwp-MDT0001_UUID@0@lo:19/0 lens 336/0 e 0 to 0 dl 1528110204 ref 1 fl Interpret:/0/ffffffff rc 0/-1 /var/log/messages-20180605.gz:Jun 4 20:56:57 warble2 kernel: LustreError: 69220:0:(tgt_handler.c:509:tgt_filter_recovery_request()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 20:57:28 warble2 kernel: Lustre: dagg-MDT0000: recovery is timed out, evict stale exports /var/log/messages-20180605.gz:Jun 4 20:57:28 warble2 kernel: Lustre: dagg-MDT0000: disconnecting 5 stale clients /var/log/messages-20180605.gz:Jun 4 20:57:28 warble2 kernel: Lustre: 45754:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0000: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 20:57:28 warble2 kernel: Lustre: 45754:0:(ldlm_lib.c:2544:target_recovery_thread()) too long recovery - read logs /var/log/messages-20180605.gz:Jun 4 20:57:28 warble2 kernel: LNet: Service thread pid 68898 completed after 333.78s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). /var/log/messages-20180605.gz:Jun 4 20:57:28 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528109848.45754 /var/log/messages-20180605.gz:Jun 4 20:57:28 warble2 kernel: Lustre: dagg-MDT0000: Recovery over after 14:38, of 129 clients 124 recovered and 5 were evicted. /var/log/messages-20180605.gz:Jun 4 20:57:44 warble2 kernel: Lustre: dagg-MDT0002: recovery is timed out, evict stale exports /var/log/messages-20180605.gz:Jun 4 20:57:44 warble2 kernel: Lustre: dagg-MDT0002: disconnecting 5 stale clients /var/log/messages-20180605.gz:Jun 4 20:57:44 warble2 kernel: Lustre: 51060:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0002: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages-20180605.gz:Jun 4 20:57:44 warble2 kernel: Lustre: 51060:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 5 previous similar messages /var/log/messages-20180605.gz:Jun 4 20:57:44 warble2 kernel: Lustre: 51060:0:(ldlm_lib.c:2544:target_recovery_thread()) too long recovery - read logs /var/log/messages-20180605.gz:Jun 4 20:57:44 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528109864.51060 /var/log/messages-20180605.gz:Jun 4 20:57:44 warble2 kernel: Lustre: dagg-MDT0002: Recovery over after 14:38, of 129 clients 124 recovered and 5 were evicted. /var/log/messages-20180605.gz:Jun 4 20:57:52 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdf86cfc00, cur 1528109872 expire 1528109722 last 1528109645 /var/log/messages-20180605.gz:Jun 4 20:57:52 warble2 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 4 20:59:01 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.159@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 20:59:01 warble2 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180605.gz:Jun 4 20:59:26 warble2 kernel: Lustre: dagg-MDT0001: Client 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 20:59:40 warble2 kernel: Lustre: dagg-MDT0001: Client 2f97c2f1-1474-b879-139d-cd5539f51fc8 (at 10.8.49.176@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 20:59:40 warble2 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:00:18 warble2 kernel: Lustre: dagg-MDT0000: Client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 21:00:18 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 21:00:18 warble2 kernel: Lustre: Skipped 84 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:00:18 warble2 kernel: LNetError: 21858:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 21:00:18 warble2 kernel: LNetError: 21858:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 78 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:00:19 warble2 kernel: Lustre: MGS: haven't heard from client 49460677-b398-6155-fdc9-79a11c11ce3a (at 10.8.49.159@tcp201) in 228 seconds. I think it's dead, and I am evicting it. exp ffff885dc2f7c800, cur 1528110019 expire 1528109869 last 1528109791 /var/log/messages-20180605.gz:Jun 4 21:02:06 warble2 kernel: Lustre: dagg-MDT0002: Client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 21:02:06 warble2 kernel: Lustre: dagg-MDT0001: Client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 21:02:06 warble2 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:02:18 warble2 kernel: LustreError: 45276:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0002-osp-MDT0000: fail to cancel 0 of 1 llog-records: rc = -116 /var/log/messages-20180605.gz:Jun 4 21:02:52 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bde4af3800, cur 1528110172 expire 1528110022 last 1528109945 /var/log/messages-20180605.gz:Jun 4 21:02:52 warble2 kernel: Lustre: Skipped 16 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:04:15 warble2 kernel: Lustre: dagg-MDT0002: Client 2f97c2f1-1474-b879-139d-cd5539f51fc8 (at 10.8.49.176@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 21:04:15 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: LNet: Service thread pid 69211 was inactive for 200.10s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: LNet: Skipped 4 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: Pid: 69211, comm: mdt_rdpg01_008 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:05:23 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110323.69211 /var/log/messages-20180605.gz:Jun 4 21:06:52 warble1 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180605.gz:Jun 4 21:06:52 warble1 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: LNet: Service thread pid 17771 was inactive for 212.65s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: Pid: 17771, comm: mdt_rdpg01_000 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:07:00 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110420.17771 /var/log/messages-20180605.gz:Jun 4 21:07:52 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bde1a89800, cur 1528110472 expire 1528110322 last 1528110245 /var/log/messages-20180605.gz:Jun 4 21:07:52 warble2 kernel: Lustre: Skipped 19 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:09:01 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.159@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 21:09:01 warble2 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:09:08 warble1 ntpd[14477]: 0.0.0.0 0612 02 freq_set kernel 65.997 PPM /var/log/messages-20180605.gz:Jun 4 21:09:08 warble1 ntpd[14477]: 0.0.0.0 0615 05 clock_sync /var/log/messages-20180605.gz:Jun 4 21:09:15 warble2 kernel: Lustre: dagg-MDT0002: Client 2f97c2f1-1474-b879-139d-cd5539f51fc8 (at 10.8.49.176@tcp201) reconnecting /var/log/messages-20180605.gz:Jun 4 21:09:15 warble2 kernel: Lustre: Skipped 14 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:10:20 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180605.gz:Jun 4 21:10:20 warble2 kernel: Lustre: Skipped 83 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:10:20 warble2 kernel: LNetError: 68819:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180605.gz:Jun 4 21:10:20 warble2 kernel: LNetError: 68819:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 84 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: LNet: Service thread pid 22031 was inactive for 200.45s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: Pid: 22031, comm: mdt_rdpg00_004 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:11:55 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110715.22031 /var/log/messages-20180605.gz:Jun 4 21:11:58 warble2 kernel: Lustre: 69220:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd5a21200 x1601995999614048/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:538/0 lens 512/696 e 24 to 0 dl 1528110723 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: LNet: Service thread pid 68937 was inactive for 200.03s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: Pid: 68937, comm: mdt00_042 /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? kfree+0x106/0x140 /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:08 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:12:09 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110729.68937 -- /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:12:13 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110733.22038 /var/log/messages-20180605.gz:Jun 4 21:12:19 warble2 kernel: Lustre: 68794:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bde2d79200 x1601996001007200/t0(0) o36->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:559/0 lens 784/3128 e 8 to 0 dl 1528110744 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: LNet: Service thread pid 17769 was inactive for 200.57s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: Pid: 17769, comm: mdt_rdpg00_000 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:12:37 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110757.17769 /var/log/messages-20180605.gz:Jun 4 21:13:22 warble2 kernel: Lustre: 69215:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd61c6c00 x1601996001374352/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:622/0 lens 512/696 e 5 to 0 dl 1528110807 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: LNet: Service thread pid 17793 was inactive for 668.76s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: Pid: 17793, comm: mdt01_004 /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdd_rename+0x4d1/0x14a0 [mdd] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdt_reint_rename_internal.isra.36+0x166a/0x20c0 [mdt] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x19b/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:13:32 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:13:33 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:13:33 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:13:33 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:13:33 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:13:33 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110813.17793 /var/log/messages-20180605.gz:Jun 4 21:13:42 warble2 kernel: LNet: Service thread pid 22034 was inactive for 214.38s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 21:13:42 warble2 kernel: LNet: Skipped 76 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:13:42 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110822.22034 /var/log/messages-20180605.gz:Jun 4 21:13:48 warble2 kernel: LNet: Service thread pid 81582 was inactive for 214.31s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 21:13:48 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528110828.81582 /var/log/messages-20180605.gz:Jun 4 21:13:48 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180605.gz:Jun 4 21:13:48 warble2 kernel: LustreError: 68937:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528110528, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885d93879200/0xd96f9b498213b767 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 2 type: IBT flags: 0x1000001000000 nid: local remote: 0xd96f9b498213b77c expref: -99 pid: 68937 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 21:16:59 warble2 kernel: Lustre: dagg-MDT0002: haven't heard from client 07a7a787-ad21-2b1a-5302-c80ca70de269 (at 10.8.49.158@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bde1636c00, cur 1528111019 expire 1528110869 last 1528110792 /var/log/messages-20180605.gz:Jun 4 21:16:59 warble2 kernel: Lustre: Skipped 37 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:18:30 warble2 kernel: Lustre: 17770:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885dcd3e9200 x1601994376440672/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:175/0 lens 512/696 e 24 to 0 dl 1528111115 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:18:36 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 21:18:36 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:18:43 warble2 kernel: Lustre: 17764:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885d9e7ec500 x1601994376442288/t0(0) o36->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:188/0 lens 776/3128 e 24 to 0 dl 1528111128 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:19:02 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.158@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 21:19:02 warble2 kernel: Lustre: Skipped 11 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:19:11 warble2 kernel: Lustre: 22032:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885dcc897800 x1601994376447040/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:216/0 lens 512/696 e 11 to 0 dl 1528111156 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:19:11 warble2 kernel: Lustre: 22032:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 21:20:03 warble2 kernel: Lustre: 22032:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/4), not sending early reply#012 req@ffff885dcd45ce00 x1601994376461088/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:267/0 lens 512/696 e 5 to 0 dl 1528111207 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:20:37 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 10.8.49.159@tcp201) /var/log/messages-20180605.gz:Jun 4 21:20:37 warble2 kernel: LNetError: 68794:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 21:20:37 warble2 kernel: LNetError: 68794:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 78 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:20:37 warble2 kernel: Lustre: Skipped 84 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:23:33 warble2 kernel: Lustre: 19695:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885da4c1e000 x1601994376839376/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:478/0 lens 512/696 e 1 to 0 dl 1528111418 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:23:33 warble2 kernel: Lustre: 19695:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: LNet: Service thread pid 22036 was inactive for 664.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: Pid: 22036, comm: mdt_rdpg00_009 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:24:42 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528111482.22036 /var/log/messages-20180605.gz:Jun 4 21:24:47 warble2 kernel: Lustre: 22037:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/4), not sending early reply#012 req@ffff885d902e1800 x1601994377453648/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:551/0 lens 512/696 e 1 to 0 dl 1528111491 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:26:59 warble2 kernel: Lustre: dagg-MDT0002: haven't heard from client 07a7a787-ad21-2b1a-5302-c80ca70de269 (at 10.8.49.158@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bddbb4a000, cur 1528111619 expire 1528111469 last 1528111392 /var/log/messages-20180605.gz:Jun 4 21:26:59 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: LNet: Service thread pid 22033 was inactive for 766.90s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: Pid: 22033, comm: mdt_rdpg00_006 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:27:38 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528111658.22033 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: Pid: 22053, comm: mdt_rdpg01_004 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] -- /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:27:46 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528111666.22053 /var/log/messages-20180605.gz:Jun 4 21:28:37 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 21:28:37 warble2 kernel: Lustre: Skipped 32 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:29:03 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 21:29:03 warble2 kernel: Lustre: Skipped 10 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:30:42 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.158@tcp201) /var/log/messages-20180605.gz:Jun 4 21:30:42 warble2 kernel: Lustre: Skipped 82 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:30:42 warble2 kernel: LNetError: 68963:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.158@tcp201 /var/log/messages-20180605.gz:Jun 4 21:30:42 warble2 kernel: LNetError: 68963:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:34:20 warble2 kernel: Lustre: 69219:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd07aaa00 x1601996002645216/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:370/0 lens 512/696 e 24 to 0 dl 1528112065 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:37:52 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdf2c89800, cur 1528112272 expire 1528112122 last 1528112045 /var/log/messages-20180605.gz:Jun 4 21:37:52 warble2 kernel: Lustre: Skipped 41 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: LNet: Service thread pid 19632 was inactive for 563.60s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: Pid: 19632, comm: mdt_rdpg01_002 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:38:13 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528112293.19632 /var/log/messages-20180605.gz:Jun 4 21:38:38 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 21:38:38 warble2 kernel: Lustre: Skipped 32 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:38:44 warble2 kernel: Lustre: 52381:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd12a6f00 x1601996003264256/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:634/0 lens 512/696 e 3 to 0 dl 1528112329 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:38:44 warble2 kernel: Lustre: 52381:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 21:39:03 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 21:39:03 warble2 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: LNet: Service thread pid 19680 was inactive for 200.67s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: Pid: 19680, comm: mdt_rdpg00_002 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:39:07 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528112347.19680 /var/log/messages-20180605.gz:Jun 4 21:40:42 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.158@tcp201) /var/log/messages-20180605.gz:Jun 4 21:40:42 warble2 kernel: Lustre: Skipped 83 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:40:42 warble2 kernel: LNetError: 68857:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.158@tcp201 /var/log/messages-20180605.gz:Jun 4 21:40:42 warble2 kernel: LNetError: 68857:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: LNet: Service thread pid 69215 was inactive for 1203.19s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: Pid: 69215, comm: mdt_rdpg01_012 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:42:07 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528112527.69215 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: LNet: Service thread pid 19695 was inactive for 362.31s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: Pid: 19695, comm: mdt_rdpg00_003 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:44:40 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528112680.19695 /var/log/messages-20180605.gz:Jun 4 21:47:58 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdeaeef000, cur 1528112878 expire 1528112728 last 1528112651 /var/log/messages-20180605.gz:Jun 4 21:47:58 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:48:39 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 21:48:39 warble2 kernel: Lustre: Skipped 33 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: LNet: Service thread pid 17770 was inactive for 1202.94s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: Pid: 17770, comm: mdt_rdpg00_001 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:48:40 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528112920.17770 /var/log/messages-20180605.gz:Jun 4 21:49:03 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 21:49:03 warble2 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:50:43 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 21:50:43 warble2 kernel: Lustre: Skipped 82 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:50:43 warble2 kernel: LNetError: 19571:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 21:50:43 warble2 kernel: LNetError: 19571:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 78 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: LNet: Service thread pid 21917 was inactive for 200.27s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: Pid: 21917, comm: mdt_rdpg01_003 /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 21:50:56 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 21:50:57 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528113057.21917 /var/log/messages-20180605.gz:Jun 4 21:51:08 warble2 kernel: Lustre: 175707:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885d7b872d00 x1601994378293504/t0(0) o35->d5c7ff2e-8c29-354a-1a03-772bf8dd100f@192.168.44.14@o2ib44:623/0 lens 512/696 e 0 to 0 dl 1528113073 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 21:51:08 warble2 kernel: Lustre: 175707:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:58:17 warble2 kernel: Lustre: dagg-MDT0002: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdd740fc00, cur 1528113497 expire 1528113347 last 1528113270 /var/log/messages-20180605.gz:Jun 4 21:58:17 warble2 kernel: Lustre: Skipped 40 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:58:40 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 21:58:40 warble2 kernel: Lustre: Skipped 33 previous similar messages /var/log/messages-20180605.gz:Jun 4 21:59:03 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 21:59:03 warble2 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: LNet: Service thread pid 69213 was inactive for 1200.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: Pid: 69213, comm: mdt_rdpg01_010 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 22:00:41 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528113641.69213 /var/log/messages-20180605.gz:Jun 4 22:00:43 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 22:00:43 warble2 kernel: Lustre: Skipped 85 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:00:43 warble2 kernel: LNetError: 68951:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 22:00:43 warble2 kernel: LNetError: 68951:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 81 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: Pid: 69217, comm: mdt_rdpg01_014 /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:00:49 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] -- /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x67c/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 22:01:28 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528113688.19550 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: LNet: Service thread pid 80198 was inactive for 200.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: LNet: Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: Pid: 80198, comm: mdt00_115 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:47 warble2 kernel: -- /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 22:01:48 warble2 kernel: LNet: Service thread pid 68984 was inactive for 200.62s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 22:01:58 warble2 kernel: LNet: Service thread pid 9211 was inactive for 1203.29s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages-20180605.gz:Jun 4 22:01:58 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 22:01:58 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528113718.9211 /var/log/messages-20180605.gz:Jun 4 22:03:08 warble2 kernel: LustreError: 19550:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528113488, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff885de0b46c00/0xd96f9b49930b4fee lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 8 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 19550 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 22:03:08 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528113788.19550 /var/log/messages-20180605.gz:Jun 4 22:03:27 warble2 kernel: LustreError: 21858:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528113507, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88bdced08a00/0xd96f9b4993268a0b lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 8 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 21858 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 22:03:27 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180605.gz:Jun 4 22:03:27 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 22:03:27 warble2 kernel: LustreError: 80198:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528113507, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0001 lock: ffff885dc42fda00/0xd96f9b4993268aac lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 3 type: IBT flags: 0x1000001000000 nid: local remote: 0xd96f9b4993268ab3 expref: -99 pid: 80198 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 22:03:27 warble2 kernel: LustreError: 80198:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:08:03 warble2 kernel: Lustre: 80078:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885da79add00 x1601912630655616/t0(0) o36->4daddac5-b65d-0372-67b2-20b3b6329c08@192.168.44.174@o2ib44:128/0 lens 776/3128 e 24 to 0 dl 1528114088 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 22:08:03 warble2 kernel: Lustre: 80078:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:08:17 warble2 kernel: Lustre: dagg-MDT0002: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bde1b61000, cur 1528114097 expire 1528113947 last 1528113870 /var/log/messages-20180605.gz:Jun 4 22:08:17 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:08:41 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 22:08:41 warble2 kernel: Lustre: Skipped 38 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:09:03 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 22:09:03 warble2 kernel: Lustre: Skipped 11 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:10:43 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 22:10:43 warble2 kernel: Lustre: Skipped 91 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:10:43 warble2 kernel: LNetError: 68981:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 22:10:43 warble2 kernel: LNetError: 68981:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 78 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:18:42 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 22:18:42 warble2 kernel: Lustre: Skipped 38 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:19:05 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 22:19:05 warble2 kernel: Lustre: Skipped 10 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:20:18 warble2 kernel: Lustre: MGS: haven't heard from client 49460677-b398-6155-fdc9-79a11c11ce3a (at 10.8.49.159@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885db6ff6000, cur 1528114818 expire 1528114668 last 1528114591 /var/log/messages-20180605.gz:Jun 4 22:20:18 warble2 kernel: Lustre: Skipped 41 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:20:43 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 22:20:43 warble2 kernel: Lustre: Skipped 89 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:20:43 warble2 kernel: LNetError: 19571:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 22:20:43 warble2 kernel: LNetError: 19571:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: LNet: Service thread pid 241427 was inactive for 1201.29s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: Pid: 241427, comm: mdt_rdpg01_022 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 22:24:26 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528115066.241427 /var/log/messages-20180605.gz:Jun 4 22:28:43 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 22:28:43 warble2 kernel: Lustre: Skipped 38 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:29:05 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 22:29:05 warble2 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:30:18 warble2 kernel: Lustre: MGS: haven't heard from client 49460677-b398-6155-fdc9-79a11c11ce3a (at 10.8.49.159@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885da308bc00, cur 1528115418 expire 1528115268 last 1528115191 /var/log/messages-20180605.gz:Jun 4 22:30:18 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:30:43 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 22:30:43 warble2 kernel: Lustre: Skipped 88 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:30:43 warble2 kernel: LNetError: 17767:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 22:30:43 warble2 kernel: LNetError: 17767:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 79 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:34:41 warble2 kernel: Lustre: 163847:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88bdcec76600 x1601996007853248/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:216/0 lens 512/696 e 0 to 0 dl 1528115686 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 22:34:41 warble2 kernel: Lustre: 163847:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:38:44 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 22:38:44 warble2 kernel: Lustre: Skipped 38 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:39:05 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 22:39:05 warble2 kernel: Lustre: Skipped 8 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:40:18 warble2 kernel: Lustre: MGS: haven't heard from client 49460677-b398-6155-fdc9-79a11c11ce3a (at 10.8.49.159@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885daaeb4000, cur 1528116018 expire 1528115868 last 1528115791 /var/log/messages-20180605.gz:Jun 4 22:40:18 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:40:43 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 22:40:43 warble2 kernel: Lustre: Skipped 87 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:40:43 warble2 kernel: LNetError: 68850:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 22:40:43 warble2 kernel: LNetError: 68850:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 79 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: LNet: Service thread pid 162231 was inactive for 1200.07s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: Pid: 162231, comm: mdt_rdpg01_019 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 22:42:11 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528116131.162231 /var/log/messages-20180605.gz:Jun 4 22:48:45 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 22:48:45 warble2 kernel: Lustre: Skipped 38 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:49:05 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 22:49:05 warble2 kernel: Lustre: Skipped 10 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:50:32 warble2 kernel: Lustre: dagg-MDT0002: haven't heard from client 2f97c2f1-1474-b879-139d-cd5539f51fc8 (at 10.8.49.176@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bddcdca400, cur 1528116632 expire 1528116482 last 1528116405 /var/log/messages-20180605.gz:Jun 4 22:50:32 warble2 kernel: Lustre: Skipped 44 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:50:43 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 22:50:43 warble2 kernel: Lustre: Skipped 88 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:50:43 warble2 kernel: LNetError: 68794:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 22:50:43 warble2 kernel: LNetError: 68794:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 79 previous similar messages /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: LNet: Service thread pid 69127 was inactive for 200.68s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: Pid: 69127, comm: mdt00_091 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 22:51:34 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528116694.69127 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: LNet: Service thread pid 68988 was inactive for 200.17s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: Pid: 68988, comm: mdt00_056 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 22:51:53 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528116713.68988 -- /var/log/messages-20180605.gz:Jun 4 23:03:33 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180605.gz:Jun 4 23:03:33 warble2 kernel: LustreError: 68866:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:03:34 warble2 kernel: LustreError: 69008:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528117113, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff885de776f000/0xd96f9b49a6f7930e lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 69008 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 23:08:47 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 23:08:47 warble2 kernel: Lustre: Skipped 33 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:10:43 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d853d5400, cur 1528117843 expire 1528117693 last 1528117616 /var/log/messages-20180605.gz:Jun 4 23:10:43 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:10:44 warble2 kernel: Lustre: 69122:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885da39f8300 x1601912630759856/t0(0) o36->4daddac5-b65d-0372-67b2-20b3b6329c08@192.168.44.174@o2ib44:114/0 lens 776/3128 e 0 to 0 dl 1528117849 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 23:10:44 warble2 kernel: Lustre: 69122:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:11:03 warble2 kernel: Lustre: 80081:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885d8465b900 x1601955953391408/t0(0) o36->1a074ea6-b336-018a-5129-9f8e23e8eb32@192.168.44.121@o2ib44:133/0 lens 776/3128 e 0 to 0 dl 1528117868 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 23:11:32 warble2 kernel: Lustre: MGS: Connection restored to (at 10.8.49.158@tcp201) /var/log/messages-20180605.gz:Jun 4 23:11:32 warble2 kernel: Lustre: Skipped 93 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:11:32 warble2 kernel: LNetError: 17632:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.158@tcp201 /var/log/messages-20180605.gz:Jun 4 23:11:32 warble2 kernel: LNetError: 17632:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:14:01 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.159@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 23:14:01 warble2 kernel: Lustre: Skipped 12 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:15:50 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180605.gz:Jun 4 23:15:50 warble2 kernel: LustreError: 69122:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528117850, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885da2941400/0xd96f9b49aae395ee lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 6 type: IBT flags: 0x1000001000000 nid: local remote: 0xd96f9b49aae395f5 expref: -99 pid: 69122 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 23:16:09 warble2 kernel: LustreError: 80081:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528117869, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885db4ffe800/0xd96f9b49aafcbf14 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 6 type: IBT flags: 0x1000001000000 nid: local remote: 0xd96f9b49aafcbf1b expref: -99 pid: 80081 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 23:16:09 warble2 kernel: LustreError: 68943:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528117869, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88bdd4486a00/0xd96f9b49aafcc7f0 lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 21 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 68943 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: LNet: Service thread pid 79940 was inactive for 1203.89s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: LNet: Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: Pid: 79940, comm: mdt00_097 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:18:18 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528118298.79940 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: LNet: Service thread pid 68866 was inactive for 1201.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: Pid: 68866, comm: mdt00_029 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528118314.68866 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: Pid: 69008, comm: mdt00_068 /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:18:34 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] -- /var/log/messages-20180605.gz:Jun 4 23:18:35 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:18:35 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:18:48 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 23:18:48 warble2 kernel: Lustre: Skipped 38 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:20:43 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bde5e03c00, cur 1528118443 expire 1528118293 last 1528118216 /var/log/messages-20180605.gz:Jun 4 23:20:43 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:21:33 warble2 kernel: Lustre: MGS: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180605.gz:Jun 4 23:21:33 warble2 kernel: Lustre: Skipped 86 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:21:33 warble2 kernel: LNetError: 20575:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 4 23:21:33 warble2 kernel: LNetError: 20575:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:23:20 warble2 kernel: Lustre: 17796:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885d7a6e7800 x1601912630780528/t0(0) o36->4daddac5-b65d-0372-67b2-20b3b6329c08@192.168.44.174@o2ib44:115/0 lens 760/3128 e 0 to 0 dl 1528118605 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 23:23:20 warble2 kernel: Lustre: 17796:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:23:39 warble2 kernel: Lustre: 17796:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885db78b1b00 x1601905923047376/t0(0) o36->ff7190b4-1a69-a284-8d45-437edbd4b34b@192.168.44.119@o2ib44:134/0 lens 760/3128 e 0 to 0 dl 1528118624 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 23:24:02 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.158@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 23:24:02 warble2 kernel: Lustre: Skipped 11 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:28:26 warble2 kernel: LustreError: 80013:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528118606, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff885d82aba600/0xd96f9b49aef3e79a lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 22 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 80013 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 23:28:49 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 23:28:49 warble2 kernel: Lustre: Skipped 36 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:30:43 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885da9ee5400, cur 1528119043 expire 1528118893 last 1528118816 /var/log/messages-20180605.gz:Jun 4 23:30:43 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: LNet: Service thread pid 69122 was inactive for 1201.48s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: LNet: Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: Pid: 69122, comm: mdt00_086 /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 23:30:51 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:30:52 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528119052.69122 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: LNet: Service thread pid 80081 was inactive for 1203.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: Pid: 80081, comm: mdt00_104 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528119072.80081 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: Pid: 68943, comm: mdt01_065 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] -- /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:31:12 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:31:35 warble2 kernel: Lustre: MGS: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180605.gz:Jun 4 23:31:35 warble2 kernel: LNetError: 68830:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180605.gz:Jun 4 23:31:35 warble2 kernel: LNetError: 68830:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:31:35 warble2 kernel: Lustre: Skipped 88 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:34:02 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.158@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 23:34:02 warble2 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:35:56 warble2 kernel: Lustre: 79958:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885d815ab000 x1601912630795600/t0(0) o36->4daddac5-b65d-0372-67b2-20b3b6329c08@192.168.44.174@o2ib44:116/0 lens 760/3128 e 0 to 0 dl 1528119361 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 23:35:56 warble2 kernel: Lustre: 79958:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 23:38:50 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 23:38:50 warble2 kernel: Lustre: Skipped 34 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:40:45 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdea2ef000, cur 1528119645 expire 1528119495 last 1528119418 /var/log/messages-20180605.gz:Jun 4 23:40:45 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:41:02 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180605.gz:Jun 4 23:41:02 warble2 kernel: LustreError: 69007:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528119362, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885d9f29b800/0xd96f9b49b2d7497c lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 7 type: IBT flags: 0x1000001000000 nid: local remote: 0xd96f9b49b2d74983 expref: -99 pid: 69007 timeout: 0 lvb_type: 0 /var/log/messages-20180605.gz:Jun 4 23:41:35 warble2 kernel: Lustre: MGS: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180605.gz:Jun 4 23:41:35 warble2 kernel: LNetError: 68956:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180605.gz:Jun 4 23:41:35 warble2 kernel: LNetError: 68956:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 79 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:41:35 warble2 kernel: Lustre: Skipped 86 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: LNet: Service thread pid 80013 was inactive for 1203.18s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: Pid: 80013, comm: mdt00_100 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x67c/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:43:29 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528119809.80013 /var/log/messages-20180605.gz:Jun 4 23:44:02 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.158@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 23:44:02 warble2 kernel: Lustre: Skipped 10 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:48:32 warble2 kernel: Lustre: 80161:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885dbb449e00 x1601912630811824/t0(0) o36->4daddac5-b65d-0372-67b2-20b3b6329c08@192.168.44.174@o2ib44:117/0 lens 760/3128 e 0 to 0 dl 1528120117 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 4 23:48:51 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 23:48:51 warble2 kernel: Lustre: Skipped 34 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:51:22 warble2 kernel: Lustre: dagg-MDT0000: haven't heard from client 2f97c2f1-1474-b879-139d-cd5539f51fc8 (at 10.8.49.176@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdd17cb000, cur 1528120282 expire 1528120132 last 1528120055 /var/log/messages-20180605.gz:Jun 4 23:51:22 warble2 kernel: Lustre: Skipped 43 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:51:45 warble2 kernel: Lustre: MGS: Connection restored to (at 10.8.49.176@tcp201) /var/log/messages-20180605.gz:Jun 4 23:51:45 warble2 kernel: LNetError: 19577:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.176@tcp201 /var/log/messages-20180605.gz:Jun 4 23:51:45 warble2 kernel: LNetError: 19577:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 81 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:51:45 warble2 kernel: Lustre: Skipped 86 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:54:02 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.158@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 4 23:54:02 warble2 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: LNet: Service thread pid 69007 was inactive for 1200.78s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: Pid: 69007, comm: mdt00_067 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: /var/log/messages-20180605.gz:Jun 4 23:56:03 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528120563.69007 /var/log/messages-20180605.gz:Jun 4 23:58:52 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 4 23:58:52 warble2 kernel: Lustre: Skipped 33 previous similar messages /var/log/messages-20180605.gz:Jun 5 00:01:45 warble2 kernel: Lustre: MGS: Connection restored to (at 10.8.49.176@tcp201) -- /var/log/messages-20180605.gz:Jun 5 02:34:53 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:37:50 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:37:50 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:39:09 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 5 02:39:09 warble2 kernel: Lustre: Skipped 50 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:40:34 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:40:34 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:42:03 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:42:03 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:42:38 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 5 02:42:38 warble2 kernel: Lustre: Skipped 142 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:43:44 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:43:44 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:44:11 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 5 02:44:11 warble2 kernel: Lustre: Skipped 66 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:46:01 warble2 kernel: LustreError: 17633:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff88be33669c50 x1601933770954736/t0(0) o256->5132f085-03cf-d894-16fc-f637cc1b8d52@10.8.49.176@tcp201:216/0 lens 304/240 e 2 to 0 dl 1528130786 ref 1 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 5 02:46:28 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:46:28 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:46:53 warble2 kernel: Lustre: dagg-MDT0001: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bde9c18400, cur 1528130813 expire 1528130663 last 1528130586 /var/log/messages-20180605.gz:Jun 5 02:46:53 warble2 kernel: Lustre: Skipped 19 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: LNet: Service thread pid 162230 was inactive for 200.09s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: Pid: 162230, comm: mdt_rdpg01_018 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: /var/log/messages-20180605.gz:Jun 5 02:47:15 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528130835.162230 /var/log/messages-20180605.gz:Jun 5 02:49:10 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 5 02:49:10 warble2 kernel: Lustre: Skipped 92 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:49:37 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:49:37 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:52:21 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:52:21 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:52:39 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 5 02:52:39 warble2 kernel: Lustre: Skipped 131 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:53:49 warble2 kernel: Lustre: 241428:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd12a0c00 x1601905493362912/t0(0) o35->9e8ae17b-cc98-7869-2d40-26a839b2b3a8@192.168.44.105@o2ib44:664/0 lens 512/696 e 24 to 0 dl 1528131234 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 5 02:55:31 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:55:31 warble2 kernel: LNet: 16028:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:55:59 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 5 02:55:59 warble2 kernel: Lustre: Skipped 24 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:56:20 warble2 kernel: LNetError: 21268:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 5 02:56:20 warble2 kernel: LNetError: 21268:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 67 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:57:35 warble2 kernel: LNetError: 68813:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180605.gz:Jun 5 02:57:35 warble2 kernel: LNetError: 68813:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 33 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:59:11 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 5 02:59:11 warble2 kernel: Lustre: Skipped 112 previous similar messages /var/log/messages-20180605.gz:Jun 5 02:59:18 warble1 kernel: LNet: 639:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Can't accept conn from 192.168.44.15@o2ib44 (version 12): max_frags 256 too large (32 wanted) /var/log/messages-20180605.gz:Jun 5 02:59:18 warble1 kernel: LNet: 639:0:(o2iblnd_cb.c:2341:kiblnd_passive_connect()) Skipped 3 previous similar messages /var/log/messages-20180605.gz:Jun 5 03:00:05 warble2 kernel: LNetError: 21268:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.159@tcp201 /var/log/messages-20180605.gz:Jun 5 03:00:05 warble2 kernel: LNetError: 21268:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 47 previous similar messages /var/log/messages-20180605.gz:Jun 5 03:02:40 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 5 03:02:40 warble2 kernel: Lustre: Skipped 136 previous similar messages /var/log/messages-20180605.gz:Jun 5 03:05:06 warble2 kernel: LNetError: 20575:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.176@tcp201 /var/log/messages-20180605.gz:Jun 5 03:05:06 warble2 kernel: LNetError: 20575:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 59 previous similar messages /var/log/messages-20180605.gz:Jun 5 03:06:26 warble2 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180605.gz:Jun 5 03:06:26 warble2 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: LNet: Service thread pid 161439 was inactive for 200.11s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: Pid: 161439, comm: mdt_rdpg01_025 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: #012Call Trace: /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: /var/log/messages-20180605.gz:Jun 5 03:08:19 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528132099.161439 /var/log/messages-20180605.gz:Jun 5 03:09:12 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180605.gz:Jun 5 03:09:12 warble2 kernel: Lustre: Skipped 100 previous similar messages /var/log/messages-20180605.gz:Jun 5 03:12:41 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.13@o2ib44) /var/log/messages-20180605.gz:Jun 5 03:12:41 warble2 kernel: Lustre: Skipped 99 previous similar messages /var/log/messages-20180605.gz:Jun 5 03:14:52 transom1 fm0_sm[109211]: WARN [PmEngine]: PM: PmPrintExceededPort: Routing of 134 Exceeded Threshold of 100. opa-l-6 Guid 0x00117501020c38c8 LID 0x6 Port 3 Neighbor: warble2 hfi1_0 Guid 0x001175010171d0f9 LID 0x78 Port 1 /var/log/messages-20180605.gz:Jun 5 03:14:54 warble2 kernel: Lustre: 69216:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88be63274050 x1601905493416256/t0(0) o35->9e8ae17b-cc98-7869-2d40-26a839b2b3a8@192.168.44.105@o2ib44:419/0 lens 512/696 e 24 to 0 dl 1528132499 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180605.gz:Jun 5 03:15:08 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble1 not in post /var/log/messages-20180605.gz:Jun 5 03:15:08 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180605.gz:Jun 5 03:15:25 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble1 not in post /var/log/messages-20180605.gz:Jun 5 03:15:25 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180605.gz:Jun 5 03:15:36 warble2 kernel: LNetError: 68928:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180605.gz:Jun 5 03:15:36 warble2 kernel: LNetError: 68928:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 87 previous similar messages -- /var/log/messages-20180606.gz:Jun 5 03:48:53 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:49:21 warble1 kernel: Lustre: dagg-MDT0000: Client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) reconnecting /var/log/messages-20180606.gz:Jun 5 03:49:21 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:49:42 warble1 kernel: Lustre: dagg-MDT0002: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdc775b800, cur 1528134582 expire 1528134432 last 1528134355 /var/log/messages-20180606.gz:Jun 5 03:49:42 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:50:37 warble1 kernel: Lustre: dagg-MDT0000: haven't heard from client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bd9e99fc00, cur 1528134637 expire 1528134487 last 1528134410 /var/log/messages-20180606.gz:Jun 5 03:50:37 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:51:11 warble1 kernel: LustreError: 54811:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0001-osp-MDT0002: fail to cancel 0 of 1 llog-records: rc = -116 /var/log/messages-20180606.gz:Jun 5 03:51:20 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 03:51:20 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 03:51:41 warble1 kernel: bnx2x 0000:19:00.3 em4: NIC Link is Down /var/log/messages-20180606.gz:Jun 5 03:51:44 warble1 kernel: bnx2x 0000:19:00.3 em4: NIC Link is Up, 1000 Mbps full duplex, Flow control: ON - receive & transmit /var/log/messages-20180606.gz:Jun 5 03:52:10 warble1 kernel: Lustre: dagg-MDT0000: Client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180606.gz:Jun 5 03:52:10 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:52:37 warble1 kernel: Lustre: MGS: haven't heard from client 9153ba9b-33c5-1fd8-fd28-ab871ff45382 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d86268400, cur 1528134757 expire 1528134607 last 1528134530 /var/log/messages-20180606.gz:Jun 5 03:52:37 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:53:25 warble1 kernel: Lustre: dagg-MDT0002: Client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180606.gz:Jun 5 03:53:25 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:53:53 warble1 kernel: Lustre: dagg-MDT0001: haven't heard from client 2f97c2f1-1474-b879-139d-cd5539f51fc8 (at 10.8.49.176@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdc8ecf800, cur 1528134833 expire 1528134683 last 1528134606 /var/log/messages-20180606.gz:Jun 5 03:53:53 warble1 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: LNet: Service thread pid 44421 was inactive for 200.06s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: Pid: 44421, comm: mdt_rdpg01_001 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 03:54:26 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528134866.44421 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: LNet: Service thread pid 44415 was inactive for 200.70s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: Pid: 44415, comm: mdt01_000 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] mdt_reint_setattr+0xba5/0x1060 [mdt] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 03:54:30 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528134870.44415 /var/log/messages-20180606.gz:Jun 5 03:55:50 transom1 rsyncd[41547]: connect from warble2 (192.168.44.22) /var/log/messages-20180606.gz:Jun 5 03:55:50 transom1 rsyncd[41547]: rsync on images/centos-7.4-minimal-server-07//etc/rsync.options.farnarkle from boot@warble2 (192.168.44.22) /var/log/messages-20180606.gz:Jun 5 03:55:50 transom1 rsyncd[41548]: connect from warble2 (192.168.44.22) /var/log/messages-20180606.gz:Jun 5 03:55:50 transom1 rsyncd[41548]: rsync on images/centos-7.4-minimal-server-07// from boot@warble2 (192.168.44.22) /var/log/messages-20180606.gz:Jun 5 03:55:55 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180606.gz:Jun 5 03:55:55 warble1 kernel: Lustre: Skipped 67 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:56:04 warble1 kernel: Lustre: dagg-MDT0002: haven't heard from client 07a7a787-ad21-2b1a-5302-c80ca70de269 (at 10.8.49.158@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bca33db400, cur 1528134964 expire 1528134814 last 1528134737 /var/log/messages-20180606.gz:Jun 5 03:56:04 warble1 kernel: Lustre: Skipped 11 previous similar messages /var/log/messages-20180606.gz:Jun 5 03:56:18 transom1 rsyncd[41609]: connect from warble2 (192.168.44.22) /var/log/messages-20180606.gz:Jun 5 03:56:18 transom1 rsyncd[41609]: rsync on images/centos-7.4-minimal-server-07//lib/modules/3.10.0-693.21.1.el7.x86_64/ from boot@warble2 (192.168.44.22) /var/log/messages-20180606.gz:Jun 5 03:56:20 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 03:56:20 warble1 kernel: Lustre: Skipped 4 previous similar messages -- /var/log/messages-20180606.gz:Jun 5 04:00:37 warble1 kernel: Lustre: dagg-MDT0000: haven't heard from client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bcacb4ec00, cur 1528135237 expire 1528135087 last 1528135010 /var/log/messages-20180606.gz:Jun 5 04:00:37 warble1 kernel: Lustre: Skipped 17 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:01:01 warble1 kernel: Lustre: 47797:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bcac663900 x1601996037330688/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:166/0 lens 512/696 e 23 to 0 dl 1528135266 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 04:01:04 warble1 kernel: Lustre: 48071:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bca5738600 x1601996037625840/t0(0) o36->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:169/0 lens 608/3392 e 23 to 0 dl 1528135269 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 04:01:20 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 04:01:20 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:02:10 warble1 kernel: Lustre: dagg-MDT0000: Client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) reconnecting /var/log/messages-20180606.gz:Jun 5 04:02:10 warble1 kernel: Lustre: Skipped 15 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:06:02 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to (at 10.8.49.158@tcp201) /var/log/messages-20180606.gz:Jun 5 04:06:02 warble1 kernel: Lustre: Skipped 81 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:07:41 warble1 kernel: LNetError: 45085:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.221@tcp201 /var/log/messages-20180606.gz:Jun 5 04:07:41 warble1 kernel: LNetError: 45085:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 81 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:09:42 warble1 kernel: Lustre: dagg-MDT0002: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88b94f3b8400, cur 1528135782 expire 1528135632 last 1528135555 /var/log/messages-20180606.gz:Jun 5 04:09:42 warble1 kernel: Lustre: Skipped 35 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:10:23 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180606.gz:Jun 5 04:10:23 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180606.gz:Jun 5 04:11:08 warble1 kernel: Lustre: dagg-MDT0002: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 04:11:08 warble1 kernel: Lustre: Skipped 28 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:11:20 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 04:11:20 warble1 kernel: Lustre: Skipped 9 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: LNet: Service thread pid 59266 was inactive for 200.06s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: Pid: 59266, comm: mdt_rdpg00_016 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 04:11:39 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528135899.59266 /var/log/messages-20180606.gz:Jun 5 04:12:24 warble2 ntpd[13905]: 0.0.0.0 0612 02 freq_set kernel 85.592 PPM /var/log/messages-20180606.gz:Jun 5 04:12:24 warble2 ntpd[13905]: 0.0.0.0 0615 05 clock_sync /var/log/messages-20180606.gz:Jun 5 04:16:20 warble1 kernel: Lustre: MGS: Connection restored to (at 10.8.49.155@tcp201) /var/log/messages-20180606.gz:Jun 5 04:16:20 warble1 kernel: Lustre: Skipped 81 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:17:07 warble1 kernel: Lustre: 47797:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88b8bc367500 x1601996037972272/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:377/0 lens 512/696 e 1 to 0 dl 1528136232 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 04:18:00 warble1 kernel: LNetError: 44338:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.159@tcp201 /var/log/messages-20180606.gz:Jun 5 04:18:00 warble1 kernel: LNetError: 44338:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 81 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:18:14 warble1 kernel: Lustre: 59139:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885de5125d00 x1601906749510800/t0(0) o35->c7c406ea-6ebe-a8cb-c838-f6840e69ac20@192.168.44.187@o2ib44:444/0 lens 512/696 e 24 to 0 dl 1528136299 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 04:20:37 warble1 kernel: Lustre: dagg-MDT0000: haven't heard from client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88b5bacb4000, cur 1528136437 expire 1528136287 last 1528136210 /var/log/messages-20180606.gz:Jun 5 04:20:37 warble1 kernel: Lustre: Skipped 43 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:21:09 warble1 kernel: Lustre: dagg-MDT0002: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 04:21:09 warble1 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:21:21 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.176@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 04:21:21 warble1 kernel: Lustre: Skipped 10 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:26:21 warble1 kernel: Lustre: MGS: Connection restored to 5132f085-03cf-d894-16fc-f637cc1b8d52 (at 10.8.49.176@tcp201) /var/log/messages-20180606.gz:Jun 5 04:26:21 warble1 kernel: Lustre: Skipped 82 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: LNet: Service thread pid 59492 was inactive for 1200.64s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: Pid: 59492, comm: mdt_rdpg01_021 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 04:27:13 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528136833.59492 /var/log/messages-20180606.gz:Jun 5 04:28:25 warble1 kernel: LNetError: 48071:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180606.gz:Jun 5 04:28:25 warble1 kernel: LNetError: 48071:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:30:37 warble1 kernel: Lustre: dagg-MDT0000: haven't heard from client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88b622df8800, cur 1528137037 expire 1528136887 last 1528136810 /var/log/messages-20180606.gz:Jun 5 04:30:37 warble1 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:31:10 warble1 kernel: Lustre: dagg-MDT0002: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 04:31:10 warble1 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:31:52 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.158@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 04:31:52 warble1 kernel: Lustre: Skipped 10 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:32:04 warble1 kernel: Lustre: 58887:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bc0c381e00 x1601996038319728/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:519/0 lens 512/696 e 1 to 0 dl 1528137129 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 04:36:51 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180606.gz:Jun 5 04:36:51 warble1 kernel: Lustre: dagg-MDT0000: Connection restored to (at 10.8.49.221@tcp201) /var/log/messages-20180606.gz:Jun 5 04:36:51 warble1 kernel: Lustre: Skipped 82 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:38:32 warble1 kernel: LNetError: 48063:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.158@tcp201 /var/log/messages-20180606.gz:Jun 5 04:38:32 warble1 kernel: LNetError: 48063:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:40:37 warble1 kernel: Lustre: dagg-MDT0000: haven't heard from client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88ba19037800, cur 1528137637 expire 1528137487 last 1528137410 /var/log/messages-20180606.gz:Jun 5 04:40:37 warble1 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:41:11 warble1 kernel: Lustre: dagg-MDT0002: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 04:41:11 warble1 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: LNet: Service thread pid 47797 was inactive for 1201.57s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: Pid: 47797, comm: mdt_rdpg01_004 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 04:42:10 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528137730.47797 /var/log/messages-20180606.gz:Jun 5 04:42:41 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.221@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 04:42:41 warble1 kernel: Lustre: Skipped 10 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:46:52 warble1 kernel: Lustre: MGS: Connection restored to (at 10.8.49.158@tcp201) /var/log/messages-20180606.gz:Jun 5 04:46:52 warble1 kernel: Lustre: Skipped 81 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:48:50 warble1 kernel: LNetError: 45113:0:(lib-move.c:1557:lnet_select_pathway()) no route to 10.8.49.155@tcp201 /var/log/messages-20180606.gz:Jun 5 04:48:50 warble1 kernel: LNetError: 45113:0:(lib-move.c:1557:lnet_select_pathway()) Skipped 80 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:50:57 warble1 kernel: Lustre: dagg-MDT0001: haven't heard from client 19829e49-7e7c-a374-4aae-93f5d07c53f8 (at 10.8.49.155@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88b79ad36c00, cur 1528138257 expire 1528138107 last 1528138030 /var/log/messages-20180606.gz:Jun 5 04:50:57 warble1 kernel: Lustre: Skipped 40 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:51:12 warble1 kernel: Lustre: dagg-MDT0002: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 04:51:12 warble1 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages-20180606.gz:Jun 5 04:52:17 warble1 kernel: Lustre: 155369:0:(genops.c:1734:obd_export_evict_by_uuid()) dagg-MDT0000: evicting 19829e49-7e7c-a374-4aae-93f5d07c53f8 at adminstrative request /var/log/messages-20180606.gz:Jun 5 04:52:47 warble1 kernel: Lustre: 155369:0:(genops.c:1734:obd_export_evict_by_uuid()) dagg-MDT0000: evicting 07a7a787-ad21-2b1a-5302-c80ca70de269 at adminstrative request -- /var/log/messages-20180606.gz:Jun 5 05:50:26 warble2 ntpd[14473]: ntpd exiting on signal 15 /var/log/messages-20180606.gz:Jun 5 05:50:26 warble2 systemd: Stopping PCS GUI and remote configuration interface... /var/log/messages-20180606.gz:Jun 5 05:50:26 warble2 systemd: Stopping Machine Check Exception Logging Daemon... /var/log/messages-20180606.gz:Jun 5 05:50:26 warble2 systemd: Stopping Network Time Service... /var/log/messages-20180606.gz:Jun 5 05:50:26 warble2 systemd: Stopping Login Service... /var/log/messages-20180606.gz:Jun 5 05:50:26 warble2 systemd: Stopping SYSV: pslogger process logger... /var/log/messages-20180606.gz:Jun 5 05:50:32 transom1 fm0_sm[109211]: transom1; MSG:NOTICE|SM:transom1:port 1|COND:#4 Disappearance from fabric|NODE:warble2 hfi1_0:port 1:0x001175010171d0f9|LINKEDTO:opa-l-6:port 3:0x00117501020c38c8|DETAIL:Node type: hfi /var/log/messages-20180606.gz:Jun 5 05:50:42 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180606.gz:Jun 5 05:50:48 warble1 kernel: bnx2x 0000:19:00.3 em4: NIC Link is Down /var/log/messages-20180606.gz:Jun 5 05:50:51 warble1 kernel: bnx2x 0000:19:00.3 em4: NIC Link is Up, 1000 Mbps full duplex, Flow control: ON - receive & transmit /var/log/messages-20180606.gz:Jun 5 05:50:59 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180606.gz:Jun 5 05:51:17 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180606.gz:Jun 5 05:51:18 warble1 kernel: Lustre: dagg-MDT0002: Client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 05:51:18 warble1 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages-20180606.gz:Jun 5 05:51:34 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180606.gz:Jun 5 05:51:52 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180606.gz:Jun 5 05:52:09 transom1 opaQueryGmetric: /root/ib/opaQueryGmetric.py: Error: host warble2 not in post /var/log/messages-20180606.gz:Jun 5 05:52:52 10.7.2.95 StorageArray: warble-md3420;104;Informational;Needs attention condition resolved /var/log/messages-20180606.gz:Jun 5 05:53:07 warble1 kernel: Lustre: dagg-MDT0002: haven't heard from client 7a807933-486b-2969-fc5f-eddf31a41573 (at 10.8.49.221@tcp201) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdc6a55400, cur 1528141987 expire 1528141837 last 1528141760 /var/log/messages-20180606.gz:Jun 5 05:53:07 warble1 kernel: Lustre: Skipped 41 previous similar messages /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: LNet: Service thread pid 48536 was inactive for 200.40s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: Pid: 48536, comm: mdt_rdpg01_011 /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:32 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 05:54:33 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528142073.48536 /var/log/messages-20180606.gz:Jun 5 05:58:00 warble1 kernel: Lustre: dagg-MDT0001: Connection restored to 90b226ee-19b9-fc21-a879-dea9f207d8b9 (at 10.8.49.159@tcp201) /var/log/messages-20180606.gz:Jun 5 05:58:00 warble1 kernel: Lustre: Skipped 83 previous similar messages /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: LNet: Service thread pid 47349 was inactive for 200.38s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: Pid: 47349, comm: mdt00_012 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? mdt_version_save+0x67/0x120 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] mdt_reint_setattr+0xba5/0x1060 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 05:58:05 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528142285.47349 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: LNet: Service thread pid 47376 was inactive for 200.58s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: Pid: 47376, comm: mdt00_018 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? zap_lookup_norm_by_dnode+0x97/0xc0 [zfs] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? lu_object_find+0x16/0x20 [obdclass] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 05:58:44 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528142324.47376 /var/log/messages-20180606.gz:Jun 5 05:58:50 warble1 kernel: Lustre: MGS: Received new LWP connection from 10.8.49.155@tcp201, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 05:58:50 warble1 kernel: Lustre: Skipped 10 previous similar messages -- /var/log/messages-20180606.gz:Jun 5 06:25:57 warble1 kernel: LustreError: 0-0: Forced cleanup waiting for dagg-MDT0000-osp-MDT0002 namespace with 2 resources in use, (rc=-110) /var/log/messages-20180606.gz:Jun 5 06:26:04 warble1 kernel: Lustre: server umount MGS complete /var/log/messages-20180606.gz:Jun 5 06:26:15 warble1 kernel: Lustre: 35194:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528143968/real 1528143968] req@ffff88b206697800 x1602361437481856/t0(0) o400->MGC192.168.44.21@o2ib44@0@lo:26/25 lens 224/224 e 0 to 1 dl 1528143975 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:26:15 warble1 kernel: Lustre: 35194:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 06:26:15 warble1 kernel: LustreError: 166-1: MGC192.168.44.21@o2ib44: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail /var/log/messages-20180606.gz:Jun 5 06:26:21 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528143975/real 1528143975] req@ffff88b206696300 x1602361437482336/t0(0) o250->MGC192.168.44.21@o2ib44@0@lo:26/25 lens 520/544 e 0 to 1 dl 1528143981 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:26:22 warble1 kernel: Lustre: dagg-MDT0002: Not available for connect from 192.168.44.175@o2ib44 (stopping) /var/log/messages-20180606.gz:Jun 5 06:26:22 warble1 kernel: Lustre: Skipped 59 previous similar messages /var/log/messages-20180606.gz:Jun 5 06:26:22 warble1 kernel: LustreError: 0-0: Forced cleanup waiting for dagg-MDT0000-osp-MDT0002 namespace with 2 resources in use, (rc=-110) /var/log/messages-20180606.gz:Jun 5 06:26:23 warble1 kernel: LustreError: 137-5: images-MDT0000_UUID: not available for connect from 192.168.44.112@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180606.gz:Jun 5 06:26:23 warble1 kernel: LustreError: Skipped 90 previous similar messages /var/log/messages-20180606.gz:Jun 5 06:26:47 warble1 kernel: LustreError: 0-0: Forced cleanup waiting for dagg-MDT0000-osp-MDT0002 namespace with 2 resources in use, (rc=-110) /var/log/messages-20180606.gz:Jun 5 06:26:51 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528144000/real 1528144000] req@ffff88b1e7fbe600 x1602361437483568/t0(0) o250->MGC192.168.44.21@o2ib44@0@lo:26/25 lens 520/544 e 0 to 1 dl 1528144011 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:26:51 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 06:26:55 warble1 kernel: Lustre: dagg-MDT0002: Not available for connect from 192.168.44.217@o2ib44 (stopping) /var/log/messages-20180606.gz:Jun 5 06:26:55 warble1 kernel: Lustre: Skipped 128 previous similar messages /var/log/messages-20180606.gz:Jun 5 06:27:06 warble1 kernel: LustreError: 327038:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88b206696600 x1602361437485040/t0(0) o101->dagg-MDT0000-lwp-MDT0002@0@lo:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:27:06 warble1 kernel: LustreError: 327038:0:(client.c:1166:ptlrpc_import_delay_req()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 06:27:06 warble1 kernel: LustreError: 327038:0:(qsd_reint.c:56:qsd_reint_completion()) dagg-MDT0002: failed to enqueue global quota lock, glb fid:[0x200000006:0x1010000:0x0], rc:-5 /var/log/messages-20180606.gz:Jun 5 06:27:12 warble1 kernel: LustreError: 0-0: Forced cleanup waiting for dagg-MDT0000-osp-MDT0002 namespace with 2 resources in use, (rc=-110) /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: LNet: Service thread pid 58449 was inactive for 1200.91s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: Pid: 58449, comm: mdt_rdpg01_017 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] top_trans_wait_result+0xa6/0x151 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] top_trans_stop+0x46e/0x970 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? top_trans_start+0x28e/0x950 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] mdd_attr_set+0x59a/0xb60 [mdd] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 06:27:14 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528144034.58449 /var/log/messages-20180606.gz:Jun 5 06:27:16 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528144025/real 1528144025] req@ffff88b206695400 x1602361437484816/t0(0) o38->dagg-MDT0001-osp-MDT0002@192.168.44.22@o2ib44:24/4 lens 520/544 e 0 to 1 dl 1528144036 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:27:21 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528144025/real 1528144025] req@ffff88b206694e00 x1602361437484800/t0(0) o250->MGC192.168.44.21@o2ib44@0@lo:26/25 lens 520/544 e 0 to 1 dl 1528144041 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:27:27 warble1 kernel: LustreError: 137-5: images-MDT0000_UUID: not available for connect from 192.168.44.211@o2ib44 (no target). If you are running an HA pair check that the target is mounted on the other server. /var/log/messages-20180606.gz:Jun 5 06:27:27 warble1 kernel: LustreError: Skipped 843 previous similar messages /var/log/messages-20180606.gz:Jun 5 06:27:37 warble1 kernel: LustreError: 0-0: Forced cleanup waiting for dagg-MDT0000-osp-MDT0002 namespace with 2 resources in use, (rc=-110) /var/log/messages-20180606.gz:Jun 5 06:27:51 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528144050/real 1528144050] req@ffff88b1f3847b00 x1602361437486128/t0(0) o250->MGC192.168.44.21@o2ib44@0@lo:26/25 lens 520/544 e 0 to 1 dl 1528144071 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:28:02 warble1 kernel: LustreError: 0-0: Forced cleanup waiting for dagg-MDT0000-osp-MDT0002 namespace with 2 resources in use, (rc=-110) /var/log/messages-20180606.gz:Jun 5 06:28:09 warble1 kernel: Lustre: dagg-MDT0002: Not available for connect from 192.168.44.175@o2ib44 (stopping) /var/log/messages-20180606.gz:Jun 5 06:28:09 warble1 kernel: Lustre: Skipped 150 previous similar messages /var/log/messages-20180606.gz:Jun 5 06:28:11 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528144075/real 1528144075] req@ffff88b1f3847b00 x1602361437487456/t0(0) o38->dagg-MDT0001-osp-MDT0002@192.168.44.22@o2ib44:24/4 lens 520/544 e 0 to 1 dl 1528144091 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 06:28:27 warble1 kernel: LustreError: 0-0: Forced cleanup waiting for dagg-MDT0000-osp-MDT0002 namespace with 2 resources in use, (rc=-110) /var/log/messages-20180606.gz:Jun 5 06:29:06 warble1 kernel: Lustre: 35178:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1528144125/real 1528144125] req@ffff88b8bc360f00 x1602361437490064/t0(0) o38->dagg-MDT0001-osp-MDT0002@192.168.44.22@o2ib44:24/4 lens 520/544 e 0 to 1 dl 1528144146 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 -- /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: Listen normally on 7 lo ::1 UDP 123 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: Listen normally on 8 em4 fe80::1a66:daff:febc:88e3 UDP 123 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: Listen normally on 9 em3 fe80::1a66:daff:febc:88e1 UDP 123 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: Listen normally on 10 bond0 fe80::1a66:daff:febc:88dd UDP 123 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: Listen normally on 11 ib0 fe80::211:7501:171:de62 UDP 123 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: Listening on routing socket on fd #28 for interface updates /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: 0.0.0.0 c016 06 restart /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 ntpd[10108]: 0.0.0.0 c011 01 freq_not_set /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 systemd: Started Update UTMP about System Runlevel Changes. /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 systemd: Started Ganglia Monitoring Daemon. /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 systemd: Startup finished in 6.458s (kernel) + 27.308s (initrd) + 44.738s (userspace) = 1min 18.504s. /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 kernel: 8021q: 802.1Q VLAN Support v1.8 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 kernel: 8021q: adding VLAN 0 to HW filter on device em1 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 kernel: 8021q: adding VLAN 0 to HW filter on device em2 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 kernel: 8021q: adding VLAN 0 to HW filter on device em3 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 kernel: 8021q: adding VLAN 0 to HW filter on device em4 /var/log/messages-20180606.gz:Jun 5 06:40:45 warble1 kernel: 8021q: adding VLAN 0 to HW filter on device bond0 /var/log/messages-20180606.gz:Jun 5 06:40:50 warble1 ntpd[10108]: 0.0.0.0 c514 04 freq_mode /var/log/messages-20180606.gz:Jun 5 06:42:23 warble2 kernel: LustreError: 23521:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0002-osp-MDT0000: fail to cancel 0 of 1 llog-records: rc = -116 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: LNet: Service thread pid 31219 was inactive for 200.35s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: Pid: 31219, comm: mdt_rdpg01_012 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528145138.31219 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: Pid: 19953, comm: mdt00_008 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdd_rename+0x4d1/0x14a0 [mdd] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_reint_rename_internal.isra.36+0x166a/0x20c0 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x19b/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:38 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: LNet: Service thread pid 21618 was inactive for 200.53s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: Pid: 21618, comm: mdt01_021 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x67c/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: /var/log/messages-20180606.gz:Jun 5 06:45:39 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528145139.21618 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: LNet: Service thread pid 29743 was inactive for 200.41s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: Pid: 29743, comm: mdt_rdpg01_003 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: /var/log/messages-20180606.gz:Jun 5 06:46:26 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528145186.29743 /var/log/messages-20180606.gz:Jun 5 06:47:18 warble2 kernel: LustreError: 21618:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528144938, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88bd4781e800/0xe5a911331421704e lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 3 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 21618 timeout: 0 lvb_type: 0 /var/log/messages-20180606.gz:Jun 5 06:47:18 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528145238.21618 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: LNet: Service thread pid 27197 was inactive for 200.49s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: Pid: 27197, comm: mdt_rdpg00_014 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: /var/log/messages-20180606.gz:Jun 5 06:48:48 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528145328.27197 /var/log/messages-20180606.gz:Jun 5 06:52:13 warble2 kernel: Lustre: 30715:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bd894b6300 x1601996054268928/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:623/0 lens 512/696 e 24 to 0 dl 1528145538 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 06:52:13 warble2 kernel: Lustre: 21620:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bd894b2a00 x1601996054269088/t0(0) o36->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:623/0 lens 784/3128 e 24 to 0 dl 1528145538 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 06:52:19 warble2 kernel: Lustre: dagg-MDT0000: Client d5c7ff2e-8c29-354a-1a03-772bf8dd100f (at 192.168.44.14@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 06:52:19 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to cca9b578-d68d-8227-225a-f0bcec6e5bc1 (at 192.168.44.14@o2ib44) /var/log/messages-20180606.gz:Jun 5 06:52:19 warble2 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 06:53:00 warble2 kernel: Lustre: 30715:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bd91c1cc50 x1601996054274224/t0(0) o35->e5a97d67-ed84-7335-830f-861443a6ce18@192.168.44.13@o2ib44:670/0 lens 512/696 e 11 to 0 dl 1528145585 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 06:53:00 warble2 kernel: Lustre: 30715:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 06:54:27 warble1 systemd: Starting Cleanup of Temporary Directories... /var/log/messages-20180606.gz:Jun 5 06:54:27 warble1 systemd: Started Cleanup of Temporary Directories. /var/log/messages-20180606.gz:Jun 5 06:55:24 warble2 ntpd[13847]: 0.0.0.0 c612 02 freq_set kernel 85.163 PPM /var/log/messages-20180606.gz:Jun 5 06:55:24 warble2 ntpd[13847]: 0.0.0.0 c61c 0c clock_step +0.442171 s /var/log/messages-20180606.gz:Jun 5 06:55:24 warble2 systemd: Time has been changed -- /var/log/messages-20180606.gz:Jun 5 07:53:36 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 07:53:36 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 07:53:36 warble1 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 07:53:36 warble1 kernel: LustreError: 11-0: dagg-MDT0001-osp-MDT0002: operation mds_connect to node 0@lo failed: rc = -114 /var/log/messages-20180606.gz:Jun 5 07:53:37 warble1 kernel: Lustre: 14600:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1528149211/real 0] req@ffff885dfe621200 x1602380417208720/t0(0) o38->dagg-MDT0000-lwp-MDT0001@192.168.44.22@o2ib44:12/10 lens 520/544 e 0 to 1 dl 1528149216 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 07:53:38 warble1 kernel: Lustre: dagg-MDT0000: Connection restored to 9d355025-6242-1611-8231-d90e20157e8a (at 192.168.44.103@o2ib44) /var/log/messages-20180606.gz:Jun 5 07:53:38 warble1 kernel: Lustre: Skipped 192 previous similar messages /var/log/messages-20180606.gz:Jun 5 07:53:41 warble1 kernel: Lustre: 14600:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1528149216/real 0] req@ffff885dfe627800 x1602380417210304/t0(0) o38->dagg-MDT0000-lwp-MDT0002@192.168.44.22@o2ib44:12/10 lens 520/544 e 0 to 1 dl 1528149221 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 07:53:41 warble1 kernel: Lustre: 14600:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 07:54:06 warble1 kernel: LNet: 14588:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 192.168.44.22@o2ib44: 4299147 seconds /var/log/messages-20180606.gz:Jun 5 07:54:06 warble1 kernel: Lustre: 14600:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1528149241/real 1528149246] req@ffff885df5af8300 x1602380417211888/t0(0) o38->dagg-MDT0001-osp-MDT0002@192.168.44.22@o2ib44:24/4 lens 520/544 e 0 to 1 dl 1528149247 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 /var/log/messages-20180606.gz:Jun 5 07:54:06 warble1 kernel: Lustre: 14600:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 07:54:10 warble1 kernel: Lustre: dagg-MDT0000: Connection restored to 192.168.44.34@o2ib44 (at 192.168.44.34@o2ib44) /var/log/messages-20180606.gz:Jun 5 07:54:10 warble1 kernel: Lustre: Skipped 411 previous similar messages /var/log/messages-20180606.gz:Jun 5 07:54:16 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 07:54:19 warble1 kernel: Lustre: images-MDT0000: Recovery over after 1:06, of 123 clients 123 recovered and 0 were evicted. /var/log/messages-20180606.gz:Jun 5 07:54:26 warble1 kernel: LustreError: 22891:0:(llog_cat.c:269:llog_cat_id2handle()) dagg-MDT0002-osp-MDT0000: error opening log id [0x1:0x8002b381:0x6]:0: rc = -2 /var/log/messages-20180606.gz:Jun 5 07:54:26 warble1 kernel: LustreError: 22891:0:(llog_cat.c:824:llog_cat_process_cb()) dagg-MDT0002-osp-MDT0000: cannot find handle for llog [0x1:0x8002b381:0x6]: rc = -2 /var/log/messages-20180606.gz:Jun 5 07:54:26 warble1 kernel: Lustre: dagg-MDT0002: Recovery over after 0:50, of 123 clients 123 recovered and 0 were evicted. /var/log/messages-20180606.gz:Jun 5 07:54:26 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: LNet: Service thread pid 30821 was inactive for 200.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: Pid: 30821, comm: mdt00_007 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? kfree+0x106/0x140 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528149467.30821 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: Pid: 20010, comm: mdt01_001 -- /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdd_rename+0x4d1/0x14a0 [mdd] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_rename_internal.isra.36+0x166a/0x20c0 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x19b/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:47 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: LNet: Service thread pid 29654 was inactive for 200.46s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: Pid: 29654, comm: mdt_rdpg01_006 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 07:57:48 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528149468.29654 /var/log/messages-20180606.gz:Jun 5 07:58:02 10.7.2.95 StorageArray: warble-md3420;4011;Warning;Virtual disk not on preferred path due to failover /var/log/messages-20180606.gz:Jun 5 07:59:27 warble1 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180606.gz:Jun 5 07:59:27 warble1 kernel: LustreError: 30821:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528149267, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.21@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885df2d3ee00/0xb21c78a363f30e03 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 3 type: IBT flags: 0x1000001000000 nid: local remote: 0xb21c78a363f30e0a expref: -99 pid: 30821 timeout: 0 lvb_type: 0 /var/log/messages-20180606.gz:Jun 5 07:59:27 warble1 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 07:59:27 warble1 kernel: Lustre: dagg-MDT0000: Connection restored to 192.168.44.21@o2ib44 (at 0@lo) /var/log/messages-20180606.gz:Jun 5 07:59:27 warble1 kernel: Lustre: Skipped 233 previous similar messages /var/log/messages-20180606.gz:Jun 5 08:00:02 warble1 kernel: LNet: 7070:0:(o2iblnd_cb.c:2405:kiblnd_passive_connect()) Conn stale 192.168.44.13@o2ib44 version 12/12 incarnation 1528149601950653/1528149601950653 /var/log/messages-20180606.gz:Jun 5 08:00:02 warble1 kernel: LNet: 14588:0:(o2iblnd_cb.c:1350:kiblnd_reconnect_peer()) Abort reconnection of 192.168.44.13@o2ib44: connected /var/log/messages-20180606.gz:Jun 5 08:00:35 warble1 kernel: Lustre: images-MDT0000: haven't heard from client acda3e68-0b90-3c01-43a4-d12e2223c2d6 (at 192.168.44.105@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e0721f400, cur 1528149635 expire 1528149485 last 1528149408 /var/log/messages-20180606.gz:Jun 5 08:01:19 warble1 kernel: Lustre: dagg-MDT0002: haven't heard from client e5a97d67-ed84-7335-830f-861443a6ce18 (at 192.168.44.13@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdf3d3a800, cur 1528149679 expire 1528149529 last 1528149452 /var/log/messages-20180606.gz:Jun 5 08:01:19 warble1 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages-20180606.gz:Jun 5 08:02:09 warble1 kernel: Lustre: MGS: haven't heard from client cca9b578-d68d-8227-225a-f0bcec6e5bc1 (at 192.168.44.14@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bdef6a6c00, cur 1528149729 expire 1528149579 last 1528149502 /var/log/messages-20180606.gz:Jun 5 08:02:09 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 08:02:15 warble1 kernel: Lustre: images-MDT0000: haven't heard from client a1880c59-f9ea-ebde-b415-b7d2dcd12bd6 (at 192.168.44.14@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e07218400, cur 1528149735 expire 1528149585 last 1528149508 /var/log/messages-20180606.gz:Jun 5 08:05:58 warble1 kernel: Lustre: MGS: haven't heard from client 3b9b68a4-6cb4-bb0b-cf77-9e5eac43fc85 (at 192.168.44.105@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885df6fe0800, cur 1528149958 expire 1528149808 last 1528149731 /var/log/messages-20180606.gz:Jun 5 08:06:41 warble1 kernel: Lustre: MGS: Connection restored to acda3e68-0b90-3c01-43a4-d12e2223c2d6 (at 192.168.44.105@o2ib44) /var/log/messages-20180606.gz:Jun 5 08:06:41 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 08:11:38 warble1 kernel: Lustre: MGS: Connection restored to 968eec94-fcdc-5794-f6e0-c91a986a6aa8 (at 192.168.44.13@o2ib44) /var/log/messages-20180606.gz:Jun 5 08:11:38 warble1 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: LNet: Service thread pid 29651 was inactive for 200.71s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: Pid: 29651, comm: mdt_rdpg01_003 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 08:19:18 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528150758.29651 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: LNet: Service thread pid 45074 was inactive for 200.72s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: Pid: 45074, comm: mdt_rdpg01_016 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528150776.45074 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: Pid: 29655, comm: mdt_rdpg01_007 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:36 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:37 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: LNet: Service thread pid 29652 was inactive for 200.77s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: Pid: 29652, comm: mdt_rdpg01_004 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528150780.29652 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: Pid: 45505, comm: mdt_rdpg01_018 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 08:19:40 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] -- /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:19:41 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 08:25:52 warble1 kernel: Lustre: 29653:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bddcc73900 x1601957225095056/t0(0) o35->aa3ec151-d4f5-0084-927c-a97dea5f3bfc@192.168.44.131@o2ib44:202/0 lens 512/696 e 24 to 0 dl 1528151157 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 08:25:58 warble1 kernel: Lustre: dagg-MDT0002: Client aa3ec151-d4f5-0084-927c-a97dea5f3bfc (at 192.168.44.131@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:25:58 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 08:25:58 warble1 kernel: Lustre: Skipped 13 previous similar messages /var/log/messages-20180606.gz:Jun 5 08:26:10 warble1 kernel: Lustre: 45073:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bddf7d9b00 x1601957226556832/t0(0) o35->aa3ec151-d4f5-0084-927c-a97dea5f3bfc@192.168.44.131@o2ib44:220/0 lens 512/696 e 22 to 0 dl 1528151175 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 08:26:14 warble1 kernel: Lustre: 45073:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd7cc9b00 x1601956360850608/t0(0) o35->a7d952ca-39a6-9731-3962-c5f9a501d014@192.168.44.172@o2ib44:224/0 lens 512/696 e 22 to 0 dl 1528151179 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 08:26:14 warble1 kernel: Lustre: 45073:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 08:26:22 warble1 kernel: Lustre: dagg-MDT0002: Client a7d952ca-39a6-9731-3962-c5f9a501d014 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:35:59 warble1 kernel: Lustre: dagg-MDT0002: Client aa3ec151-d4f5-0084-927c-a97dea5f3bfc (at 192.168.44.131@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:35:59 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 08:35:59 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 08:36:23 warble1 kernel: Lustre: dagg-MDT0002: Client a7d952ca-39a6-9731-3962-c5f9a501d014 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: LNet: Service thread pid 45073 was inactive for 200.64s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: Pid: 45073, comm: mdt_rdpg01_015 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 08:45:46 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528152346.45073 /var/log/messages-20180606.gz:Jun 5 08:46:00 warble1 kernel: Lustre: dagg-MDT0002: Client aa3ec151-d4f5-0084-927c-a97dea5f3bfc (at 192.168.44.131@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:46:00 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 08:46:00 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 08:46:24 warble1 kernel: Lustre: dagg-MDT0002: Client a7d952ca-39a6-9731-3962-c5f9a501d014 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:52:20 warble1 kernel: Lustre: 32813:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdd1073300 x1602381728793680/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:280/0 lens 512/696 e 24 to 0 dl 1528152745 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 08:52:20 warble1 kernel: Lustre: 32813:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 08:52:27 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:56:01 warble1 kernel: Lustre: dagg-MDT0002: Client aa3ec151-d4f5-0084-927c-a97dea5f3bfc (at 192.168.44.131@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 08:56:01 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 08:56:01 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:00:26 warble1 kernel: Lustre: 417711:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdffbe3850 x1602381729372288/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:11/0 lens 512/696 e 2 to 0 dl 1528153231 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:02:27 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 09:02:27 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 09:03:11 warble1 kernel: Lustre: 32077:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-96), not sending early reply#012 req@ffff88bdff2d5450 x1602381730158864/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:176/0 lens 512/696 e 0 to 0 dl 1528153396 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:06:02 warble1 kernel: Lustre: dagg-MDT0002: Client aa3ec151-d4f5-0084-927c-a97dea5f3bfc (at 192.168.44.131@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 09:06:02 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 09:06:02 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:07:27 warble1 kernel: LNet: Service thread pid 29653 was inactive for 1016.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:07:27 warble1 kernel: Pid: 29653, comm: mdt_rdpg01_005 /var/log/messages-20180606.gz:Jun 5 09:07:27 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:07:28 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528153648.29653 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: LNet: Service thread pid 32075 was inactive for 1116.30s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: Pid: 32075, comm: mdt_rdpg01_009 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:10:11 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:10:12 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:10:12 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:10:12 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528153812.32075 /var/log/messages-20180606.gz:Jun 5 09:12:28 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 09:12:28 warble1 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 09:15:07 warble1 kernel: Lustre: 32077:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88bdc9096c00 x1602381731772784/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:137/0 lens 512/696 e 0 to 0 dl 1528154112 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:16:03 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 09:16:03 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:16:43 warble1 kernel: Lustre: 417710:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88bddd2cb900 x1602381731966496/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:233/0 lens 512/696 e 0 to 0 dl 1528154208 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: LNet: Service thread pid 30482 was inactive for 200.46s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: Pid: 30482, comm: mdt_rdpg00_004 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:17:34 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154254.30482 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: LNet: Service thread pid 32247 was inactive for 200.49s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: Pid: 32247, comm: mdt_rdpg00_009 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:18:10 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154290.32247 /var/log/messages-20180606.gz:Jun 5 09:19:08 warble1 kernel: Lustre: 32814:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885de2a37b00 x1602381732129888/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:378/0 lens 512/696 e 0 to 0 dl 1528154353 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: LNet: Service thread pid 32248 was inactive for 212.42s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: Pid: 32248, comm: mdt_rdpg00_010 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:19:24 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154364.32248 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: LNet: Service thread pid 20007 was inactive for 200.63s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: Pid: 20007, comm: mdt00_001 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:20:34 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154434.20007 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: LNet: Service thread pid 33684 was inactive for 200.33s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: Pid: 33684, comm: mdt01_035 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:22:06 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154526.33684 /var/log/messages-20180606.gz:Jun 5 09:22:13 warble1 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180606.gz:Jun 5 09:22:13 warble1 kernel: LustreError: 20007:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528154233, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.21@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff88bde838de00/0xb21c78a3651b9591 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 4 type: IBT flags: 0x1000001000000 nid: local remote: 0xb21c78a3651b9598 expref: -99 pid: 20007 timeout: 0 lvb_type: 0 /var/log/messages-20180606.gz:Jun 5 09:22:13 warble1 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 09:22:29 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 09:22:29 warble1 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: LNet: Service thread pid 25602 was inactive for 1203.55s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: Pid: 25602, comm: mdt_rdpg01_002 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:22:41 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154561.25602 /var/log/messages-20180606.gz:Jun 5 09:23:46 warble1 kernel: Lustre: dagg-MDT0000-osp-MDT0001: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages-20180606.gz:Jun 5 09:23:46 warble1 kernel: LustreError: 33684:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528154326, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.21@o2ib44 ns: dagg-MDT0000-osp-MDT0001 lock: ffff88bddf35ac00/0xb21c78a365231590 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 2 type: IBT flags: 0x1000001000000 nid: local remote: 0xb21c78a365231597 expref: -99 pid: 33684 timeout: 0 lvb_type: 0 /var/log/messages-20180606.gz:Jun 5 09:23:46 warble1 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID /var/log/messages-20180606.gz:Jun 5 09:24:09 warble1 kernel: Lustre: 32249:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885debee9500 x1602381795561696/t0(0) o35->fe3dd6be-b5f0-1569-fa08-c73c4ec56144@192.168.44.14@o2ib44:679/0 lens 512/696 e 24 to 0 dl 1528154654 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: LNet: Service thread pid 417711 was inactive for 1201.76s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: Pid: 417711, comm: mdt_rdpg01_023 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:24:15 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154655.417711 /var/log/messages-20180606.gz:Jun 5 09:24:44 warble1 kernel: Lustre: 45488:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885e0ebdd850 x1602381795610064/t0(0) o35->fe3dd6be-b5f0-1569-fa08-c73c4ec56144@192.168.44.14@o2ib44:714/0 lens 512/696 e 12 to 0 dl 1528154689 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:25:47 warble1 kernel: Lustre: 42981:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885df72b0600 x1602381795655680/t0(0) o35->fe3dd6be-b5f0-1569-fa08-c73c4ec56144@192.168.44.14@o2ib44:22/0 lens 512/696 e 5 to 0 dl 1528154752 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:26:04 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 09:26:04 warble1 kernel: Lustre: Skipped 7 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:26:38 warble1 kernel: LNet: Service thread pid 45499 was inactive for 1200.41s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:26:38 warble1 kernel: Pid: 45499, comm: mdt_rdpg01_017 /var/log/messages-20180606.gz:Jun 5 09:26:38 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:26:38 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:26:38 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:26:38 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:38 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:26:39 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528154799.45499 /var/log/messages-20180606.gz:Jun 5 09:27:08 warble1 kernel: Lustre: 33220:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885de04b8f00 x1602381795684704/t0(0) o36->fe3dd6be-b5f0-1569-fa08-c73c4ec56144@192.168.44.14@o2ib44:103/0 lens 776/3128 e 24 to 0 dl 1528154833 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:28:41 warble1 kernel: Lustre: 33633:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdeaede000 x1602381734166016/t0(0) o36->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:196/0 lens 760/3128 e 24 to 0 dl 1528154926 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:32:18 warble1 kernel: Lustre: 32076:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88bdc9095100 x1602381734172448/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:413/0 lens 512/696 e 0 to 0 dl 1528155143 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:32:30 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 09:32:30 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:36:05 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 09:36:05 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:36:38 warble1 kernel: Lustre: 20013:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885ddbf84e00 x1602381797179616/t0(0) o35->fe3dd6be-b5f0-1569-fa08-c73c4ec56144@192.168.44.14@o2ib44:673/0 lens 512/696 e 0 to 0 dl 1528155403 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:36:38 warble1 kernel: Lustre: 20013:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: LNet: Service thread pid 44961 was inactive for 1017.62s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: Pid: 44961, comm: mdt_rdpg00_018 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:39:12 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528155552.44961 /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: Pid: 32812, comm: mdt_rdpg01_012 /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 09:39:49 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] -- /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 09:45:21 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528155921.43082 /var/log/messages-20180606.gz:Jun 5 09:46:06 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 09:46:06 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:52:32 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 09:52:32 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:56:07 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 09:56:07 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 09:56:47 warble1 kernel: Lustre: 42981:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885df314ce00 x1602381800833008/t0(0) o35->fe3dd6be-b5f0-1569-fa08-c73c4ec56144@192.168.44.14@o2ib44:372/0 lens 512/696 e 0 to 0 dl 1528156612 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 09:56:47 warble1 kernel: Lustre: 42981:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 10:02:33 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 10:02:33 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: LNet: Service thread pid 20013 was inactive for 1202.45s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: LNet: Skipped 3 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: Pid: 20013, comm: mdt_rdpg00_001 /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:19 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528157060.20013 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: Pid: 42980, comm: mdt_rdpg00_013 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] -- /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 10:04:20 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 10:06:08 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 10:06:08 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:12:34 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 10:12:34 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:16:09 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 10:16:09 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:22:35 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 10:22:35 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:26:10 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 10:26:10 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:32:36 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 10:32:36 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:36:11 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 10:36:11 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:42:37 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 10:42:37 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:46:12 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 10:46:12 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: LNet: Service thread pid 413302 was inactive for 200.30s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: LNet: Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: Pid: 413302, comm: mdt_rdpg01_019 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 10:46:19 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528159579.413302 /var/log/messages-20180606.gz:Jun 5 10:52:38 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 10:52:38 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 10:52:54 warble1 kernel: Lustre: 365029:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bdc1a0a050 x1602056497575520/t0(0) o35->ebbe0768-dc2d-c855-116e-620b3793d3cc@192.168.44.175@o2ib44:719/0 lens 512/696 e 24 to 0 dl 1528159979 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 10:52:54 warble1 kernel: Lustre: 365029:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: LNet: Service thread pid 42981 was inactive for 200.11s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: Pid: 42981, comm: mdt_rdpg00_014 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 10:53:47 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528160027.42981 /var/log/messages-20180606.gz:Jun 5 10:56:14 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 10:56:14 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:00:22 warble1 kernel: Lustre: 32249:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885e0d679050 x1602056599328096/t0(0) o35->2940fc32-93c7-1d6a-c6d1-0ac4c910d441@192.168.44.166@o2ib44:412/0 lens 512/696 e 24 to 0 dl 1528160427 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 11:02:39 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 11:02:39 warble1 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:03:30 warble1 kernel: Lustre: 365029:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-96), not sending early reply#012 req@ffff88bdd1076900 x1602056497721152/t0(0) o35->ebbe0768-dc2d-c855-116e-620b3793d3cc@192.168.44.175@o2ib44:600/0 lens 512/696 e 0 to 0 dl 1528160615 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 11:06:15 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 11:06:15 warble1 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:10:28 warble1 kernel: LNet: Service thread pid 32076 was inactive for 1114.27s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 11:10:28 warble1 kernel: Pid: 32076, comm: mdt_rdpg01_010 /var/log/messages-20180606.gz:Jun 5 11:10:28 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 11:10:28 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 11:10:29 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528161029.32076 /var/log/messages-20180606.gz:Jun 5 11:12:40 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 11:12:40 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:16:16 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 11:16:16 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:22:41 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 11:22:41 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:26:17 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 11:26:17 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:29:19 warble1 kernel: bnx2x 0000:19:00.3 em4: NIC Link is Down /var/log/messages-20180606.gz:Jun 5 11:29:22 warble1.oob Severity: Warning, Category: System Health, MessageID: NIC100, Message: The NIC Integrated 1 Port 4 network link is down. /var/log/messages-20180606.gz:Jun 5 11:29:22 warble1 kernel: bnx2x 0000:19:00.3 em4: NIC Link is Up, 1000 Mbps full duplex, Flow control: ON - receive & transmit /var/log/messages-20180606.gz:Jun 5 11:32:42 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting -- /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: MTU change on vl 3 from 10240 to 0 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: MTU change on vl 4 from 10240 to 0 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: MTU change on vl 5 from 10240 to 0 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: MTU change on vl 6 from 10240 to 0 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: MTU change on vl 7 from 10240 to 0 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: set_link_state: current INIT, new ARMED /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: logical state changed to PORT_ARMED (0x3) /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: send_idle_message: sending idle message 0x103 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: read_idle_message: read idle message 0x103 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: handle_sma_message: SMA message 0x1 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: set_link_state: current ARMED, new ACTIVE /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: logical state changed to PORT_ACTIVE (0x4) /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: send_idle_message: sending idle message 0x203 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: read_idle_message: read idle message 0x203 /var/log/messages-20180606.gz:Jun 5 11:33:52 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: handle_sma_message: SMA message 0x2 /var/log/messages-20180606.gz:Jun 5 11:33:54 warble2 kernel: hfi1 0000:d8:00.0: hfi1_0: Switching to NO_DMA_RTAIL /var/log/messages-20180606.gz:Jun 5 11:33:54 warble2 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready /var/log/messages-20180606.gz:Jun 5 11:33:57 warble2 ntpd[13705]: Listen normally on 11 ib0 fe80::211:7501:171:d0f9 UDP 123 /var/log/messages-20180606.gz:Jun 5 11:36:18 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 11:36:18 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: LNet: Service thread pid 358893 was inactive for 200.06s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: Pid: 358893, comm: mdt_rdpg00_024 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 11:37:58 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528162678.358893 /var/log/messages-20180606.gz:Jun 5 11:38:40 warble1 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:38:40 warble1 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:38:40 warble1 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:38:40 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 11:38:40 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 11:38:40 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 11:38:40 warble1 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 11:38:44 warble2 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 8:240 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:38:44 warble2 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:16 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:38:44 warble2 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:48 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:38:44 warble2 kernel: sd 1:0:1:10: rdac: array warble-md3420, ctlr 0, queueing MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 11:38:44 warble2 kernel: sd 1:0:1:10: rdac: array warble-md3420, ctlr 0, MODE_SELECT returned with sense 06/94/01 -- /var/log/messages-20180606.gz:Jun 5 11:44:12 warble1 kernel: scsi 1:0:2:31: SSP: enclosure level(0x0000), connector name( 1 ) /var/log/messages-20180606.gz:Jun 5 11:44:12 warble1 kernel: scsi 1:0:2:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) /var/log/messages-20180606.gz:Jun 5 11:44:12 warble1 kernel: scsi 1:0:2:31: Attached scsi generic sg18 type 0 /var/log/messages-20180606.gz:Jun 5 11:44:20 warble1 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:44:20 warble1 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:44:20 warble1 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:44:20 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 11:44:20 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 11:44:20 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 11:44:20 warble1 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 11:44:24 warble2 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 8:240 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:44:24 warble2 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:16 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:44:24 warble2 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:48 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:44:24 warble2 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:80 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 11:44:24 warble2 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:112 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 11:44:24 warble2 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:144 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 11:44:24 warble2 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:176 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 11:44:33 warble1 kernel: Lustre: 30483:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885debd89200 x1602056599850000/t0(0) o35->2940fc32-93c7-1d6a-c6d1-0ac4c910d441@192.168.44.166@o2ib44:43/0 lens 512/696 e 24 to 0 dl 1528163078 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 11:46:19 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 11:46:19 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: LNet: Service thread pid 33705 was inactive for 200.73s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: Pid: 33705, comm: mdt01_043 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x67c/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 11:46:28 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528163188.33705 /var/log/messages-20180606.gz:Jun 5 11:47:22 warble1 kernel: scsi 1:0:2:0: device_block, handle(0x000a) /var/log/messages-20180606.gz:Jun 5 11:47:22 warble1 kernel: scsi 1:0:2:31: device_block, handle(0x000a) /var/log/messages-20180606.gz:Jun 5 11:47:24 warble1 kernel: scsi 1:0:2:0: device_unblock and setting to running, handle(0x000a) /var/log/messages-20180606.gz:Jun 5 11:47:24 warble1 kernel: scsi 1:0:2:31: device_unblock and setting to running, handle(0x000a) /var/log/messages-20180606.gz:Jun 5 11:47:24 warble1 kernel: scsi 1:0:2:0: rdac: Detached /var/log/messages-20180606.gz:Jun 5 11:47:24 warble1 kernel: mpt3sas_cm0: removing handle(0x000a), sas_addr(0x500a0984b65ea614) /var/log/messages-20180606.gz:Jun 5 11:47:24 warble1 kernel: mpt3sas_cm0: removing : enclosure logical id(0x51866da0bde20800), slot(4) -- /var/log/messages-20180606.gz:Jun 5 11:51:45 warble1 kernel: mpt3sas_cm0: removing handle(0x000a), sas_addr(0x500a0984b65ea614) /var/log/messages-20180606.gz:Jun 5 11:51:45 warble1 kernel: mpt3sas_cm0: removing : enclosure logical id(0x51866da0bde20800), slot(4) /var/log/messages-20180606.gz:Jun 5 11:51:45 warble1 kernel: mpt3sas_cm0: removing enclosure level(0x0000), connector name( 1 ) /var/log/messages-20180606.gz:Jun 5 11:51:54 warble2 ntpd[13705]: 0.0.0.0 0612 02 freq_set kernel 67.193 PPM /var/log/messages-20180606.gz:Jun 5 11:51:54 warble2 ntpd[13705]: 0.0.0.0 0615 05 clock_sync /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:0: Direct-Access DELL MD34xx 0825 PQ: 1 ANSI: 5 /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:0: SSP: handle(0x000a), sas_addr(0x500a0984b65ea614), phy(4), device_name(0x500a0984b65ea614) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:0: SSP: enclosure_logical_id(0x51866da0bde20800), slot(4) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:0: SSP: enclosure level(0x0000), connector name( 1 ) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:0: rdac: LUN 0 (IOSHIP) (unowned) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:0: Attached scsi generic sg17 type 0 /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:31: SSP: handle(0x000a), sas_addr(0x500a0984b65ea614), phy(4), device_name(0x500a0984b65ea614) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:31: SSP: enclosure_logical_id(0x51866da0bde20800), slot(4) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:31: SSP: enclosure level(0x0000), connector name( 1 ) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) /var/log/messages-20180606.gz:Jun 5 11:51:56 warble1 kernel: scsi 1:0:4:31: Attached scsi generic sg18 type 0 /var/log/messages-20180606.gz:Jun 5 11:52:44 warble1 kernel: Lustre: dagg-MDT0002: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages-20180606.gz:Jun 5 11:52:44 warble1 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: LNet: Service thread pid 360672 was inactive for 200.76s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: Pid: 360672, comm: mdt_rdpg01_025 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 11:53:01 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528163581.360672 /var/log/messages-20180606.gz:Jun 5 11:53:02 warble1 kernel: Lustre: 33665:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bde1e7e300 x1602056498235872/t0(0) o36->ebbe0768-dc2d-c855-116e-620b3793d3cc@192.168.44.175@o2ib44:552/0 lens 760/3128 e 24 to 0 dl 1528163587 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: LNet: Service thread pid 30824 was inactive for 767.38s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: Pid: 30824, comm: mdt_rdpg00_007 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 11:53:25 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528163605.30824 /var/log/messages-20180606.gz:Jun 5 11:54:40 warble1 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:54:40 warble1 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:54:40 warble1 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:54:40 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 11:54:40 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 11:54:40 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 11:54:40 warble1 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 8:240 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:16 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:48 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 kernel: sd 1:0:1:10: rdac: array warble-md3420, ctlr 0, queueing MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 kernel: sd 1:0:1:10: rdac: array warble-md3420, ctlr 0, MODE_SELECT returned with sense 06/94/01 -- /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:176 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 kernel: sd 1:0:1:10: rdac: array warble-md3420, ctlr 0, MODE_SELECT completed /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, queueing MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, MODE_SELECT returned with sense 06/94/01 /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, retrying MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 11:54:45 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, MODE_SELECT completed /var/log/messages-20180606.gz:Jun 5 11:55:00 warble1 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:55:00 warble1 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:55:00 warble1 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:55:00 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 11:55:00 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 11:55:00 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 11:55:00 warble1 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 11:55:05 warble2 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 8:240 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 11:55:05 warble2 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:16 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 11:55:05 warble2 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:48 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 11:55:05 warble2 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:80 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 11:55:05 warble2 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:112 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 11:55:05 warble2 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:144 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 11:55:05 warble2 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:176 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: LNet: Service thread pid 32806 was inactive for 867.86s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: Pid: 32806, comm: mdt_rdpg00_012 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 11:56:09 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528163769.32806 /var/log/messages-20180606.gz:Jun 5 11:56:20 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 11:56:20 warble1 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: LNet: Service thread pid 33750 was inactive for 200.71s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: Pid: 33750, comm: mdt00_036 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x67c/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 11:58:24 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528163904.33750 /var/log/messages-20180606.gz:Jun 5 11:59:36 warble1 kernel: Lustre: 360796:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88bd77f1ec00 x1601912952572144/t0(0) o35->178ba697-5ce4-1bf0-4b26-b1c82128581b@192.168.44.173@o2ib44:191/0 lens 512/696 e 24 to 0 dl 1528163981 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages-20180606.gz:Jun 5 12:00:00 warble1 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 12:00:00 warble1 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 12:00:00 warble1 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 12:00:00 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 12:00:00 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 12:00:00 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] -- /var/log/messages-20180606.gz:Jun 5 12:16:07 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, queueing MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 12:16:07 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, MODE_SELECT returned with sense 06/94/01 /var/log/messages-20180606.gz:Jun 5 12:16:07 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, retrying MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 12:16:07 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, MODE_SELECT completed /var/log/messages-20180606.gz:Jun 5 12:16:21 warble1 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 12:16:21 warble1 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 12:16:21 warble1 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 12:16:21 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 12:16:21 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 12:16:21 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 12:16:21 warble1 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 12:16:22 warble1 kernel: Lustre: dagg-MDT0002: Connection restored to e0137da9-e6eb-5014-d1aa-97b9b2bdee9c (at 192.168.44.131@o2ib44) /var/log/messages-20180606.gz:Jun 5 12:16:22 warble1 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages-20180606.gz:Jun 5 12:16:26 warble2 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 8:240 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 12:16:26 warble2 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:16 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 12:16:26 warble2 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:48 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 12:16:26 warble2 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:80 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 12:16:26 warble2 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:112 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 12:16:26 warble2 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:144 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 12:16:26 warble2 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:176 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: LNet: Service thread pid 33656 was inactive for 200.29s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: Pid: 33656, comm: mdt00_033 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 12:17:30 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528165050.33656 -- /var/log/messages-20180606.gz:Jun 5 12:26:47 warble2 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:176 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 12:26:47 warble2 kernel: sd 1:0:1:10: rdac: array warble-md3420, ctlr 0, MODE_SELECT completed /var/log/messages-20180606.gz:Jun 5 12:26:47 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, queueing MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 12:26:47 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, MODE_SELECT returned with sense 06/94/01 /var/log/messages-20180606.gz:Jun 5 12:26:47 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, retrying MODE_SELECT command /var/log/messages-20180606.gz:Jun 5 12:26:48 warble2 kernel: sd 1:0:1:20: rdac: array warble-md3420, ctlr 0, MODE_SELECT completed /var/log/messages-20180606.gz:Jun 5 12:27:01 warble1 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 12:27:01 warble1 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 12:27:01 warble1 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 12:27:01 warble1 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 12:27:01 warble1 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 12:27:01 warble1 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 12:27:01 warble1 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 12:27:07 warble2 multipathd: vol-mgt-a: load table [0 819200 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 8:240 1 round-robin 0 1 1 8:16 1] /var/log/messages-20180606.gz:Jun 5 12:27:07 warble2 multipathd: vol-mdt0-a: load table [0 1319780352 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:16 1 round-robin 0 1 1 8:48 1] /var/log/messages-20180606.gz:Jun 5 12:27:07 warble2 multipathd: vol-mdt1-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:48 1 round-robin 0 1 1 8:80 1] /var/log/messages-20180606.gz:Jun 5 12:27:07 warble2 multipathd: vol-mdt2-a: load table [0 1320599552 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:80 1 round-robin 0 1 1 8:112 1] /var/log/messages-20180606.gz:Jun 5 12:27:07 warble2 multipathd: vol-home-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:112 1 round-robin 0 1 1 8:144 1] /var/log/messages-20180606.gz:Jun 5 12:27:07 warble2 multipathd: vol-images-system-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:144 1 round-robin 0 1 1 8:176 1] /var/log/messages-20180606.gz:Jun 5 12:27:07 warble2 multipathd: vol-apps-a: load table [0 230686720 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 round-robin 0 1 1 65:176 1 round-robin 0 1 1 8:208 1] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: LNet: Service thread pid 356483 was inactive for 200.30s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: Pid: 356483, comm: mdt_rdpg01_024 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 12:28:29 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528165709.356483 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: LNet: Service thread pid 33696 was inactive for 200.20s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: Pid: 33696, comm: mdt01_039 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: #012Call Trace: /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] schedule+0x29/0x70 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] kthread+0xd1/0xe0 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: [] ? kthread+0x0/0xe0 /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: /var/log/messages-20180606.gz:Jun 5 12:28:48 warble1 kernel: LustreError: dumping log to /tmp/lustre-log.1528165728.33696 -- /var/log/messages-20180611.gz:Jun 10 23:19:35 warble2 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages-20180611.gz:Jun 10 23:20:24 warble2 kernel: Lustre: MGS: haven't heard from client 271a6413-fbae-a93e-426e-4be380c2bb53 (at 192.168.44.199@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bb683a400, cur 1528636824 expire 1528636674 last 1528636597 /var/log/messages-20180611.gz:Jun 10 23:20:24 warble2 kernel: Lustre: Skipped 39 previous similar messages /var/log/messages-20180611.gz:Jun 10 23:20:37 warble2 kernel: Lustre: images-MDT0000: haven't heard from client bf21a22e-d192-194a-9e24-7d84d7be7d23 (at 192.168.44.199@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8853b0849800, cur 1528636837 expire 1528636687 last 1528636610 /var/log/messages-20180611.gz:Jun 10 23:21:18 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to d7bb16fe-2887-dcf1-0591-652afedc748e (at 192.168.44.199@o2ib44) /var/log/messages-20180611.gz:Jun 10 23:21:18 warble2 kernel: Lustre: Skipped 5 previous similar messages /var/log/messages-20180611.gz:Jun 11 03:45:09 warble2 kernel: Lustre: MGS: Connection restored to 422dc970-7af9-aaa2-d9c6-5a67490ea32e (at 192.168.44.124@o2ib44) /var/log/messages-20180611.gz:Jun 11 03:45:09 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages-20180611.gz:Jun 11 03:45:34 warble2 kernel: Lustre: images-MDT0000: Connection restored to 422dc970-7af9-aaa2-d9c6-5a67490ea32e (at 192.168.44.124@o2ib44) /var/log/messages-20180611.gz:Jun 11 03:45:53 warble2 kernel: Lustre: MGS: haven't heard from client 82802f44-c8ad-3994-a7d9-9e33ecae0bd7 (at 192.168.44.124@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88bcef36c400, cur 1528652753 expire 1528652603 last 1528652526 /var/log/messages-20180611.gz:Jun 11 03:46:14 warble2 kernel: Lustre: images-MDT0000: haven't heard from client dcd27d32-badd-af06-548b-22dcd178d80f (at 192.168.44.124@o2ib44) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d69df9800, cur 1528652774 expire 1528652624 last 1528652547 /var/log/messages-20180611.gz:Jun 11 03:46:52 warble2 kernel: Lustre: dagg-MDT0002: Connection restored to 422dc970-7af9-aaa2-d9c6-5a67490ea32e (at 192.168.44.124@o2ib44) /var/log/messages-20180611.gz:Jun 11 03:46:52 warble2 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages:Jun 11 11:03:39 warble2 kernel: LustreError: 110915:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0000-osp-MDT0002: fail to cancel 1 of 1 llog-records: rc = -116 /var/log/messages:Jun 11 11:03:40 warble2 kernel: LustreError: 110915:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0000-osp-MDT0002: fail to cancel 1 of 1 llog-records: rc = -116 /var/log/messages:Jun 11 11:03:40 warble2 kernel: LustreError: 110915:0:(llog_cat.c:795:llog_cat_cancel_records()) Skipped 5 previous similar messages /var/log/messages:Jun 11 11:03:52 warble2 kernel: LustreError: 110915:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0000-osp-MDT0002: fail to cancel 1 of 1 llog-records: rc = -116 /var/log/messages:Jun 11 11:03:52 warble2 kernel: LustreError: 110915:0:(llog_cat.c:795:llog_cat_cancel_records()) Skipped 11 previous similar messages /var/log/messages:Jun 11 11:04:02 warble2 kernel: LustreError: 109296:0:(llog_cat.c:795:llog_cat_cancel_records()) dagg-MDT0000-osp-MDT0001: fail to cancel 1 of 1 llog-records: rc = -116 /var/log/messages:Jun 11 11:04:02 warble2 kernel: LustreError: 109296:0:(llog_cat.c:795:llog_cat_cancel_records()) Skipped 1 previous similar message /var/log/messages:Jun 11 11:06:59 warble2 kernel: LNet: Service thread pid 116276 was inactive for 200.63s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:06:59 warble2 kernel: Pid: 116276, comm: mdt_rdpg00_024 /var/log/messages:Jun 11 11:06:59 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:06:59 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:06:59 warble2 kernel: /var/log/messages:Jun 11 11:06:59 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679219.116276 /var/log/messages:Jun 11 11:07:00 warble2 kernel: LNet: Service thread pid 87487 was inactive for 200.49s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:07:00 warble2 kernel: Pid: 87487, comm: mdt_rdpg00_000 /var/log/messages:Jun 11 11:07:00 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: /var/log/messages:Jun 11 11:07:00 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679220.87487 /var/log/messages:Jun 11 11:07:00 warble2 kernel: Pid: 331635, comm: mdt_rdpg01_040 /var/log/messages:Jun 11 11:07:00 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:00 warble2 kernel: /var/log/messages:Jun 11 11:07:01 warble2 kernel: LNet: Service thread pid 256447 was inactive for 200.45s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:07:01 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:07:01 warble2 kernel: Pid: 256447, comm: mdt_rdpg01_036 /var/log/messages:Jun 11 11:07:01 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:07:01 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:01 warble2 kernel: /var/log/messages:Jun 11 11:07:01 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679221.256447 /var/log/messages:Jun 11 11:07:02 warble2 kernel: Pid: 116272, comm: mdt_rdpg00_020 /var/log/messages:Jun 11 11:07:02 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:07:02 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:07:02 warble2 kernel: /var/log/messages:Jun 11 11:07:02 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679222.116272 /var/log/messages:Jun 11 11:07:06 warble2 kernel: LNet: Service thread pid 92763 was inactive for 200.13s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:07:06 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679226.92763 /var/log/messages:Jun 11 11:07:07 warble2 kernel: LNet: Service thread pid 79024 was inactive for 200.27s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:07:07 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679227.79024 /var/log/messages:Jun 11 11:07:08 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679228.167869 /var/log/messages:Jun 11 11:07:09 warble2 kernel: LNet: Service thread pid 101568 was inactive for 200.48s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:07:09 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:07:09 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679229.101568 /var/log/messages:Jun 11 11:07:10 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679230.115641 /var/log/messages:Jun 11 11:07:11 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679231.122152 /var/log/messages:Jun 11 11:07:12 warble2 kernel: LNet: Service thread pid 92998 was inactive for 200.49s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:07:12 warble2 kernel: LNet: Skipped 4 previous similar messages /var/log/messages:Jun 11 11:07:12 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679232.92998 /var/log/messages:Jun 11 11:07:13 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679233.331992 /var/log/messages:Jun 11 11:07:14 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679234.7735 /var/log/messages:Jun 11 11:07:15 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679235.38496 /var/log/messages:Jun 11 11:07:16 warble2 kernel: LNet: Service thread pid 92764 was inactive for 200.39s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:07:16 warble2 kernel: LNet: Skipped 4 previous similar messages /var/log/messages:Jun 11 11:07:16 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679236.92764 /var/log/messages:Jun 11 11:07:18 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679238.146580 /var/log/messages:Jun 11 11:07:19 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679239.125103 /var/log/messages:Jun 11 11:07:21 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679241.92767 /var/log/messages:Jun 11 11:07:23 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679243.138449 /var/log/messages:Jun 11 11:07:24 warble2 kernel: LNet: Service thread pid 116275 was inactive for 200.48s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:07:24 warble2 kernel: LNet: Skipped 4 previous similar messages /var/log/messages:Jun 11 11:07:24 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679244.116275 /var/log/messages:Jun 11 11:07:31 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679251.331993 /var/log/messages:Jun 11 11:07:43 warble2 kernel: LNet: Service thread pid 93717 was inactive for 200.46s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:07:43 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:07:43 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679263.93717 /var/log/messages:Jun 11 11:08:05 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679285.45204 /var/log/messages:Jun 11 11:08:49 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0001: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages:Jun 11 11:08:49 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages:Jun 11 11:08:49 warble2 kernel: LustreError: 101568:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679029, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0001 lock: ffff884f2b785600/0xba8d5041143b4777 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 4 type: IBT flags: 0x1000001000000 nid: local remote: 0xba8d5041143b477e expref: -99 pid: 101568 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:08:49 warble2 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID /var/log/messages:Jun 11 11:08:49 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 192.168.44.22@o2ib44 (at 0@lo) /var/log/messages:Jun 11 11:08:51 warble2 kernel: LustreError: 122152:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679031, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff884ec880a600/0xba8d5041143ba533 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 3 type: IBT flags: 0x1000001000000 nid: local remote: 0xba8d5041143ba53a expref: -99 pid: 122152 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:08:51 warble2 kernel: LustreError: 122152:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages:Jun 11 11:08:51 warble2 kernel: LustreError: 92998:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679031, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884ec43c4000/0xba8d5041143bb0ee lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 9 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 92998 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:08:51 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528679331.92998 /var/log/messages:Jun 11 11:08:52 warble2 kernel: LustreError: 99412:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679032, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0001 lock: ffff884f54b95800/0xba8d5041143be73c lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 4 type: IBT flags: 0x1000001000000 nid: local remote: 0xba8d5041143be743 expref: -99 pid: 99412 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:09:44 warble2 kernel: LustreError: 45204:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679084, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88be210bbe00/0xba8d5041144e7659 lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 9 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 45204 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:13:33 warble2 kernel: Lustre: 116270:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885d8b3fa850 x1602890297769744/t0(0) o35->8a269cc5-4548-5dc4-f239-1a7f325eaf98@192.168.44.108@o2ib44:163/0 lens 512/696 e 24 to 0 dl 1528679618 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:13:34 warble2 kernel: Lustre: 116270:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88be59c50000 x1602890297787600/t0(0) o35->8a269cc5-4548-5dc4-f239-1a7f325eaf98@192.168.44.108@o2ib44:164/0 lens 512/696 e 23 to 0 dl 1528679619 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:13:37 warble2 kernel: Lustre: 413067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff884ecf90a100 x1602890286720544/t0(0) o35->5dc383e0-2a08-7beb-5a44-0a7f991c2936@192.168.44.103@o2ib44:167/0 lens 512/696 e 23 to 0 dl 1528679622 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:13:37 warble2 kernel: Lustre: 413067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages /var/log/messages:Jun 11 11:13:39 warble2 kernel: Lustre: dagg-MDT0000: Client 8a269cc5-4548-5dc4-f239-1a7f325eaf98 (at 192.168.44.108@o2ib44) reconnecting /var/log/messages:Jun 11 11:13:39 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.108@o2ib44) /var/log/messages:Jun 11 11:13:39 warble2 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages:Jun 11 11:13:41 warble2 kernel: Lustre: dagg-MDT0000: Client 87fa0831-5932-3de9-286d-b35743b63640 (at 192.168.44.104@o2ib44) reconnecting /var/log/messages:Jun 11 11:13:41 warble2 kernel: Lustre: 116260:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885b9b86e450 x1602890286748400/t0(0) o35->5dc383e0-2a08-7beb-5a44-0a7f991c2936@192.168.44.103@o2ib44:171/0 lens 512/696 e 23 to 0 dl 1528679626 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:13:43 warble2 kernel: Lustre: dagg-MDT0000: Client 5dc383e0-2a08-7beb-5a44-0a7f991c2936 (at 192.168.44.103@o2ib44) reconnecting /var/log/messages:Jun 11 11:13:45 warble2 kernel: Lustre: 413067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff884efae9e600 x1602890297832208/t0(0) o35->8a269cc5-4548-5dc4-f239-1a7f325eaf98@192.168.44.108@o2ib44:175/0 lens 512/696 e 22 to 0 dl 1528679630 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:13:45 warble2 kernel: Lustre: 413067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages /var/log/messages:Jun 11 11:13:48 warble2 kernel: Lustre: dagg-MDT0000: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages:Jun 11 11:13:53 warble2 kernel: Lustre: 138426:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff884eed401e00 x1602890297884400/t0(0) o35->8a269cc5-4548-5dc4-f239-1a7f325eaf98@192.168.44.108@o2ib44:183/0 lens 512/696 e 15 to 0 dl 1528679638 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:13:53 warble2 kernel: Lustre: 138426:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages /var/log/messages:Jun 11 11:14:17 warble2 kernel: Lustre: 173022:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88be2c9f9500 x1602890264813904/t0(0) o35->87fa0831-5932-3de9-286d-b35743b63640@192.168.44.104@o2ib44:207/0 lens 512/696 e 11 to 0 dl 1528679662 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:14:17 warble2 kernel: Lustre: 173022:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages /var/log/messages:Jun 11 11:15:06 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0001: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages:Jun 11 11:15:06 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages:Jun 11 11:15:06 warble2 kernel: LustreError: 72910:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679406, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0001 lock: ffff88bcd3aef000/0xba8d504114d201a0 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 5 type: IBT flags: 0x1000001000000 nid: local remote: 0xba8d504114d201a7 expref: -99 pid: 72910 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:15:06 warble2 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID /var/log/messages:Jun 11 11:15:06 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages:Jun 11 11:15:06 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 192.168.44.22@o2ib44 (at 0@lo) /var/log/messages:Jun 11 11:15:06 warble2 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages:Jun 11 11:16:51 warble2 kernel: LustreError: 74568:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679510, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884f03821400/0xba8d5041154d23c6 lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 214 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 74568 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:16:51 warble2 kernel: LustreError: 74568:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 1 previous similar message -- /var/log/messages:Jun 11 11:17:37 warble2 kernel: LustreError: 79454:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679557, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88be0d76ce00/0xba8d50411565888a lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 214 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 79454 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:17:37 warble2 kernel: LustreError: 79454:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages:Jun 11 11:17:54 warble2 kernel: LustreError: 170783:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679573, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884eeb905400/0xba8d504116160f41 lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 214 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 170783 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:17:54 warble2 kernel: LustreError: 170783:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 1 previous similar message /var/log/messages:Jun 11 11:18:51 warble2 kernel: LustreError: 303835:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679631, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884ee3393200/0xba8d504116a957f6 lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 215 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 303835 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:18:51 warble2 kernel: LustreError: 303835:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 71 previous similar messages /var/log/messages:Jun 11 11:19:37 warble2 kernel: LustreError: 99416:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679677, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884f0af83400/0xba8d504116c05aa0 lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 215 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 99416 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:19:37 warble2 kernel: LustreError: 99416:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 11 previous similar messages /var/log/messages:Jun 11 11:19:47 warble2 kernel: Lustre: 332091:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/4), not sending early reply#012 req@ffff88bd37806450 x1602387771198864/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:536/0 lens 512/696 e 1 to 0 dl 1528679991 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:19:47 warble2 kernel: Lustre: 332091:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages:Jun 11 11:20:07 warble2 kernel: Lustre: dagg-MDT0001: Client 25bba27b-216b-93df-faef-82aebfc8555e (at 192.168.44.13@o2ib44) reconnecting /var/log/messages:Jun 11 11:20:07 warble2 kernel: Lustre: Skipped 3 previous similar messages /var/log/messages:Jun 11 11:20:07 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 11:20:07 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages:Jun 11 11:20:50 warble2 kernel: LustreError: 115632:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528679750, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884e9f190600/0xba8d504117633f21 lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 28 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 115632 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:20:50 warble2 kernel: LustreError: 115632:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 8 previous similar messages /var/log/messages:Jun 11 11:21:29 warble2 kernel: Lustre: 332092:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-1), not sending early reply#012 req@ffff88be3f76da00 x1602387772489152/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:639/0 lens 512/696 e 0 to 0 dl 1528680094 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:21:29 warble2 kernel: Lustre: 332092:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages:Jun 11 11:22:09 warble2 kernel: Lustre: dagg-MDT0000: Client cd64b7c0-80dd-c162-351e-19fada851182 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages:Jun 11 11:22:09 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.172@o2ib44) /var/log/messages:Jun 11 11:22:34 warble2 kernel: LNet: Service thread pid 115576 was inactive for 763.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:22:34 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:22:34 warble2 kernel: Pid: 115576, comm: mdt_rdpg01_014 /var/log/messages:Jun 11 11:22:34 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:22:34 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:22:34 warble2 kernel: /var/log/messages:Jun 11 11:22:34 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680154.115576 /var/log/messages:Jun 11 11:22:58 warble2 kernel: Lustre: dagg-MDT0000: Client 367a0081-aa63-061e-f57b-0ed186deb6bf (at 192.168.44.109@o2ib44) reconnecting /var/log/messages:Jun 11 11:22:58 warble2 kernel: Lustre: Skipped 2 previous similar messages /var/log/messages:Jun 11 11:23:08 warble2 kernel: LNet: Service thread pid 72910 was inactive for 782.86s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:23:08 warble2 kernel: Pid: 72910, comm: mdt01_083 /var/log/messages:Jun 11 11:23:08 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:23:08 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:23:08 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 11:23:08 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:23:08 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages:Jun 11 11:23:08 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages:Jun 11 11:23:08 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:23:08 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:23:09 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:23:09 warble2 kernel: /var/log/messages:Jun 11 11:23:09 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680189.72910 /var/log/messages:Jun 11 11:23:34 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 11:23:34 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages:Jun 11 11:23:53 warble2 kernel: Lustre: 73157:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-76), not sending early reply#012 req@ffff88b3a6824200 x1602890032484512/t0(0) o101->367a0081-aa63-061e-f57b-0ed186deb6bf@192.168.44.109@o2ib44:28/0 lens 696/3384 e 0 to 0 dl 1528680238 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:23:53 warble2 kernel: Lustre: 73157:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 14 previous similar messages /var/log/messages:Jun 11 11:24:47 warble2 kernel: Lustre: dagg-MDT0000: Client 9a38fab3-92ab-83a9-6b7a-438d8568dee6 (at 192.168.44.193@o2ib44) reconnecting /var/log/messages:Jun 11 11:24:47 warble2 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages:Jun 11 11:24:47 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to c3fbd872-d6b1-403c-47dd-989a5553f65f (at 192.168.44.193@o2ib44) /var/log/messages:Jun 11 11:24:47 warble2 kernel: Lustre: Skipped 11 previous similar messages /var/log/messages:Jun 11 11:27:13 warble2 kernel: Lustre: dagg-MDT0000: Client 08af1fb0-f487-6bab-3334-c1efe3681813 (at 192.168.44.111@o2ib44) reconnecting /var/log/messages:Jun 11 11:27:13 warble2 kernel: Lustre: Skipped 11 previous similar messages /var/log/messages:Jun 11 11:27:31 warble2 kernel: LNet: Service thread pid 331637 was inactive for 962.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:27:31 warble2 kernel: Pid: 331637, comm: mdt_rdpg01_042 /var/log/messages:Jun 11 11:27:31 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:27:31 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:27:31 warble2 kernel: /var/log/messages:Jun 11 11:27:31 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680451.331637 /var/log/messages:Jun 11 11:28:12 warble2 kernel: LNet: Service thread pid 87736 was inactive for 980.84s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:28:12 warble2 kernel: Pid: 87736, comm: mdt01_005 /var/log/messages:Jun 11 11:28:12 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:28:12 warble2 kernel: /var/log/messages:Jun 11 11:28:12 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680492.87736 /var/log/messages:Jun 11 11:28:12 warble2 kernel: Pid: 93076, comm: mdt01_013 /var/log/messages:Jun 11 11:28:12 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:28:12 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 -- /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:28:13 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:28:13 warble2 kernel: /var/log/messages:Jun 11 11:28:13 warble2 kernel: LNet: Service thread pid 94133 was inactive for 981.96s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:28:13 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:28:20 warble2 kernel: Lustre: 167864:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff884ec9e3bf00 x1602890031796144/t0(0) o101->126c4513-d6fb-cb1d-e183-abb4a191896f@192.168.44.156@o2ib44:295/0 lens 696/3384 e 0 to 0 dl 1528680505 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:28:20 warble2 kernel: Lustre: 167864:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 101 previous similar messages /var/log/messages:Jun 11 11:28:24 warble2 kernel: LNet: Service thread pid 318398 was inactive for 993.32s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:28:24 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:28:24 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680504.318398 /var/log/messages:Jun 11 11:29:21 warble2 kernel: LNet: Service thread pid 76557 was inactive for 1034.46s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:29:21 warble2 kernel: LNet: Skipped 2 previous similar messages /var/log/messages:Jun 11 11:29:21 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680561.76557 /var/log/messages:Jun 11 11:29:29 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680569.320193 /var/log/messages:Jun 11 11:29:54 warble2 kernel: LNet: Service thread pid 141412 was inactive for 1063.44s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:29:54 warble2 kernel: LNet: Skipped 2 previous similar messages /var/log/messages:Jun 11 11:29:54 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680594.141412 /var/log/messages:Jun 11 11:30:08 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 11:30:08 warble2 kernel: Lustre: Skipped 17 previous similar messages /var/log/messages:Jun 11 11:30:39 warble2 kernel: LNet: Service thread pid 115668 was inactive for 1082.33s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:30:39 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680639.115668 /var/log/messages:Jun 11 11:31:49 warble2 kernel: LNet: Service thread pid 269114 was inactive for 1134.11s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:31:49 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:31:49 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680709.269114 /var/log/messages:Jun 11 11:31:57 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680717.101528 /var/log/messages:Jun 11 11:32:01 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680721.147521 /var/log/messages:Jun 11 11:32:09 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680729.303838 /var/log/messages:Jun 11 11:32:27 warble2 kernel: Lustre: dagg-MDT0000: Client cd64b7c0-80dd-c162-351e-19fada851182 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages:Jun 11 11:32:27 warble2 kernel: Lustre: Skipped 6 previous similar messages /var/log/messages:Jun 11 11:33:47 warble2 kernel: LNet: Service thread pid 91540 was inactive for 1200.84s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:33:47 warble2 kernel: LNet: Skipped 4 previous similar messages /var/log/messages:Jun 11 11:33:47 warble2 kernel: Pid: 91540, comm: mdt_rdpg01_004 /var/log/messages:Jun 11 11:33:47 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:33:47 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? _raw_spin_unlock_irqrestore+0x15/0x20 /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? queued_spin_lock_slowpath+0xb/0xf /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? _raw_spin_lock+0x20/0x30 /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:33:48 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:48 warble2 kernel: /var/log/messages:Jun 11 11:33:48 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680828.91540 /var/log/messages:Jun 11 11:33:52 warble2 kernel: Pid: 303835, comm: mdt00_060 /var/log/messages:Jun 11 11:33:52 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] -- /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:33:52 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:52 warble2 kernel: /var/log/messages:Jun 11 11:33:52 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680832.303835 /var/log/messages:Jun 11 11:33:56 warble2 kernel: LNet: Service thread pid 170779 was inactive for 1203.93s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:33:56 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:33:56 warble2 kernel: Pid: 170779, comm: mdt00_044 /var/log/messages:Jun 11 11:33:56 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: /var/log/messages:Jun 11 11:33:56 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680836.170779 /var/log/messages:Jun 11 11:33:56 warble2 kernel: Pid: 87731, comm: mdt01_004 /var/log/messages:Jun 11 11:33:56 warble2 kernel: #012Call Trace: -- /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:33:56 warble2 kernel: /var/log/messages:Jun 11 11:34:00 warble2 kernel: LNet: Service thread pid 256637 was inactive for 1201.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. /var/log/messages:Jun 11 11:34:00 warble2 kernel: LNet: Skipped 80 previous similar messages /var/log/messages:Jun 11 11:34:00 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680840.256637 /var/log/messages:Jun 11 11:34:37 warble2 kernel: LustreError: 386412:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528680577, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff885935891000/0xba8d5041190575f8 lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 30 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 386412 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:34:37 warble2 kernel: LustreError: 386412:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 10 previous similar messages /var/log/messages:Jun 11 11:34:41 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680881.170781 /var/log/messages:Jun 11 11:34:53 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680893.118113 /var/log/messages:Jun 11 11:35:22 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680922.138435 /var/log/messages:Jun 11 11:35:54 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680954.115632 /var/log/messages:Jun 11 11:36:11 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528680971.141411 /var/log/messages:Jun 11 11:36:52 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528681012.99422 /var/log/messages:Jun 11 11:36:56 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528681016.98993 /var/log/messages:Jun 11 11:38:09 warble2 kernel: Lustre: 114685:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff88be4e642400 x1602387776746352/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:129/0 lens 512/696 e 0 to 0 dl 1528681094 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:38:09 warble2 kernel: Lustre: 114685:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 13 previous similar messages /var/log/messages:Jun 11 11:39:04 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 9bb6882e-59d8-c9d9-ccac-dfb7f24444ca (at 192.168.44.137@o2ib44) /var/log/messages:Jun 11 11:39:04 warble2 kernel: Lustre: Skipped 22 previous similar messages /var/log/messages:Jun 11 11:40:37 warble2 kernel: LNet: Service thread pid 266841 was inactive for 1200.76s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:40:37 warble2 kernel: LNet: Skipped 2 previous similar messages /var/log/messages:Jun 11 11:40:37 warble2 kernel: Pid: 266841, comm: mdt_rdpg00_034 /var/log/messages:Jun 11 11:40:37 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: /var/log/messages:Jun 11 11:40:37 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528681237.266841 /var/log/messages:Jun 11 11:40:37 warble2 kernel: Pid: 73926, comm: mdt_rdpg00_047 /var/log/messages:Jun 11 11:40:37 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:40:37 warble2 kernel: /var/log/messages:Jun 11 11:41:07 warble2 kernel: LustreError: 92992:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528680967, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884e9d11f200/0xba8d504119412809 lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 34 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 92992 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:42:37 warble2 kernel: LustreError: 318213:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528681057, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884f0aef4c00/0xba8d50411960734a lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 34 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 318213 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:42:45 warble2 kernel: Lustre: dagg-MDT0000: Client cd64b7c0-80dd-c162-351e-19fada851182 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages:Jun 11 11:42:45 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 11:45:40 warble2 kernel: LNet: Service thread pid 115573 was inactive for 1201.01s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:45:40 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:45:40 warble2 kernel: Pid: 115573, comm: mdt_rdpg01_011 /var/log/messages:Jun 11 11:45:40 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] mdd_attr_set+0x5eb/0xce0 [mdd] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] mdt_mfd_close+0x1a6/0x610 [mdt] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] mdt_close_internal+0x121/0x220 [mdt] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] mdt_close+0x220/0x780 [mdt] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:45:40 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:45:40 warble2 kernel: /var/log/messages:Jun 11 11:45:40 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528681540.115573 /var/log/messages:Jun 11 11:48:03 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages:Jun 11 11:48:03 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages:Jun 11 11:48:37 warble2 kernel: Lustre: 74569:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff884ee752c800 x1602890033685264/t0(0) o101->126c4513-d6fb-cb1d-e183-abb4a191896f@192.168.44.156@o2ib44:2/0 lens 696/3384 e 0 to 0 dl 1528681722 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 11:48:37 warble2 kernel: Lustre: 74569:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message /var/log/messages:Jun 11 11:49:32 warble2 kernel: LNet: Service thread pid 252567 was inactive for 200.52s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:49:32 warble2 kernel: Pid: 252567, comm: mdt01_069 /var/log/messages:Jun 11 11:49:32 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? ldlm_resource_add_lock+0x6a/0x1b0 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ldlm_cli_enqueue_fini+0x93b/0xdc0 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ldlm_cli_enqueue+0x6c2/0x810 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? mdt_remote_blocking_ast+0x0/0x590 [mdt] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] osp_md_object_lock+0x172/0x2e0 [osp] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] lod_object_lock+0xf3/0x950 [lod] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? htable_lookup+0xa9/0x180 [obdclass] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] mdd_object_lock+0x3e/0xe0 [mdd] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] mdt_remote_object_lock+0x1e5/0x710 [mdt] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x36a/0x860 [mdt] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? strlcpy+0x42/0x60 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:49:32 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:49:32 warble2 kernel: /var/log/messages:Jun 11 11:49:32 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528681772.252567 -- /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:49:38 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:49:38 warble2 kernel: /var/log/messages:Jun 11 11:49:38 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528681778.386412 /var/log/messages:Jun 11 11:50:10 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 11:50:10 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 11:51:11 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages:Jun 11 11:51:11 warble2 kernel: LustreError: 252567:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528681571, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff88be231d6400/0xba8d504119b814ec lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 4 type: IBT flags: 0x1000001000000 nid: local remote: 0xba8d504119b814f3 expref: -99 pid: 252567 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 11:51:11 warble2 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID /var/log/messages:Jun 11 11:53:03 warble2 kernel: Lustre: dagg-MDT0000: Client cd64b7c0-80dd-c162-351e-19fada851182 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages:Jun 11 11:53:03 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 11:56:11 warble2 kernel: LNet: Service thread pid 92992 was inactive for 1203.85s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 11:56:11 warble2 kernel: LNet: Skipped 1 previous similar message /var/log/messages:Jun 11 11:56:11 warble2 kernel: Pid: 92992, comm: mdt00_008 /var/log/messages:Jun 11 11:56:11 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 11:56:11 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 11:56:11 warble2 kernel: /var/log/messages:Jun 11 11:56:11 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528682171.92992 /var/log/messages:Jun 11 11:57:13 warble2 kernel: Pid: 115575, comm: mdt_rdpg01_013 /var/log/messages:Jun 11 11:57:13 warble2 kernel: #012Call Trace: -- /var/log/messages:Jun 11 12:03:21 warble2 kernel: Lustre: dagg-MDT0000: Client cd64b7c0-80dd-c162-351e-19fada851182 (at 192.168.44.172@o2ib44) reconnecting /var/log/messages:Jun 11 12:03:21 warble2 kernel: Lustre: Skipped 23 previous similar messages /var/log/messages:Jun 11 12:03:47 warble2 kernel: Lustre: 331638:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88be4e2b8c00 x1602387783395296/t0(0) o35->25bba27b-216b-93df-faef-82aebfc8555e@192.168.44.13@o2ib44:157/0 lens 512/696 e 24 to 0 dl 1528682632 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 12:03:47 warble2 kernel: Lustre: 331638:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages /var/log/messages:Jun 11 12:10:12 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 12:10:12 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 12:13:39 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 12:13:39 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 12:20:13 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 12:20:13 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 12:23:40 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 12:23:40 warble2 kernel: Lustre: Skipped 20 previous similar messages /var/log/messages:Jun 11 12:30:13 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to (at 192.168.44.114@o2ib44) /var/log/messages:Jun 11 12:30:13 warble2 kernel: Lustre: Skipped 25 previous similar messages /var/log/messages:Jun 11 12:33:41 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 12:33:41 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 12:40:15 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 12:40:15 warble2 kernel: Lustre: Skipped 27 previous similar messages /var/log/messages:Jun 11 12:43:42 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 12:43:42 warble2 kernel: Lustre: Skipped 30 previous similar messages /var/log/messages:Jun 11 12:44:36 warble2 kernel: LNet: Service thread pid 115976 was inactive for 200.71s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 12:44:36 warble2 kernel: LNet: Skipped 2 previous similar messages /var/log/messages:Jun 11 12:44:36 warble2 kernel: Pid: 115976, comm: mdt01_025 /var/log/messages:Jun 11 12:44:36 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? hfi1_verbs_send_dma+0x0/0x740 [hfi1] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 12:44:36 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 12:44:36 warble2 kernel: /var/log/messages:Jun 11 12:44:36 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528685076.115976 /var/log/messages:Jun 11 12:46:15 warble2 kernel: LustreError: 115976:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528684875, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88be53096c00/0xba8d50411b5abf4a lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 36 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 115976 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 12:50:16 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 12:50:16 warble2 kernel: Lustre: Skipped 29 previous similar messages /var/log/messages:Jun 11 12:51:10 warble2 kernel: Lustre: 157387:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88be51957b00 x1602890316413376/t0(0) o101->e9ec5d11-59ff-46d6-6ff9-12de4e5778ac@192.168.44.170@o2ib44:735/0 lens 696/3384 e 24 to 0 dl 1528685475 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 12:53:43 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 12:53:43 warble2 kernel: Lustre: Skipped 23 previous similar messages /var/log/messages:Jun 11 13:00:17 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 13:00:17 warble2 kernel: Lustre: Skipped 26 previous similar messages /var/log/messages:Jun 11 13:03:44 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 13:03:44 warble2 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages:Jun 11 13:10:18 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 13:10:18 warble2 kernel: Lustre: Skipped 25 previous similar messages /var/log/messages:Jun 11 13:13:45 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 13:13:45 warble2 kernel: Lustre: Skipped 31 previous similar messages /var/log/messages:Jun 11 13:15:45 warble2 kernel: LNet: Service thread pid 138447 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 13:15:45 warble2 kernel: Pid: 138447, comm: mdt00_123 /var/log/messages:Jun 11 13:15:45 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? hfi1_verbs_send_dma+0x0/0x740 [hfi1] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 13:15:45 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 13:15:45 warble2 kernel: /var/log/messages:Jun 11 13:15:45 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528686945.138447 /var/log/messages:Jun 11 13:17:25 warble2 kernel: LustreError: 138447:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528686745, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884f52175200/0xba8d50411c4d757a lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 219 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 138447 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:18:47 warble2 kernel: LNet: Service thread pid 45140 was inactive for 200.25s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 13:18:47 warble2 kernel: Pid: 45140, comm: mdt01_115 /var/log/messages:Jun 11 13:18:47 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? hfi1_verbs_send_dma+0x0/0x740 [hfi1] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? number.isra.2+0x323/0x360 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 13:18:47 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 13:18:47 warble2 kernel: /var/log/messages:Jun 11 13:18:47 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528687127.45140 /var/log/messages:Jun 11 13:20:19 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) /var/log/messages:Jun 11 13:20:19 warble2 kernel: Lustre: Skipped 24 previous similar messages /var/log/messages:Jun 11 13:20:27 warble2 kernel: LustreError: 45140:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528686927, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88b3a1f03e00/0xba8d50411c682a82 lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 219 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 45140 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:21:45 warble2 kernel: LustreError: 76381:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687005, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff885d27c8a800/0xba8d50411c79cd5f lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 56 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 76381 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:22:20 warble2 kernel: Lustre: 101571:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff884f05189b00 x1602908580264608/t0(0) o101->f1c0f2d9-a00b-d3d6-7cb9-e5c5982a3268@192.168.44.124@o2ib44:340/0 lens 696/3384 e 24 to 0 dl 1528687345 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 13:23:46 warble2 kernel: Lustre: dagg-MDT0000: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting /var/log/messages:Jun 11 13:23:46 warble2 kernel: Lustre: Skipped 26 previous similar messages /var/log/messages:Jun 11 13:25:22 warble2 kernel: Lustre: 94124:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff88be4e2bb300 x1602890033766112/t0(0) o101->cd64b7c0-80dd-c162-351e-19fada851182@192.168.44.172@o2ib44:522/0 lens 696/3384 e 24 to 0 dl 1528687527 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 13:25:38 warble2 kernel: LustreError: 316708:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687237, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884efe436c00/0xba8d50411c8d819c lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 64 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 316708 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 87481:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687252, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884f2d315800/0xba8d50411c8edc7e lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 65 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 87481 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 115607:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687252, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884fc4947200/0xba8d50411c8edc8c lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 64 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 115607 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 102253:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687252, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884e9d119c00/0xba8d50411c8edc85 lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 67 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102253 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 101560:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687252, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff885e049b9800/0xba8d50411c8edc77 lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 68 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 101560 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 101561:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687252, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff884ecc6ff400/0xba8d50411c8edc69 lrc: 3/1,0 mode: --/PR res: [0x20001b981:0x13be5:0x0].0x0 bits 0x13 rrc: 67 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 101561 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 115607:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 102253:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 101560:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages:Jun 11 13:25:53 warble2 kernel: LustreError: 101561:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages:Jun 11 13:26:07 warble2 kernel: LNet: Service thread pid 76381 was inactive for 562.31s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 13:26:07 warble2 kernel: Pid: 76381, comm: mdt00_095 /var/log/messages:Jun 11 13:26:07 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] mdt_object_local_lock+0x512/0xaf0 [mdt] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? mdt_blocking_ast+0x0/0x2e0 [mdt] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] ? lustre_msg_buf+0x0/0x60 [ptlrpc] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] mdt_object_lock_internal+0x5e/0x300 [mdt] /var/log/messages:Jun 11 13:26:07 warble2 kernel: [] mdt_getattr_name_lock+0x8a4/0x1910 [mdt] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] mdt_intent_getattr+0x2b0/0x480 [mdt] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] mdt_intent_policy+0x441/0xc70 [mdt] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 13:26:08 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 13:26:08 warble2 kernel: /var/log/messages:Jun 11 13:26:08 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528687568.76381 /var/log/messages:Jun 11 13:26:40 warble2 kernel: Lustre: 94112:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff884f20145100 x1602890331470208/t0(0) o101->6d464ccc-d282-ac16-affc-83bbcfa54656@192.168.44.127@o2ib44:600/0 lens 696/3384 e 1 to 0 dl 1528687605 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 13:28:38 warble2 kernel: LustreError: 73154:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528687418, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88be2d2c6200/0xba8d50411cb7ad32 lrc: 3/1,0 mode: --/PR res: [0x20001b982:0x13c11:0x0].0x0 bits 0x13 rrc: 229 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 73154 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 13:28:38 warble2 kernel: LustreError: 73154:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 2 previous similar messages /var/log/messages:Jun 11 13:30:20 warble2 kernel: Lustre: dagg-MDT0001: Connection restored to 6a638310-9247-c5fa-6554-368ead9cd19b (at 192.168.44.13@o2ib44) -- /var/log/messages:Jun 11 13:46:31 warble2 kernel: Lustre: dagg-MDT0002: Will be in recovery for at least 5:00, or until 130 clients reconnect /var/log/messages:Jun 11 13:46:35 warble2 kernel: LustreError: 25431:0:(tgt_handler.c:509:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff885e276ec850 x1602540922035072/t0(0) o601->dagg-MDT0000-lwp-OST0008_UUID@192.168.44.35@o2ib44:280/0 lens 336/0 e 0 to 0 dl 1528689550 ref 1 fl Interpret:/0/ffffffff rc 0/-1 /var/log/messages:Jun 11 13:46:35 warble2 kernel: LustreError: 25431:0:(tgt_handler.c:509:tgt_filter_recovery_request()) Skipped 8 previous similar messages /var/log/messages:Jun 11 13:46:36 warble2 kernel: Lustre: 21371:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1528688791/real 0] req@ffff885dfc9bfb00 x1602946191090880/t0(0) o38->dagg-MDT0001-osp-MDT0002@192.168.44.21@o2ib44:24/4 lens 520/544 e 0 to 1 dl 1528688796 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 /var/log/messages:Jun 11 13:46:36 warble2 kernel: Lustre: 21371:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous similar messages /var/log/messages:Jun 11 13:46:39 warble2 kernel: Lustre: apps-MDT0000: Recovery over after 0:50, of 122 clients 122 recovered and 0 were evicted. /var/log/messages:Jun 11 13:46:49 warble2 kernel: LustreError: 26253:0:(tgt_handler.c:509:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff885de8d96000 x1602543186500112/t0(0) o601->dagg-MDT0000-lwp-OST000e_UUID@192.168.44.38@o2ib44:294/0 lens 336/0 e 0 to 0 dl 1528689564 ref 1 fl Interpret:/0/ffffffff rc 0/-1 /var/log/messages:Jun 11 13:46:49 warble2 kernel: LustreError: 26253:0:(tgt_handler.c:509:tgt_filter_recovery_request()) Skipped 1 previous similar message /var/log/messages:Jun 11 13:46:53 warble2 kernel: Lustre: 27499:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0000: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages:Jun 11 13:47:02 warble2 kernel: Lustre: 27499:0:(ldlm_lib.c:1773:extend_recovery_timer()) dagg-MDT0000: extended recovery timer reaching hard limit: 900, extend: 1 /var/log/messages:Jun 11 13:47:02 warble2 kernel: Lustre: 27499:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 1 previous similar message /var/log/messages:Jun 11 13:47:02 warble2 kernel: Lustre: dagg-MDT0000: Recovery over after 1:00, of 130 clients 130 recovered and 0 were evicted. /var/log/messages:Jun 11 13:47:10 warble2 kernel: Lustre: dagg-MDT0001: Recovery over after 0:56, of 130 clients 130 recovered and 0 were evicted. /var/log/messages:Jun 11 13:47:29 warble2 kernel: Lustre: dagg-MDT0002: Recovery already passed deadline 13:40. If you do not want to wait more, please abort the recovery by force. /var/log/messages:Jun 11 13:47:29 warble2 kernel: Lustre: Skipped 4 previous similar messages /var/log/messages:Jun 11 13:47:29 warble2 kernel: Lustre: dagg-MDT0002: Recovery over after 0:58, of 130 clients 130 recovered and 0 were evicted. /var/log/messages:Jun 11 13:53:41 warble2 systemd: Starting Cleanup of Temporary Directories... /var/log/messages:Jun 11 13:53:41 warble2 systemd: Started Cleanup of Temporary Directories. /var/log/messages:Jun 11 13:54:49 warble2 ntpd[14483]: 0.0.0.0 0612 02 freq_set kernel 68.833 PPM /var/log/messages:Jun 11 13:54:49 warble2 ntpd[14483]: 0.0.0.0 0615 05 clock_sync /var/log/messages:Jun 11 14:17:28 warble2 kernel: LNet: Service thread pid 22072 was inactive for 200.35s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 14:17:28 warble2 kernel: Pid: 22072, comm: mdt00_001 /var/log/messages:Jun 11 14:17:28 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? recalc_sigpending+0x1b/0x50 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] top_trans_wait_result+0xa6/0x155 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] top_trans_stop+0x42b/0x930 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] lod_trans_stop+0x259/0x340 [lod] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? top_trans_start+0x27e/0x940 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdd_trans_stop+0x2a/0x46 [mdd] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdd_rename+0x4d1/0x14a0 [mdd] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdt_reint_rename_internal.isra.36+0x166a/0x20c0 [mdt] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x19b/0x860 [mdt] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 14:17:28 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 14:17:28 warble2 kernel: /var/log/messages:Jun 11 14:17:28 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528690648.22072 /var/log/messages:Jun 11 14:24:03 warble2 kernel: Lustre: 25447:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply#012 req@ffff885dbec0dd00 x1602890006452832/t0(0) o36->8ae7f49e-c44d-405f-5a0a-11d57f42fbd3@192.168.44.122@o2ib44:268/0 lens 816/3128 e 24 to 0 dl 1528691048 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 14:24:09 warble2 kernel: Lustre: dagg-MDT0000: Client 8ae7f49e-c44d-405f-5a0a-11d57f42fbd3 (at 192.168.44.122@o2ib44) reconnecting /var/log/messages:Jun 11 14:24:09 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 5ef15f3b-22ed-f4ad-4682-bf29b6d2a251 (at 192.168.44.122@o2ib44) /var/log/messages:Jun 11 14:24:09 warble2 kernel: Lustre: Skipped 473 previous similar messages /var/log/messages:Jun 11 14:31:58 warble2 kernel: Lustre: dagg-MDT0000-osp-MDT0002: Connection to dagg-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete /var/log/messages:Jun 11 14:31:58 warble2 kernel: LustreError: 22159:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528691218, 300s ago), entering recovery for dagg-MDT0000_UUID@192.168.44.22@o2ib44 ns: dagg-MDT0000-osp-MDT0002 lock: ffff885dd1dda000/0xaba15aa86689dda7 lrc: 4/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 2 type: IBT flags: 0x1000001000000 nid: local remote: 0xaba15aa86689ddae expref: -99 pid: 22159 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 14:31:58 warble2 kernel: Lustre: dagg-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID /var/log/messages:Jun 11 14:31:58 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages:Jun 11 14:31:58 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 192.168.44.22@o2ib44 (at 0@lo) /var/log/messages:Jun 11 14:33:47 warble2 kernel: LNet: Service thread pid 25416 was inactive for 200.16s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: /var/log/messages:Jun 11 14:33:47 warble2 kernel: Pid: 25416, comm: mdt01_011 /var/log/messages:Jun 11 14:33:47 warble2 kernel: #012Call Trace: /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] schedule+0x29/0x70 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] schedule_timeout+0x174/0x2c0 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? process_timeout+0x0/0x10 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? cfs_block_sigsinv+0x71/0xa0 [libcfs] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ldlm_completion_ast+0x5b1/0x920 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? default_wake_function+0x0/0x20 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ldlm_cli_enqueue_local+0x233/0x860 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] mdt_reint_rename_or_migrate.isra.39+0x67c/0x860 [mdt] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] mdt_reint_rename+0x13/0x20 [mdt] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] mdt_reint_rec+0x83/0x210 [mdt] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] mdt_reint_internal+0x5fb/0x9c0 [mdt] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] mdt_reint+0x67/0x140 [mdt] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] tgt_request_handle+0x92a/0x1370 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? default_wake_function+0x12/0x20 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? __wake_up_common+0x5b/0x90 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] kthread+0xd1/0xe0 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ret_from_fork+0x5d/0xb0 /var/log/messages:Jun 11 14:33:47 warble2 kernel: [] ? kthread+0x0/0xe0 /var/log/messages:Jun 11 14:33:47 warble2 kernel: /var/log/messages:Jun 11 14:33:47 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528691627.25416 /var/log/messages:Jun 11 14:34:10 warble2 kernel: Lustre: dagg-MDT0000: Client 8ae7f49e-c44d-405f-5a0a-11d57f42fbd3 (at 192.168.44.122@o2ib44) reconnecting /var/log/messages:Jun 11 14:34:10 warble2 kernel: Lustre: dagg-MDT0000: Connection restored to 5ef15f3b-22ed-f4ad-4682-bf29b6d2a251 (at 192.168.44.122@o2ib44) /var/log/messages:Jun 11 14:34:10 warble2 kernel: Lustre: Skipped 1 previous similar message /var/log/messages:Jun 11 14:35:26 warble2 kernel: LustreError: 25416:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1528691426, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-dagg-MDT0000_UUID lock: ffff88bdc11bb400/0xaba15aa866a248e6 lrc: 3/0,1 mode: --/EX res: [0x200000004:0x1:0x0].0x0 bits 0x2 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 25416 timeout: 0 lvb_type: 0 /var/log/messages:Jun 11 14:35:26 warble2 kernel: LustreError: dumping log to /tmp/lustre-log.1528691726.25416 /var/log/messages:Jun 11 14:39:28 warble2 kernel: Lustre: 25407:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply#012 req@ffff885db861cb00 x1602566395610928/t0(0) o36->df25b3be-8ce8-27f2-a298-c024eac58e9b@192.168.44.14@o2ib44:438/0 lens 760/3128 e 0 to 0 dl 1528691973 ref 2 fl Interpret:/0/0 rc 0/0 /var/log/messages:Jun 11 14:39:34 warble2 kernel: Lustre: dagg-MDT0002: Client df25b3be-8ce8-27f2-a298-c024eac58e9b (at 192.168.44.14@o2ib44) reconnecting