Apr 28 13:04:18 fir-md1-s1 kernel: Lustre: fir-MDT0002: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 Apr 28 13:04:19 fir-md1-s1 kernel: Lustre: fir-MDD0002: changelog on Apr 28 13:04:19 fir-md1-s1 kernel: Lustre: fir-MDT0002: in recovery but waiting for the first client to connect Apr 28 13:04:19 fir-md1-s1 kernel: Lustre: fir-MDT0002: Will be in recovery for at least 2:30, or until 1326 clients reconnect Apr 28 13:04:19 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to ab14b935-8cbe-53d7-5802-24431fcda42b (at 10.9.115.5@o2ib4) Apr 28 13:04:19 fir-md1-s1 kernel: Lustre: Skipped 4 previous similar messages Apr 28 13:04:19 fir-md1-s1 kernel: LustreError: 11-0: fir-MDT0002-osp-MDT0000: operation mds_connect to node 0@lo failed: rc = -114 Apr 28 13:04:19 fir-md1-s1 kernel: LustreError: Skipped 5 previous similar messages Apr 28 13:04:19 fir-md1-s1 kernel: Lustre: fir-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 Apr 28 13:04:29 fir-md1-s1 kernel: LustreError: 104523:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff984c80f80300 x1631596515698480/t0(0) o601->fir-MDT0000-lwp-OST000a_UUID@10.0.10.101@o2ib7:5/0 lens 336/0 e 0 to 0 dl 1556481875 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 28 13:04:29 fir-md1-s1 kernel: LustreError: 104523:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 13 previous similar messages Apr 28 13:04:40 fir-md1-s1 kernel: LustreError: 104871:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff982c0f784200 x1631596470408432/t0(0) o601->fir-MDT0000-lwp-OST001c_UUID@10.0.10.105@o2ib7:16/0 lens 336/0 e 0 to 0 dl 1556481886 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 28 13:04:40 fir-md1-s1 kernel: LustreError: 104871:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 1 previous similar message Apr 28 13:04:50 fir-md1-s1 kernel: LustreError: 104523:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff984c80f9f200 x1631596515698960/t0(0) o601->fir-MDT0000-lwp-OST000a_UUID@10.0.10.101@o2ib7:26/0 lens 336/0 e 0 to 0 dl 1556481896 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 28 13:04:50 fir-md1-s1 kernel: LustreError: 104523:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 3 previous similar messages Apr 28 13:05:14 fir-md1-s1 kernel: LNetError: 98329:0:(o2iblnd_cb.c:3324:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds Apr 28 13:05:14 fir-md1-s1 kernel: LNetError: 98329:0:(o2iblnd_cb.c:3399:kiblnd_check_conns()) Timed out RDMA with 10.0.10.52@o2ib7 (5): c: 0, oc: 0, rc: 8 Apr 28 13:05:24 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to c2c2c302-6522-5fd4-4eda-95bd20e43cd4 (at 10.0.10.52@o2ib7) Apr 28 13:05:24 fir-md1-s1 kernel: Lustre: Skipped 2746 previous similar messages Apr 28 13:05:38 fir-md1-s1 kernel: LustreError: 104519:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff982c8d18da00 x1631596470412144/t0(0) o601->fir-MDT0000-lwp-OST001a_UUID@10.0.10.105@o2ib7:14/0 lens 336/0 e 0 to 0 dl 1556481944 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 28 13:05:38 fir-md1-s1 kernel: LustreError: 104519:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 6 previous similar messages Apr 28 13:06:00 fir-md1-s1 kernel: Lustre: fir-MDT0000: Recovery already passed deadline 4:51. If you do not want to wait more, please abort the recovery by force. Apr 28 13:06:00 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 28 13:06:00 fir-md1-s1 kernel: Lustre: fir-MDT0000: Recovery over after 1:41, of 1328 clients 1328 recovered and 0 were evicted. Apr 28 13:06:00 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 28 13:07:10 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.8.13.3@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 28 13:07:10 fir-md1-s1 kernel: LustreError: Skipped 4137 previous similar messages Apr 28 13:07:25 fir-md1-s1 kernel: LustreError: 11-0: fir-MDT0003-osp-MDT0000: operation mds_statfs to node 10.0.10.52@o2ib7 failed: rc = -107 Apr 28 13:07:25 fir-md1-s1 kernel: Lustre: fir-MDT0003-osp-MDT0002: Connection to fir-MDT0003 (at 10.0.10.52@o2ib7) was lost; in progress operations using this service will wait for recovery to complete Apr 28 13:07:25 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 28 13:07:25 fir-md1-s1 kernel: LustreError: Skipped 1 previous similar message Apr 28 13:08:30 fir-md1-s1 kernel: LustreError: 167-0: fir-MDT0003-osp-MDT0002: This client was evicted by fir-MDT0003; in progress operations using this service will fail. Apr 28 13:08:30 fir-md1-s1 kernel: LustreError: Skipped 1 previous similar message Apr 28 13:08:30 fir-md1-s1 kernel: Lustre: fir-MDT0003-osp-MDT0002: Connection restored to 10.0.10.52@o2ib7 (at 10.0.10.52@o2ib7) Apr 28 13:08:30 fir-md1-s1 kernel: Lustre: Skipped 28 previous similar messages Apr 28 14:11:02 fir-md1-s1 kernel: LNetError: 98341:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) Apr 28 15:02:37 fir-md1-s1 kernel: perf: interrupt took too long (3130 > 3128), lowering kernel.perf_event_max_sample_rate to 63000 Apr 28 15:11:54 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 69b98a86-b591-56ad-1a8d-a21bd9c7acd4 (at 10.8.14.8@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c53edd400, cur 1556489514 expire 1556489364 last 1556489287 Apr 28 15:16:07 fir-md1-s1 kernel: LNetError: 98345:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) Apr 28 18:44:23 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 07174424-87a6-1756-764e-10e7f32ab3b2 (at 10.8.23.36@o2ib6) reconnecting Apr 28 18:44:23 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to d463a02f-b8d4-b769-cf50-8b2b19f729a7 (at 10.8.23.36@o2ib6) Apr 28 18:47:28 fir-md1-s1 kernel: Lustre: MGS: Received new LWP connection from 10.8.23.36@o2ib6, removing former export from same NID Apr 28 18:47:28 fir-md1-s1 kernel: Lustre: MGS: Connection restored to d463a02f-b8d4-b769-cf50-8b2b19f729a7 (at 10.8.23.36@o2ib6) Apr 28 18:47:28 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 07174424-87a6-1756-764e-10e7f32ab3b2 (at 10.8.23.36@o2ib6) reconnecting Apr 28 18:47:36 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 07174424-87a6-1756-764e-10e7f32ab3b2 (at 10.8.23.36@o2ib6) reconnecting Apr 28 18:50:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 07174424-87a6-1756-764e-10e7f32ab3b2 (at 10.8.23.36@o2ib6) reconnecting Apr 28 18:50:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to d463a02f-b8d4-b769-cf50-8b2b19f729a7 (at 10.8.23.36@o2ib6) Apr 28 18:50:24 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 28 20:07:56 fir-md1-s1 kernel: LNetError: 98340:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) Apr 28 21:36:40 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.13.24@o2ib6) Apr 28 21:37:39 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 13862775-a3a0-e732-572e-283e8559a178 (at 10.8.13.24@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cf9c84c00, cur 1556512659 expire 1556512509 last 1556512432 Apr 28 21:37:39 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 00:08:15 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client a7819449-de15-8c48-aafa-598162857925 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982caf8fa400, cur 1556521695 expire 1556521545 last 1556521468 Apr 29 00:08:15 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 00:08:17 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 12f820c4-5371-71bd-bfce-4a297d863139 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cae2a0000, cur 1556521697 expire 1556521547 last 1556521470 Apr 29 00:10:55 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 00:10:55 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 02:06:36 fir-md1-s1 kernel: LNetError: 98345:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) Apr 29 03:26:16 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 5995695a-60b9-a4b2-7af4-aefc33fc37cb (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9829ac766400, cur 1556533576 expire 1556533426 last 1556533349 Apr 29 03:26:16 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 03:26:24 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 5995695a-60b9-a4b2-7af4-aefc33fc37cb (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9849ae6b3000, cur 1556533584 expire 1556533434 last 1556533357 Apr 29 03:26:24 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 03:34:49 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 03:34:49 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 03:34:50 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 03:34:50 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 03:43:42 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 30655aad-f884-6822-c8a5-bc96f5e0e9eb (at 10.8.14.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c4fed1400, cur 1556534622 expire 1556534472 last 1556534395 Apr 29 05:27:43 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client e18301fc-f860-0db4-bf24-6c606e0cc839 (at 10.8.8.31@o2ib6) reconnecting Apr 29 05:27:43 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.8.31@o2ib6) Apr 29 06:29:44 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client e18301fc-f860-0db4-bf24-6c606e0cc839 (at 10.8.8.31@o2ib6) reconnecting Apr 29 06:29:44 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.8.31@o2ib6) Apr 29 08:09:27 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 31aea298-76cf-b176-60c7-b3a8604d9082 (at 10.8.14.2@o2ib6) Apr 29 08:09:27 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 08:12:08 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client d6942713-295b-7d84-12e7-27da6b04f060 (at 10.8.14.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98593eb97000, cur 1556550728 expire 1556550578 last 1556550501 Apr 29 08:12:08 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 08:18:41 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 31aea298-76cf-b176-60c7-b3a8604d9082 (at 10.8.14.2@o2ib6) Apr 29 09:11:52 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.13.24@o2ib6) Apr 29 09:11:52 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 09:11:53 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.13.24@o2ib6) Apr 29 09:11:53 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 09:34:20 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 805c9b53-b5ac-70b3-1174-0545d8e4f16e (at 10.9.0.63@o2ib4) Apr 29 09:40:14 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.14.9@o2ib6) Apr 29 09:40:14 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 09:40:15 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.14.9@o2ib6) Apr 29 09:40:15 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 09:40:17 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.14.8@o2ib6) Apr 29 09:40:17 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 09:40:36 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 42580479-9769-7e96-0685-af71f2380e4d (at 10.8.14.6@o2ib6) Apr 29 09:41:46 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 8291de35-2b9b-a367-f026-cecf1f3c56bb (at 10.8.17.3@o2ib6) Apr 29 09:41:46 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 09:42:53 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 3fa39b6b-68bb-1570-c431-26d39d28b172 (at 10.9.103.12@o2ib4) Apr 29 09:42:53 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 09:44:52 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 25c4d2d3-bd69-c0d4-1a1f-a673eaec6370 (at 10.9.102.70@o2ib4) Apr 29 09:44:52 fir-md1-s1 kernel: Lustre: Skipped 4 previous similar messages Apr 29 09:45:46 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 3a7fb5c7-aa4b-da40-a3f0-6ccb458e3426 (at 10.9.102.70@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98488db51800, cur 1556556346 expire 1556556196 last 1556556119 Apr 29 09:45:46 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 09:46:12 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client c2269f9e-0528-6ac7-a9df-944515bbc59e (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9848ebe00400, cur 1556556372 expire 1556556222 last 1556556145 Apr 29 09:46:12 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 09:49:12 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 09:49:12 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 10:06:00 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client cc33dcd4-da52-9936-53df-7263cc8c3cfd (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98484279b400, cur 1556557560 expire 1556557410 last 1556557333 Apr 29 10:06:00 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 10:10:18 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 10:10:18 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 29 10:29:37 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 700295d5-4c82-df5a-76d4-1fdc06f0b471 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9837df28c000, cur 1556558977 expire 1556558827 last 1556558750 Apr 29 10:29:37 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 10:32:53 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 10:32:53 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 10:48:52 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client db063b05-a22d-9c4a-a4e4-1ea56d813bf1 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9837ceed4000, cur 1556560132 expire 1556559982 last 1556559905 Apr 29 10:48:52 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 10:51:59 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 10:51:59 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 10:52:00 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.10.29@o2ib6) Apr 29 10:52:00 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 11:31:05 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 1591aae4-9a67-e2fd-4368-80f3361e0936 (at 10.8.14.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98486db3ec00, cur 1556562665 expire 1556562515 last 1556562438 Apr 29 11:31:05 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 12:24:07 fir-md1-s1 kernel: Lustre: 104709:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556565840/real 1556565840] req@ffff98385d79e900 x1632086641716240/t0(0) o104->fir-MDT0002@10.8.1.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556565847 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Apr 29 12:24:14 fir-md1-s1 kernel: Lustre: 104709:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556565847/real 1556565847] req@ffff98385d79e900 x1632086641716240/t0(0) o104->fir-MDT0002@10.8.1.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556565854 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 29 12:24:15 fir-md1-s1 kernel: Lustre: 105010:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9837863aef00 x1631646858528272/t0(0) o101->8719d679-2033-f46d-d5b4-1da7ad753964@10.8.21.33@o2ib6:20/0 lens 1776/3288 e 1 to 0 dl 1556565860 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 12:24:21 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 8719d679-2033-f46d-d5b4-1da7ad753964 (at 10.8.21.33@o2ib6) reconnecting Apr 29 12:24:21 fir-md1-s1 kernel: Lustre: 104709:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556565854/real 1556565854] req@ffff98385d79e900 x1632086641716240/t0(0) o104->fir-MDT0002@10.8.1.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556565861 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 29 12:24:21 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 12:24:21 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.21.33@o2ib6) Apr 29 12:24:28 fir-md1-s1 kernel: Lustre: 104709:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556565861/real 1556565861] req@ffff98385d79e900 x1632086641716240/t0(0) o104->fir-MDT0002@10.8.1.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556565868 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 29 12:24:35 fir-md1-s1 kernel: Lustre: 104709:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556565868/real 1556565868] req@ffff98385d79e900 x1632086641716240/t0(0) o104->fir-MDT0002@10.8.1.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556565875 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 29 12:24:35 fir-md1-s1 kernel: LustreError: 104709:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 10.8.1.7@o2ib6) failed to reply to blocking AST (req@ffff98385d79e900 x1632086641716240 status 0 rc -110), evict it ns: mdt-fir-MDT0002_UUID lock: ffff982b52fd6540/0x1a7f5501af0d81b4 lrc: 4/0,0 mode: PR/PR res: [0x2c0013076:0x14888:0x0].0x0 bits 0x13/0x0 rrc: 62 type: IBT flags: 0x60200400000020 nid: 10.8.1.7@o2ib6 remote: 0xd06b014bb1317b23 expref: 182 pid: 105055 timeout: 91169 lvb_type: 0 Apr 29 12:24:35 fir-md1-s1 kernel: LustreError: 138-a: fir-MDT0002: A client on nid 10.8.1.7@o2ib6 was evicted due to a lock blocking callback time out: rc -110 Apr 29 12:24:35 fir-md1-s1 kernel: LustreError: 98552:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 35s: evicting client at 10.8.1.7@o2ib6 ns: mdt-fir-MDT0002_UUID lock: ffff982b52fd6540/0x1a7f5501af0d81b4 lrc: 3/0,0 mode: PR/PR res: [0x2c0013076:0x14888:0x0].0x0 bits 0x13/0x0 rrc: 62 type: IBT flags: 0x60200400000020 nid: 10.8.1.7@o2ib6 remote: 0xd06b014bb1317b23 expref: 183 pid: 105055 timeout: 0 lvb_type: 0 Apr 29 12:24:49 fir-md1-s1 kernel: Lustre: 105100:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556565882/real 1556565882] req@ffff98377bb9f500 x1632086642239216/t0(0) o104->fir-MDT0002@10.8.1.14@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556565889 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 29 12:24:49 fir-md1-s1 kernel: Lustre: 105100:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 1 previous similar message Apr 29 12:25:00 fir-md1-s1 kernel: Lustre: 105011:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff98378ce8da00 x1631646858533696/t0(0) o36->8719d679-2033-f46d-d5b4-1da7ad753964@10.8.21.33@o2ib6:5/0 lens 496/2888 e 0 to 0 dl 1556565905 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 12:25:06 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 8719d679-2033-f46d-d5b4-1da7ad753964 (at 10.8.21.33@o2ib6) reconnecting Apr 29 12:25:06 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.21.33@o2ib6) Apr 29 12:25:10 fir-md1-s1 kernel: Lustre: 105100:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556565903/real 1556565903] req@ffff98377bb9f500 x1632086642239216/t0(0) o104->fir-MDT0002@10.8.1.14@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556565910 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 29 12:25:10 fir-md1-s1 kernel: Lustre: 105100:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Apr 29 12:25:10 fir-md1-s1 kernel: LustreError: 105100:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 10.8.1.14@o2ib6) failed to reply to blocking AST (req@ffff98377bb9f500 x1632086642239216 status 0 rc -110), evict it ns: mdt-fir-MDT0002_UUID lock: ffff982c11253a80/0x1a7f55015b554963 lrc: 4/0,0 mode: PR/PR res: [0x2c001be70:0x5fa:0x0].0x0 bits 0x5b/0x0 rrc: 4 type: IBT flags: 0x60200400000020 nid: 10.8.1.14@o2ib6 remote: 0x6727fcfde4dc6674 expref: 3948 pid: 105024 timeout: 91204 lvb_type: 0 Apr 29 12:25:10 fir-md1-s1 kernel: LustreError: 138-a: fir-MDT0002: A client on nid 10.8.1.14@o2ib6 was evicted due to a lock blocking callback time out: rc -110 Apr 29 12:25:10 fir-md1-s1 kernel: LustreError: 98552:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 35s: evicting client at 10.8.1.14@o2ib6 ns: mdt-fir-MDT0002_UUID lock: ffff982c11253a80/0x1a7f55015b554963 lrc: 3/0,0 mode: PR/PR res: [0x2c001be70:0x5fa:0x0].0x0 bits 0x5b/0x0 rrc: 4 type: IBT flags: 0x60200400000020 nid: 10.8.1.14@o2ib6 remote: 0x6727fcfde4dc6674 expref: 3949 pid: 105024 timeout: 0 lvb_type: 0 Apr 29 12:26:24 fir-md1-s1 kernel: Lustre: 105286:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff982831be1800 x1631540762426304/t0(0) o101->2e1837bb-385a-af64-a5d1-7a58230af8b2@10.9.0.64@o2ib4:29/0 lens 480/568 e 1 to 0 dl 1556565989 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 12:26:30 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 2e1837bb-385a-af64-a5d1-7a58230af8b2 (at 10.9.0.64@o2ib4) reconnecting Apr 29 12:26:30 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.9.0.64@o2ib4) Apr 29 12:26:39 fir-md1-s1 kernel: LustreError: 98552:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 30s: evicting client at 10.8.0.65@o2ib6 ns: mdt-fir-MDT0002_UUID lock: ffff983cd3ad3cc0/0x1a7f5501b2ea864f lrc: 3/0,0 mode: PW/PW res: [0x2c0001757:0xc13:0x0].0x0 bits 0x40/0x0 rrc: 8 type: IBT flags: 0x60200400000020 nid: 10.8.0.65@o2ib6 remote: 0x41e86f8b1aec1576 expref: 2765 pid: 105015 timeout: 91270 lvb_type: 0 Apr 29 12:26:53 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.0.65@o2ib6) Apr 29 12:27:35 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client bea4489b-3515-bf0a-35fc-7a54985e6bca (at 10.8.1.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c9716b800, cur 1556566055 expire 1556565905 last 1556565828 Apr 29 12:27:35 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 12:28:51 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client e71be1be-a191-6969-1824-6fbb914cebbd (at 10.8.1.5@o2ib6) in 183 seconds. I think it's dead, and I am evicting it. exp ffff982cae2a5c00, cur 1556566131 expire 1556565981 last 1556565948 Apr 29 12:28:51 fir-md1-s1 kernel: Lustre: Skipped 36 previous similar messages Apr 29 12:28:51 fir-md1-s1 kernel: Lustre: 105122:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556566124/real 1556566124] req@ffff9827f7b67b00 x1632086647197712/t0(0) o104->fir-MDT0002@10.8.1.1@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556566131 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Apr 29 12:29:10 fir-md1-s1 kernel: Lustre: 104955:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff9827fba6b000 x1631354901323088/t0(0) o36->414f404d-8f8d-e649-adfd-ee21c11784f7@10.8.21.32@o2ib6:14/0 lens 544/2888 e 0 to 0 dl 1556566154 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 12:29:15 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 414f404d-8f8d-e649-adfd-ee21c11784f7 (at 10.8.21.32@o2ib6) reconnecting Apr 29 12:29:15 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.21.32@o2ib6) Apr 29 12:29:17 fir-md1-s1 kernel: Lustre: 105061:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9858e2671800 x1631610370953824/t0(0) o101->ea216aa1-3f9e-6bba-cc60-e74ebefab95f@10.9.106.35@o2ib4:22/0 lens 584/3264 e 1 to 0 dl 1556566162 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 12:29:17 fir-md1-s1 kernel: Lustre: 105061:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages Apr 29 12:29:19 fir-md1-s1 kernel: LustreError: 105122:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 10.8.1.1@o2ib6) failed to reply to blocking AST (req@ffff9827f7b67b00 x1632086647197712 status 0 rc -110), evict it ns: mdt-fir-MDT0002_UUID lock: ffff982b52eb9f80/0x1a7f55010acb96d9 lrc: 4/0,0 mode: PR/PR res: [0x2c001c0e8:0x468:0x0].0x0 bits 0x5b/0x0 rrc: 4 type: IBT flags: 0x60200400000020 nid: 10.8.1.1@o2ib6 remote: 0xf3b356862370200 expref: 9973 pid: 105055 timeout: 91453 lvb_type: 0 Apr 29 12:29:19 fir-md1-s1 kernel: LustreError: 138-a: fir-MDT0002: A client on nid 10.8.1.1@o2ib6 was evicted due to a lock blocking callback time out: rc -110 Apr 29 12:29:19 fir-md1-s1 kernel: LustreError: 98552:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 35s: evicting client at 10.8.1.1@o2ib6 ns: mdt-fir-MDT0002_UUID lock: ffff982b52eb9f80/0x1a7f55010acb96d9 lrc: 3/0,0 mode: PR/PR res: [0x2c001c0e8:0x468:0x0].0x0 bits 0x5b/0x0 rrc: 4 type: IBT flags: 0x60200400000020 nid: 10.8.1.1@o2ib6 remote: 0xf3b356862370200 expref: 9974 pid: 105055 timeout: 0 lvb_type: 0 Apr 29 12:29:19 fir-md1-s1 kernel: LustreError: 114763:0:(client.c:1175:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff98377b261800 x1632086647414800/t0(0) o104->fir-MDT0002@10.8.1.1@o2ib6:15/16 lens 296/224 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Apr 29 12:29:19 fir-md1-s1 kernel: LustreError: 114763:0:(client.c:1175:ptlrpc_import_delay_req()) Skipped 1 previous similar message Apr 29 12:30:06 fir-md1-s1 kernel: Lustre: 114761:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556566199/real 1556566199] req@ffff984acb731500 x1632086647590176/t0(0) o106->fir-MDT0002@10.8.1.21@o2ib6:15/16 lens 296/280 e 0 to 1 dl 1556566206 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Apr 29 12:30:06 fir-md1-s1 kernel: Lustre: 114761:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 8 previous similar messages Apr 29 12:30:24 fir-md1-s1 kernel: Lustre: 104514:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff9847f5bb9b00 x1631507694422848/t0(0) o101->961be99d-7ebe-b3cc-0bf6-b3ffe5de5af3@10.8.10.30@o2ib6:29/0 lens 480/568 e 0 to 0 dl 1556566229 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 12:30:30 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 961be99d-7ebe-b3cc-0bf6-b3ffe5de5af3 (at 10.8.10.30@o2ib6) reconnecting Apr 29 12:30:30 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 12:30:30 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to a81bc029-a063-90fe-07be-7634d62bf6c7 (at 10.8.10.30@o2ib6) Apr 29 12:30:30 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 12:30:55 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 0a23a5c6-9a2c-c94e-20d6-5cf2ace26733 (at 10.8.10.32@o2ib6) reconnecting Apr 29 12:30:55 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 12:30:55 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.10.32@o2ib6) Apr 29 12:30:55 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 12:31:04 fir-md1-s1 kernel: Lustre: 105110:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff98377b69ec00 x1631742561600256/t0(0) o101->443850a1-e00f-945f-2b6c-3f1b9a404420@10.8.10.25@o2ib6:9/0 lens 1808/3288 e 0 to 0 dl 1556566269 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 12:31:04 fir-md1-s1 kernel: Lustre: 105110:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages Apr 29 12:31:14 fir-md1-s1 kernel: LustreError: 104912:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 10.8.1.8@o2ib6) failed to reply to blocking AST (req@ffff982839fb3600 x1632086647832896 status 0 rc -110), evict it ns: mdt-fir-MDT0002_UUID lock: ffff983c172c1680/0x1a7f5501aea25fb4 lrc: 4/0,0 mode: PR/PR res: [0x2c0013076:0x13b4d:0x0].0x0 bits 0x13/0x0 rrc: 10 type: IBT flags: 0x60200400000020 nid: 10.8.1.8@o2ib6 remote: 0x68edbe8038d7635f expref: 167 pid: 114809 timeout: 91567 lvb_type: 0 Apr 29 12:31:14 fir-md1-s1 kernel: LustreError: 138-a: fir-MDT0002: A client on nid 10.8.1.8@o2ib6 was evicted due to a lock blocking callback time out: rc -110 Apr 29 12:31:14 fir-md1-s1 kernel: LustreError: 98552:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 35s: evicting client at 10.8.1.8@o2ib6 ns: mdt-fir-MDT0002_UUID lock: ffff983c172c1680/0x1a7f5501aea25fb4 lrc: 3/0,0 mode: PR/PR res: [0x2c0013076:0x13b4d:0x0].0x0 bits 0x13/0x0 rrc: 11 type: IBT flags: 0x60200400000020 nid: 10.8.1.8@o2ib6 remote: 0x68edbe8038d7635f expref: 168 pid: 114809 timeout: 0 lvb_type: 0 Apr 29 12:31:16 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client fafb3280-fd8a-565a-20cd-97f85a227ff6 (at 10.8.11.31@o2ib6) reconnecting Apr 29 12:31:16 fir-md1-s1 kernel: Lustre: Skipped 3 previous similar messages Apr 29 12:31:20 fir-md1-s1 kernel: LustreError: 114756:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 10.8.1.20@o2ib6) failed to reply to blocking AST (req@ffff98377f781500 x1632086647890144 status 0 rc -110), evict it ns: mdt-fir-MDT0002_UUID lock: ffff983baa7b4a40/0x1a7f5501ae978bad lrc: 4/0,0 mode: PR/PR res: [0x2c0013076:0x130b6:0x0].0x0 bits 0x13/0x0 rrc: 10 type: IBT flags: 0x60200400000020 nid: 10.8.1.20@o2ib6 remote: 0xfc062702081880a6 expref: 166 pid: 105136 timeout: 91574 lvb_type: 0 Apr 29 12:31:20 fir-md1-s1 kernel: LustreError: 114756:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) Skipped 1 previous similar message Apr 29 12:31:20 fir-md1-s1 kernel: LustreError: 138-a: fir-MDT0002: A client on nid 10.8.1.20@o2ib6 was evicted due to a lock blocking callback time out: rc -110 Apr 29 12:31:20 fir-md1-s1 kernel: LustreError: Skipped 1 previous similar message Apr 29 12:31:32 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to a81bc029-a063-90fe-07be-7634d62bf6c7 (at 10.8.10.30@o2ib6) Apr 29 12:31:32 fir-md1-s1 kernel: Lustre: Skipped 4 previous similar messages Apr 29 12:32:04 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 961be99d-7ebe-b3cc-0bf6-b3ffe5de5af3 (at 10.8.10.30@o2ib6) reconnecting Apr 29 12:32:04 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 12:32:11 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 536de979-bdf0-e41a-74e8-be22c0e588bb (at 10.8.1.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c3f66c800, cur 1556566331 expire 1556566181 last 1556566104 Apr 29 12:32:11 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 12:32:19 fir-md1-s1 kernel: Lustre: 104885:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556566332/real 1556566332] req@ffff98490b262a00 x1632086647592000/t0(0) o106->fir-MDT0002@10.8.1.19@o2ib6:15/16 lens 296/280 e 0 to 1 dl 1556566339 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 29 12:32:19 fir-md1-s1 kernel: Lustre: 104885:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 54 previous similar messages Apr 29 12:33:06 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to a81bc029-a063-90fe-07be-7634d62bf6c7 (at 10.8.10.30@o2ib6) Apr 29 12:33:06 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 12:33:19 fir-md1-s1 kernel: LNet: Service thread pid 104885 was inactive for 200.03s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Apr 29 12:33:19 fir-md1-s1 kernel: Pid: 104885, comm: mdt02_005 3.10.0-957.1.3.el7_lustre.x86_64 #1 SMP Fri Dec 7 14:50:35 PST 2018 Apr 29 12:33:19 fir-md1-s1 kernel: Call Trace: Apr 29 12:33:19 fir-md1-s1 kernel: [] ptlrpc_set_wait+0x500/0x8d0 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] ldlm_run_ast_work+0xd5/0x3a0 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] ldlm_glimpse_locks+0x3b/0x100 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] mdt_do_glimpse+0x1e9/0x4c0 [mdt] Apr 29 12:33:19 fir-md1-s1 kernel: [] mdt_glimpse_enqueue+0x3d3/0x4f0 [mdt] Apr 29 12:33:19 fir-md1-s1 kernel: [] mdt_intent_glimpse+0x1f/0x30 [mdt] Apr 29 12:33:19 fir-md1-s1 kernel: [] mdt_intent_policy+0x2e8/0xd00 [mdt] Apr 29 12:33:19 fir-md1-s1 kernel: [] ldlm_lock_enqueue+0x366/0xa60 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] Apr 29 12:33:19 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 Apr 29 12:33:19 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 Apr 29 12:33:19 fir-md1-s1 kernel: [] 0xffffffffffffffff Apr 29 12:33:19 fir-md1-s1 kernel: LustreError: dumping log to /tmp/lustre-log.1556566399.104885 Apr 29 12:33:27 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 2bb73289-246f-eef4-271a-7f2b0f0e738c (at 10.8.1.19@o2ib6) in 224 seconds. I think it's dead, and I am evicting it. exp ffff982caf8fa800, cur 1556566407 expire 1556566257 last 1556566183 Apr 29 12:33:27 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 29 12:33:27 fir-md1-s1 kernel: LNet: Service thread pid 104885 completed after 207.37s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Apr 29 12:34:43 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 17cc7e5e-b0f0-3041-2ea5-47649f83c68a (at 10.9.105.3@o2ib4) in 155 seconds. I think it's dead, and I am evicting it. exp ffff982c9b489800, cur 1556566483 expire 1556566333 last 1556566328 Apr 29 12:34:43 fir-md1-s1 kernel: Lustre: Skipped 10 previous similar messages Apr 29 12:35:55 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client bb7ca983-30a7-699d-dc68-5d3ba052bb48 (at 10.9.105.3@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982caa2df400, cur 1556566555 expire 1556566405 last 1556566328 Apr 29 12:35:55 fir-md1-s1 kernel: Lustre: Skipped 4 previous similar messages Apr 29 12:56:13 fir-md1-s1 kernel: Lustre: MGS: Connection restored to d1ed76b2-d0ab-d16d-6ddf-03577d3faee9 (at 10.8.1.16@o2ib6) Apr 29 12:57:45 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 556c7472-e6a1-dbfd-d736-4aa3a5d34a21 (at 10.8.1.15@o2ib6) Apr 29 12:57:45 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 12:58:29 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.1.13@o2ib6) Apr 29 12:58:29 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 29 12:59:48 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client e95f2dea-bfd4-30ff-5daa-abde5eaef543 (at 10.8.13.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cb02ee000, cur 1556567988 expire 1556567838 last 1556567761 Apr 29 12:59:48 fir-md1-s1 kernel: Lustre: Skipped 3 previous similar messages Apr 29 13:00:04 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client e95f2dea-bfd4-30ff-5daa-abde5eaef543 (at 10.8.13.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c08ba3c00, cur 1556568004 expire 1556567854 last 1556567777 Apr 29 13:00:04 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 13:00:46 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.1.5@o2ib6) Apr 29 13:00:46 fir-md1-s1 kernel: Lustre: Skipped 26 previous similar messages Apr 29 13:03:14 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 0a548a8d-0306-ac26-1d74-3d4d1502fc27 (at 10.8.1.8@o2ib6) Apr 29 13:03:14 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 29 13:31:53 fir-md1-s1 kernel: LNetError: 98336:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 29 13:32:24 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 2e220de5-7b5b-3874-45ea-64c959a50d0b (at 10.8.0.67@o2ib6) reconnecting Apr 29 13:32:24 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 13:32:24 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to 6f33b97b-66f8-e1a5-c00c-bf631f892350 (at 10.8.0.67@o2ib6) Apr 29 13:32:24 fir-md1-s1 kernel: Lustre: Skipped 29 previous similar messages Apr 29 14:05:18 fir-md1-s1 kernel: LNetError: 98331:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 29 14:05:26 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 40047af1-727c-af36-6cf4-0ce2eaf8f0e0 (at 10.8.7.28@o2ib6) reconnecting Apr 29 14:05:26 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.7.28@o2ib6) Apr 29 14:30:09 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 8e6278f9-7305-b217-3b1f-cfd02c7696e0 (at 10.9.105.2@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c3fef6c00, cur 1556573409 expire 1556573259 last 1556573182 Apr 29 14:58:40 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 4a212031-a783-6a5f-433e-abb6c797d547 (at 10.9.105.2@o2ib4) Apr 29 14:58:41 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to 4a212031-a783-6a5f-433e-abb6c797d547 (at 10.9.105.2@o2ib4) Apr 29 16:06:04 fir-md1-s1 kernel: LNetError: 98342:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 29 16:06:17 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 9a8bc7f0-674a-721d-c255-50108001b9f0 (at 10.8.0.66@o2ib6) reconnecting Apr 29 16:06:17 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.0.66@o2ib6) Apr 29 16:06:17 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 16:20:54 fir-md1-s1 kernel: Lustre: 104986:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9836fea10000 x1631540787910864/t0(0) o101->2e1837bb-385a-af64-a5d1-7a58230af8b2@10.9.0.64@o2ib4:29/0 lens 480/568 e 1 to 0 dl 1556580059 ref 2 fl Interpret:/0/0 rc 0/0 Apr 29 16:20:54 fir-md1-s1 kernel: Lustre: 104986:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message Apr 29 17:02:15 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.14.9@o2ib6) Apr 29 17:02:16 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.14.9@o2ib6) Apr 29 17:02:53 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.14.8@o2ib6) Apr 29 17:02:53 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 17:03:16 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 42580479-9769-7e96-0685-af71f2380e4d (at 10.8.14.6@o2ib6) Apr 29 17:03:16 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 17:12:31 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.9.101.52@o2ib4) Apr 29 17:12:31 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 17:12:37 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 97fdc779-cf49-9cfe-e70e-4fa32248f62a (at 10.8.1.27@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cfbde7000, cur 1556583157 expire 1556583007 last 1556582930 Apr 29 17:12:37 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 17:12:48 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.30.3@o2ib6) Apr 29 17:12:48 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 17:13:31 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 6578454c-74df-9da0-8364-fba5b907f5df (at 10.9.112.13@o2ib4) Apr 29 17:13:31 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 17:19:32 fir-md1-s1 kernel: LNetError: 98338:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 29 17:19:32 fir-md1-s1 kernel: LNetError: 98338:0:(lib-msg.c:811:lnet_is_health_check()) Skipped 2 previous similar messages Apr 29 17:19:39 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client cec4ce3d-7421-61e4-362c-c29b7d79240a (at 10.8.27.10@o2ib6) reconnecting Apr 29 17:19:39 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.27.10@o2ib6) Apr 29 17:19:39 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 29 17:20:18 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client c5d29146-8e69-99bb-85ae-0e928604facc (at 10.8.0.68@o2ib6) reconnecting Apr 29 17:20:19 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.0.68@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 29 17:20:19 fir-md1-s1 kernel: LustreError: Skipped 10 previous similar messages Apr 29 17:20:39 fir-md1-s1 kernel: LNetError: 98333:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 29 17:20:44 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client c5d29146-8e69-99bb-85ae-0e928604facc (at 10.8.0.68@o2ib6) reconnecting Apr 29 17:20:44 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to c5d29146-8e69-99bb-85ae-0e928604facc (at 10.8.0.68@o2ib6) Apr 29 17:20:44 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 17:20:47 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 147c0c80-0156-d078-a77e-b8af4511cc40 (at 10.8.27.6@o2ib6) reconnecting Apr 29 17:24:05 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 062a922e-d169-cbae-d1a8-a13d992a2655 (at 10.8.14.7@o2ib6) Apr 29 17:24:05 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 17:52:56 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 6e281d64-50b1-ba40-7a98-35184b0fb522 (at 10.8.1.17@o2ib6) reconnecting Apr 29 17:52:56 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.1.17@o2ib6) Apr 29 17:52:56 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 29 18:04:37 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 31aea298-76cf-b176-60c7-b3a8604d9082 (at 10.8.14.2@o2ib6) Apr 29 18:05:27 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client f8b6c3d5-656e-fd56-84c6-e17e320a0313 (at 10.8.14.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9836cc70d400, cur 1556586327 expire 1556586177 last 1556586100 Apr 29 18:05:27 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 29 20:09:16 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client b0f06bff-666c-4f17-dbec-3b9049db7d53 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c83bf2400, cur 1556593756 expire 1556593606 last 1556593529 Apr 29 20:09:16 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 20:09:35 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 20:09:35 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 20:09:35 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 20:09:35 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 20:20:06 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 3867a32c-78cd-2d85-0084-612d8bc8cfc2 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98272072c000, cur 1556594406 expire 1556594256 last 1556594179 Apr 29 20:20:06 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 20:20:25 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 20:20:26 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 23:07:20 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 23fc2cc4-e55e-4102-8ff1-2721d7928fe3 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982720b32c00, cur 1556604440 expire 1556604290 last 1556604213 Apr 29 23:07:20 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 23:07:45 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 23:07:45 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 23:07:46 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 23:07:46 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 29 23:29:58 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 15cf799f-e999-06c3-e497-fd977f4842e4 (at 10.8.1.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c8ae6c800, cur 1556605798 expire 1556605648 last 1556605571 Apr 29 23:29:58 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 23:30:10 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 38622ec9-7ac0-38c3-b90a-94070e269171 (at 10.8.1.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cfbde1c00, cur 1556605810 expire 1556605660 last 1556605583 Apr 29 23:30:12 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 38622ec9-7ac0-38c3-b90a-94070e269171 (at 10.8.1.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c98f92800, cur 1556605812 expire 1556605662 last 1556605585 Apr 29 23:34:36 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 196f26f6-9ffe-a462-427b-ad78cfe213db (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983634af3400, cur 1556606076 expire 1556605926 last 1556605849 Apr 29 23:35:00 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 23:47:11 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 560f3c71-6165-647b-7e6a-457ed54e5a5a (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983616a4fc00, cur 1556606831 expire 1556606681 last 1556606604 Apr 29 23:47:11 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 29 23:47:32 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 29 23:47:32 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:01:35 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.1.4@o2ib6) Apr 30 00:01:35 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:11:46 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 75340d33-b0a3-a200-a256-242a0d6f135a (at 10.8.1.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cafade400, cur 1556608306 expire 1556608156 last 1556608079 Apr 30 00:11:46 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:14:06 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.1.29@o2ib6) Apr 30 00:14:06 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:14:07 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.1.29@o2ib6) Apr 30 00:14:07 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 00:16:27 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client ab1887af-dff2-7ba3-fffb-db8911310d92 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983621671c00, cur 1556608587 expire 1556608437 last 1556608360 Apr 30 00:16:27 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:19:59 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 30 00:22:25 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 30 00:22:25 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:23:46 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client cbd5b2f4-fdb3-42fa-0374-7bb2f5bcf359 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9826c326f800, cur 1556609026 expire 1556608876 last 1556608799 Apr 30 00:23:46 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:27:04 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client d8afb670-b6f1-240e-7a05-516e1585d701 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9826c37a1000, cur 1556609224 expire 1556609074 last 1556608997 Apr 30 00:27:04 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:27:05 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 30 00:27:05 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:38:52 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 56371864-de3c-4967-6366-4d383d838ee0 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98360a670c00, cur 1556609932 expire 1556609782 last 1556609705 Apr 30 00:38:52 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:39:08 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 30 00:39:08 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:43:06 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 250ea2e8-78ee-c31f-8dde-7f09a56c1645 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983608fb5800, cur 1556610186 expire 1556610036 last 1556609959 Apr 30 00:43:06 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 00:43:23 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 250ea2e8-78ee-c31f-8dde-7f09a56c1645 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9846b4e7f800, cur 1556610203 expire 1556610053 last 1556609976 Apr 30 00:43:25 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 30 00:43:25 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:20:09 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 9620d4c8-9ab0-39de-d075-9eadcff61784 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cb0ac3400, cur 1556616009 expire 1556615859 last 1556615782 Apr 30 02:20:09 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 02:20:10 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 9620d4c8-9ab0-39de-d075-9eadcff61784 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cfbd80c00, cur 1556616010 expire 1556615860 last 1556615783 Apr 30 02:20:10 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 02:24:18 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 02:24:18 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:28:05 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client ca232a3e-a284-7474-aab9-e112cbec4e1d (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985c60fdd000, cur 1556616485 expire 1556616335 last 1556616258 Apr 30 02:28:08 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client e0a20e99-adbd-b0cf-ce21-1b879b0aa677 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9835eef74c00, cur 1556616488 expire 1556616338 last 1556616261 Apr 30 02:30:11 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 02:30:11 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:33:55 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 02:33:55 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:33:58 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 64997e83-0f9c-4308-6e2b-a64dc013f292 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9858a8a37400, cur 1556616838 expire 1556616688 last 1556616611 Apr 30 02:33:58 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 02:37:42 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client f8f8215e-95d2-56ed-c80a-260397ab2bb4 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9858c127cc00, cur 1556617062 expire 1556616912 last 1556616835 Apr 30 02:37:42 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:39:49 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 02:39:49 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:43:32 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 02:43:32 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:43:36 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 2e5c90ad-1a8a-fb05-9326-f3ecffa2681d (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cb1eb9c00, cur 1556617416 expire 1556617266 last 1556617189 Apr 30 02:43:36 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:47:19 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client c3e46c85-fbde-7ea3-ee24-782a04416210 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985c61ec2400, cur 1556617639 expire 1556617489 last 1556617412 Apr 30 02:47:19 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:48:07 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 02:48:07 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:53:09 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 0573c99a-464f-6462-8876-66b949e748dc (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98283b617000, cur 1556617989 expire 1556617839 last 1556617762 Apr 30 02:53:09 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:54:20 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 02:54:20 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 02:58:07 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client c780fcd4-56fc-d774-538e-c9b196ed7a72 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cc7244400, cur 1556618287 expire 1556618137 last 1556618060 Apr 30 02:58:07 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 03:00:13 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 03:00:13 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 03:04:00 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 98a4986c-ba88-0ef8-7584-f0edb57b09c8 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985bc93e9000, cur 1556618640 expire 1556618490 last 1556618413 Apr 30 03:04:00 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 03:05:11 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 03:05:11 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 03:11:03 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 03:11:03 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 03:14:50 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client f060b2fc-aabe-5650-b922-d3e033b970ba (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cb2bdcc00, cur 1556619290 expire 1556619140 last 1556619063 Apr 30 03:14:50 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 03:16:01 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 03:16:01 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 03:25:39 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 956ad010-2fc5-c79d-1b94-e0610626c4ac (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9835f9f9dc00, cur 1556619939 expire 1556619789 last 1556619712 Apr 30 03:25:39 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 03:26:51 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 03:26:51 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 03:41:39 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 03:41:39 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 03:45:26 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client cdd601f1-f36d-ac0f-9c88-0adc825deb8b (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98280ff6fc00, cur 1556621126 expire 1556620976 last 1556620899 Apr 30 03:45:26 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 03:55:11 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 03:55:11 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 04:00:13 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 74e717e8-31cf-03ba-fb03-fcc99afd0e1d (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985be568a400, cur 1556622013 expire 1556621863 last 1556621786 Apr 30 04:00:13 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 04:06:17 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 04:06:17 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 04:15:05 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client c0b8fa33-4fc0-e74c-cdf7-83d5c45c261d (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9846b1225000, cur 1556622905 expire 1556622755 last 1556622678 Apr 30 04:15:05 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 04:21:14 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 04:21:14 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 04:29:58 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 1e2ff914-40ca-94ef-4c9e-fda8963b5feb (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98469f6ecc00, cur 1556623798 expire 1556623648 last 1556623571 Apr 30 04:29:58 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 04:33:36 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.26.4@o2ib6) Apr 30 04:33:36 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 04:44:50 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 836207a1-369a-f345-2c62-cef3ae247fde (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984677689000, cur 1556624690 expire 1556624540 last 1556624463 Apr 30 04:44:50 fir-md1-s1 kernel: Lustre: Skipped 14 previous similar messages Apr 30 04:46:01 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 04:46:01 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 04:55:42 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client ebd42e16-aa0f-e5b3-d21b-f0fe2189f6da (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984684285c00, cur 1556625342 expire 1556625192 last 1556625115 Apr 30 04:55:42 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 05:01:43 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 05:01:43 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 05:10:25 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client eeb99c39-a642-95a1-b88d-6b3e19e1f5de (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9846722d9000, cur 1556626225 expire 1556626075 last 1556625998 Apr 30 05:10:25 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 05:16:32 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 05:16:32 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 05:25:16 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 4cdc8623-00e6-9c4e-7cb8-87899cd46c60 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98466eefc000, cur 1556627116 expire 1556626966 last 1556626889 Apr 30 05:25:16 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 05:31:26 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 05:31:26 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 05:42:17 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 5bde17a2-0029-8404-713f-8f1d5f1830cd (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9837c1a29400, cur 1556628137 expire 1556627987 last 1556627910 Apr 30 05:42:17 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 05:44:45 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 05:44:45 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 05:55:01 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 0f9e5759-213d-d12a-4833-7999fca5b124 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9846a3ba8400, cur 1556628901 expire 1556628751 last 1556628674 Apr 30 05:55:01 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 05:56:10 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 05:56:10 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 06:09:53 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 5e127778-b19c-e22d-9194-7f090ba1e150 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9846622be800, cur 1556629793 expire 1556629643 last 1556629566 Apr 30 06:09:53 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 06:11:01 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 06:11:01 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 06:21:46 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 06:21:46 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 06:25:33 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 7e1409c5-d146-ab05-0f10-6f891628f04f (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9835a87a1c00, cur 1556630733 expire 1556630583 last 1556630506 Apr 30 06:25:33 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 06:35:34 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 3edb7d08-0747-f23f-ba5a-7fc880f58917 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9835b5fff800, cur 1556631334 expire 1556631184 last 1556631107 Apr 30 06:35:34 fir-md1-s1 kernel: Lustre: Skipped 9 previous similar messages Apr 30 06:36:45 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 06:36:45 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 06:49:07 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 06:49:07 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 06:50:25 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 7f3a8fb5-6743-88c7-f413-b2e50b908175 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98271a793000, cur 1556632225 expire 1556632075 last 1556631998 Apr 30 06:50:25 fir-md1-s1 kernel: Lustre: Skipped 7 previous similar messages Apr 30 07:01:28 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 07:01:28 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 07:05:15 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client a83c5ea8-7f8d-3f9a-1027-4a633c82c492 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984657637800, cur 1556633115 expire 1556632965 last 1556632888 Apr 30 07:05:15 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 07:16:28 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 07:16:28 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 07:20:15 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 6bb44096-6097-d4fb-9e46-5daa6d81ee34 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98359ff27400, cur 1556634015 expire 1556633865 last 1556633788 Apr 30 07:20:15 fir-md1-s1 kernel: Lustre: Skipped 5 previous similar messages Apr 30 07:31:20 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 07:31:20 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 07:35:07 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 03417a96-815a-ed19-9454-67942dd46815 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984654bf9c00, cur 1556634907 expire 1556634757 last 1556634680 Apr 30 07:35:07 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 07:45:09 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 1f4d151f-e4d9-806c-0686-a2bab0574845 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98359a6bc400, cur 1556635509 expire 1556635359 last 1556635282 Apr 30 07:45:09 fir-md1-s1 kernel: Lustre: Skipped 9 previous similar messages Apr 30 07:46:18 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 07:46:18 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 07:57:20 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 2cf43c7f-3c72-ee4a-5328-2c837e619b0a (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9835a0284800, cur 1556636240 expire 1556636090 last 1556636013 Apr 30 07:57:20 fir-md1-s1 kernel: Lustre: Skipped 7 previous similar messages Apr 30 08:01:07 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 08:01:07 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 08:06:34 fir-md1-s1 kernel: LNetError: 98340:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) Apr 30 08:09:54 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 6e55d082-c154-a339-d6a4-318e9a07fc6c (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9826796ee800, cur 1556636994 expire 1556636844 last 1556636767 Apr 30 08:09:54 fir-md1-s1 kernel: Lustre: Skipped 8 previous similar messages Apr 30 08:16:01 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.12.33@o2ib6) Apr 30 08:16:01 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 08:28:13 fir-md1-s1 kernel: Lustre: Failing over fir-MDT0000 Apr 30 08:28:13 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 08:28:13 fir-md1-s1 kernel: Lustre: fir-MDT0002: Not available for connect from 10.8.1.22@o2ib6 (stopping) Apr 30 08:28:13 fir-md1-s1 kernel: Lustre: Skipped 193 previous similar messages Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 11-0: fir-MDT0001-osp-MDT0000: operation mds_disconnect to node 10.0.10.52@o2ib7 failed: rc = -107 Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 100903:0:(osp_dev.c:485:osp_disconnect()) fir-MDT0002-osp-MDT0000: can't disconnect: rc = -19 Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 98940:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) ldlm_cancel from 10.9.101.23@o2ib4 arrived at 1556638094 with bad export cookie 1909338203045035455 Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 98940:0:(ldlm_lock.c:2695:ldlm_lock_dump_handle()) ### ### ns: mdt-fir-MDT0000_UUID lock: ffff984c3fa72ac0/0x1a7f55072af7abc7 lrc: 3/0,0 mode: PR/PR res: [0x2000217f1:0xa94:0x0].0x0 bits 0x1b/0x0 rrc: 7 type: IBT flags: 0x40200000000000 nid: 10.9.101.23@o2ib4 remote: 0x75c679fbbd0be2f expref: 2187 pid: 105101 timeout: 159493 lvb_type: 0 Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 100903:0:(lod_dev.c:265:lod_sub_process_config()) fir-MDT0000-mdtlov: error cleaning up LOD index 2: cmd 0xcf031: rc = -19 Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 98940:0:(ldlm_lock.c:2695:ldlm_lock_dump_handle()) Skipped 5 previous similar messages Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 98429:0:(client.c:1175:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff98464cb20600 x1632086911409264/t0(0) o41->fir-MDT0003-osp-MDT0000@10.0.10.52@o2ib7:24/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Apr 30 08:28:14 fir-md1-s1 kernel: LustreError: 98414:0:(client.c:1175:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff982bb64a1b00 x1632086911409248/t0(0) o41->fir-MDT0003-osp-MDT0002@10.0.10.52@o2ib7:24/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Apr 30 08:28:15 fir-md1-s1 kernel: LustreError: 107788:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) ldlm_cancel from 10.8.20.10@o2ib6 arrived at 1556638095 with bad export cookie 1909338203045026040 Apr 30 08:28:15 fir-md1-s1 kernel: LustreError: 98425:0:(client.c:1175:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff9835f0273600 x1632086911410384/t0(0) o41->fir-MDT0002-osp-MDT0000@0@lo:24/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Apr 30 08:28:15 fir-md1-s1 kernel: LustreError: 98425:0:(client.c:1175:ptlrpc_import_delay_req()) Skipped 3 previous similar messages Apr 30 08:28:15 fir-md1-s1 kernel: LustreError: 107967:0:(ldlm_lock.c:2695:ldlm_lock_dump_handle()) ### ### ns: mdt-fir-MDT0002_UUID lock: ffff983cb0ba8000/0x1a7f55072af7f999 lrc: 3/0,0 mode: PR/PR res: [0x2c000156c:0xa5df:0x0].0x0 bits 0x5b/0x0 rrc: 425 type: IBT flags: 0x40200000000000 nid: 10.8.18.17@o2ib6 remote: 0xc30e748dd1cc3d59 expref: 137 pid: 104912 timeout: 0 lvb_type: 0 Apr 30 08:28:16 fir-md1-s1 kernel: Lustre: fir-MDT0002: Not available for connect from 10.9.102.16@o2ib4 (stopping) Apr 30 08:28:16 fir-md1-s1 kernel: Lustre: Skipped 187 previous similar messages Apr 30 08:28:16 fir-md1-s1 kernel: LustreError: 98940:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) ldlm_cancel from 10.8.31.9@o2ib6 arrived at 1556638096 with bad export cookie 1909338203045028784 Apr 30 08:28:16 fir-md1-s1 kernel: LustreError: 98940:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) Skipped 1 previous similar message Apr 30 08:28:16 fir-md1-s1 kernel: LustreError: 107788:0:(ldlm_lock.c:2695:ldlm_lock_dump_handle()) ### ### ns: mdt-fir-MDT0000_UUID lock: ffff982cbfe7e0c0/0x1a7f55072b0249c6 lrc: 3/0,0 mode: PR/PR res: [0x200021997:0xe52:0x0].0x0 bits 0x20/0x0 rrc: 7 type: IBT flags: 0x40200000000000 nid: 10.9.107.15@o2ib4 remote: 0xcf1976d3dd3e765f expref: 249 pid: 104520 timeout: 0 lvb_type: 0 Apr 30 08:28:16 fir-md1-s1 kernel: LustreError: 107788:0:(ldlm_lock.c:2695:ldlm_lock_dump_handle()) Skipped 3 previous similar messages Apr 30 08:28:18 fir-md1-s1 kernel: LustreError: 107967:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) ldlm_cancel from 10.9.106.6@o2ib4 arrived at 1556638098 with bad export cookie 1909338203045023121 Apr 30 08:28:18 fir-md1-s1 kernel: LustreError: 107967:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) Skipped 5 previous similar messages Apr 30 08:28:19 fir-md1-s1 kernel: LustreError: 107788:0:(ldlm_lock.c:2695:ldlm_lock_dump_handle()) ### ### ns: mdt-fir-MDT0000_UUID lock: ffff983974b43f00/0x1a7f55072a3e81b8 lrc: 3/0,0 mode: PR/PR res: [0x200016b10:0x1d:0x0].0x0 bits 0x5b/0x0 rrc: 3 type: IBT flags: 0x40200000000000 nid: 10.8.29.3@o2ib6 remote: 0x73c64d7d5eda9bd0 expref: 64926 pid: 105233 timeout: 0 lvb_type: 0 Apr 30 08:28:20 fir-md1-s1 kernel: Lustre: fir-MDT0000: Not available for connect from 10.9.101.50@o2ib4 (stopping) Apr 30 08:28:20 fir-md1-s1 kernel: Lustre: Skipped 410 previous similar messages Apr 30 08:28:22 fir-md1-s1 kernel: LustreError: 98940:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) ldlm_cancel from 10.8.26.10@o2ib6 arrived at 1556638102 with bad export cookie 1909338203045031073 Apr 30 08:28:22 fir-md1-s1 kernel: LustreError: 98940:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) Skipped 16 previous similar messages Apr 30 08:28:28 fir-md1-s1 kernel: Lustre: fir-MDT0000: Not available for connect from 10.8.26.25@o2ib6 (stopping) Apr 30 08:28:28 fir-md1-s1 kernel: Lustre: Skipped 735 previous similar messages Apr 30 08:28:31 fir-md1-s1 kernel: LustreError: 107967:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) ldlm_cancel from 10.9.107.22@o2ib4 arrived at 1556638111 with bad export cookie 1909338203045021931 Apr 30 08:28:31 fir-md1-s1 kernel: LustreError: 107967:0:(ldlm_lockd.c:2322:ldlm_cancel_handler()) Skipped 13 previous similar messages Apr 30 08:28:34 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.105.54@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:28:40 fir-md1-s1 kernel: Lustre: server umount fir-MDT0000 complete Apr 30 08:28:43 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.101.39@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:28:43 fir-md1-s1 kernel: LustreError: Skipped 463 previous similar messages Apr 30 08:28:45 fir-md1-s1 kernel: Lustre: fir-MDT0002: Not available for connect from 10.9.104.53@o2ib4 (stopping) Apr 30 08:28:45 fir-md1-s1 kernel: Lustre: Skipped 1368 previous similar messages Apr 30 08:29:02 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.101.65@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:29:02 fir-md1-s1 kernel: LustreError: Skipped 2185 previous similar messages Apr 30 08:29:04 fir-md1-s1 kernel: LustreError: 100920:0:(client.c:1175:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff9826c7a49800 x1632086911412432/t0(0) o101->fir-MDT0000-lwp-MDT0002@0@lo:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1 Apr 30 08:29:04 fir-md1-s1 kernel: LustreError: 100920:0:(qsd_reint.c:56:qsd_reint_completion()) fir-MDT0002: failed to enqueue global quota lock, glb fid:[0x200000006:0x10000:0x0], rc:-5 Apr 30 08:29:04 fir-md1-s1 kernel: LustreError: 100920:0:(qsd_reint.c:56:qsd_reint_completion()) Skipped 1 previous similar message Apr 30 08:29:05 fir-md1-s1 kernel: Lustre: server umount fir-MDT0002 complete Apr 30 08:29:38 fir-md1-s1 kernel: Lustre: 101025:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556638172/real 1556638172] req@ffff9857a62e7800 x1632086911413328/t0(0) o251->MGC10.0.10.51@o2ib7@0@lo:26/25 lens 224/224 e 0 to 1 dl 1556638178 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Apr 30 08:29:38 fir-md1-s1 kernel: Lustre: 101025:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 9 previous similar messages Apr 30 08:29:39 fir-md1-s1 kernel: Lustre: server umount MGS complete Apr 30 08:29:40 fir-md1-s1 kernel: LNetError: 7253:0:(o2iblnd_cb.c:2469:kiblnd_passive_connect()) Can't accept conn from 10.0.10.202@o2ib7 on NA (ib0:1:10.0.10.51): bad dst nid 10.0.10.51@o2ib7 Apr 30 08:29:42 fir-md1-s1 kernel: LNet: Removed LNI 10.0.10.51@o2ib7 Apr 30 08:29:44 fir-md1-s1 kernel: LNet: HW NUMA nodes: 4, HW CPU cores: 48, npartitions: 4 Apr 30 08:29:44 fir-md1-s1 kernel: alg: No test for adler32 (adler32-zlib) Apr 30 08:29:45 fir-md1-s1 kernel: Lustre: Lustre: Build Version: 2.12.0.pl9 Apr 30 08:29:45 fir-md1-s1 kernel: LNet: Using FastReg for registration Apr 30 08:29:45 fir-md1-s1 kernel: LNet: Added LNI 10.0.10.51@o2ib7 [8/256/0/180] Apr 30 08:29:45 fir-md1-s1 kernel: LNetError: 7253:0:(o2iblnd_cb.c:2469:kiblnd_passive_connect()) Can't accept conn from 10.0.10.106@o2ib7 on NA (ib0:1:10.0.10.51): bad dst nid 10.0.10.51@o2ib7 Apr 30 08:30:47 fir-md1-s1 kernel: LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc Apr 30 08:30:47 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 1153d653-edd7-2ca8-72ae-5f213cd0d2c4 (at 0@lo) Apr 30 08:30:48 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 74828ce0-ea4f-f77f-83f0-8f7bbd9d7f10 (at 10.8.27.17@o2ib6) Apr 30 08:30:48 fir-md1-s1 kernel: Lustre: Skipped 14 previous similar messages Apr 30 08:30:49 fir-md1-s1 kernel: Lustre: MGS: Connection restored to b4679b48-b11c-9680-7d24-915c2e7a6a0e (at 10.8.24.18@o2ib6) Apr 30 08:30:49 fir-md1-s1 kernel: Lustre: Skipped 43 previous similar messages Apr 30 08:30:51 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 2cf55f95-62ef-57e6-6781-995d0b924cb6 (at 10.8.7.22@o2ib6) Apr 30 08:30:51 fir-md1-s1 kernel: Lustre: Skipped 67 previous similar messages Apr 30 08:30:57 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.9.114.15@o2ib4) Apr 30 08:30:57 fir-md1-s1 kernel: Lustre: Skipped 145 previous similar messages Apr 30 08:31:06 fir-md1-s1 kernel: Lustre: MGS: Connection restored to a19fbd52-fc1f-6afe-5025-88bbd6370298 (at 10.9.102.36@o2ib4) Apr 30 08:31:06 fir-md1-s1 kernel: Lustre: Skipped 10 previous similar messages Apr 30 08:31:22 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 0c2afddf-dbbe-bdc6-2650-d15e478dbb2e (at 10.9.102.4@o2ib4) Apr 30 08:31:22 fir-md1-s1 kernel: Lustre: Skipped 112 previous similar messages Apr 30 08:31:23 fir-md1-s1 kernel: LDISKFS-fs (dm-4): file extents enabled Apr 30 08:31:23 fir-md1-s1 kernel: LDISKFS-fs (dm-0): file extents enabled Apr 30 08:31:23 fir-md1-s1 kernel: , maximum tree depth=5 Apr 30 08:31:23 fir-md1-s1 kernel: , maximum tree depth=5 Apr 30 08:31:23 fir-md1-s1 kernel: LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc Apr 30 08:31:23 fir-md1-s1 kernel: LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc Apr 30 08:31:24 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.103.37@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:31:24 fir-md1-s1 kernel: LustreError: Skipped 1 previous similar message Apr 30 08:31:24 fir-md1-s1 kernel: LustreError: 11-0: fir-MDT0001-osp-MDT0000: operation mds_connect to node 10.0.10.52@o2ib7 failed: rc = -114 Apr 30 08:31:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: Imperative Recovery not enabled, recovery window 300-900 Apr 30 08:31:24 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.1.3@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:31:24 fir-md1-s1 kernel: LustreError: Skipped 34 previous similar messages Apr 30 08:31:24 fir-md1-s1 kernel: Lustre: fir-MDD0000: changelog on Apr 30 08:31:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: in recovery but waiting for the first client to connect Apr 30 08:31:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: Will be in recovery for at least 5:00, or until 1334 clients reconnect Apr 30 08:31:25 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.25.27@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:31:25 fir-md1-s1 kernel: LustreError: Skipped 65 previous similar messages Apr 30 08:31:26 fir-md1-s1 kernel: LustreError: 101695:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff982bb1bc0900 x1631600671074432/t0(0) o601->fir-MDT0000-lwp-OST0022_UUID@10.0.10.105@o2ib7:2/0 lens 336/0 e 0 to 0 dl 1556638292 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:31:27 fir-md1-s1 kernel: LustreError: 11-0: fir-MDT0000-osp-MDT0002: operation mds_connect to node 0@lo failed: rc = -114 Apr 30 08:31:27 fir-md1-s1 kernel: LustreError: Skipped 1 previous similar message Apr 30 08:31:27 fir-md1-s1 kernel: LustreError: 101983:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff982ccf81c200 x1631600671088896/t0(0) o601->fir-MDT0000-lwp-OST001e_UUID@10.0.10.105@o2ib7:3/0 lens 336/0 e 0 to 0 dl 1556638293 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:31:27 fir-md1-s1 kernel: Lustre: fir-MDT0002: Imperative Recovery not enabled, recovery window 300-900 Apr 30 08:31:27 fir-md1-s1 kernel: LustreError: 101983:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 77 previous similar messages Apr 30 08:31:27 fir-md1-s1 kernel: Lustre: fir-MDD0002: changelog on Apr 30 08:31:27 fir-md1-s1 kernel: Lustre: fir-MDT0002: in recovery but waiting for the first client to connect Apr 30 08:31:27 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.102.70@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:31:27 fir-md1-s1 kernel: Lustre: fir-MDT0002: Will be in recovery for at least 5:00, or until 1334 clients reconnect Apr 30 08:31:27 fir-md1-s1 kernel: LustreError: Skipped 176 previous similar messages Apr 30 08:31:28 fir-md1-s1 kernel: LustreError: 101980:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff982b4e798900 x1631600671093920/t0(0) o601->fir-MDT0000-lwp-OST001e_UUID@10.0.10.105@o2ib7:4/0 lens 336/0 e 0 to 0 dl 1556638294 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:31:28 fir-md1-s1 kernel: LustreError: 101980:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 51 previous similar messages Apr 30 08:31:29 fir-md1-s1 kernel: Lustre: fir-MDT0000: Denying connection for new client e6faa00b-070f-4d22-51ac-e59042b5a00c(at 10.8.12.33@o2ib6), waiting for 1334 known clients (96 recovered, 8 in progress, and 0 evicted) already passed deadline 0:04 Apr 30 08:31:29 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 08:31:31 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.102.34@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:31:31 fir-md1-s1 kernel: LustreError: 101980:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff982c7deb0300 x1631600671094000/t0(0) o601->fir-MDT0000-lwp-OST0022_UUID@10.0.10.105@o2ib7:7/0 lens 336/0 e 0 to 0 dl 1556638297 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:31:31 fir-md1-s1 kernel: LustreError: 101980:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 4 previous similar messages Apr 30 08:31:31 fir-md1-s1 kernel: LustreError: Skipped 477 previous similar messages Apr 30 08:31:36 fir-md1-s1 kernel: LustreError: 101980:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff98267623ac50 x1631600796636864/t0(0) o601->fir-MDT0000-lwp-OST0007_UUID@10.0.10.102@o2ib7:12/0 lens 336/0 e 0 to 0 dl 1556638302 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:31:36 fir-md1-s1 kernel: LustreError: 101980:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 2 previous similar messages Apr 30 08:31:42 fir-md1-s1 kernel: LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.0.10.52@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. Apr 30 08:31:42 fir-md1-s1 kernel: LustreError: Skipped 1007 previous similar messages Apr 30 08:31:45 fir-md1-s1 kernel: LustreError: 101983:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff982beef62400 x1631600671095088/t0(0) o601->fir-MDT0000-lwp-OST0022_UUID@10.0.10.105@o2ib7:21/0 lens 336/0 e 0 to 0 dl 1556638311 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:31:45 fir-md1-s1 kernel: LustreError: 101983:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 3 previous similar messages Apr 30 08:31:58 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to db3bd310-8795-965d-3dea-b9fd8474f855 (at 10.0.10.101@o2ib7) Apr 30 08:31:58 fir-md1-s1 kernel: Lustre: Skipped 3645 previous similar messages Apr 30 08:32:02 fir-md1-s1 kernel: LustreError: 102201:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff9846563acb00 x1631600706336656/t0(0) o601->fir-MDT0000-lwp-OST0004_UUID@10.0.10.101@o2ib7:8/0 lens 336/0 e 0 to 0 dl 1556638328 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:32:02 fir-md1-s1 kernel: LustreError: 101699:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff984ca6f15d00 x1631600706336640/t0(0) o601->fir-MDT0000-lwp-OST0004_UUID@10.0.10.101@o2ib7:8/0 lens 336/0 e 0 to 0 dl 1556638328 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Apr 30 08:32:02 fir-md1-s1 kernel: LustreError: 101699:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 955 previous similar messages Apr 30 08:32:04 fir-md1-s1 kernel: Lustre: fir-MDT0000: Denying connection for new client e6faa00b-070f-4d22-51ac-e59042b5a00c(at 10.8.12.33@o2ib6), waiting for 1334 known clients (1239 recovered, 87 in progress, and 0 evicted) already passed deadline 0:39 Apr 30 08:32:04 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 08:32:17 fir-md1-s1 kernel: Lustre: fir-MDT0000: Recovery already passed deadline 0:53. If you do not want to wait more, please abort the recovery by force. Apr 30 08:32:17 fir-md1-s1 kernel: Lustre: fir-MDT0000: Recovery over after 0:53, of 1334 clients 1334 recovered and 0 were evicted. Apr 30 08:32:18 fir-md1-s1 kernel: Lustre: fir-MDT0002: Recovery over after 0:51, of 1334 clients 1334 recovered and 0 were evicted. Apr 30 08:36:16 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client a3a1deb5-f17f-91e3-ffe6-f046415da924 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cc7bd6800, cur 1556638576 expire 1556638426 last 1556638349 Apr 30 08:36:19 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client e6faa00b-070f-4d22-51ac-e59042b5a00c (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98378ba86800, cur 1556638579 expire 1556638429 last 1556638352 Apr 30 08:37:24 fir-md1-s1 kernel: Lustre: MGS: Connection restored to a3a1deb5-f17f-91e3-ffe6-f046415da924 (at 10.8.12.33@o2ib6) Apr 30 08:37:24 fir-md1-s1 kernel: Lustre: Skipped 73 previous similar messages Apr 30 08:41:11 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client a97fecac-5576-d099-2d6d-ba3e8ee376e1 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984cabf02c00, cur 1556638871 expire 1556638721 last 1556638644 Apr 30 08:41:11 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 08:41:14 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client d443b2d2-ef37-1815-642c-90bcf5846a13 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983c7a3d0000, cur 1556638874 expire 1556638724 last 1556638647 Apr 30 08:42:19 fir-md1-s1 kernel: Lustre: MGS: Connection restored to a3a1deb5-f17f-91e3-ffe6-f046415da924 (at 10.8.12.33@o2ib6) Apr 30 08:42:19 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 08:46:06 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client ed3d9ca0-da54-f70a-e918-7d51b4894506 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9827afa23000, cur 1556639166 expire 1556639016 last 1556638939 Apr 30 08:46:06 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 08:47:15 fir-md1-s1 kernel: Lustre: MGS: Connection restored to a3a1deb5-f17f-91e3-ffe6-f046415da924 (at 10.8.12.33@o2ib6) Apr 30 08:47:15 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 08:51:02 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 054d8096-224c-690f-3549-b7034c5669d6 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985c70a54c00, cur 1556639462 expire 1556639312 last 1556639235 Apr 30 08:51:02 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 08:52:23 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639536/real 1556639536] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639543 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Apr 30 08:52:30 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639543/real 1556639543] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639550 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 30 08:52:31 fir-md1-s1 kernel: Lustre: 102606:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff984c85e48300 x1631625331264144/t0(0) o101->a1b347d2-59e9-5de2-6e29-00884610c229@10.8.25.22@o2ib6:6/0 lens 592/3264 e 1 to 0 dl 1556639556 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:52:32 fir-md1-s1 kernel: Lustre: 102420:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff982b9900c500 x1631776836655760/t0(0) o101->74b6aa20-82e6-c3c9-1fe9-141d2e6b56e2@10.8.26.30@o2ib6:6/0 lens 592/3264 e 1 to 0 dl 1556639556 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:52:32 fir-md1-s1 kernel: Lustre: 102420:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages Apr 30 08:52:33 fir-md1-s1 kernel: Lustre: 101920:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff982b9900e900 x1631496285003632/t0(0) o101->53803d2a-ea9e-0335-702a-3d9daed0d916@10.8.22.17@o2ib6:7/0 lens 592/3264 e 1 to 0 dl 1556639557 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:52:33 fir-md1-s1 kernel: Lustre: 101920:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message Apr 30 08:52:37 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 0b49eccd-cda4-7bac-8560-4f28415786a3 (at 10.9.0.62@o2ib4) reconnecting Apr 30 08:52:37 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639550/real 1556639550] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639557 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 30 08:52:38 fir-md1-s1 kernel: Lustre: 102530:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff982b96b5b600 x1631695967233984/t0(0) o101->acb1aa3b-60ab-7f7c-ec38-03838117cd24@10.8.25.12@o2ib6:13/0 lens 592/3264 e 1 to 0 dl 1556639563 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:52:38 fir-md1-s1 kernel: Lustre: 102530:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message Apr 30 08:52:38 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 53803d2a-ea9e-0335-702a-3d9daed0d916 (at 10.8.22.17@o2ib6) reconnecting Apr 30 08:52:38 fir-md1-s1 kernel: Lustre: Skipped 3 previous similar messages Apr 30 08:52:42 fir-md1-s1 kernel: Lustre: 102590:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff983c3e7d4b00 x1631795301549552/t0(0) o101->7b47d238-4d96-3180-1efb-43deab0e7ece@10.8.24.19@o2ib6:17/0 lens 592/3264 e 1 to 0 dl 1556639567 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:52:44 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639557/real 1556639557] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639564 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 30 08:52:44 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client acb1aa3b-60ab-7f7c-ec38-03838117cd24 (at 10.8.25.12@o2ib6) reconnecting Apr 30 08:52:44 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 08:52:48 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 7b47d238-4d96-3180-1efb-43deab0e7ece (at 10.8.24.19@o2ib6) reconnecting Apr 30 08:52:50 fir-md1-s1 kernel: Lustre: 102370:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff982b66e8bc50 x1631686241233568/t0(0) o101->e069e613-f413-14c2-adc9-8bb2c0565535@10.8.20.30@o2ib6:25/0 lens 592/3264 e 1 to 0 dl 1556639575 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:52:50 fir-md1-s1 kernel: Lustre: 102370:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message Apr 30 08:52:51 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639564/real 1556639564] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639571 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 30 08:52:53 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client b7ad475a-9cfb-ec6c-4413-22a589606837 (at 10.8.11.3@o2ib6) reconnecting Apr 30 08:53:05 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639578/real 1556639578] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639585 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 30 08:53:05 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 1 previous similar message Apr 30 08:53:05 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client acb1aa3b-60ab-7f7c-ec38-03838117cd24 (at 10.8.25.12@o2ib6) reconnecting Apr 30 08:53:05 fir-md1-s1 kernel: Lustre: Skipped 9 previous similar messages Apr 30 08:53:17 fir-md1-s1 kernel: Lustre: 102420:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff982acaebec00 x1631730847137280/t0(0) o101->d4d733ff-8d4b-d8de-bbc6-b5ae7cc529ba@10.8.12.35@o2ib6:22/0 lens 592/3264 e 0 to 0 dl 1556639602 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:53:17 fir-md1-s1 kernel: Lustre: 102420:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages Apr 30 08:53:23 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client d4d733ff-8d4b-d8de-bbc6-b5ae7cc529ba (at 10.8.12.35@o2ib6) reconnecting Apr 30 08:53:23 fir-md1-s1 kernel: Lustre: Skipped 14 previous similar messages Apr 30 08:53:26 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639599/real 1556639599] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639606 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 30 08:53:26 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Apr 30 08:53:46 fir-md1-s1 kernel: LustreError: 102394:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1556639536, 90s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff983cf8ffad00/0xce8853847d1bd099 lrc: 3/1,0 mode: --/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102394 timeout: 0 lvb_type: 0 Apr 30 08:53:46 fir-md1-s1 kernel: LustreError: dumping log to /tmp/lustre-log.1556639626.102394 Apr 30 08:53:47 fir-md1-s1 kernel: LustreError: 102571:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1556639537, 90s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff984c67232ac0/0xce8853847d1d0931 lrc: 3/1,0 mode: --/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102571 timeout: 0 lvb_type: 0 Apr 30 08:53:47 fir-md1-s1 kernel: LustreError: 102571:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 3 previous similar messages Apr 30 08:53:53 fir-md1-s1 kernel: LustreError: 102532:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1556639543, 90s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff982cd00bb840/0xce8853847d27cb6b lrc: 3/1,0 mode: --/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 33 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102532 timeout: 0 lvb_type: 0 Apr 30 08:53:53 fir-md1-s1 kernel: LustreError: 102532:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message Apr 30 08:53:56 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client b7ad475a-9cfb-ec6c-4413-22a589606837 (at 10.8.11.3@o2ib6) reconnecting Apr 30 08:53:56 fir-md1-s1 kernel: Lustre: Skipped 27 previous similar messages Apr 30 08:53:56 fir-md1-s1 kernel: Lustre: 102702:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff983cc0a06000 x1631684958101408/t0(0) o101->8455dbbf-4366-afd6-29b8-dc2a91bfd5f9@10.8.11.12@o2ib6:1/0 lens 592/3264 e 0 to 0 dl 1556639641 ref 2 fl Interpret:/0/0 rc 0/0 Apr 30 08:53:56 fir-md1-s1 kernel: Lustre: 102702:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 6 previous similar messages Apr 30 08:53:57 fir-md1-s1 kernel: LustreError: 102403:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1556639547, 90s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff983ca864d100/0xce8853847d2e6669 lrc: 3/1,0 mode: --/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 34 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102403 timeout: 0 lvb_type: 0 Apr 30 08:54:01 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556639634/real 1556639634] req@ffff985a6332e000 x1632253444072512/t0(0) o104->fir-MDT0000@10.8.10.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556639641 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 Apr 30 08:54:01 fir-md1-s1 kernel: Lustre: 102380:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 4 previous similar messages Apr 30 08:54:02 fir-md1-s1 kernel: LustreError: 102606:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1556639552, 90s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff984c83b29200/0xce8853847d370a85 lrc: 3/1,0 mode: --/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 35 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102606 timeout: 0 lvb_type: 0 Apr 30 08:54:17 fir-md1-s1 kernel: LustreError: 102649:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1556639567, 90s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff983c2c2cd580/0xce8853847d4fe404 lrc: 3/1,0 mode: --/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 37 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102649 timeout: 0 lvb_type: 0 Apr 30 08:54:17 fir-md1-s1 kernel: LustreError: 102649:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 3 previous similar messages Apr 30 08:54:34 fir-md1-s1 kernel: LustreError: 102488:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1556639584, 90s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff982816623600/0xce8853847d6c2565 lrc: 3/1,0 mode: --/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 40 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102488 timeout: 0 lvb_type: 0 Apr 30 08:54:34 fir-md1-s1 kernel: LustreError: 102488:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 5 previous similar messages Apr 30 08:54:51 fir-md1-s1 kernel: LustreError: 102380:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 10.8.10.29@o2ib6) failed to reply to blocking AST (req@ffff985a6332e000 x1632253444072512 status 0 rc -110), evict it ns: mdt-fir-MDT0000_UUID lock: ffff9835c2218240/0xce885384746ac067 lrc: 4/0,0 mode: PR/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 41 type: IBT flags: 0x60200400000020 nid: 10.8.10.29@o2ib6 remote: 0xe520aafcd65657d3 expref: 75 pid: 102611 timeout: 165104 lvb_type: 0 Apr 30 08:54:51 fir-md1-s1 kernel: LustreError: 138-a: fir-MDT0000: A client on nid 10.8.10.29@o2ib6 was evicted due to a lock blocking callback time out: rc -110 Apr 30 08:54:51 fir-md1-s1 kernel: LustreError: 101500:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 155s: evicting client at 10.8.10.29@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff9835c2218240/0xce885384746ac067 lrc: 3/0,0 mode: PR/PR res: [0x20001e803:0x12c:0x0].0x0 bits 0x13/0x0 rrc: 41 type: IBT flags: 0x60200400000020 nid: 10.8.10.29@o2ib6 remote: 0xe520aafcd65657d3 expref: 76 pid: 102611 timeout: 0 lvb_type: 0 Apr 30 08:54:51 fir-md1-s1 kernel: Lustre: 102532:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (147:1s); client may timeout. req@ffff982b96b5b600 x1631695967233984/t0(0) o101->acb1aa3b-60ab-7f7c-ec38-03838117cd24@10.8.25.12@o2ib6:13/0 lens 592/536 e 1 to 0 dl 1556639690 ref 1 fl Complete:/0/0 rc 0/0 Apr 30 08:54:51 fir-md1-s1 kernel: Lustre: 102532:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1 previous similar message Apr 30 08:54:52 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client c61a7e99-2daf-0d4e-4d5e-683948263561 (at 10.8.12.33@o2ib6) in 158 seconds. I think it's dead, and I am evicting it. exp ffff984bfeb77c00, cur 1556639692 expire 1556639542 last 1556639534 Apr 30 08:54:52 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 08:57:02 fir-md1-s1 kernel: Lustre: MGS: Connection restored to a3a1deb5-f17f-91e3-ffe6-f046415da924 (at 10.8.12.33@o2ib6) Apr 30 08:57:02 fir-md1-s1 kernel: Lustre: Skipped 133 previous similar messages Apr 30 08:58:11 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client e8c62915-6c71-4c14-7626-f3601bfc0f0f (at 10.8.1.3@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985953787400, cur 1556639891 expire 1556639741 last 1556639664 Apr 30 08:58:11 fir-md1-s1 kernel: Lustre: Skipped 4 previous similar messages Apr 30 09:00:49 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client c3ea7cff-a2b9-6fc8-d2a8-f427f845bf09 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cf2eb2c00, cur 1556640049 expire 1556639899 last 1556639822 Apr 30 09:00:49 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 09:05:47 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client d7a74b87-bc65-826f-a750-c8a80cc8482a (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cd48f0c00, cur 1556640347 expire 1556640197 last 1556640120 Apr 30 09:05:47 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 09:27:10 fir-md1-s1 kernel: Lustre: MGS: Connection restored to a3a1deb5-f17f-91e3-ffe6-f046415da924 (at 10.8.12.33@o2ib6) Apr 30 09:27:10 fir-md1-s1 kernel: Lustre: Skipped 17 previous similar messages Apr 30 09:28:51 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client b0015eb5-6efa-a3bc-bfd9-109e877d2725 (at 10.8.11.7@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9859f2201000, cur 1556641731 expire 1556641581 last 1556641504 Apr 30 09:28:51 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 09:29:05 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.1.3@o2ib6) Apr 30 09:29:05 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 09:30:07 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 3eee9c96-c1b6-76d2-95c6-7eafc5882ffc (at 10.8.12.33@o2ib6) in 174 seconds. I think it's dead, and I am evicting it. exp ffff983c4e38b000, cur 1556641807 expire 1556641657 last 1556641633 Apr 30 09:30:07 fir-md1-s1 kernel: Lustre: Skipped 11 previous similar messages Apr 30 09:34:40 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client 5f3e56aa-c133-8c57-1770-33b09b5d7956 (at 10.8.12.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984c02f49400, cur 1556642080 expire 1556641930 last 1556641853 Apr 30 09:34:40 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 10:02:53 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client e80f6b46-7bcd-30a8-8491-3102d8ee0aa0 (at 10.8.25.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984709677000, cur 1556643773 expire 1556643623 last 1556643546 Apr 30 10:02:53 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 10:02:59 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 296150f1-2afd-83d5-ffa7-39277d8476e6 (at 10.8.25.9@o2ib6) Apr 30 10:02:59 fir-md1-s1 kernel: Lustre: Skipped 17 previous similar messages Apr 30 11:40:42 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 6f8d180c-697b-87fe-2c39-64e5c1d542ef (at 10.8.26.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984cf8671800, cur 1556649642 expire 1556649492 last 1556649415 Apr 30 11:40:42 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 11:41:17 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 627e76d3-c7eb-8d97-a64e-cb771e022cc0 (at 10.8.26.33@o2ib6) Apr 30 11:41:17 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 11:54:44 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client e4faccdb-f303-9bdd-51a6-ad7a646ae559 (at 10.8.26.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cb4a1a400, cur 1556650484 expire 1556650334 last 1556650257 Apr 30 11:54:44 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 11:54:58 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client e4faccdb-f303-9bdd-51a6-ad7a646ae559 (at 10.8.26.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98375420c400, cur 1556650498 expire 1556650348 last 1556650271 Apr 30 11:54:58 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 11:55:35 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 627e76d3-c7eb-8d97-a64e-cb771e022cc0 (at 10.8.26.33@o2ib6) Apr 30 11:55:35 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 12:09:42 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 4b625669-b570-bf89-cdc9-b22aede67358 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982cd197ec00, cur 1556651382 expire 1556651232 last 1556651155 Apr 30 12:12:04 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 12:12:04 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 12:20:15 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 327c28a1-fe51-b704-2bde-95368e501f01 (at 10.8.1.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cfb382c00, cur 1556652015 expire 1556651865 last 1556651788 Apr 30 12:20:15 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 12:22:08 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.1.29@o2ib6) Apr 30 12:22:08 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 12:24:14 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 459dc0b2-5f7f-24eb-f6a1-6e1030b48b5c (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9858c6edb000, cur 1556652254 expire 1556652104 last 1556652027 Apr 30 12:24:14 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 12:24:16 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 459dc0b2-5f7f-24eb-f6a1-6e1030b48b5c (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983cf7852800, cur 1556652256 expire 1556652106 last 1556652029 Apr 30 12:32:36 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 12:32:36 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 12:41:02 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client acdb8e1f-3ab2-f130-36a6-60883f4fd9c7 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983cc6f8c400, cur 1556653262 expire 1556653112 last 1556653035 Apr 30 12:41:02 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 12:42:03 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 12:42:03 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:03:48 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client f4319f02-25fa-20b6-a648-2516dd1744d4 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983c4af49800, cur 1556654628 expire 1556654478 last 1556654401 Apr 30 13:03:48 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:06:28 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 13:06:28 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:12:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 2df8c138-8c23-ea09-17c8-c9239f9279b9 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982b28a92800, cur 1556655144 expire 1556654994 last 1556654917 Apr 30 13:12:24 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:14:07 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.13.24@o2ib6) Apr 30 13:14:07 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:14:58 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 919649d8-704c-889b-d1dd-a296af8855ee (at 10.8.13.24@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985c307d1800, cur 1556655298 expire 1556655148 last 1556655071 Apr 30 13:14:58 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:37:58 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 13:37:58 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:51:25 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client d9eb9aeb-03ec-18ce-b78d-5769086dc54d (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985c3ca43400, cur 1556657485 expire 1556657335 last 1556657258 Apr 30 13:51:25 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 13:54:03 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 13:54:03 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 14:02:54 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 4115a52d-9eff-7ac8-6fc7-05e10e61ece9 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983c3b71ec00, cur 1556658174 expire 1556658024 last 1556657947 Apr 30 14:02:54 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 14:06:47 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 14:06:47 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 14:16:53 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client cb2147ca-63ad-9be6-4549-5a3714e7a68f (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983bb7820400, cur 1556659013 expire 1556658863 last 1556658786 Apr 30 14:16:53 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 14:21:21 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 14:21:21 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 14:32:43 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 49903e14-e267-be84-fd5c-bda9815f9fe4 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984be9741000, cur 1556659963 expire 1556659813 last 1556659736 Apr 30 14:32:43 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 14:36:40 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 14:36:40 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 15:12:15 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client ec01363a-b910-254e-075d-e7f3e6df1606 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98376638dc00, cur 1556662335 expire 1556662185 last 1556662108 Apr 30 15:12:15 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 15:12:30 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client ec01363a-b910-254e-075d-e7f3e6df1606 (at 10.8.10.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984ca8a6c800, cur 1556662350 expire 1556662200 last 1556662123 Apr 30 15:12:30 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 15:17:38 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 15:17:38 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 15:17:39 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.10.29@o2ib6) Apr 30 15:17:39 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 15:39:03 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client a078cd0f-7e7e-03be-ddc4-775ce28fae96 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9859b7ac6c00, cur 1556663943 expire 1556663793 last 1556663716 Apr 30 15:39:15 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client a078cd0f-7e7e-03be-ddc4-775ce28fae96 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cf8216800, cur 1556663955 expire 1556663805 last 1556663728 Apr 30 15:39:15 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 15:39:39 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) Apr 30 15:39:39 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) Apr 30 15:39:39 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message Apr 30 16:19:42 fir-md1-s1 kernel: LNetError: 101314:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 30 16:20:13 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 1113817e-df5c-b4f9-94ce-f33b62a1b499 (at 10.8.8.26@o2ib6) reconnecting Apr 30 16:20:13 fir-md1-s1 kernel: Lustre: Skipped 66 previous similar messages Apr 30 16:20:13 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to 520279e1-9bd2-c069-cb21-1e3d5370b323 (at 10.8.8.26@o2ib6) Apr 30 16:36:39 fir-md1-s1 kernel: LNetError: 101320:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 30 16:36:46 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 7cc0f019-7fa6-17b1-76f1-8ecb3c84ba82 (at 10.8.27.24@o2ib6) reconnecting Apr 30 16:36:46 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.27.24@o2ib6) Apr 30 16:42:51 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 1d5157f9-7efa-61ef-c6f9-b1db29ae7243 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984a85b69800, cur 1556667771 expire 1556667621 last 1556667544 Apr 30 16:43:04 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) Apr 30 16:46:46 fir-md1-s1 kernel: LNetError: 101310:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) Apr 30 16:47:17 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client 7cc0f019-7fa6-17b1-76f1-8ecb3c84ba82 (at 10.8.27.24@o2ib6) reconnecting Apr 30 16:47:17 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.27.24@o2ib6) Apr 30 16:47:17 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 17:05:52 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 53385b7b-a550-b1a8-0abe-3b8ac836eb95 (at 10.8.10.20@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cb6f23800, cur 1556669152 expire 1556669002 last 1556668925 Apr 30 17:05:52 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages Apr 30 17:08:02 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.10.20@o2ib6) Apr 30 22:33:23 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 6fab2fb5-26e6-7b9a-b3d9-fd518701970b (at 10.8.14.8@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98465aa59000, cur 1556688803 expire 1556688653 last 1556688576 Apr 30 22:33:23 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 08:20:20 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 6b2f9741-e509-4243-058a-e7872e15cb5c (at 10.8.1.31@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984cf8677c00, cur 1556724020 expire 1556723870 last 1556723793 May 01 08:20:20 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 08:22:30 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.1.31@o2ib6) May 01 08:22:30 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 08:39:12 fir-md1-s1 kernel: Lustre: MGS: Connection restored to d64a3116-c9b2-082e-3250-a9e5dffa1cb1 (at 10.8.14.5@o2ib6) May 01 08:39:12 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 08:39:26 fir-md1-s1 kernel: Lustre: MGS: Connection restored to cc6b3c33-4b63-0ce5-d300-e50e75e32d79 (at 10.8.13.23@o2ib6) May 01 08:39:26 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 08:39:31 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.13.24@o2ib6) May 01 08:39:31 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 08:55:32 fir-md1-s1 kernel: Lustre: MGS: Connection restored to a3a1deb5-f17f-91e3-ffe6-f046415da924 (at 10.8.12.33@o2ib6) May 01 08:55:32 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 10:10:43 fir-md1-s1 kernel: Lustre: 102595:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556730636/real 1556730636] req@ffff9838e8673c00 x1632253887884704/t0(0) o104->fir-MDT0000@10.8.27.23@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556730643 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 May 01 10:10:43 fir-md1-s1 kernel: Lustre: 102595:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 7 previous similar messages May 01 10:10:51 fir-md1-s1 kernel: Lustre: 102371:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff983cd98bc800 x1631565217852128/t0(0) o101->7cc0f019-7fa6-17b1-76f1-8ecb3c84ba82@10.8.27.24@o2ib6:26/0 lens 1800/3288 e 1 to 0 dl 1556730656 ref 2 fl Interpret:/0/0 rc 0/0 May 01 10:10:51 fir-md1-s1 kernel: Lustre: 102371:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 11 previous similar messages May 01 10:10:57 fir-md1-s1 kernel: Lustre: 102595:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556730650/real 1556730650] req@ffff9838e8673c00 x1632253887884704/t0(0) o104->fir-MDT0000@10.8.27.23@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556730657 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 May 01 10:10:57 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 7cc0f019-7fa6-17b1-76f1-8ecb3c84ba82 (at 10.8.27.24@o2ib6) reconnecting May 01 10:10:57 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.27.24@o2ib6) May 01 10:10:57 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 10:10:57 fir-md1-s1 kernel: Lustre: 102595:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 1 previous similar message May 01 10:11:11 fir-md1-s1 kernel: LustreError: 102595:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 10.8.27.23@o2ib6) failed to reply to blocking AST (req@ffff9838e8673c00 x1632253887884704 status 0 rc -110), evict it ns: mdt-fir-MDT0000_UUID lock: ffff983c72eb6e40/0xce8853875de4f9ed lrc: 4/0,0 mode: PR/PR res: [0x20000560a:0xaeb:0x0].0x0 bits 0x13/0x0 rrc: 7 type: IBT flags: 0x60200400000020 nid: 10.8.27.23@o2ib6 remote: 0xb5891fec25bff6c1 expref: 157 pid: 102522 timeout: 255965 lvb_type: 0 May 01 10:11:11 fir-md1-s1 kernel: LustreError: 138-a: fir-MDT0000: A client on nid 10.8.27.23@o2ib6 was evicted due to a lock blocking callback time out: rc -110 May 01 10:11:11 fir-md1-s1 kernel: LustreError: 101500:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 35s: evicting client at 10.8.27.23@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff983c72eb6e40/0xce8853875de4f9ed lrc: 3/0,0 mode: PR/PR res: [0x20000560a:0xaeb:0x0].0x0 bits 0x13/0x0 rrc: 7 type: IBT flags: 0x60200400000020 nid: 10.8.27.23@o2ib6 remote: 0xb5891fec25bff6c1 expref: 158 pid: 102522 timeout: 0 lvb_type: 0 May 01 10:14:05 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client c002d779-213f-8764-b0ce-a364b557d98d (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985c736d7800, cur 1556730845 expire 1556730695 last 1556730618 May 01 10:14:05 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 10:14:23 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 10:44:51 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 0840d825-5e1d-ab09-748d-b5fef372f47f (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9847076dc800, cur 1556732691 expire 1556732541 last 1556732464 May 01 10:44:51 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 10:45:25 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) May 01 10:45:25 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 11:35:59 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client c7c1e785-7484-b308-7dc5-6b63513d6220 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9849f4ee0400, cur 1556735759 expire 1556735609 last 1556735532 May 01 11:35:59 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 11:36:04 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client c7c1e785-7484-b308-7dc5-6b63513d6220 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983c6a2efc00, cur 1556735764 expire 1556735614 last 1556735537 May 01 11:36:04 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 11:36:15 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 11:36:15 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 11:55:58 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 84153925-7318-d597-37bf-61264542eb58 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982aaeb11000, cur 1556736958 expire 1556736808 last 1556736731 May 01 11:56:13 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 84153925-7318-d597-37bf-61264542eb58 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98387bf7e000, cur 1556736973 expire 1556736823 last 1556736746 May 01 11:56:13 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 11:56:23 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 11:56:23 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 14:10:36 fir-md1-s1 kernel: LNetError: 101316:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) May 01 18:06:52 fir-md1-s1 kernel: Lustre: 102768:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556759205/real 1556759205] req@ffff9823e870f800 x1632254373081920/t0(0) o104->fir-MDT0000@10.8.27.23@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556759212 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 May 01 18:06:52 fir-md1-s1 kernel: Lustre: 102768:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 2 previous similar messages May 01 18:06:59 fir-md1-s1 kernel: Lustre: 102768:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556759212/real 1556759212] req@ffff9823e870f800 x1632254373081920/t0(0) o104->fir-MDT0000@10.8.27.23@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556759219 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 May 01 18:07:06 fir-md1-s1 kernel: Lustre: 102768:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556759219/real 1556759219] req@ffff9823e870f800 x1632254373081920/t0(0) o104->fir-MDT0000@10.8.27.23@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556759226 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 May 01 18:07:10 fir-md1-s1 kernel: Lustre: 102546:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff982a2fe51b00 x1631565224469472/t0(0) o101->7cc0f019-7fa6-17b1-76f1-8ecb3c84ba82@10.8.27.24@o2ib6:15/0 lens 1792/3288 e 0 to 0 dl 1556759235 ref 2 fl Interpret:/0/0 rc 0/0 May 01 18:07:16 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 7cc0f019-7fa6-17b1-76f1-8ecb3c84ba82 (at 10.8.27.24@o2ib6) reconnecting May 01 18:07:16 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.27.24@o2ib6) May 01 18:07:16 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 18:07:17 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client dfe61200-863e-32be-7d68-5233540a9762 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9847f0addc00, cur 1556759237 expire 1556759087 last 1556759010 May 01 18:07:20 fir-md1-s1 kernel: Lustre: 102768:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556759233/real 1556759233] req@ffff9823e870f800 x1632254373081920/t0(0) o104->fir-MDT0000@10.8.27.23@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1556759240 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 May 01 18:07:20 fir-md1-s1 kernel: Lustre: 102768:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 1 previous similar message May 01 18:07:30 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client dfe61200-863e-32be-7d68-5233540a9762 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9837f3b95400, cur 1556759250 expire 1556759100 last 1556759023 May 01 18:07:30 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 18:07:40 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 18:11:54 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 7592b62b-cd74-82e4-03cd-75fb5e0a226b (at 10.8.14.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9848d1a6a400, cur 1556759514 expire 1556759364 last 1556759287 May 01 18:11:57 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 7592b62b-cd74-82e4-03cd-75fb5e0a226b (at 10.8.14.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cabafdc00, cur 1556759517 expire 1556759367 last 1556759290 May 01 18:37:56 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 18:37:56 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 18:38:41 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 4cd88215-e667-298c-fb54-c17c8301efbb (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98289c311c00, cur 1556761121 expire 1556760971 last 1556760894 May 01 18:38:41 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 18:38:58 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 4cd88215-e667-298c-fb54-c17c8301efbb (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff98363a726800, cur 1556761138 expire 1556760988 last 1556760911 May 01 18:38:58 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 18:43:16 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.14.9@o2ib6) May 01 18:43:16 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 18:51:21 fir-md1-s1 kernel: Lustre: MGS: haven't heard from client f6dec251-9455-2db5-0c0d-4b1d5c39f7f8 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984ac166f800, cur 1556761881 expire 1556761731 last 1556761654 May 01 18:51:34 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client dd447c0e-bf16-d0be-6449-bf36e688df99 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff984ac176b800, cur 1556761894 expire 1556761744 last 1556761667 May 01 18:51:34 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 18:51:55 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) May 01 18:51:55 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:04:32 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client fe35d843-20f4-288f-d8db-52dd32b58570 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9847c9778c00, cur 1556762672 expire 1556762522 last 1556762445 May 01 19:04:50 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) May 01 19:04:50 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:08:53 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 64b566c2-ebb5-7da0-af60-514dba7cee07 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982c15ad5800, cur 1556762933 expire 1556762783 last 1556762706 May 01 19:08:53 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:09:15 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 64b566c2-ebb5-7da0-af60-514dba7cee07 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983229267800, cur 1556762955 expire 1556762805 last 1556762728 May 01 19:09:15 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 19:09:25 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 19:09:25 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:15:45 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 2acff163-453f-6866-ca52-3be787a802e5 (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982b3630f000, cur 1556763345 expire 1556763195 last 1556763118 May 01 19:15:52 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 19:15:52 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:20:47 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client ef0ca740-68f8-0d2f-af07-d739c91e59f6 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff982b520d3400, cur 1556763647 expire 1556763497 last 1556763420 May 01 19:20:47 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:21:10 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) May 01 19:21:10 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:33:55 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client b169cfff-999c-3e08-edaf-bc412cfb2b0a (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983938efa400, cur 1556764435 expire 1556764285 last 1556764208 May 01 19:33:55 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 19:34:10 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client b169cfff-999c-3e08-edaf-bc412cfb2b0a (at 10.8.27.23@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983571eb7c00, cur 1556764450 expire 1556764300 last 1556764223 May 01 19:34:10 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 19:34:21 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.27.23@o2ib6) May 01 19:34:21 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 21:00:45 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 6667e1fb-9e5d-8122-f716-8d2ca6b880cd (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983222a73800, cur 1556769645 expire 1556769495 last 1556769418 May 01 21:02:38 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) May 01 21:02:38 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 21:36:58 fir-md1-s1 kernel: Lustre: fir-MDT0000: haven't heard from client 5242334c-3a63-f428-27e9-84a9b8569357 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983880323c00, cur 1556771818 expire 1556771668 last 1556771591 May 01 21:36:58 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 21:37:03 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 5242334c-3a63-f428-27e9-84a9b8569357 (at 10.8.21.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff983880321000, cur 1556771823 expire 1556771673 last 1556771596 May 01 21:37:03 fir-md1-s1 kernel: Lustre: Skipped 1 previous similar message May 01 21:37:20 fir-md1-s1 kernel: Lustre: MGS: Connection restored to 93524e03-763b-9556-15d0-9c57c97e51dd (at 10.8.21.21@o2ib6) May 01 21:37:20 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 22:18:16 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client 90fd09f3-1e4c-d89d-b1ef-509c9c50dd06 (at 10.8.9.8@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985cf57e4800, cur 1556774296 expire 1556774146 last 1556774069 May 01 22:18:23 fir-md1-s1 kernel: Lustre: MGS: Connection restored to (at 10.8.9.8@o2ib6) May 01 22:18:23 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 23:31:38 fir-md1-s1 kernel: Lustre: fir-MDT0002: haven't heard from client e2a571ed-a09d-5b66-3666-df63bf8e2019 (at 10.8.10.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff985ce3d65c00, cur 1556778698 expire 1556778548 last 1556778471 May 01 23:31:38 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 23:40:35 fir-md1-s1 kernel: Lustre: 102710:0:(mdd_device.c:1794:mdd_changelog_clear()) fir-MDD0002: Failure to clear the changelog for user 1: -22 May 01 23:42:05 fir-md1-s1 kernel: list passed to list_sort() too long for efficiency May 01 23:42:19 fir-md1-s1 kernel: Lustre: 101920:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9821b459bf00 x1632078159890016/t0(0) o101->6b6c38b2-4fd6-8d37-9524-d4553bfeb828@10.0.10.3@o2ib7:23/0 lens 600/3264 e 1 to 0 dl 1556779343 ref 2 fl Interpret:/0/0 rc 0/0 May 01 23:42:21 fir-md1-s1 kernel: Lustre: 102970:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff983a39ba1050 x1631585929788784/t0(0) o4->16749711-2a27-479b-83fc-14b2199ba6af@10.9.104.18@o2ib4:26/0 lens 8680/448 e 1 to 0 dl 1556779346 ref 2 fl Interpret:/0/0 rc 0/0 May 01 23:42:23 fir-md1-s1 kernel: Lustre: 102763:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff983180f6c800 x1631534633692496/t0(0) o101->f3bba4e8-9568-4001-6257-88537741a8c9@10.8.29.3@o2ib6:28/0 lens 1768/3288 e 1 to 0 dl 1556779348 ref 2 fl Interpret:/0/0 rc 0/0 May 01 23:42:23 fir-md1-s1 kernel: Lustre: 102763:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages May 01 23:42:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 6b6c38b2-4fd6-8d37-9524-d4553bfeb828 (at 10.0.10.3@o2ib7) reconnecting May 01 23:42:24 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.0.10.3@o2ib7) May 01 23:42:24 fir-md1-s1 kernel: Lustre: Skipped 2 previous similar messages May 01 23:42:25 fir-md1-s1 kernel: Lustre: 102778:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9831f9a69b00 x1631604149634912/t0(0) o101->fir-MDT0000-lwp-OST002e_UUID@10.0.10.107@o2ib7:0/0 lens 456/496 e 1 to 0 dl 1556779350 ref 2 fl Interpret:/0/0 rc 0/0 May 01 23:42:25 fir-md1-s1 kernel: Lustre: 102778:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 67 previous similar messages May 01 23:42:27 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 16749711-2a27-479b-83fc-14b2199ba6af (at 10.9.104.18@o2ib4) reconnecting May 01 23:42:27 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.9.104.18@o2ib4) May 01 23:42:29 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client d4242da5-5a9c-4508-f9da-c1e7f36347f4 (at 10.9.114.4@o2ib4) reconnecting May 01 23:42:29 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.9.114.4@o2ib4) May 01 23:42:29 fir-md1-s1 kernel: Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.52@o2ib7, removing former export from same NID May 01 23:42:29 fir-md1-s1 kernel: Lustre: 102822:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9857f819c200 x1631604139115328/t0(0) o101->fir-MDT0000-lwp-OST0016_UUID@10.0.10.103@o2ib7:4/0 lens 456/496 e 1 to 0 dl 1556779354 ref 2 fl Interpret:/0/0 rc 0/0 May 01 23:42:29 fir-md1-s1 kernel: Lustre: 102822:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 45 previous similar messages May 01 23:42:31 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client fir-MDT0000-lwp-OST0026_UUID (at 10.0.10.107@o2ib7) reconnecting May 01 23:42:31 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to d3dcd8ee-7913-062f-8514-9178ef53d789 (at 10.0.10.107@o2ib7) May 01 23:42:31 fir-md1-s1 kernel: Lustre: Skipped 30 previous similar messages May 01 23:42:31 fir-md1-s1 kernel: Lustre: Skipped 30 previous similar messages May 01 23:42:31 fir-md1-s1 kernel: Lustre: 102963:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556779330/real 1556779330] req@ffff98239efdb900 x1632254603866608/t0(0) o601->fir-MDT0000-lwp-MDT0002@0@lo:23/10 lens 336/336 e 1 to 1 dl 1556779351 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 May 01 23:42:31 fir-md1-s1 kernel: Lustre: 102963:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 1 previous similar message May 01 23:42:31 fir-md1-s1 kernel: Lustre: fir-MDT0000-lwp-MDT0002: Connection to fir-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete May 01 23:42:31 fir-md1-s1 kernel: Lustre: fir-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID May 01 23:42:36 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [mdt_io00_057:103101] May 01 23:42:36 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [mdt_io01_029:102923] May 01 23:42:36 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) May 01 23:42:36 fir-md1-s1 kernel: Modules linked in: May 01 23:42:36 fir-md1-s1 kernel: osp(OE) May 01 23:42:36 fir-md1-s1 kernel: mdd(OE) May 01 23:42:36 fir-md1-s1 kernel: lod(OE) May 01 23:42:36 fir-md1-s1 kernel: mdt(OE) May 01 23:42:36 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:36 fir-md1-s1 kernel: mgs(OE) May 01 23:42:36 fir-md1-s1 kernel: mgc(OE) May 01 23:42:36 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lquota(OE) May 01 23:42:36 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lustre(OE) May 01 23:42:36 fir-md1-s1 kernel: lmv(OE) May 01 23:42:36 fir-md1-s1 kernel: mdc(OE) May 01 23:42:36 fir-md1-s1 kernel: osc(OE) May 01 23:42:36 fir-md1-s1 kernel: lov(OE) May 01 23:42:36 fir-md1-s1 kernel: fid(OE) May 01 23:42:36 fir-md1-s1 kernel: fld(OE) May 01 23:42:36 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:36 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:36 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:36 fir-md1-s1 kernel: lnet(OE) May 01 23:42:36 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:36 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:36 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:36 fir-md1-s1 kernel: nfsv4 May 01 23:42:36 fir-md1-s1 kernel: dns_resolver May 01 23:42:36 fir-md1-s1 kernel: nfs May 01 23:42:36 fir-md1-s1 kernel: lockd May 01 23:42:36 fir-md1-s1 kernel: grace May 01 23:42:36 fir-md1-s1 kernel: fscache May 01 23:42:36 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:36 fir-md1-s1 kernel: dell_rbu May 01 23:42:36 fir-md1-s1 kernel: sunrpc May 01 23:42:36 fir-md1-s1 kernel: vfat May 01 23:42:36 fir-md1-s1 kernel: fat May 01 23:42:36 fir-md1-s1 kernel: dm_round_robin May 01 23:42:36 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:36 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:36 fir-md1-s1 kernel: kvm_amd May 01 23:42:36 fir-md1-s1 kernel: kvm May 01 23:42:36 fir-md1-s1 kernel: ses May 01 23:42:36 fir-md1-s1 kernel: irqbypass May 01 23:42:36 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:36 fir-md1-s1 kernel: enclosure May 01 23:42:36 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:36 fir-md1-s1 kernel: dcdbas May 01 23:42:36 fir-md1-s1 kernel: aesni_intel May 01 23:42:36 fir-md1-s1 kernel: lrw May 01 23:42:36 fir-md1-s1 kernel: gf128mul May 01 23:42:36 fir-md1-s1 kernel: glue_helper May 01 23:42:36 fir-md1-s1 kernel: ablk_helper May 01 23:42:36 fir-md1-s1 kernel: cryptd May 01 23:42:36 fir-md1-s1 kernel: ipmi_si May 01 23:42:36 fir-md1-s1 kernel: pcspkr May 01 23:42:36 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:36 fir-md1-s1 kernel: ccp May 01 23:42:36 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:36 fir-md1-s1 kernel: dm_multipath May 01 23:42:36 fir-md1-s1 kernel: sg May 01 23:42:36 fir-md1-s1 kernel: k10temp May 01 23:42:36 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:36 fir-md1-s1 kernel: dm_mod May 01 23:42:36 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:36 fir-md1-s1 kernel: knem(OE) May 01 23:42:36 fir-md1-s1 kernel: ip_tables May 01 23:42:36 fir-md1-s1 kernel: ext4 May 01 23:42:36 fir-md1-s1 kernel: mbcache May 01 23:42:36 fir-md1-s1 kernel: jbd2 May 01 23:42:36 fir-md1-s1 kernel: sd_mod May 01 23:42:36 fir-md1-s1 kernel: crc_t10dif May 01 23:42:36 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:36 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:36 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:36 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:36 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:36 fir-md1-s1 kernel: syscopyarea May 01 23:42:36 fir-md1-s1 kernel: sysfillrect May 01 23:42:36 fir-md1-s1 kernel: sysimgblt May 01 23:42:36 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:36 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:36 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:36 fir-md1-s1 kernel: ttm May 01 23:42:36 fir-md1-s1 kernel: devlink May 01 23:42:36 fir-md1-s1 kernel: ahci May 01 23:42:36 fir-md1-s1 kernel: crct10dif_common May 01 23:42:36 fir-md1-s1 kernel: libahci May 01 23:42:36 fir-md1-s1 kernel: drm May 01 23:42:36 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:36 fir-md1-s1 kernel: tg3 May 01 23:42:36 fir-md1-s1 kernel: crc32c_intel May 01 23:42:36 fir-md1-s1 kernel: libata May 01 23:42:36 fir-md1-s1 kernel: megaraid_sas May 01 23:42:36 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:36 fir-md1-s1 kernel: ptp May 01 23:42:36 fir-md1-s1 kernel: pps_core May 01 23:42:36 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:36 fir-md1-s1 kernel: raid_class May 01 23:42:36 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:36 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: CPU: 9 PID: 102923 Comm: mdt_io01_029 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:36 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:36 fir-md1-s1 kernel: task: ffff985c7cfbd140 ti: ffff985cbe4d8000 task.ti: ffff985cbe4d8000 May 01 23:42:36 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:36 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x15e/0x200 May 01 23:42:36 fir-md1-s1 kernel: RSP: 0018:ffff985cbe4db800 EFLAGS: 00000212 May 01 23:42:36 fir-md1-s1 kernel: RAX: 0000000000000101 RBX: ffff983165105ac0 RCX: 0000000000490000 May 01 23:42:36 fir-md1-s1 kernel: RDX: 0000000000110101 RSI: 0000000000000101 RDI: ffff982c9fc8c480 May 01 23:42:36 fir-md1-s1 kernel: RBP: ffff985cbe4db800 R08: ffff983cff69b780 R09: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R10: ffff983cff69f140 R11: ffffde3f18cce000 R12: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R13: ffff985cbe4db7a0 R14: ffff983165105830 R15: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: FS: 00007fddcbed4880(0000) GS:ffff983cff680000(0000) knlGS:0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:36 fir-md1-s1 kernel: CR2: 00007f50e83bb000 CR3: 00000015323e8000 CR4: 00000000003407e0 May 01 23:42:36 fir-md1-s1 kernel: Call Trace: May 01 23:42:36 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:36 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:36 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:36 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:36 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:36 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: Code: May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 18 May 01 23:42:36 fir-md1-s1 kernel: 09 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 17 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: 21 May 01 23:42:36 fir-md1-s1 kernel: 83 May 01 23:42:36 fir-md1-s1 kernel: f8 May 01 23:42:36 fir-md1-s1 kernel: 03 May 01 23:42:36 fir-md1-s1 kernel: 75 May 01 23:42:36 fir-md1-s1 kernel: 10 May 01 23:42:36 fir-md1-s1 kernel: eb May 01 23:42:36 fir-md1-s1 kernel: 1a May 01 23:42:36 fir-md1-s1 kernel: 66 May 01 23:42:36 fir-md1-s1 kernel: 2e May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 1f May 01 23:42:36 fir-md1-s1 kernel: 84 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: 0c May 01 23:42:36 fir-md1-s1 kernel: f3 May 01 23:42:36 fir-md1-s1 kernel: 90 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 17 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: 83 May 01 23:42:36 fir-md1-s1 kernel: f8 May 01 23:42:36 fir-md1-s1 kernel: 03 May 01 23:42:36 fir-md1-s1 kernel: <75> May 01 23:42:36 fir-md1-s1 kernel: f0 May 01 23:42:36 fir-md1-s1 kernel: be May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: eb May 01 23:42:36 fir-md1-s1 kernel: 15 May 01 23:42:36 fir-md1-s1 kernel: 66 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 1f May 01 23:42:36 fir-md1-s1 kernel: 84 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: d0 May 01 23:42:36 fir-md1-s1 kernel: f0 May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [mdt_io00_073:103263] May 01 23:42:36 fir-md1-s1 kernel: Modules linked in: May 01 23:42:36 fir-md1-s1 kernel: osp(OE) May 01 23:42:36 fir-md1-s1 kernel: mdd(OE) May 01 23:42:36 fir-md1-s1 kernel: lod(OE) May 01 23:42:36 fir-md1-s1 kernel: mdt(OE) May 01 23:42:36 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:36 fir-md1-s1 kernel: mgs(OE) May 01 23:42:36 fir-md1-s1 kernel: mgc(OE) May 01 23:42:36 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lquota(OE) May 01 23:42:36 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lustre(OE) May 01 23:42:36 fir-md1-s1 kernel: lmv(OE) May 01 23:42:36 fir-md1-s1 kernel: mdc(OE) May 01 23:42:36 fir-md1-s1 kernel: osc(OE) May 01 23:42:36 fir-md1-s1 kernel: lov(OE) May 01 23:42:36 fir-md1-s1 kernel: fid(OE) May 01 23:42:36 fir-md1-s1 kernel: fld(OE) May 01 23:42:36 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:36 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:36 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:36 fir-md1-s1 kernel: lnet(OE) May 01 23:42:36 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:36 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:36 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:36 fir-md1-s1 kernel: nfsv4 May 01 23:42:36 fir-md1-s1 kernel: dns_resolver May 01 23:42:36 fir-md1-s1 kernel: nfs May 01 23:42:36 fir-md1-s1 kernel: lockd May 01 23:42:36 fir-md1-s1 kernel: grace May 01 23:42:36 fir-md1-s1 kernel: fscache May 01 23:42:36 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:36 fir-md1-s1 kernel: dell_rbu May 01 23:42:36 fir-md1-s1 kernel: sunrpc May 01 23:42:36 fir-md1-s1 kernel: vfat May 01 23:42:36 fir-md1-s1 kernel: fat May 01 23:42:36 fir-md1-s1 kernel: dm_round_robin May 01 23:42:36 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:36 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:36 fir-md1-s1 kernel: kvm_amd May 01 23:42:36 fir-md1-s1 kernel: kvm May 01 23:42:36 fir-md1-s1 kernel: ses May 01 23:42:36 fir-md1-s1 kernel: irqbypass May 01 23:42:36 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:36 fir-md1-s1 kernel: enclosure May 01 23:42:36 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:36 fir-md1-s1 kernel: dcdbas May 01 23:42:36 fir-md1-s1 kernel: aesni_intel May 01 23:42:36 fir-md1-s1 kernel: lrw May 01 23:42:36 fir-md1-s1 kernel: gf128mul May 01 23:42:36 fir-md1-s1 kernel: glue_helper May 01 23:42:36 fir-md1-s1 kernel: ablk_helper May 01 23:42:36 fir-md1-s1 kernel: cryptd May 01 23:42:36 fir-md1-s1 kernel: ipmi_si May 01 23:42:36 fir-md1-s1 kernel: pcspkr May 01 23:42:36 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:36 fir-md1-s1 kernel: ccp May 01 23:42:36 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:36 fir-md1-s1 kernel: dm_multipath May 01 23:42:36 fir-md1-s1 kernel: sg May 01 23:42:36 fir-md1-s1 kernel: k10temp May 01 23:42:36 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:36 fir-md1-s1 kernel: dm_mod May 01 23:42:36 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:36 fir-md1-s1 kernel: knem(OE) May 01 23:42:36 fir-md1-s1 kernel: ip_tables May 01 23:42:36 fir-md1-s1 kernel: ext4 May 01 23:42:36 fir-md1-s1 kernel: mbcache May 01 23:42:36 fir-md1-s1 kernel: jbd2 May 01 23:42:36 fir-md1-s1 kernel: sd_mod May 01 23:42:36 fir-md1-s1 kernel: crc_t10dif May 01 23:42:36 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:36 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:36 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:36 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:36 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:36 fir-md1-s1 kernel: syscopyarea May 01 23:42:36 fir-md1-s1 kernel: sysfillrect May 01 23:42:36 fir-md1-s1 kernel: sysimgblt May 01 23:42:36 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:36 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:36 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:36 fir-md1-s1 kernel: ttm May 01 23:42:36 fir-md1-s1 kernel: devlink May 01 23:42:36 fir-md1-s1 kernel: ahci May 01 23:42:36 fir-md1-s1 kernel: crct10dif_common May 01 23:42:36 fir-md1-s1 kernel: libahci May 01 23:42:36 fir-md1-s1 kernel: drm May 01 23:42:36 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:36 fir-md1-s1 kernel: tg3 May 01 23:42:36 fir-md1-s1 kernel: crc32c_intel May 01 23:42:36 fir-md1-s1 kernel: libata May 01 23:42:36 fir-md1-s1 kernel: megaraid_sas May 01 23:42:36 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:36 fir-md1-s1 kernel: ptp May 01 23:42:36 fir-md1-s1 kernel: pps_core May 01 23:42:36 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:36 fir-md1-s1 kernel: raid_class May 01 23:42:36 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:36 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: CPU: 12 PID: 103263 Comm: mdt_io00_073 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:36 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:36 fir-md1-s1 kernel: task: ffff984cba23e180 ti: ffff98286afac000 task.ti: ffff98286afac000 May 01 23:42:36 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:36 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x126/0x200 May 01 23:42:36 fir-md1-s1 kernel: RSP: 0018:ffff98286afaf750 EFLAGS: 00000246 May 01 23:42:36 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff9831739477d8 RCX: 0000000000610000 May 01 23:42:36 fir-md1-s1 kernel: RDX: ffff984cff81b780 RSI: 0000000001110101 RDI: ffff982c9fc8c480 May 01 23:42:36 fir-md1-s1 kernel: RBP: ffff98286afaf750 R08: ffff982cfeedb780 R09: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R10: ffff982cfeedf140 R11: ffffde3edb488200 R12: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R13: ffff98286afaf6f0 R14: ffff983173947548 R15: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: FS: 00007fde62083880(0000) GS:ffff982cfeec0000(0000) knlGS:0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:36 fir-md1-s1 kernel: CR2: 00007f427f58b000 CR3: 000000203caa6000 CR4: 00000000003407e0 May 01 23:42:36 fir-md1-s1 kernel: Call Trace: May 01 23:42:36 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:36 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? zone_statistics+0x88/0xa0 May 01 23:42:36 fir-md1-s1 kernel: [] ? qsd_op_begin+0xb1/0x4b0 [lquota] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ldiskfs_inode_attach_jinode+0x55/0xd0 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_write_commit+0x3a2/0x8c0 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_commitrw_write.isra.46+0x608/0xd20 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_obd_commitrw+0x29b/0x520 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] obd_commitrw+0x9c/0x370 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_brw_write+0x100d/0x1a90 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:36 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: Code: May 01 23:42:36 fir-md1-s1 kernel: 0d May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 98 May 01 23:42:36 fir-md1-s1 kernel: 83 May 01 23:42:36 fir-md1-s1 kernel: e2 May 01 23:42:36 fir-md1-s1 kernel: 30 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 81 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: 80 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 03 May 01 23:42:36 fir-md1-s1 kernel: 14 May 01 23:42:36 fir-md1-s1 kernel: c5 May 01 23:42:36 fir-md1-s1 kernel: 60 May 01 23:42:36 fir-md1-s1 kernel: b9 May 01 23:42:36 fir-md1-s1 kernel: b4 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 4c May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: 02 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 75 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 1f May 01 23:42:36 fir-md1-s1 kernel: 44 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: f3 May 01 23:42:36 fir-md1-s1 kernel: 90 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: <85> May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: f6 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c9 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: 04 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 18 May 01 23:42:36 fir-md1-s1 kernel: 09 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 17 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#16 stuck for 23s! [mdt00_018:102388] May 01 23:42:36 fir-md1-s1 kernel: Modules linked in: May 01 23:42:36 fir-md1-s1 kernel: osp(OE) May 01 23:42:36 fir-md1-s1 kernel: mdd(OE) May 01 23:42:36 fir-md1-s1 kernel: lod(OE) May 01 23:42:36 fir-md1-s1 kernel: mdt(OE) May 01 23:42:36 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:36 fir-md1-s1 kernel: mgs(OE) May 01 23:42:36 fir-md1-s1 kernel: mgc(OE) May 01 23:42:36 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lquota(OE) May 01 23:42:36 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lustre(OE) May 01 23:42:36 fir-md1-s1 kernel: lmv(OE) May 01 23:42:36 fir-md1-s1 kernel: mdc(OE) May 01 23:42:36 fir-md1-s1 kernel: osc(OE) May 01 23:42:36 fir-md1-s1 kernel: lov(OE) May 01 23:42:36 fir-md1-s1 kernel: fid(OE) May 01 23:42:36 fir-md1-s1 kernel: fld(OE) May 01 23:42:36 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:36 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:36 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:36 fir-md1-s1 kernel: lnet(OE) May 01 23:42:36 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:36 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:36 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:36 fir-md1-s1 kernel: nfsv4 May 01 23:42:36 fir-md1-s1 kernel: dns_resolver May 01 23:42:36 fir-md1-s1 kernel: nfs May 01 23:42:36 fir-md1-s1 kernel: lockd May 01 23:42:36 fir-md1-s1 kernel: grace May 01 23:42:36 fir-md1-s1 kernel: fscache May 01 23:42:36 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:36 fir-md1-s1 kernel: dell_rbu May 01 23:42:36 fir-md1-s1 kernel: sunrpc May 01 23:42:36 fir-md1-s1 kernel: vfat May 01 23:42:36 fir-md1-s1 kernel: fat May 01 23:42:36 fir-md1-s1 kernel: dm_round_robin May 01 23:42:36 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:36 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:36 fir-md1-s1 kernel: kvm_amd May 01 23:42:36 fir-md1-s1 kernel: kvm May 01 23:42:36 fir-md1-s1 kernel: ses May 01 23:42:36 fir-md1-s1 kernel: irqbypass May 01 23:42:36 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:36 fir-md1-s1 kernel: enclosure May 01 23:42:36 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:36 fir-md1-s1 kernel: dcdbas May 01 23:42:36 fir-md1-s1 kernel: aesni_intel May 01 23:42:36 fir-md1-s1 kernel: lrw May 01 23:42:36 fir-md1-s1 kernel: gf128mul May 01 23:42:36 fir-md1-s1 kernel: glue_helper May 01 23:42:36 fir-md1-s1 kernel: ablk_helper May 01 23:42:36 fir-md1-s1 kernel: cryptd May 01 23:42:36 fir-md1-s1 kernel: ipmi_si May 01 23:42:36 fir-md1-s1 kernel: pcspkr May 01 23:42:36 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:36 fir-md1-s1 kernel: ccp May 01 23:42:36 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:36 fir-md1-s1 kernel: dm_multipath May 01 23:42:36 fir-md1-s1 kernel: sg May 01 23:42:36 fir-md1-s1 kernel: k10temp May 01 23:42:36 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:36 fir-md1-s1 kernel: dm_mod May 01 23:42:36 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:36 fir-md1-s1 kernel: knem(OE) May 01 23:42:36 fir-md1-s1 kernel: ip_tables May 01 23:42:36 fir-md1-s1 kernel: ext4 May 01 23:42:36 fir-md1-s1 kernel: mbcache May 01 23:42:36 fir-md1-s1 kernel: jbd2 May 01 23:42:36 fir-md1-s1 kernel: sd_mod May 01 23:42:36 fir-md1-s1 kernel: crc_t10dif May 01 23:42:36 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:36 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:36 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:36 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:36 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:36 fir-md1-s1 kernel: syscopyarea May 01 23:42:36 fir-md1-s1 kernel: sysfillrect May 01 23:42:36 fir-md1-s1 kernel: sysimgblt May 01 23:42:36 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:36 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:36 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:36 fir-md1-s1 kernel: ttm May 01 23:42:36 fir-md1-s1 kernel: devlink May 01 23:42:36 fir-md1-s1 kernel: ahci May 01 23:42:36 fir-md1-s1 kernel: crct10dif_common May 01 23:42:36 fir-md1-s1 kernel: libahci May 01 23:42:36 fir-md1-s1 kernel: drm May 01 23:42:36 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:36 fir-md1-s1 kernel: tg3 May 01 23:42:36 fir-md1-s1 kernel: crc32c_intel May 01 23:42:36 fir-md1-s1 kernel: libata May 01 23:42:36 fir-md1-s1 kernel: megaraid_sas May 01 23:42:36 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:36 fir-md1-s1 kernel: ptp May 01 23:42:36 fir-md1-s1 kernel: pps_core May 01 23:42:36 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:36 fir-md1-s1 kernel: raid_class May 01 23:42:36 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:36 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: CPU: 16 PID: 102388 Comm: mdt00_018 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:36 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:36 fir-md1-s1 kernel: task: ffff985884642080 ti: ffff984c4b64c000 task.ti: ffff984c4b64c000 May 01 23:42:36 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_inode_touch_time_cmp+0xd/0x90 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: RSP: 0018:ffff984c4b64f180 EFLAGS: 00000282 May 01 23:42:36 fir-md1-s1 kernel: RAX: 8000041400080000 RBX: ffffffffb7019f22 RCX: 000000010c9dde88 May 01 23:42:36 fir-md1-s1 kernel: RDX: ffff984b013d4600 RSI: ffff9836de9d5ac8 RDI: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: RBP: ffff984c4b64f1d0 R08: ffff984c4b64f300 R09: 00000000003ecd00 May 01 23:42:36 fir-md1-s1 kernel: R10: 0000000047bdbb01 R11: ffffde3f261ef6c0 R12: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R13: ffff982d3f224c80 R14: ffff983cf8b38400 R15: ffff982cfef254b8 May 01 23:42:36 fir-md1-s1 kernel: FS: 00007f32ccf2c740(0000) GS:ffff982cfef00000(0000) knlGS:0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:36 fir-md1-s1 kernel: CR2: 00007f32c5fef140 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:36 fir-md1-s1 kernel: Call Trace: May 01 23:42:36 fir-md1-s1 kernel: [] ? merge+0x62/0xc0 May 01 23:42:36 fir-md1-s1 kernel: [] ? ldiskfs_init_inode_table+0x410/0x410 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] list_sort+0x9b/0x250 May 01 23:42:36 fir-md1-s1 kernel: [] __ldiskfs_es_shrink+0x1ce/0x2a0 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_es_shrink+0xb4/0x130 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] shrink_slab+0x175/0x340 May 01 23:42:36 fir-md1-s1 kernel: [] ? zone_watermark_ok+0x1f/0x30 May 01 23:42:36 fir-md1-s1 kernel: [] ? compaction_suitable+0xa3/0xb0 May 01 23:42:36 fir-md1-s1 kernel: [] zone_reclaim+0x1d1/0x2f0 May 01 23:42:36 fir-md1-s1 kernel: [] get_page_from_freelist+0x87b/0xa70 May 01 23:42:36 fir-md1-s1 kernel: [] ? __getblk+0x2d/0x300 May 01 23:42:36 fir-md1-s1 kernel: [] __alloc_pages_nodemask+0x176/0x420 May 01 23:42:36 fir-md1-s1 kernel: [] alloc_pages_current+0x98/0x110 May 01 23:42:36 fir-md1-s1 kernel: [] new_slab+0x2c5/0x390 May 01 23:42:36 fir-md1-s1 kernel: [] ___slab_alloc+0x3ac/0x4f0 May 01 23:42:36 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:42:36 fir-md1-s1 kernel: [] ? fld_cache_lookup+0x36/0x1a0 [fld] May 01 23:42:36 fir-md1-s1 kernel: [] ? fld_local_lookup+0x62/0x270 [fld] May 01 23:42:36 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:42:36 fir-md1-s1 kernel: [] __slab_alloc+0x40/0x5c May 01 23:42:36 fir-md1-s1 kernel: [] kmem_cache_alloc+0x19b/0x1f0 May 01 23:42:36 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:42:36 fir-md1-s1 kernel: [] osp_object_alloc+0x40/0x170 [osp] May 01 23:42:36 fir-md1-s1 kernel: [] lod_object_init+0x1e7/0x3c0 [lod] May 01 23:42:36 fir-md1-s1 kernel: [] lu_object_alloc+0xe5/0x320 [obdclass] May 01 23:42:36 fir-md1-s1 kernel: [] lu_object_find_at+0x76/0x280 [obdclass] May 01 23:42:36 fir-md1-s1 kernel: [] lu_object_find_slice+0x1f/0x90 [obdclass] May 01 23:42:36 fir-md1-s1 kernel: [] mdd_object_find+0x10/0x70 [mdd] May 01 23:42:36 fir-md1-s1 kernel: [] obf_lookup+0x2c9/0x350 [mdd] May 01 23:42:36 fir-md1-s1 kernel: [] ? req_capsule_get_size+0x31/0x70 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_getattr_name_lock+0xf7c/0x1c30 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? __req_capsule_get+0x15f/0x740 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_intent_getattr+0x2b5/0x480 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_intent_policy+0x2e8/0xd00 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] ? mdt_intent_layout+0xcc0/0xcc0 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] ldlm_lock_enqueue+0x366/0xa60 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:36 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: Code: May 01 23:42:36 fir-md1-s1 kernel: ff May 01 23:42:36 fir-md1-s1 kernel: 8d May 01 23:42:36 fir-md1-s1 kernel: 4a May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: d0 May 01 23:42:36 fir-md1-s1 kernel: f0 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: b1 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 39 May 01 23:42:36 fir-md1-s1 kernel: d0 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 84 May 01 23:42:36 fir-md1-s1 kernel: fb May 01 23:42:36 fir-md1-s1 kernel: fd May 01 23:42:36 fir-md1-s1 kernel: ff May 01 23:42:36 fir-md1-s1 kernel: ff May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: eb May 01 23:42:36 fir-md1-s1 kernel: e2 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 1f May 01 23:42:36 fir-md1-s1 kernel: 84 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 66 May 01 23:42:36 fir-md1-s1 kernel: 66 May 01 23:42:36 fir-md1-s1 kernel: 66 May 01 23:42:36 fir-md1-s1 kernel: 66 May 01 23:42:36 fir-md1-s1 kernel: 90 May 01 23:42:36 fir-md1-s1 kernel: 55 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 86 May 01 23:42:36 fir-md1-s1 kernel: e8 May 01 23:42:36 fir-md1-s1 kernel: fc May 01 23:42:36 fir-md1-s1 kernel: ff May 01 23:42:36 fir-md1-s1 kernel: ff May 01 23:42:36 fir-md1-s1 kernel: <48> May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: e5 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: c1 May 01 23:42:36 fir-md1-s1 kernel: e8 May 01 23:42:36 fir-md1-s1 kernel: 2b May 01 23:42:36 fir-md1-s1 kernel: a8 May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: 15 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 8a May 01 23:42:36 fir-md1-s1 kernel: e8 May 01 23:42:36 fir-md1-s1 kernel: fc May 01 23:42:36 fir-md1-s1 kernel: ff May 01 23:42:36 fir-md1-s1 kernel: ff May 01 23:42:36 fir-md1-s1 kernel: b8 May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#18 stuck for 23s! [mdt_io02_065:103134] May 01 23:42:36 fir-md1-s1 kernel: Modules linked in: May 01 23:42:36 fir-md1-s1 kernel: osp(OE) May 01 23:42:36 fir-md1-s1 kernel: mdd(OE) May 01 23:42:36 fir-md1-s1 kernel: lod(OE) May 01 23:42:36 fir-md1-s1 kernel: mdt(OE) May 01 23:42:36 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:36 fir-md1-s1 kernel: mgs(OE) May 01 23:42:36 fir-md1-s1 kernel: mgc(OE) May 01 23:42:36 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lquota(OE) May 01 23:42:36 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lustre(OE) May 01 23:42:36 fir-md1-s1 kernel: lmv(OE) May 01 23:42:36 fir-md1-s1 kernel: mdc(OE) May 01 23:42:36 fir-md1-s1 kernel: osc(OE) May 01 23:42:36 fir-md1-s1 kernel: lov(OE) May 01 23:42:36 fir-md1-s1 kernel: fid(OE) May 01 23:42:36 fir-md1-s1 kernel: fld(OE) May 01 23:42:36 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:36 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:36 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:36 fir-md1-s1 kernel: lnet(OE) May 01 23:42:36 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:36 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:36 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:36 fir-md1-s1 kernel: nfsv4 May 01 23:42:36 fir-md1-s1 kernel: dns_resolver May 01 23:42:36 fir-md1-s1 kernel: nfs May 01 23:42:36 fir-md1-s1 kernel: lockd May 01 23:42:36 fir-md1-s1 kernel: grace May 01 23:42:36 fir-md1-s1 kernel: fscache May 01 23:42:36 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:36 fir-md1-s1 kernel: dell_rbu May 01 23:42:36 fir-md1-s1 kernel: sunrpc May 01 23:42:36 fir-md1-s1 kernel: vfat May 01 23:42:36 fir-md1-s1 kernel: fat May 01 23:42:36 fir-md1-s1 kernel: dm_round_robin May 01 23:42:36 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:36 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:36 fir-md1-s1 kernel: kvm_amd May 01 23:42:36 fir-md1-s1 kernel: kvm May 01 23:42:36 fir-md1-s1 kernel: ses May 01 23:42:36 fir-md1-s1 kernel: irqbypass May 01 23:42:36 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:36 fir-md1-s1 kernel: enclosure May 01 23:42:36 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:36 fir-md1-s1 kernel: dcdbas May 01 23:42:36 fir-md1-s1 kernel: aesni_intel May 01 23:42:36 fir-md1-s1 kernel: lrw May 01 23:42:36 fir-md1-s1 kernel: gf128mul May 01 23:42:36 fir-md1-s1 kernel: glue_helper May 01 23:42:36 fir-md1-s1 kernel: ablk_helper May 01 23:42:36 fir-md1-s1 kernel: cryptd May 01 23:42:36 fir-md1-s1 kernel: ipmi_si May 01 23:42:36 fir-md1-s1 kernel: pcspkr May 01 23:42:36 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:36 fir-md1-s1 kernel: ccp May 01 23:42:36 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:36 fir-md1-s1 kernel: dm_multipath May 01 23:42:36 fir-md1-s1 kernel: sg May 01 23:42:36 fir-md1-s1 kernel: k10temp May 01 23:42:36 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:36 fir-md1-s1 kernel: dm_mod May 01 23:42:36 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:36 fir-md1-s1 kernel: knem(OE) May 01 23:42:36 fir-md1-s1 kernel: ip_tables May 01 23:42:36 fir-md1-s1 kernel: ext4 May 01 23:42:36 fir-md1-s1 kernel: mbcache May 01 23:42:36 fir-md1-s1 kernel: jbd2 May 01 23:42:36 fir-md1-s1 kernel: sd_mod May 01 23:42:36 fir-md1-s1 kernel: crc_t10dif May 01 23:42:36 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:36 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:36 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:36 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:36 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:36 fir-md1-s1 kernel: syscopyarea May 01 23:42:36 fir-md1-s1 kernel: sysfillrect May 01 23:42:36 fir-md1-s1 kernel: sysimgblt May 01 23:42:36 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:36 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:36 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:36 fir-md1-s1 kernel: ttm May 01 23:42:36 fir-md1-s1 kernel: devlink May 01 23:42:36 fir-md1-s1 kernel: ahci May 01 23:42:36 fir-md1-s1 kernel: crct10dif_common May 01 23:42:36 fir-md1-s1 kernel: libahci May 01 23:42:36 fir-md1-s1 kernel: drm May 01 23:42:36 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:36 fir-md1-s1 kernel: tg3 May 01 23:42:36 fir-md1-s1 kernel: crc32c_intel May 01 23:42:36 fir-md1-s1 kernel: libata May 01 23:42:36 fir-md1-s1 kernel: megaraid_sas May 01 23:42:36 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:36 fir-md1-s1 kernel: ptp May 01 23:42:36 fir-md1-s1 kernel: pps_core May 01 23:42:36 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:36 fir-md1-s1 kernel: raid_class May 01 23:42:36 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:36 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: CPU: 18 PID: 103134 Comm: mdt_io02_065 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:36 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:36 fir-md1-s1 kernel: task: ffff985ccda90000 ti: ffff98583efd4000 task.ti: ffff98583efd4000 May 01 23:42:36 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:36 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:42:36 fir-md1-s1 kernel: RSP: 0018:ffff98583efd7750 EFLAGS: 00000246 May 01 23:42:36 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff983164d60378 RCX: 0000000000910000 May 01 23:42:36 fir-md1-s1 kernel: RDX: ffff983cff69b780 RSI: 0000000000490101 RDI: ffff982c9fc8c480 May 01 23:42:36 fir-md1-s1 kernel: RBP: ffff98583efd7750 R08: ffff984cff71b780 R09: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R10: ffff984cff71f140 R11: ffffde3ef98bd000 R12: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R13: ffff98583efd76f0 R14: ffff983164d600e8 R15: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: FS: 00007f010bbcf880(0000) GS:ffff984cff700000(0000) knlGS:0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:36 fir-md1-s1 kernel: CR2: 0000000001c9e8e0 CR3: 000000402db9c000 CR4: 00000000003407e0 May 01 23:42:36 fir-md1-s1 kernel: Call Trace: May 01 23:42:36 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:36 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? zone_statistics+0x88/0xa0 May 01 23:42:36 fir-md1-s1 kernel: [] ? qsd_op_begin+0xb1/0x4b0 [lquota] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ldiskfs_inode_attach_jinode+0x55/0xd0 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_write_commit+0x3a2/0x8c0 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_commitrw_write.isra.46+0x608/0xd20 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_obd_commitrw+0x29b/0x520 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] obd_commitrw+0x9c/0x370 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_brw_write+0x100d/0x1a90 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:36 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: Code: May 01 23:42:36 fir-md1-s1 kernel: 13 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: c1 May 01 23:42:36 fir-md1-s1 kernel: ea May 01 23:42:36 fir-md1-s1 kernel: 0d May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 98 May 01 23:42:36 fir-md1-s1 kernel: 83 May 01 23:42:36 fir-md1-s1 kernel: e2 May 01 23:42:36 fir-md1-s1 kernel: 30 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 81 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: 80 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 03 May 01 23:42:36 fir-md1-s1 kernel: 14 May 01 23:42:36 fir-md1-s1 kernel: c5 May 01 23:42:36 fir-md1-s1 kernel: 60 May 01 23:42:36 fir-md1-s1 kernel: b9 May 01 23:42:36 fir-md1-s1 kernel: b4 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 4c May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: 02 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 75 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 1f May 01 23:42:36 fir-md1-s1 kernel: 44 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: f3 May 01 23:42:36 fir-md1-s1 kernel: 90 May 01 23:42:36 fir-md1-s1 kernel: <41> May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: f6 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c9 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: 04 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 18 May 01 23:42:36 fir-md1-s1 kernel: 09 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#25 stuck for 22s! [mdt_io01_082:103083] May 01 23:42:36 fir-md1-s1 kernel: Modules linked in: May 01 23:42:36 fir-md1-s1 kernel: osp(OE) May 01 23:42:36 fir-md1-s1 kernel: mdd(OE) May 01 23:42:36 fir-md1-s1 kernel: lod(OE) May 01 23:42:36 fir-md1-s1 kernel: mdt(OE) May 01 23:42:36 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:36 fir-md1-s1 kernel: mgs(OE) May 01 23:42:36 fir-md1-s1 kernel: mgc(OE) May 01 23:42:36 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lquota(OE) May 01 23:42:36 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lustre(OE) May 01 23:42:36 fir-md1-s1 kernel: lmv(OE) May 01 23:42:36 fir-md1-s1 kernel: mdc(OE) May 01 23:42:36 fir-md1-s1 kernel: osc(OE) May 01 23:42:36 fir-md1-s1 kernel: lov(OE) May 01 23:42:36 fir-md1-s1 kernel: fid(OE) May 01 23:42:36 fir-md1-s1 kernel: fld(OE) May 01 23:42:36 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:36 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:36 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:36 fir-md1-s1 kernel: lnet(OE) May 01 23:42:36 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:36 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:36 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:36 fir-md1-s1 kernel: nfsv4 May 01 23:42:36 fir-md1-s1 kernel: dns_resolver May 01 23:42:36 fir-md1-s1 kernel: nfs May 01 23:42:36 fir-md1-s1 kernel: lockd May 01 23:42:36 fir-md1-s1 kernel: grace May 01 23:42:36 fir-md1-s1 kernel: fscache May 01 23:42:36 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:36 fir-md1-s1 kernel: dell_rbu May 01 23:42:36 fir-md1-s1 kernel: sunrpc May 01 23:42:36 fir-md1-s1 kernel: vfat May 01 23:42:36 fir-md1-s1 kernel: fat May 01 23:42:36 fir-md1-s1 kernel: dm_round_robin May 01 23:42:36 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:36 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:36 fir-md1-s1 kernel: kvm_amd May 01 23:42:36 fir-md1-s1 kernel: kvm May 01 23:42:36 fir-md1-s1 kernel: ses May 01 23:42:36 fir-md1-s1 kernel: irqbypass May 01 23:42:36 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:36 fir-md1-s1 kernel: enclosure May 01 23:42:36 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:36 fir-md1-s1 kernel: dcdbas May 01 23:42:36 fir-md1-s1 kernel: aesni_intel May 01 23:42:36 fir-md1-s1 kernel: lrw May 01 23:42:36 fir-md1-s1 kernel: gf128mul May 01 23:42:36 fir-md1-s1 kernel: glue_helper May 01 23:42:36 fir-md1-s1 kernel: ablk_helper May 01 23:42:36 fir-md1-s1 kernel: cryptd May 01 23:42:36 fir-md1-s1 kernel: ipmi_si May 01 23:42:36 fir-md1-s1 kernel: pcspkr May 01 23:42:36 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:36 fir-md1-s1 kernel: ccp May 01 23:42:36 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:36 fir-md1-s1 kernel: dm_multipath May 01 23:42:36 fir-md1-s1 kernel: sg May 01 23:42:36 fir-md1-s1 kernel: k10temp May 01 23:42:36 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:36 fir-md1-s1 kernel: dm_mod May 01 23:42:36 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:36 fir-md1-s1 kernel: knem(OE) May 01 23:42:36 fir-md1-s1 kernel: ip_tables May 01 23:42:36 fir-md1-s1 kernel: ext4 May 01 23:42:36 fir-md1-s1 kernel: mbcache May 01 23:42:36 fir-md1-s1 kernel: jbd2 May 01 23:42:36 fir-md1-s1 kernel: sd_mod May 01 23:42:36 fir-md1-s1 kernel: crc_t10dif May 01 23:42:36 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:36 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:36 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:36 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:36 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:36 fir-md1-s1 kernel: syscopyarea May 01 23:42:36 fir-md1-s1 kernel: sysfillrect May 01 23:42:36 fir-md1-s1 kernel: sysimgblt May 01 23:42:36 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:36 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:36 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:36 fir-md1-s1 kernel: ttm May 01 23:42:36 fir-md1-s1 kernel: devlink May 01 23:42:36 fir-md1-s1 kernel: ahci May 01 23:42:36 fir-md1-s1 kernel: crct10dif_common May 01 23:42:36 fir-md1-s1 kernel: libahci May 01 23:42:36 fir-md1-s1 kernel: drm May 01 23:42:36 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:36 fir-md1-s1 kernel: tg3 May 01 23:42:36 fir-md1-s1 kernel: crc32c_intel May 01 23:42:36 fir-md1-s1 kernel: libata May 01 23:42:36 fir-md1-s1 kernel: megaraid_sas May 01 23:42:36 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:36 fir-md1-s1 kernel: ptp May 01 23:42:36 fir-md1-s1 kernel: pps_core May 01 23:42:36 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:36 fir-md1-s1 kernel: raid_class May 01 23:42:36 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:36 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: CPU: 25 PID: 103083 Comm: mdt_io01_082 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:36 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:36 fir-md1-s1 kernel: task: ffff985cfe905140 ti: ffff985ccaf18000 task.ti: ffff985ccaf18000 May 01 23:42:36 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:36 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x128/0x200 May 01 23:42:36 fir-md1-s1 kernel: RSP: 0018:ffff985ccaf1b800 EFLAGS: 00000246 May 01 23:42:36 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff9831703ceb60 RCX: 0000000000c90000 May 01 23:42:36 fir-md1-s1 kernel: RDX: ffff984cff71b780 RSI: 0000000000910101 RDI: ffff982c9fc8c480 May 01 23:42:36 fir-md1-s1 kernel: RBP: ffff985ccaf1b800 R08: ffff983cff79b780 R09: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R10: ffff983cff79f140 R11: ffffde3fa5607800 R12: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R13: ffff985ccaf1b7a0 R14: ffff9831703ce8d0 R15: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: FS: 00007f427f792740(0000) GS:ffff983cff780000(0000) knlGS:0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:36 fir-md1-s1 kernel: CR2: 00007f427f58b000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:36 fir-md1-s1 kernel: Call Trace: May 01 23:42:36 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:36 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:36 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:36 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:36 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:36 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: Code: May 01 23:42:36 fir-md1-s1 kernel: 98 May 01 23:42:36 fir-md1-s1 kernel: 83 May 01 23:42:36 fir-md1-s1 kernel: e2 May 01 23:42:36 fir-md1-s1 kernel: 30 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 81 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: 80 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 03 May 01 23:42:36 fir-md1-s1 kernel: 14 May 01 23:42:36 fir-md1-s1 kernel: c5 May 01 23:42:36 fir-md1-s1 kernel: 60 May 01 23:42:36 fir-md1-s1 kernel: b9 May 01 23:42:36 fir-md1-s1 kernel: b4 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 4c May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: 02 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 75 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 1f May 01 23:42:36 fir-md1-s1 kernel: 44 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: f3 May 01 23:42:36 fir-md1-s1 kernel: 90 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: <74> May 01 23:42:36 fir-md1-s1 kernel: f6 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c9 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: 04 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 18 May 01 23:42:36 fir-md1-s1 kernel: 09 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 17 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#34 stuck for 22s! [mdt_io02_034:102984] May 01 23:42:36 fir-md1-s1 kernel: Modules linked in: May 01 23:42:36 fir-md1-s1 kernel: osp(OE) May 01 23:42:36 fir-md1-s1 kernel: mdd(OE) May 01 23:42:36 fir-md1-s1 kernel: lod(OE) May 01 23:42:36 fir-md1-s1 kernel: mdt(OE) May 01 23:42:36 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:36 fir-md1-s1 kernel: mgs(OE) May 01 23:42:36 fir-md1-s1 kernel: mgc(OE) May 01 23:42:36 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lquota(OE) May 01 23:42:36 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:36 fir-md1-s1 kernel: lustre(OE) May 01 23:42:36 fir-md1-s1 kernel: lmv(OE) May 01 23:42:36 fir-md1-s1 kernel: mdc(OE) May 01 23:42:36 fir-md1-s1 kernel: osc(OE) May 01 23:42:36 fir-md1-s1 kernel: lov(OE) May 01 23:42:36 fir-md1-s1 kernel: fid(OE) May 01 23:42:36 fir-md1-s1 kernel: fld(OE) May 01 23:42:36 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:36 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:36 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:36 fir-md1-s1 kernel: lnet(OE) May 01 23:42:36 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:36 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:36 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:36 fir-md1-s1 kernel: nfsv4 May 01 23:42:36 fir-md1-s1 kernel: dns_resolver May 01 23:42:36 fir-md1-s1 kernel: nfs May 01 23:42:36 fir-md1-s1 kernel: lockd May 01 23:42:36 fir-md1-s1 kernel: grace May 01 23:42:36 fir-md1-s1 kernel: fscache May 01 23:42:36 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:36 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:36 fir-md1-s1 kernel: dell_rbu May 01 23:42:36 fir-md1-s1 kernel: sunrpc May 01 23:42:36 fir-md1-s1 kernel: vfat May 01 23:42:36 fir-md1-s1 kernel: fat May 01 23:42:36 fir-md1-s1 kernel: dm_round_robin May 01 23:42:36 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:36 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:36 fir-md1-s1 kernel: kvm_amd May 01 23:42:36 fir-md1-s1 kernel: kvm May 01 23:42:36 fir-md1-s1 kernel: ses May 01 23:42:36 fir-md1-s1 kernel: irqbypass May 01 23:42:36 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:36 fir-md1-s1 kernel: enclosure May 01 23:42:36 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:36 fir-md1-s1 kernel: dcdbas May 01 23:42:36 fir-md1-s1 kernel: aesni_intel May 01 23:42:36 fir-md1-s1 kernel: lrw May 01 23:42:36 fir-md1-s1 kernel: gf128mul May 01 23:42:36 fir-md1-s1 kernel: glue_helper May 01 23:42:36 fir-md1-s1 kernel: ablk_helper May 01 23:42:36 fir-md1-s1 kernel: cryptd May 01 23:42:36 fir-md1-s1 kernel: ipmi_si May 01 23:42:36 fir-md1-s1 kernel: pcspkr May 01 23:42:36 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:36 fir-md1-s1 kernel: ccp May 01 23:42:36 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:36 fir-md1-s1 kernel: dm_multipath May 01 23:42:36 fir-md1-s1 kernel: sg May 01 23:42:36 fir-md1-s1 kernel: k10temp May 01 23:42:36 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:36 fir-md1-s1 kernel: dm_mod May 01 23:42:36 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:36 fir-md1-s1 kernel: knem(OE) May 01 23:42:36 fir-md1-s1 kernel: ip_tables May 01 23:42:36 fir-md1-s1 kernel: ext4 May 01 23:42:36 fir-md1-s1 kernel: mbcache May 01 23:42:36 fir-md1-s1 kernel: jbd2 May 01 23:42:36 fir-md1-s1 kernel: sd_mod May 01 23:42:36 fir-md1-s1 kernel: crc_t10dif May 01 23:42:36 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:36 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:36 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:36 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:36 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:36 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:36 fir-md1-s1 kernel: syscopyarea May 01 23:42:36 fir-md1-s1 kernel: sysfillrect May 01 23:42:36 fir-md1-s1 kernel: sysimgblt May 01 23:42:36 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:36 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:36 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:36 fir-md1-s1 kernel: ttm May 01 23:42:36 fir-md1-s1 kernel: devlink May 01 23:42:36 fir-md1-s1 kernel: ahci May 01 23:42:36 fir-md1-s1 kernel: crct10dif_common May 01 23:42:36 fir-md1-s1 kernel: libahci May 01 23:42:36 fir-md1-s1 kernel: drm May 01 23:42:36 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:36 fir-md1-s1 kernel: tg3 May 01 23:42:36 fir-md1-s1 kernel: crc32c_intel May 01 23:42:36 fir-md1-s1 kernel: libata May 01 23:42:36 fir-md1-s1 kernel: megaraid_sas May 01 23:42:36 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:36 fir-md1-s1 kernel: ptp May 01 23:42:36 fir-md1-s1 kernel: pps_core May 01 23:42:36 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:36 fir-md1-s1 kernel: raid_class May 01 23:42:36 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:36 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: CPU: 34 PID: 102984 Comm: mdt_io02_034 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:36 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:36 fir-md1-s1 kernel: task: ffff982cf9de4100 ti: ffff985ce80d4000 task.ti: ffff985ce80d4000 May 01 23:42:36 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:36 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:42:36 fir-md1-s1 kernel: RSP: 0018:ffff985ce80d7800 EFLAGS: 00000246 May 01 23:42:36 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff983165105698 RCX: 0000000001110000 May 01 23:42:36 fir-md1-s1 kernel: RDX: ffff983cff79b780 RSI: 0000000000c90101 RDI: ffff982c9fc8c480 May 01 23:42:36 fir-md1-s1 kernel: RBP: ffff985ce80d7800 R08: ffff984cff81b780 R09: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R10: ffff984cff81f140 R11: ffffde3fa7770c00 R12: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R13: ffff985ce80d77a0 R14: ffff983165105408 R15: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: FS: 00007fe19c902740(0000) GS:ffff984cff800000(0000) knlGS:0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:36 fir-md1-s1 kernel: CR2: 00007fe19bb327c0 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:36 fir-md1-s1 kernel: Call Trace: May 01 23:42:36 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:36 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:36 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_free_reply_data+0x128/0x3b0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? kfree+0x106/0x140 May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_free_reply_data+0x128/0x3b0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:36 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: Code: May 01 23:42:36 fir-md1-s1 kernel: 13 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: c1 May 01 23:42:36 fir-md1-s1 kernel: ea May 01 23:42:36 fir-md1-s1 kernel: 0d May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 98 May 01 23:42:36 fir-md1-s1 kernel: 83 May 01 23:42:36 fir-md1-s1 kernel: e2 May 01 23:42:36 fir-md1-s1 kernel: 30 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 81 May 01 23:42:36 fir-md1-s1 kernel: c2 May 01 23:42:36 fir-md1-s1 kernel: 80 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 01 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 48 May 01 23:42:36 fir-md1-s1 kernel: 03 May 01 23:42:36 fir-md1-s1 kernel: 14 May 01 23:42:36 fir-md1-s1 kernel: c5 May 01 23:42:36 fir-md1-s1 kernel: 60 May 01 23:42:36 fir-md1-s1 kernel: b9 May 01 23:42:36 fir-md1-s1 kernel: b4 May 01 23:42:36 fir-md1-s1 kernel: b7 May 01 23:42:36 fir-md1-s1 kernel: 4c May 01 23:42:36 fir-md1-s1 kernel: 89 May 01 23:42:36 fir-md1-s1 kernel: 02 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 75 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 1f May 01 23:42:36 fir-md1-s1 kernel: 44 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: 00 May 01 23:42:36 fir-md1-s1 kernel: f3 May 01 23:42:36 fir-md1-s1 kernel: 90 May 01 23:42:36 fir-md1-s1 kernel: <41> May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 40 May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c0 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: f6 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: 08 May 01 23:42:36 fir-md1-s1 kernel: 4d May 01 23:42:36 fir-md1-s1 kernel: 85 May 01 23:42:36 fir-md1-s1 kernel: c9 May 01 23:42:36 fir-md1-s1 kernel: 74 May 01 23:42:36 fir-md1-s1 kernel: 04 May 01 23:42:36 fir-md1-s1 kernel: 41 May 01 23:42:36 fir-md1-s1 kernel: 0f May 01 23:42:36 fir-md1-s1 kernel: 18 May 01 23:42:36 fir-md1-s1 kernel: 09 May 01 23:42:36 fir-md1-s1 kernel: 8b May 01 23:42:36 fir-md1-s1 kernel: May 01 23:42:36 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client 3ddfc0e1-d9a8-93ac-6e7d-3e2edb9b897f (at 10.8.0.65@o2ib6) reconnecting May 01 23:42:36 fir-md1-s1 kernel: Lustre: Skipped 9 previous similar messages May 01 23:42:36 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to (at 10.8.0.65@o2ib6) May 01 23:42:36 fir-md1-s1 kernel: Lustre: Skipped 12 previous similar messages May 01 23:42:36 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:36 fir-md1-s1 kernel: mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas [last unloaded: libcfs] May 01 23:42:36 fir-md1-s1 kernel: CPU: 4 PID: 103101 Comm: mdt_io00_057 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:36 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:36 fir-md1-s1 kernel: task: ffff985c827130c0 ti: ffff985c1a30c000 task.ti: ffff985c1a30c000 May 01 23:42:36 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x1ce/0x200 May 01 23:42:36 fir-md1-s1 kernel: RSP: 0018:ffff985c1a30f800 EFLAGS: 00000202 May 01 23:42:36 fir-md1-s1 kernel: RAX: 0000000000000001 RBX: ffff9831703ce738 RCX: 0000000000000001 May 01 23:42:36 fir-md1-s1 kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffff982c9fc8c480 May 01 23:42:36 fir-md1-s1 kernel: RBP: ffff985c1a30f800 R08: 0000000000000101 R09: ffffffffc1231d1a May 01 23:42:36 fir-md1-s1 kernel: R10: ffff982cfee5f140 R11: ffffde3ed5b1ce00 R12: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: R13: ffff985c1a30f7a0 R14: ffff9831703ce4a8 R15: 0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: FS: 00007f427f792740(0000) GS:ffff982cfee40000(0000) knlGS:0000000000000000 May 01 23:42:36 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:36 fir-md1-s1 kernel: CR2: 00007f427f58b000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:36 fir-md1-s1 kernel: Call Trace: May 01 23:42:36 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:36 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:42:36 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:36 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:36 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:36 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:36 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:36 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:36 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:36 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:36 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:36 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:36 fir-md1-s1 kernel: Code: 37 81 fe 00 01 00 00 74 f4 e9 93 fe ff ff 0f 1f 80 00 00 00 00 83 fa 01 75 11 0f 1f 00 e9 68 fe ff ff 0f 1f 00 85 c0 74 0c f3 90 <8b> 07 0f b6 c0 83 f8 03 75 f0 b8 01 00 00 00 66 89 07 5d c3 66 May 01 23:42:38 fir-md1-s1 kernel: Lustre: 102998:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff984bdbe48c50 x1631547606839792/t0(0) o4->778be52b-e90c-a9e1-8d5b-9e961e103e4e@10.9.101.5@o2ib4:13/0 lens 6328/448 e 1 to 0 dl 1556779363 ref 2 fl Interpret:/0/0 rc 0/0 May 01 23:42:38 fir-md1-s1 kernel: Lustre: 102998:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 60 previous similar messages May 01 23:42:41 fir-md1-s1 kernel: Lustre: 102963:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (30:1s); client may timeout. req@ffff98269b6d3850 x1631550680416096/t279448167600(0) o4->da8b20c2-1617-543d-bea7-2bc4da319abc@10.9.112.7@o2ib4:10/0 lens 488/416 e 0 to 0 dl 1556779360 ref 1 fl Complete:/0/0 rc 0/0 May 01 23:42:41 fir-md1-s1 kernel: Lustre: 101380:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556779340/real 1556779340] req@ffff98471df1fb00 x1632254603991360/t0(0) o101->fir-MDT0000-lwp-MDT0002@0@lo:23/10 lens 456/496 e 1 to 1 dl 1556779361 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 01 23:42:43 fir-md1-s1 kernel: Lustre: fir-MDT0000: Client fd395c74-7a26-632a-06a8-cecc6aa8caa0 (at 10.9.105.64@o2ib4) reconnecting May 01 23:42:43 fir-md1-s1 kernel: Lustre: Skipped 16 previous similar messages May 01 23:42:43 fir-md1-s1 kernel: Lustre: fir-MDT0000: Connection restored to cb9e5693-7c44-f40c-eac0-9f9482ccd7f6 (at 10.9.105.64@o2ib4) May 01 23:42:43 fir-md1-s1 kernel: Lustre: Skipped 16 previous similar messages May 01 23:42:44 fir-md1-s1 kernel: Lustre: 103078:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (30:1s); client may timeout. req@ffff983816f31050 x1631541998329168/t279448167813(0) o4->d4242da5-5a9c-4508-f9da-c1e7f36347f4@10.9.114.4@o2ib4:13/0 lens 488/416 e 0 to 0 dl 1556779363 ref 1 fl Complete:/0/0 rc 0/0 May 01 23:42:50 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#26 stuck for 23s! [mdt_io02_043:103027] May 01 23:42:50 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif May 01 23:42:50 fir-md1-s1 kernel: crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas [last unloaded: libcfs] May 01 23:42:50 fir-md1-s1 kernel: CPU: 26 PID: 103027 Comm: mdt_io02_043 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:50 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:50 fir-md1-s1 kernel: task: ffff982c812730c0 ti: ffff983a5ba80000 task.ti: ffff983a5ba80000 May 01 23:42:50 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x126/0x200 May 01 23:42:50 fir-md1-s1 kernel: RSP: 0018:ffff983a5ba83800 EFLAGS: 00000246 May 01 23:42:50 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff983165526b60 RCX: 0000000000d10000 May 01 23:42:50 fir-md1-s1 kernel: RDX: ffff982cfeedb780 RSI: 0000000000610101 RDI: ffff982c9fc8c480 May 01 23:42:50 fir-md1-s1 kernel: RBP: ffff983a5ba83800 R08: ffff984cff79b780 R09: 0000000000000000 May 01 23:42:50 fir-md1-s1 kernel: R10: ffff984cff79f140 R11: ffffde3f6b96dc00 R12: 0000000000000000 May 01 23:42:50 fir-md1-s1 kernel: R13: ffff983a5ba837a0 R14: ffff9831655268d0 R15: 0000000000000000 May 01 23:42:50 fir-md1-s1 kernel: FS: 00007fa424097780(0000) GS:ffff984cff780000(0000) knlGS:0000000000000000 May 01 23:42:50 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:50 fir-md1-s1 kernel: CR2: 00007fa4240a8000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:50 fir-md1-s1 kernel: Call Trace: May 01 23:42:50 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:50 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:50 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:50 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:42:50 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:42:50 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:42:50 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:42:50 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:50 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:50 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:50 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:50 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:50 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:50 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:50 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:50 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:50 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:50 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:50 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:50 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:50 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:50 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:50 fir-md1-s1 kernel: Code: 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 41 8b 40 08 <85> c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b 17 0f b7 c2 May 01 23:42:55 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [mdt_io02_001:101733] May 01 23:42:55 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif May 01 23:42:55 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [mdt_io03_043:103125] May 01 23:42:55 fir-md1-s1 kernel: crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas May 01 23:42:55 fir-md1-s1 kernel: Modules linked in: May 01 23:42:55 fir-md1-s1 kernel: osp(OE) May 01 23:42:55 fir-md1-s1 kernel: mdd(OE) May 01 23:42:55 fir-md1-s1 kernel: lod(OE) May 01 23:42:55 fir-md1-s1 kernel: mdt(OE) May 01 23:42:55 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:55 fir-md1-s1 kernel: mgs(OE) May 01 23:42:55 fir-md1-s1 kernel: mgc(OE) May 01 23:42:55 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:55 fir-md1-s1 kernel: lquota(OE) May 01 23:42:55 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:55 fir-md1-s1 kernel: lustre(OE) May 01 23:42:55 fir-md1-s1 kernel: lmv(OE) May 01 23:42:55 fir-md1-s1 kernel: mdc(OE) May 01 23:42:55 fir-md1-s1 kernel: osc(OE) May 01 23:42:55 fir-md1-s1 kernel: lov(OE) May 01 23:42:55 fir-md1-s1 kernel: fid(OE) May 01 23:42:55 fir-md1-s1 kernel: fld(OE) May 01 23:42:55 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:55 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:55 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:55 fir-md1-s1 kernel: lnet(OE) May 01 23:42:55 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:55 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:55 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:55 fir-md1-s1 kernel: nfsv4 May 01 23:42:55 fir-md1-s1 kernel: dns_resolver May 01 23:42:55 fir-md1-s1 kernel: nfs May 01 23:42:55 fir-md1-s1 kernel: lockd May 01 23:42:55 fir-md1-s1 kernel: grace May 01 23:42:55 fir-md1-s1 kernel: fscache May 01 23:42:55 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:55 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:55 fir-md1-s1 kernel: dell_rbu May 01 23:42:55 fir-md1-s1 kernel: sunrpc May 01 23:42:55 fir-md1-s1 kernel: vfat May 01 23:42:55 fir-md1-s1 kernel: fat May 01 23:42:55 fir-md1-s1 kernel: dm_round_robin May 01 23:42:55 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:55 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:55 fir-md1-s1 kernel: kvm_amd May 01 23:42:55 fir-md1-s1 kernel: kvm May 01 23:42:55 fir-md1-s1 kernel: ses May 01 23:42:55 fir-md1-s1 kernel: irqbypass May 01 23:42:55 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:55 fir-md1-s1 kernel: enclosure May 01 23:42:55 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:55 fir-md1-s1 kernel: dcdbas May 01 23:42:55 fir-md1-s1 kernel: aesni_intel May 01 23:42:55 fir-md1-s1 kernel: lrw May 01 23:42:55 fir-md1-s1 kernel: gf128mul May 01 23:42:55 fir-md1-s1 kernel: glue_helper May 01 23:42:55 fir-md1-s1 kernel: ablk_helper May 01 23:42:55 fir-md1-s1 kernel: cryptd May 01 23:42:55 fir-md1-s1 kernel: ipmi_si May 01 23:42:55 fir-md1-s1 kernel: pcspkr May 01 23:42:55 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:55 fir-md1-s1 kernel: ccp May 01 23:42:55 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:55 fir-md1-s1 kernel: dm_multipath May 01 23:42:55 fir-md1-s1 kernel: sg May 01 23:42:55 fir-md1-s1 kernel: k10temp May 01 23:42:55 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:55 fir-md1-s1 kernel: dm_mod May 01 23:42:55 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:55 fir-md1-s1 kernel: knem(OE) May 01 23:42:55 fir-md1-s1 kernel: ip_tables May 01 23:42:55 fir-md1-s1 kernel: ext4 May 01 23:42:55 fir-md1-s1 kernel: mbcache May 01 23:42:55 fir-md1-s1 kernel: jbd2 May 01 23:42:55 fir-md1-s1 kernel: sd_mod May 01 23:42:55 fir-md1-s1 kernel: crc_t10dif May 01 23:42:55 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:55 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:55 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:55 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:55 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:55 fir-md1-s1 kernel: syscopyarea May 01 23:42:55 fir-md1-s1 kernel: sysfillrect May 01 23:42:55 fir-md1-s1 kernel: sysimgblt May 01 23:42:55 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:55 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:55 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:55 fir-md1-s1 kernel: ttm May 01 23:42:55 fir-md1-s1 kernel: devlink May 01 23:42:55 fir-md1-s1 kernel: ahci May 01 23:42:55 fir-md1-s1 kernel: crct10dif_common May 01 23:42:55 fir-md1-s1 kernel: libahci May 01 23:42:55 fir-md1-s1 kernel: drm May 01 23:42:55 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:55 fir-md1-s1 kernel: tg3 May 01 23:42:55 fir-md1-s1 kernel: crc32c_intel May 01 23:42:55 fir-md1-s1 kernel: libata May 01 23:42:55 fir-md1-s1 kernel: megaraid_sas May 01 23:42:55 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:55 fir-md1-s1 kernel: ptp May 01 23:42:55 fir-md1-s1 kernel: pps_core May 01 23:42:55 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:55 fir-md1-s1 kernel: raid_class May 01 23:42:55 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:55 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:55 fir-md1-s1 kernel: May 01 23:42:55 fir-md1-s1 kernel: CPU: 15 PID: 103125 Comm: mdt_io03_043 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:55 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:55 fir-md1-s1 kernel: task: ffff985912f64100 ti: ffff9858407d0000 task.ti: ffff9858407d0000 May 01 23:42:55 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:55 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:42:55 fir-md1-s1 kernel: RSP: 0018:ffff9858407d38e8 EFLAGS: 00000246 May 01 23:42:55 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff984851f7a378 RCX: 0000000000790000 May 01 23:42:55 fir-md1-s1 kernel: RDX: ffff982cff01b780 RSI: 0000000001010101 RDI: ffff982c9fc8c480 May 01 23:42:55 fir-md1-s1 kernel: RBP: ffff9858407d38e8 R08: ffff985d3f4db780 R09: 0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff985d3f55ac00 May 01 23:42:55 fir-md1-s1 kernel: R13: ffff985912f64168 R14: 00ff9858407d3850 R15: ffff984cf34000a0 May 01 23:42:55 fir-md1-s1 kernel: FS: 00007f63c1c68740(0000) GS:ffff985d3f4c0000(0000) knlGS:0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:55 fir-md1-s1 kernel: CR2: 00007ff884d9fd1c CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:55 fir-md1-s1 kernel: Call Trace: May 01 23:42:55 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:55 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:55 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? load_balance+0x178/0x9a0 May 01 23:42:55 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:55 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:55 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:55 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:55 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: Code: May 01 23:42:55 fir-md1-s1 kernel: 13 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: c1 May 01 23:42:55 fir-md1-s1 kernel: ea May 01 23:42:55 fir-md1-s1 kernel: 0d May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 98 May 01 23:42:55 fir-md1-s1 kernel: 83 May 01 23:42:55 fir-md1-s1 kernel: e2 May 01 23:42:55 fir-md1-s1 kernel: 30 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 81 May 01 23:42:55 fir-md1-s1 kernel: c2 May 01 23:42:55 fir-md1-s1 kernel: 80 May 01 23:42:55 fir-md1-s1 kernel: b7 May 01 23:42:55 fir-md1-s1 kernel: 01 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 03 May 01 23:42:55 fir-md1-s1 kernel: 14 May 01 23:42:55 fir-md1-s1 kernel: c5 May 01 23:42:55 fir-md1-s1 kernel: 60 May 01 23:42:55 fir-md1-s1 kernel: b9 May 01 23:42:55 fir-md1-s1 kernel: b4 May 01 23:42:55 fir-md1-s1 kernel: b7 May 01 23:42:55 fir-md1-s1 kernel: 4c May 01 23:42:55 fir-md1-s1 kernel: 89 May 01 23:42:55 fir-md1-s1 kernel: 02 May 01 23:42:55 fir-md1-s1 kernel: 41 May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 40 May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c0 May 01 23:42:55 fir-md1-s1 kernel: 75 May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 1f May 01 23:42:55 fir-md1-s1 kernel: 44 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: f3 May 01 23:42:55 fir-md1-s1 kernel: 90 May 01 23:42:55 fir-md1-s1 kernel: <41> May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 40 May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c0 May 01 23:42:55 fir-md1-s1 kernel: 74 May 01 23:42:55 fir-md1-s1 kernel: f6 May 01 23:42:55 fir-md1-s1 kernel: 4d May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 4d May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c9 May 01 23:42:55 fir-md1-s1 kernel: 74 May 01 23:42:55 fir-md1-s1 kernel: 04 May 01 23:42:55 fir-md1-s1 kernel: 41 May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 18 May 01 23:42:55 fir-md1-s1 kernel: 09 May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: May 01 23:42:55 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#21 stuck for 22s! [mdt_io01_085:103094] May 01 23:42:55 fir-md1-s1 kernel: Modules linked in: May 01 23:42:55 fir-md1-s1 kernel: osp(OE) May 01 23:42:55 fir-md1-s1 kernel: mdd(OE) May 01 23:42:55 fir-md1-s1 kernel: lod(OE) May 01 23:42:55 fir-md1-s1 kernel: mdt(OE) May 01 23:42:55 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:55 fir-md1-s1 kernel: mgs(OE) May 01 23:42:55 fir-md1-s1 kernel: mgc(OE) May 01 23:42:55 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:55 fir-md1-s1 kernel: lquota(OE) May 01 23:42:55 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:55 fir-md1-s1 kernel: lustre(OE) May 01 23:42:55 fir-md1-s1 kernel: lmv(OE) May 01 23:42:55 fir-md1-s1 kernel: mdc(OE) May 01 23:42:55 fir-md1-s1 kernel: osc(OE) May 01 23:42:55 fir-md1-s1 kernel: lov(OE) May 01 23:42:55 fir-md1-s1 kernel: fid(OE) May 01 23:42:55 fir-md1-s1 kernel: fld(OE) May 01 23:42:55 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:55 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:55 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:55 fir-md1-s1 kernel: lnet(OE) May 01 23:42:55 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:55 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:55 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:55 fir-md1-s1 kernel: nfsv4 May 01 23:42:55 fir-md1-s1 kernel: dns_resolver May 01 23:42:55 fir-md1-s1 kernel: nfs May 01 23:42:55 fir-md1-s1 kernel: lockd May 01 23:42:55 fir-md1-s1 kernel: grace May 01 23:42:55 fir-md1-s1 kernel: fscache May 01 23:42:55 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:55 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:55 fir-md1-s1 kernel: dell_rbu May 01 23:42:55 fir-md1-s1 kernel: sunrpc May 01 23:42:55 fir-md1-s1 kernel: vfat May 01 23:42:55 fir-md1-s1 kernel: fat May 01 23:42:55 fir-md1-s1 kernel: dm_round_robin May 01 23:42:55 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:55 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:55 fir-md1-s1 kernel: kvm_amd May 01 23:42:55 fir-md1-s1 kernel: kvm May 01 23:42:55 fir-md1-s1 kernel: ses May 01 23:42:55 fir-md1-s1 kernel: irqbypass May 01 23:42:55 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:55 fir-md1-s1 kernel: enclosure May 01 23:42:55 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:55 fir-md1-s1 kernel: dcdbas May 01 23:42:55 fir-md1-s1 kernel: aesni_intel May 01 23:42:55 fir-md1-s1 kernel: lrw May 01 23:42:55 fir-md1-s1 kernel: gf128mul May 01 23:42:55 fir-md1-s1 kernel: glue_helper May 01 23:42:55 fir-md1-s1 kernel: ablk_helper May 01 23:42:55 fir-md1-s1 kernel: cryptd May 01 23:42:55 fir-md1-s1 kernel: ipmi_si May 01 23:42:55 fir-md1-s1 kernel: pcspkr May 01 23:42:55 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:55 fir-md1-s1 kernel: ccp May 01 23:42:55 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:55 fir-md1-s1 kernel: dm_multipath May 01 23:42:55 fir-md1-s1 kernel: sg May 01 23:42:55 fir-md1-s1 kernel: k10temp May 01 23:42:55 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:55 fir-md1-s1 kernel: dm_mod May 01 23:42:55 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:55 fir-md1-s1 kernel: knem(OE) May 01 23:42:55 fir-md1-s1 kernel: ip_tables May 01 23:42:55 fir-md1-s1 kernel: ext4 May 01 23:42:55 fir-md1-s1 kernel: mbcache May 01 23:42:55 fir-md1-s1 kernel: jbd2 May 01 23:42:55 fir-md1-s1 kernel: sd_mod May 01 23:42:55 fir-md1-s1 kernel: crc_t10dif May 01 23:42:55 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:55 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:55 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:55 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:55 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:55 fir-md1-s1 kernel: syscopyarea May 01 23:42:55 fir-md1-s1 kernel: sysfillrect May 01 23:42:55 fir-md1-s1 kernel: sysimgblt May 01 23:42:55 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:55 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:55 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:55 fir-md1-s1 kernel: ttm May 01 23:42:55 fir-md1-s1 kernel: devlink May 01 23:42:55 fir-md1-s1 kernel: ahci May 01 23:42:55 fir-md1-s1 kernel: crct10dif_common May 01 23:42:55 fir-md1-s1 kernel: libahci May 01 23:42:55 fir-md1-s1 kernel: drm May 01 23:42:55 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:55 fir-md1-s1 kernel: tg3 May 01 23:42:55 fir-md1-s1 kernel: crc32c_intel May 01 23:42:55 fir-md1-s1 kernel: libata May 01 23:42:55 fir-md1-s1 kernel: megaraid_sas May 01 23:42:55 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:55 fir-md1-s1 kernel: ptp May 01 23:42:55 fir-md1-s1 kernel: pps_core May 01 23:42:55 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:55 fir-md1-s1 kernel: raid_class May 01 23:42:55 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:55 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:55 fir-md1-s1 kernel: May 01 23:42:55 fir-md1-s1 kernel: CPU: 21 PID: 103094 Comm: mdt_io01_085 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:55 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:55 fir-md1-s1 kernel: task: ffff982c9ff3c100 ti: ffff98596d728000 task.ti: ffff98596d728000 May 01 23:42:55 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:55 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x126/0x200 May 01 23:42:55 fir-md1-s1 kernel: RSP: 0018:ffff98596d72b8e8 EFLAGS: 00000246 May 01 23:42:55 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff98267ca42b78 RCX: 0000000000a90000 May 01 23:42:55 fir-md1-s1 kernel: RDX: ffff984cff79b780 RSI: 0000000000d10101 RDI: ffff982c9fc8c480 May 01 23:42:55 fir-md1-s1 kernel: RBP: ffff98596d72b8e8 R08: ffff983cff75b780 R09: 0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff983cff65ac00 May 01 23:42:55 fir-md1-s1 kernel: R13: ffff982c9ff3c168 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:42:55 fir-md1-s1 kernel: FS: 00007fad68bc1880(0000) GS:ffff983cff740000(0000) knlGS:0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:55 fir-md1-s1 kernel: CR2: 00007ffcc3425f98 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:55 fir-md1-s1 kernel: Call Trace: May 01 23:42:55 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:55 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:55 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:55 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:55 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:55 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:55 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: Code: May 01 23:42:55 fir-md1-s1 kernel: 0d May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 98 May 01 23:42:55 fir-md1-s1 kernel: 83 May 01 23:42:55 fir-md1-s1 kernel: e2 May 01 23:42:55 fir-md1-s1 kernel: 30 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 81 May 01 23:42:55 fir-md1-s1 kernel: c2 May 01 23:42:55 fir-md1-s1 kernel: 80 May 01 23:42:55 fir-md1-s1 kernel: b7 May 01 23:42:55 fir-md1-s1 kernel: 01 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 03 May 01 23:42:55 fir-md1-s1 kernel: 14 May 01 23:42:55 fir-md1-s1 kernel: c5 May 01 23:42:55 fir-md1-s1 kernel: 60 May 01 23:42:55 fir-md1-s1 kernel: b9 May 01 23:42:55 fir-md1-s1 kernel: b4 May 01 23:42:55 fir-md1-s1 kernel: b7 May 01 23:42:55 fir-md1-s1 kernel: 4c May 01 23:42:55 fir-md1-s1 kernel: 89 May 01 23:42:55 fir-md1-s1 kernel: 02 May 01 23:42:55 fir-md1-s1 kernel: 41 May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 40 May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c0 May 01 23:42:55 fir-md1-s1 kernel: 75 May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 1f May 01 23:42:55 fir-md1-s1 kernel: 44 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: f3 May 01 23:42:55 fir-md1-s1 kernel: 90 May 01 23:42:55 fir-md1-s1 kernel: 41 May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 40 May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: <85> May 01 23:42:55 fir-md1-s1 kernel: c0 May 01 23:42:55 fir-md1-s1 kernel: 74 May 01 23:42:55 fir-md1-s1 kernel: f6 May 01 23:42:55 fir-md1-s1 kernel: 4d May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 4d May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c9 May 01 23:42:55 fir-md1-s1 kernel: 74 May 01 23:42:55 fir-md1-s1 kernel: 04 May 01 23:42:55 fir-md1-s1 kernel: 41 May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 18 May 01 23:42:55 fir-md1-s1 kernel: 09 May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 17 May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: b7 May 01 23:42:55 fir-md1-s1 kernel: c2 May 01 23:42:55 fir-md1-s1 kernel: May 01 23:42:55 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#32 stuck for 23s! [mdt_io00_072:103262] May 01 23:42:55 fir-md1-s1 kernel: Modules linked in: May 01 23:42:55 fir-md1-s1 kernel: osp(OE) May 01 23:42:55 fir-md1-s1 kernel: mdd(OE) May 01 23:42:55 fir-md1-s1 kernel: lod(OE) May 01 23:42:55 fir-md1-s1 kernel: mdt(OE) May 01 23:42:55 fir-md1-s1 kernel: lfsck(OE) May 01 23:42:55 fir-md1-s1 kernel: mgs(OE) May 01 23:42:55 fir-md1-s1 kernel: mgc(OE) May 01 23:42:55 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:42:55 fir-md1-s1 kernel: lquota(OE) May 01 23:42:55 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:42:55 fir-md1-s1 kernel: lustre(OE) May 01 23:42:55 fir-md1-s1 kernel: lmv(OE) May 01 23:42:55 fir-md1-s1 kernel: mdc(OE) May 01 23:42:55 fir-md1-s1 kernel: osc(OE) May 01 23:42:55 fir-md1-s1 kernel: lov(OE) May 01 23:42:55 fir-md1-s1 kernel: fid(OE) May 01 23:42:55 fir-md1-s1 kernel: fld(OE) May 01 23:42:55 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:42:55 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:42:55 fir-md1-s1 kernel: obdclass(OE) May 01 23:42:55 fir-md1-s1 kernel: lnet(OE) May 01 23:42:55 fir-md1-s1 kernel: libcfs(OE) May 01 23:42:55 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:42:55 fir-md1-s1 kernel: auth_rpcgss May 01 23:42:55 fir-md1-s1 kernel: nfsv4 May 01 23:42:55 fir-md1-s1 kernel: dns_resolver May 01 23:42:55 fir-md1-s1 kernel: nfs May 01 23:42:55 fir-md1-s1 kernel: lockd May 01 23:42:55 fir-md1-s1 kernel: grace May 01 23:42:55 fir-md1-s1 kernel: fscache May 01 23:42:55 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:42:55 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: iw_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_cm(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_umad(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:42:55 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:42:55 fir-md1-s1 kernel: dell_rbu May 01 23:42:55 fir-md1-s1 kernel: sunrpc May 01 23:42:55 fir-md1-s1 kernel: vfat May 01 23:42:55 fir-md1-s1 kernel: fat May 01 23:42:55 fir-md1-s1 kernel: dm_round_robin May 01 23:42:55 fir-md1-s1 kernel: amd64_edac_mod May 01 23:42:55 fir-md1-s1 kernel: edac_mce_amd May 01 23:42:55 fir-md1-s1 kernel: kvm_amd May 01 23:42:55 fir-md1-s1 kernel: kvm May 01 23:42:55 fir-md1-s1 kernel: ses May 01 23:42:55 fir-md1-s1 kernel: irqbypass May 01 23:42:55 fir-md1-s1 kernel: crc32_pclmul May 01 23:42:55 fir-md1-s1 kernel: enclosure May 01 23:42:55 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:42:55 fir-md1-s1 kernel: dcdbas May 01 23:42:55 fir-md1-s1 kernel: aesni_intel May 01 23:42:55 fir-md1-s1 kernel: lrw May 01 23:42:55 fir-md1-s1 kernel: gf128mul May 01 23:42:55 fir-md1-s1 kernel: glue_helper May 01 23:42:55 fir-md1-s1 kernel: ablk_helper May 01 23:42:55 fir-md1-s1 kernel: cryptd May 01 23:42:55 fir-md1-s1 kernel: ipmi_si May 01 23:42:55 fir-md1-s1 kernel: pcspkr May 01 23:42:55 fir-md1-s1 kernel: ipmi_devintf May 01 23:42:55 fir-md1-s1 kernel: ccp May 01 23:42:55 fir-md1-s1 kernel: i2c_piix4 May 01 23:42:55 fir-md1-s1 kernel: dm_multipath May 01 23:42:55 fir-md1-s1 kernel: sg May 01 23:42:55 fir-md1-s1 kernel: k10temp May 01 23:42:55 fir-md1-s1 kernel: ipmi_msghandler May 01 23:42:55 fir-md1-s1 kernel: dm_mod May 01 23:42:55 fir-md1-s1 kernel: acpi_power_meter May 01 23:42:55 fir-md1-s1 kernel: knem(OE) May 01 23:42:55 fir-md1-s1 kernel: ip_tables May 01 23:42:55 fir-md1-s1 kernel: ext4 May 01 23:42:55 fir-md1-s1 kernel: mbcache May 01 23:42:55 fir-md1-s1 kernel: jbd2 May 01 23:42:55 fir-md1-s1 kernel: sd_mod May 01 23:42:55 fir-md1-s1 kernel: crc_t10dif May 01 23:42:55 fir-md1-s1 kernel: crct10dif_generic May 01 23:42:55 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:42:55 fir-md1-s1 kernel: ib_core(OE) May 01 23:42:55 fir-md1-s1 kernel: i2c_algo_bit May 01 23:42:55 fir-md1-s1 kernel: drm_kms_helper May 01 23:42:55 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:42:55 fir-md1-s1 kernel: syscopyarea May 01 23:42:55 fir-md1-s1 kernel: sysfillrect May 01 23:42:55 fir-md1-s1 kernel: sysimgblt May 01 23:42:55 fir-md1-s1 kernel: fb_sys_fops May 01 23:42:55 fir-md1-s1 kernel: mlxfw(OE) May 01 23:42:55 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:42:55 fir-md1-s1 kernel: ttm May 01 23:42:55 fir-md1-s1 kernel: devlink May 01 23:42:55 fir-md1-s1 kernel: ahci May 01 23:42:55 fir-md1-s1 kernel: crct10dif_common May 01 23:42:55 fir-md1-s1 kernel: libahci May 01 23:42:55 fir-md1-s1 kernel: drm May 01 23:42:55 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:42:55 fir-md1-s1 kernel: tg3 May 01 23:42:55 fir-md1-s1 kernel: crc32c_intel May 01 23:42:55 fir-md1-s1 kernel: libata May 01 23:42:55 fir-md1-s1 kernel: megaraid_sas May 01 23:42:55 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:42:55 fir-md1-s1 kernel: ptp May 01 23:42:55 fir-md1-s1 kernel: pps_core May 01 23:42:55 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:42:55 fir-md1-s1 kernel: raid_class May 01 23:42:55 fir-md1-s1 kernel: scsi_transport_sas May 01 23:42:55 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:55 fir-md1-s1 kernel: May 01 23:42:55 fir-md1-s1 kernel: CPU: 32 PID: 103262 Comm: mdt_io00_072 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:55 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:55 fir-md1-s1 kernel: task: ffff984cba239040 ti: ffff982bc9ed4000 task.ti: ffff982bc9ed4000 May 01 23:42:55 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:42:55 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:42:55 fir-md1-s1 kernel: RSP: 0018:ffff982bc9ed78e8 EFLAGS: 00000246 May 01 23:42:55 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff9837d662bb78 RCX: 0000000001010000 May 01 23:42:55 fir-md1-s1 kernel: RDX: ffff983cff75b780 RSI: 0000000000a90101 RDI: ffff982c9fc8c480 May 01 23:42:55 fir-md1-s1 kernel: RBP: ffff982bc9ed78e8 R08: ffff982cff01b780 R09: 0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff982cfef9ac00 May 01 23:42:55 fir-md1-s1 kernel: R13: ffff984cba2390a8 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:42:55 fir-md1-s1 kernel: FS: 00007f1fd5eaa700(0000) GS:ffff982cff000000(0000) knlGS:0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:55 fir-md1-s1 kernel: CR2: 00007f41ebfcd1b0 CR3: 0000001038b88000 CR4: 00000000003407e0 May 01 23:42:55 fir-md1-s1 kernel: Call Trace: May 01 23:42:55 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:55 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:55 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? load_balance+0x178/0x9a0 May 01 23:42:55 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:55 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:55 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:55 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:55 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: Code: May 01 23:42:55 fir-md1-s1 kernel: 13 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: c1 May 01 23:42:55 fir-md1-s1 kernel: ea May 01 23:42:55 fir-md1-s1 kernel: 0d May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 98 May 01 23:42:55 fir-md1-s1 kernel: 83 May 01 23:42:55 fir-md1-s1 kernel: e2 May 01 23:42:55 fir-md1-s1 kernel: 30 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 81 May 01 23:42:55 fir-md1-s1 kernel: c2 May 01 23:42:55 fir-md1-s1 kernel: 80 May 01 23:42:55 fir-md1-s1 kernel: b7 May 01 23:42:55 fir-md1-s1 kernel: 01 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: 48 May 01 23:42:55 fir-md1-s1 kernel: 03 May 01 23:42:55 fir-md1-s1 kernel: 14 May 01 23:42:55 fir-md1-s1 kernel: c5 May 01 23:42:55 fir-md1-s1 kernel: 60 May 01 23:42:55 fir-md1-s1 kernel: b9 May 01 23:42:55 fir-md1-s1 kernel: b4 May 01 23:42:55 fir-md1-s1 kernel: b7 May 01 23:42:55 fir-md1-s1 kernel: 4c May 01 23:42:55 fir-md1-s1 kernel: 89 May 01 23:42:55 fir-md1-s1 kernel: 02 May 01 23:42:55 fir-md1-s1 kernel: 41 May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 40 May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c0 May 01 23:42:55 fir-md1-s1 kernel: 75 May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 1f May 01 23:42:55 fir-md1-s1 kernel: 44 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: 00 May 01 23:42:55 fir-md1-s1 kernel: f3 May 01 23:42:55 fir-md1-s1 kernel: 90 May 01 23:42:55 fir-md1-s1 kernel: <41> May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 40 May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c0 May 01 23:42:55 fir-md1-s1 kernel: 74 May 01 23:42:55 fir-md1-s1 kernel: f6 May 01 23:42:55 fir-md1-s1 kernel: 4d May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: 08 May 01 23:42:55 fir-md1-s1 kernel: 4d May 01 23:42:55 fir-md1-s1 kernel: 85 May 01 23:42:55 fir-md1-s1 kernel: c9 May 01 23:42:55 fir-md1-s1 kernel: 74 May 01 23:42:55 fir-md1-s1 kernel: 04 May 01 23:42:55 fir-md1-s1 kernel: 41 May 01 23:42:55 fir-md1-s1 kernel: 0f May 01 23:42:55 fir-md1-s1 kernel: 18 May 01 23:42:55 fir-md1-s1 kernel: 09 May 01 23:42:55 fir-md1-s1 kernel: 8b May 01 23:42:55 fir-md1-s1 kernel: May 01 23:42:55 fir-md1-s1 kernel: Lustre: 103102:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply req@ffff9826a0209450 x1631546314486880/t0(0) o4->1f7bbeda-f291-d2ba-e680-a24cad2ce97f@10.9.104.23@o2ib4:29/0 lens 944/448 e 0 to 0 dl 1556779379 ref 2 fl Interpret:/2/0 rc 0/0 May 01 23:42:55 fir-md1-s1 kernel: Lustre: 103102:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 139 previous similar messages May 01 23:42:55 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:42:55 fir-md1-s1 kernel: May 01 23:42:55 fir-md1-s1 kernel: CPU: 2 PID: 101733 Comm: mdt_io02_001 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:42:55 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:42:55 fir-md1-s1 kernel: task: ffff982cf0b630c0 ti: ffff984cfa370000 task.ti: ffff984cfa370000 May 01 23:42:55 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:42:55 fir-md1-s1 kernel: RSP: 0018:ffff984cfa3738e8 EFLAGS: 00000246 May 01 23:42:55 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff985c7a711b78 RCX: 0000000000110000 May 01 23:42:55 fir-md1-s1 kernel: RDX: ffff985d3f4db780 RSI: 0000000000790101 RDI: ffff982c9fc8c480 May 01 23:42:55 fir-md1-s1 kernel: RBP: ffff984cfa3738e8 R08: ffff984cff61b780 R09: 0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff984cff61ac00 May 01 23:42:55 fir-md1-s1 kernel: R13: ffff982cf0b63128 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:42:55 fir-md1-s1 kernel: FS: 00007f759e3eb740(0000) GS:ffff984cff600000(0000) knlGS:0000000000000000 May 01 23:42:55 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:42:55 fir-md1-s1 kernel: CR2: 00007f759e3fa000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:42:55 fir-md1-s1 kernel: Call Trace: May 01 23:42:55 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:42:55 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:42:55 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:42:55 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:42:55 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:42:55 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:42:55 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:42:55 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:42:55 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:42:55 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:42:55 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:42:55 fir-md1-s1 kernel: Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b May 01 23:42:56 fir-md1-s1 kernel: Lustre: 101352:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556779354/real 1556779354] req@ffff98239efdcb00 x1632254604122352/t0(0) o601->fir-MDT0000-lwp-MDT0002@0@lo:23/10 lens 336/336 e 1 to 1 dl 1556779375 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 01 23:42:56 fir-md1-s1 kernel: Lustre: 101352:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 2 previous similar messages May 01 23:42:56 fir-md1-s1 kernel: Lustre: fir-MDT0000-lwp-MDT0002: Connection to fir-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete May 01 23:42:56 fir-md1-s1 kernel: Lustre: fir-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID May 01 23:42:59 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client d2ab40ab-8888-3abb-75f9-9c32b2196967 (at 10.8.26.26@o2ib6) reconnecting May 01 23:42:59 fir-md1-s1 kernel: Lustre: Skipped 67 previous similar messages May 01 23:42:59 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.8.26.26@o2ib6) May 01 23:42:59 fir-md1-s1 kernel: Lustre: Skipped 69 previous similar messages May 01 23:43:04 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [mdt_io00_057:103101] May 01 23:43:04 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [mdt_io01_029:102923] May 01 23:43:04 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) May 01 23:43:04 fir-md1-s1 kernel: Modules linked in: May 01 23:43:04 fir-md1-s1 kernel: osp(OE) May 01 23:43:04 fir-md1-s1 kernel: mdd(OE) May 01 23:43:04 fir-md1-s1 kernel: lod(OE) May 01 23:43:04 fir-md1-s1 kernel: mdt(OE) May 01 23:43:04 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:04 fir-md1-s1 kernel: mgs(OE) May 01 23:43:04 fir-md1-s1 kernel: mgc(OE) May 01 23:43:04 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lquota(OE) May 01 23:43:04 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lustre(OE) May 01 23:43:04 fir-md1-s1 kernel: lmv(OE) May 01 23:43:04 fir-md1-s1 kernel: mdc(OE) May 01 23:43:04 fir-md1-s1 kernel: osc(OE) May 01 23:43:04 fir-md1-s1 kernel: lov(OE) May 01 23:43:04 fir-md1-s1 kernel: fid(OE) May 01 23:43:04 fir-md1-s1 kernel: fld(OE) May 01 23:43:04 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:04 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:04 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:04 fir-md1-s1 kernel: lnet(OE) May 01 23:43:04 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:04 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:04 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:04 fir-md1-s1 kernel: nfsv4 May 01 23:43:04 fir-md1-s1 kernel: dns_resolver May 01 23:43:04 fir-md1-s1 kernel: nfs May 01 23:43:04 fir-md1-s1 kernel: lockd May 01 23:43:04 fir-md1-s1 kernel: grace May 01 23:43:04 fir-md1-s1 kernel: fscache May 01 23:43:04 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:04 fir-md1-s1 kernel: dell_rbu May 01 23:43:04 fir-md1-s1 kernel: sunrpc May 01 23:43:04 fir-md1-s1 kernel: vfat May 01 23:43:04 fir-md1-s1 kernel: fat May 01 23:43:04 fir-md1-s1 kernel: dm_round_robin May 01 23:43:04 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:04 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:04 fir-md1-s1 kernel: kvm_amd May 01 23:43:04 fir-md1-s1 kernel: kvm May 01 23:43:04 fir-md1-s1 kernel: ses May 01 23:43:04 fir-md1-s1 kernel: irqbypass May 01 23:43:04 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:04 fir-md1-s1 kernel: enclosure May 01 23:43:04 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:04 fir-md1-s1 kernel: dcdbas May 01 23:43:04 fir-md1-s1 kernel: aesni_intel May 01 23:43:04 fir-md1-s1 kernel: lrw May 01 23:43:04 fir-md1-s1 kernel: gf128mul May 01 23:43:04 fir-md1-s1 kernel: glue_helper May 01 23:43:04 fir-md1-s1 kernel: ablk_helper May 01 23:43:04 fir-md1-s1 kernel: cryptd May 01 23:43:04 fir-md1-s1 kernel: ipmi_si May 01 23:43:04 fir-md1-s1 kernel: pcspkr May 01 23:43:04 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:04 fir-md1-s1 kernel: ccp May 01 23:43:04 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:04 fir-md1-s1 kernel: dm_multipath May 01 23:43:04 fir-md1-s1 kernel: sg May 01 23:43:04 fir-md1-s1 kernel: k10temp May 01 23:43:04 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:04 fir-md1-s1 kernel: dm_mod May 01 23:43:04 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:04 fir-md1-s1 kernel: knem(OE) May 01 23:43:04 fir-md1-s1 kernel: ip_tables May 01 23:43:04 fir-md1-s1 kernel: ext4 May 01 23:43:04 fir-md1-s1 kernel: mbcache May 01 23:43:04 fir-md1-s1 kernel: jbd2 May 01 23:43:04 fir-md1-s1 kernel: sd_mod May 01 23:43:04 fir-md1-s1 kernel: crc_t10dif May 01 23:43:04 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:04 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:04 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:04 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:04 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:04 fir-md1-s1 kernel: syscopyarea May 01 23:43:04 fir-md1-s1 kernel: sysfillrect May 01 23:43:04 fir-md1-s1 kernel: sysimgblt May 01 23:43:04 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:04 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:04 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:04 fir-md1-s1 kernel: ttm May 01 23:43:04 fir-md1-s1 kernel: devlink May 01 23:43:04 fir-md1-s1 kernel: ahci May 01 23:43:04 fir-md1-s1 kernel: crct10dif_common May 01 23:43:04 fir-md1-s1 kernel: libahci May 01 23:43:04 fir-md1-s1 kernel: drm May 01 23:43:04 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:04 fir-md1-s1 kernel: tg3 May 01 23:43:04 fir-md1-s1 kernel: crc32c_intel May 01 23:43:04 fir-md1-s1 kernel: libata May 01 23:43:04 fir-md1-s1 kernel: megaraid_sas May 01 23:43:04 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:04 fir-md1-s1 kernel: ptp May 01 23:43:04 fir-md1-s1 kernel: pps_core May 01 23:43:04 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:04 fir-md1-s1 kernel: raid_class May 01 23:43:04 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:04 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: CPU: 9 PID: 102923 Comm: mdt_io01_029 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:04 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:04 fir-md1-s1 kernel: task: ffff985c7cfbd140 ti: ffff985cbe4d8000 task.ti: ffff985cbe4d8000 May 01 23:43:04 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:04 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x15e/0x200 May 01 23:43:04 fir-md1-s1 kernel: RSP: 0018:ffff985cbe4db800 EFLAGS: 00000212 May 01 23:43:04 fir-md1-s1 kernel: RAX: 0000000000000101 RBX: ffff983165105ac0 RCX: 0000000000490000 May 01 23:43:04 fir-md1-s1 kernel: RDX: 0000000000190101 RSI: 0000000000000101 RDI: ffff982c9fc8c480 May 01 23:43:04 fir-md1-s1 kernel: RBP: ffff985cbe4db800 R08: ffff983cff69b780 R09: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R10: ffff983cff69f140 R11: ffffde3f18cce000 R12: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R13: ffff985cbe4db7a0 R14: ffff983165105830 R15: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: FS: 00007fddcbed4880(0000) GS:ffff983cff680000(0000) knlGS:0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:04 fir-md1-s1 kernel: CR2: 00007f50e83bb000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:04 fir-md1-s1 kernel: Call Trace: May 01 23:43:04 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:04 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:04 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:04 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:04 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:04 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: Code: May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 18 May 01 23:43:04 fir-md1-s1 kernel: 09 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 17 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 21 May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: f8 May 01 23:43:04 fir-md1-s1 kernel: 03 May 01 23:43:04 fir-md1-s1 kernel: 75 May 01 23:43:04 fir-md1-s1 kernel: 10 May 01 23:43:04 fir-md1-s1 kernel: eb May 01 23:43:04 fir-md1-s1 kernel: 1a May 01 23:43:04 fir-md1-s1 kernel: 66 May 01 23:43:04 fir-md1-s1 kernel: 2e May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 1f May 01 23:43:04 fir-md1-s1 kernel: 84 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 0c May 01 23:43:04 fir-md1-s1 kernel: f3 May 01 23:43:04 fir-md1-s1 kernel: 90 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 17 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: f8 May 01 23:43:04 fir-md1-s1 kernel: 03 May 01 23:43:04 fir-md1-s1 kernel: <75> May 01 23:43:04 fir-md1-s1 kernel: f0 May 01 23:43:04 fir-md1-s1 kernel: be May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: eb May 01 23:43:04 fir-md1-s1 kernel: 15 May 01 23:43:04 fir-md1-s1 kernel: 66 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 1f May 01 23:43:04 fir-md1-s1 kernel: 84 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 89 May 01 23:43:04 fir-md1-s1 kernel: d0 May 01 23:43:04 fir-md1-s1 kernel: f0 May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [mdt_io00_073:103263] May 01 23:43:04 fir-md1-s1 kernel: Modules linked in: May 01 23:43:04 fir-md1-s1 kernel: osp(OE) May 01 23:43:04 fir-md1-s1 kernel: mdd(OE) May 01 23:43:04 fir-md1-s1 kernel: lod(OE) May 01 23:43:04 fir-md1-s1 kernel: mdt(OE) May 01 23:43:04 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:04 fir-md1-s1 kernel: mgs(OE) May 01 23:43:04 fir-md1-s1 kernel: mgc(OE) May 01 23:43:04 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lquota(OE) May 01 23:43:04 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lustre(OE) May 01 23:43:04 fir-md1-s1 kernel: lmv(OE) May 01 23:43:04 fir-md1-s1 kernel: mdc(OE) May 01 23:43:04 fir-md1-s1 kernel: osc(OE) May 01 23:43:04 fir-md1-s1 kernel: lov(OE) May 01 23:43:04 fir-md1-s1 kernel: fid(OE) May 01 23:43:04 fir-md1-s1 kernel: fld(OE) May 01 23:43:04 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:04 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:04 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:04 fir-md1-s1 kernel: lnet(OE) May 01 23:43:04 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:04 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:04 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:04 fir-md1-s1 kernel: nfsv4 May 01 23:43:04 fir-md1-s1 kernel: dns_resolver May 01 23:43:04 fir-md1-s1 kernel: nfs May 01 23:43:04 fir-md1-s1 kernel: lockd May 01 23:43:04 fir-md1-s1 kernel: grace May 01 23:43:04 fir-md1-s1 kernel: fscache May 01 23:43:04 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:04 fir-md1-s1 kernel: dell_rbu May 01 23:43:04 fir-md1-s1 kernel: sunrpc May 01 23:43:04 fir-md1-s1 kernel: vfat May 01 23:43:04 fir-md1-s1 kernel: fat May 01 23:43:04 fir-md1-s1 kernel: dm_round_robin May 01 23:43:04 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:04 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:04 fir-md1-s1 kernel: kvm_amd May 01 23:43:04 fir-md1-s1 kernel: kvm May 01 23:43:04 fir-md1-s1 kernel: ses May 01 23:43:04 fir-md1-s1 kernel: irqbypass May 01 23:43:04 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:04 fir-md1-s1 kernel: enclosure May 01 23:43:04 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:04 fir-md1-s1 kernel: dcdbas May 01 23:43:04 fir-md1-s1 kernel: aesni_intel May 01 23:43:04 fir-md1-s1 kernel: lrw May 01 23:43:04 fir-md1-s1 kernel: gf128mul May 01 23:43:04 fir-md1-s1 kernel: glue_helper May 01 23:43:04 fir-md1-s1 kernel: ablk_helper May 01 23:43:04 fir-md1-s1 kernel: cryptd May 01 23:43:04 fir-md1-s1 kernel: ipmi_si May 01 23:43:04 fir-md1-s1 kernel: pcspkr May 01 23:43:04 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:04 fir-md1-s1 kernel: ccp May 01 23:43:04 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:04 fir-md1-s1 kernel: dm_multipath May 01 23:43:04 fir-md1-s1 kernel: sg May 01 23:43:04 fir-md1-s1 kernel: k10temp May 01 23:43:04 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:04 fir-md1-s1 kernel: dm_mod May 01 23:43:04 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:04 fir-md1-s1 kernel: knem(OE) May 01 23:43:04 fir-md1-s1 kernel: ip_tables May 01 23:43:04 fir-md1-s1 kernel: ext4 May 01 23:43:04 fir-md1-s1 kernel: mbcache May 01 23:43:04 fir-md1-s1 kernel: jbd2 May 01 23:43:04 fir-md1-s1 kernel: sd_mod May 01 23:43:04 fir-md1-s1 kernel: crc_t10dif May 01 23:43:04 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:04 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:04 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:04 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:04 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:04 fir-md1-s1 kernel: syscopyarea May 01 23:43:04 fir-md1-s1 kernel: sysfillrect May 01 23:43:04 fir-md1-s1 kernel: sysimgblt May 01 23:43:04 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:04 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:04 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:04 fir-md1-s1 kernel: ttm May 01 23:43:04 fir-md1-s1 kernel: devlink May 01 23:43:04 fir-md1-s1 kernel: ahci May 01 23:43:04 fir-md1-s1 kernel: crct10dif_common May 01 23:43:04 fir-md1-s1 kernel: libahci May 01 23:43:04 fir-md1-s1 kernel: drm May 01 23:43:04 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:04 fir-md1-s1 kernel: tg3 May 01 23:43:04 fir-md1-s1 kernel: crc32c_intel May 01 23:43:04 fir-md1-s1 kernel: libata May 01 23:43:04 fir-md1-s1 kernel: megaraid_sas May 01 23:43:04 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:04 fir-md1-s1 kernel: ptp May 01 23:43:04 fir-md1-s1 kernel: pps_core May 01 23:43:04 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:04 fir-md1-s1 kernel: raid_class May 01 23:43:04 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:04 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: CPU: 12 PID: 103263 Comm: mdt_io00_073 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:04 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:04 fir-md1-s1 kernel: task: ffff984cba23e180 ti: ffff98286afac000 task.ti: ffff98286afac000 May 01 23:43:04 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:04 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x126/0x200 May 01 23:43:04 fir-md1-s1 kernel: RSP: 0018:ffff98286afaf750 EFLAGS: 00000246 May 01 23:43:04 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff9831739477d8 RCX: 0000000000610000 May 01 23:43:04 fir-md1-s1 kernel: RDX: ffff984cff81b780 RSI: 0000000001110101 RDI: ffff982c9fc8c480 May 01 23:43:04 fir-md1-s1 kernel: RBP: ffff98286afaf750 R08: ffff982cfeedb780 R09: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R10: ffff982cfeedf140 R11: ffffde3edb488200 R12: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R13: ffff98286afaf6f0 R14: ffff983173947548 R15: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: FS: 00007fde62083880(0000) GS:ffff982cfeec0000(0000) knlGS:0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:04 fir-md1-s1 kernel: CR2: 00007f427f58b000 CR3: 000000203caa6000 CR4: 00000000003407e0 May 01 23:43:04 fir-md1-s1 kernel: Call Trace: May 01 23:43:04 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:04 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? zone_statistics+0x88/0xa0 May 01 23:43:04 fir-md1-s1 kernel: [] ? qsd_op_begin+0xb1/0x4b0 [lquota] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ldiskfs_inode_attach_jinode+0x55/0xd0 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_write_commit+0x3a2/0x8c0 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_commitrw_write.isra.46+0x608/0xd20 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_obd_commitrw+0x29b/0x520 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] obd_commitrw+0x9c/0x370 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_brw_write+0x100d/0x1a90 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:04 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: Code: May 01 23:43:04 fir-md1-s1 kernel: 0d May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 98 May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: e2 May 01 23:43:04 fir-md1-s1 kernel: 30 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 81 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: 80 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 03 May 01 23:43:04 fir-md1-s1 kernel: 14 May 01 23:43:04 fir-md1-s1 kernel: c5 May 01 23:43:04 fir-md1-s1 kernel: 60 May 01 23:43:04 fir-md1-s1 kernel: b9 May 01 23:43:04 fir-md1-s1 kernel: b4 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 4c May 01 23:43:04 fir-md1-s1 kernel: 89 May 01 23:43:04 fir-md1-s1 kernel: 02 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 75 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 1f May 01 23:43:04 fir-md1-s1 kernel: 44 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: f3 May 01 23:43:04 fir-md1-s1 kernel: 90 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: <85> May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: f6 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c9 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 04 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 18 May 01 23:43:04 fir-md1-s1 kernel: 09 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 17 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [mdt00_018:102388] May 01 23:43:04 fir-md1-s1 kernel: Modules linked in: May 01 23:43:04 fir-md1-s1 kernel: osp(OE) May 01 23:43:04 fir-md1-s1 kernel: mdd(OE) May 01 23:43:04 fir-md1-s1 kernel: lod(OE) May 01 23:43:04 fir-md1-s1 kernel: mdt(OE) May 01 23:43:04 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:04 fir-md1-s1 kernel: mgs(OE) May 01 23:43:04 fir-md1-s1 kernel: mgc(OE) May 01 23:43:04 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lquota(OE) May 01 23:43:04 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lustre(OE) May 01 23:43:04 fir-md1-s1 kernel: lmv(OE) May 01 23:43:04 fir-md1-s1 kernel: mdc(OE) May 01 23:43:04 fir-md1-s1 kernel: osc(OE) May 01 23:43:04 fir-md1-s1 kernel: lov(OE) May 01 23:43:04 fir-md1-s1 kernel: fid(OE) May 01 23:43:04 fir-md1-s1 kernel: fld(OE) May 01 23:43:04 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:04 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:04 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:04 fir-md1-s1 kernel: lnet(OE) May 01 23:43:04 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:04 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:04 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:04 fir-md1-s1 kernel: nfsv4 May 01 23:43:04 fir-md1-s1 kernel: dns_resolver May 01 23:43:04 fir-md1-s1 kernel: nfs May 01 23:43:04 fir-md1-s1 kernel: lockd May 01 23:43:04 fir-md1-s1 kernel: grace May 01 23:43:04 fir-md1-s1 kernel: fscache May 01 23:43:04 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:04 fir-md1-s1 kernel: dell_rbu May 01 23:43:04 fir-md1-s1 kernel: sunrpc May 01 23:43:04 fir-md1-s1 kernel: vfat May 01 23:43:04 fir-md1-s1 kernel: fat May 01 23:43:04 fir-md1-s1 kernel: dm_round_robin May 01 23:43:04 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:04 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:04 fir-md1-s1 kernel: kvm_amd May 01 23:43:04 fir-md1-s1 kernel: kvm May 01 23:43:04 fir-md1-s1 kernel: ses May 01 23:43:04 fir-md1-s1 kernel: irqbypass May 01 23:43:04 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:04 fir-md1-s1 kernel: enclosure May 01 23:43:04 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:04 fir-md1-s1 kernel: dcdbas May 01 23:43:04 fir-md1-s1 kernel: aesni_intel May 01 23:43:04 fir-md1-s1 kernel: lrw May 01 23:43:04 fir-md1-s1 kernel: gf128mul May 01 23:43:04 fir-md1-s1 kernel: glue_helper May 01 23:43:04 fir-md1-s1 kernel: ablk_helper May 01 23:43:04 fir-md1-s1 kernel: cryptd May 01 23:43:04 fir-md1-s1 kernel: ipmi_si May 01 23:43:04 fir-md1-s1 kernel: pcspkr May 01 23:43:04 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:04 fir-md1-s1 kernel: ccp May 01 23:43:04 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:04 fir-md1-s1 kernel: dm_multipath May 01 23:43:04 fir-md1-s1 kernel: sg May 01 23:43:04 fir-md1-s1 kernel: k10temp May 01 23:43:04 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:04 fir-md1-s1 kernel: dm_mod May 01 23:43:04 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:04 fir-md1-s1 kernel: knem(OE) May 01 23:43:04 fir-md1-s1 kernel: ip_tables May 01 23:43:04 fir-md1-s1 kernel: ext4 May 01 23:43:04 fir-md1-s1 kernel: mbcache May 01 23:43:04 fir-md1-s1 kernel: jbd2 May 01 23:43:04 fir-md1-s1 kernel: sd_mod May 01 23:43:04 fir-md1-s1 kernel: crc_t10dif May 01 23:43:04 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:04 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:04 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:04 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:04 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:04 fir-md1-s1 kernel: syscopyarea May 01 23:43:04 fir-md1-s1 kernel: sysfillrect May 01 23:43:04 fir-md1-s1 kernel: sysimgblt May 01 23:43:04 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:04 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:04 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:04 fir-md1-s1 kernel: ttm May 01 23:43:04 fir-md1-s1 kernel: devlink May 01 23:43:04 fir-md1-s1 kernel: ahci May 01 23:43:04 fir-md1-s1 kernel: crct10dif_common May 01 23:43:04 fir-md1-s1 kernel: libahci May 01 23:43:04 fir-md1-s1 kernel: drm May 01 23:43:04 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:04 fir-md1-s1 kernel: tg3 May 01 23:43:04 fir-md1-s1 kernel: crc32c_intel May 01 23:43:04 fir-md1-s1 kernel: libata May 01 23:43:04 fir-md1-s1 kernel: megaraid_sas May 01 23:43:04 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:04 fir-md1-s1 kernel: ptp May 01 23:43:04 fir-md1-s1 kernel: pps_core May 01 23:43:04 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:04 fir-md1-s1 kernel: raid_class May 01 23:43:04 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:04 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: CPU: 16 PID: 102388 Comm: mdt00_018 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:04 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:04 fir-md1-s1 kernel: task: ffff985884642080 ti: ffff984c4b64c000 task.ti: ffff984c4b64c000 May 01 23:43:04 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_inode_touch_time_cmp+0x40/0x90 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: RSP: 0018:ffff984c4b64f180 EFLAGS: 00000246 May 01 23:43:04 fir-md1-s1 kernel: RAX: 0000000000100000 RBX: ffffffffb7019f22 RCX: 000000010ca590b2 May 01 23:43:04 fir-md1-s1 kernel: RDX: ffff9824fb188380 RSI: ffff985cd731ce50 RDI: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: RBP: ffff984c4b64f180 R08: ffff984c4b64f300 R09: 00000000003ecd00 May 01 23:43:04 fir-md1-s1 kernel: R10: 0000000047bdbb01 R11: ffffde3f261ef6c0 R12: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R13: ffff982d3f224c80 R14: ffff983cf8b38400 R15: ffff982cfef254b8 May 01 23:43:04 fir-md1-s1 kernel: FS: 00007f32ccf2c740(0000) GS:ffff982cfef00000(0000) knlGS:0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:04 fir-md1-s1 kernel: CR2: 00007f32c5fef140 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:04 fir-md1-s1 kernel: Call Trace: May 01 23:43:04 fir-md1-s1 kernel: [] merge+0x62/0xc0 May 01 23:43:04 fir-md1-s1 kernel: [] ? ldiskfs_init_inode_table+0x410/0x410 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] list_sort+0x9b/0x250 May 01 23:43:04 fir-md1-s1 kernel: [] __ldiskfs_es_shrink+0x1ce/0x2a0 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_es_shrink+0xb4/0x130 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] shrink_slab+0x175/0x340 May 01 23:43:04 fir-md1-s1 kernel: [] ? zone_watermark_ok+0x1f/0x30 May 01 23:43:04 fir-md1-s1 kernel: [] ? compaction_suitable+0xa3/0xb0 May 01 23:43:04 fir-md1-s1 kernel: [] zone_reclaim+0x1d1/0x2f0 May 01 23:43:04 fir-md1-s1 kernel: [] get_page_from_freelist+0x87b/0xa70 May 01 23:43:04 fir-md1-s1 kernel: [] ? __getblk+0x2d/0x300 May 01 23:43:04 fir-md1-s1 kernel: [] __alloc_pages_nodemask+0x176/0x420 May 01 23:43:04 fir-md1-s1 kernel: [] alloc_pages_current+0x98/0x110 May 01 23:43:04 fir-md1-s1 kernel: [] new_slab+0x2c5/0x390 May 01 23:43:04 fir-md1-s1 kernel: [] ___slab_alloc+0x3ac/0x4f0 May 01 23:43:04 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:04 fir-md1-s1 kernel: [] ? fld_cache_lookup+0x36/0x1a0 [fld] May 01 23:43:04 fir-md1-s1 kernel: [] ? fld_local_lookup+0x62/0x270 [fld] May 01 23:43:04 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:04 fir-md1-s1 kernel: [] __slab_alloc+0x40/0x5c May 01 23:43:04 fir-md1-s1 kernel: [] kmem_cache_alloc+0x19b/0x1f0 May 01 23:43:04 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:04 fir-md1-s1 kernel: [] osp_object_alloc+0x40/0x170 [osp] May 01 23:43:04 fir-md1-s1 kernel: [] lod_object_init+0x1e7/0x3c0 [lod] May 01 23:43:04 fir-md1-s1 kernel: [] lu_object_alloc+0xe5/0x320 [obdclass] May 01 23:43:04 fir-md1-s1 kernel: [] lu_object_find_at+0x76/0x280 [obdclass] May 01 23:43:04 fir-md1-s1 kernel: [] lu_object_find_slice+0x1f/0x90 [obdclass] May 01 23:43:04 fir-md1-s1 kernel: [] mdd_object_find+0x10/0x70 [mdd] May 01 23:43:04 fir-md1-s1 kernel: [] obf_lookup+0x2c9/0x350 [mdd] May 01 23:43:04 fir-md1-s1 kernel: [] ? req_capsule_get_size+0x31/0x70 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_getattr_name_lock+0xf7c/0x1c30 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? __req_capsule_get+0x15f/0x740 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_intent_getattr+0x2b5/0x480 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_intent_policy+0x2e8/0xd00 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] ? mdt_intent_layout+0xcc0/0xcc0 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] ldlm_lock_enqueue+0x366/0xa60 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:04 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: Code: May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 15 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 8a May 01 23:43:04 fir-md1-s1 kernel: e8 May 01 23:43:04 fir-md1-s1 kernel: fc May 01 23:43:04 fir-md1-s1 kernel: ff May 01 23:43:04 fir-md1-s1 kernel: ff May 01 23:43:04 fir-md1-s1 kernel: b8 May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: c1 May 01 23:43:04 fir-md1-s1 kernel: e9 May 01 23:43:04 fir-md1-s1 kernel: 2b May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: e1 May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 29 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 86 May 01 23:43:04 fir-md1-s1 kernel: e8 May 01 23:43:04 fir-md1-s1 kernel: fc May 01 23:43:04 fir-md1-s1 kernel: ff May 01 23:43:04 fir-md1-s1 kernel: ff May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: c1 May 01 23:43:04 fir-md1-s1 kernel: e8 May 01 23:43:04 fir-md1-s1 kernel: 2b May 01 23:43:04 fir-md1-s1 kernel: a8 May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 24 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 4e May 01 23:43:04 fir-md1-s1 kernel: 18 May 01 23:43:04 fir-md1-s1 kernel: <48> May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 42 May 01 23:43:04 fir-md1-s1 kernel: 18 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 39 May 01 23:43:04 fir-md1-s1 kernel: c1 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 37 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 29 May 01 23:43:04 fir-md1-s1 kernel: c8 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: c1 May 01 23:43:04 fir-md1-s1 kernel: f8 May 01 23:43:04 fir-md1-s1 kernel: 3f May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: e0 May 01 23:43:04 fir-md1-s1 kernel: 02 May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: e8 May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [mdt_io02_065:103134] May 01 23:43:04 fir-md1-s1 kernel: Modules linked in: May 01 23:43:04 fir-md1-s1 kernel: osp(OE) May 01 23:43:04 fir-md1-s1 kernel: mdd(OE) May 01 23:43:04 fir-md1-s1 kernel: lod(OE) May 01 23:43:04 fir-md1-s1 kernel: mdt(OE) May 01 23:43:04 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:04 fir-md1-s1 kernel: mgs(OE) May 01 23:43:04 fir-md1-s1 kernel: mgc(OE) May 01 23:43:04 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lquota(OE) May 01 23:43:04 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lustre(OE) May 01 23:43:04 fir-md1-s1 kernel: lmv(OE) May 01 23:43:04 fir-md1-s1 kernel: mdc(OE) May 01 23:43:04 fir-md1-s1 kernel: osc(OE) May 01 23:43:04 fir-md1-s1 kernel: lov(OE) May 01 23:43:04 fir-md1-s1 kernel: fid(OE) May 01 23:43:04 fir-md1-s1 kernel: fld(OE) May 01 23:43:04 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:04 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:04 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:04 fir-md1-s1 kernel: lnet(OE) May 01 23:43:04 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:04 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:04 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:04 fir-md1-s1 kernel: nfsv4 May 01 23:43:04 fir-md1-s1 kernel: dns_resolver May 01 23:43:04 fir-md1-s1 kernel: nfs May 01 23:43:04 fir-md1-s1 kernel: lockd May 01 23:43:04 fir-md1-s1 kernel: grace May 01 23:43:04 fir-md1-s1 kernel: fscache May 01 23:43:04 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:04 fir-md1-s1 kernel: dell_rbu May 01 23:43:04 fir-md1-s1 kernel: sunrpc May 01 23:43:04 fir-md1-s1 kernel: vfat May 01 23:43:04 fir-md1-s1 kernel: fat May 01 23:43:04 fir-md1-s1 kernel: dm_round_robin May 01 23:43:04 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:04 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:04 fir-md1-s1 kernel: kvm_amd May 01 23:43:04 fir-md1-s1 kernel: kvm May 01 23:43:04 fir-md1-s1 kernel: ses May 01 23:43:04 fir-md1-s1 kernel: irqbypass May 01 23:43:04 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:04 fir-md1-s1 kernel: enclosure May 01 23:43:04 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:04 fir-md1-s1 kernel: dcdbas May 01 23:43:04 fir-md1-s1 kernel: aesni_intel May 01 23:43:04 fir-md1-s1 kernel: lrw May 01 23:43:04 fir-md1-s1 kernel: gf128mul May 01 23:43:04 fir-md1-s1 kernel: glue_helper May 01 23:43:04 fir-md1-s1 kernel: ablk_helper May 01 23:43:04 fir-md1-s1 kernel: cryptd May 01 23:43:04 fir-md1-s1 kernel: ipmi_si May 01 23:43:04 fir-md1-s1 kernel: pcspkr May 01 23:43:04 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:04 fir-md1-s1 kernel: ccp May 01 23:43:04 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:04 fir-md1-s1 kernel: dm_multipath May 01 23:43:04 fir-md1-s1 kernel: sg May 01 23:43:04 fir-md1-s1 kernel: k10temp May 01 23:43:04 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:04 fir-md1-s1 kernel: dm_mod May 01 23:43:04 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:04 fir-md1-s1 kernel: knem(OE) May 01 23:43:04 fir-md1-s1 kernel: ip_tables May 01 23:43:04 fir-md1-s1 kernel: ext4 May 01 23:43:04 fir-md1-s1 kernel: mbcache May 01 23:43:04 fir-md1-s1 kernel: jbd2 May 01 23:43:04 fir-md1-s1 kernel: sd_mod May 01 23:43:04 fir-md1-s1 kernel: crc_t10dif May 01 23:43:04 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:04 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:04 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:04 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:04 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:04 fir-md1-s1 kernel: syscopyarea May 01 23:43:04 fir-md1-s1 kernel: sysfillrect May 01 23:43:04 fir-md1-s1 kernel: sysimgblt May 01 23:43:04 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:04 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:04 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:04 fir-md1-s1 kernel: ttm May 01 23:43:04 fir-md1-s1 kernel: devlink May 01 23:43:04 fir-md1-s1 kernel: ahci May 01 23:43:04 fir-md1-s1 kernel: crct10dif_common May 01 23:43:04 fir-md1-s1 kernel: libahci May 01 23:43:04 fir-md1-s1 kernel: drm May 01 23:43:04 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:04 fir-md1-s1 kernel: tg3 May 01 23:43:04 fir-md1-s1 kernel: crc32c_intel May 01 23:43:04 fir-md1-s1 kernel: libata May 01 23:43:04 fir-md1-s1 kernel: megaraid_sas May 01 23:43:04 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:04 fir-md1-s1 kernel: ptp May 01 23:43:04 fir-md1-s1 kernel: pps_core May 01 23:43:04 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:04 fir-md1-s1 kernel: raid_class May 01 23:43:04 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:04 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: CPU: 18 PID: 103134 Comm: mdt_io02_065 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:04 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:04 fir-md1-s1 kernel: task: ffff985ccda90000 ti: ffff98583efd4000 task.ti: ffff98583efd4000 May 01 23:43:04 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:04 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:04 fir-md1-s1 kernel: RSP: 0018:ffff98583efd7750 EFLAGS: 00000246 May 01 23:43:04 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff983164d60378 RCX: 0000000000910000 May 01 23:43:04 fir-md1-s1 kernel: RDX: ffff983cff69b780 RSI: 0000000000490101 RDI: ffff982c9fc8c480 May 01 23:43:04 fir-md1-s1 kernel: RBP: ffff98583efd7750 R08: ffff984cff71b780 R09: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R10: ffff984cff71f140 R11: ffffde3ef98bd000 R12: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R13: ffff98583efd76f0 R14: ffff983164d600e8 R15: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: FS: 00007f010bbcf880(0000) GS:ffff984cff700000(0000) knlGS:0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:04 fir-md1-s1 kernel: CR2: 0000000001c9e8e0 CR3: 000000402db9c000 CR4: 00000000003407e0 May 01 23:43:04 fir-md1-s1 kernel: Call Trace: May 01 23:43:04 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:04 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? zone_statistics+0x88/0xa0 May 01 23:43:04 fir-md1-s1 kernel: [] ? qsd_op_begin+0xb1/0x4b0 [lquota] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ldiskfs_inode_attach_jinode+0x55/0xd0 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_write_commit+0x3a2/0x8c0 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_commitrw_write.isra.46+0x608/0xd20 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_obd_commitrw+0x29b/0x520 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] obd_commitrw+0x9c/0x370 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_brw_write+0x100d/0x1a90 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:04 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: Code: May 01 23:43:04 fir-md1-s1 kernel: 13 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: c1 May 01 23:43:04 fir-md1-s1 kernel: ea May 01 23:43:04 fir-md1-s1 kernel: 0d May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 98 May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: e2 May 01 23:43:04 fir-md1-s1 kernel: 30 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 81 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: 80 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 03 May 01 23:43:04 fir-md1-s1 kernel: 14 May 01 23:43:04 fir-md1-s1 kernel: c5 May 01 23:43:04 fir-md1-s1 kernel: 60 May 01 23:43:04 fir-md1-s1 kernel: b9 May 01 23:43:04 fir-md1-s1 kernel: b4 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 4c May 01 23:43:04 fir-md1-s1 kernel: 89 May 01 23:43:04 fir-md1-s1 kernel: 02 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 75 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 1f May 01 23:43:04 fir-md1-s1 kernel: 44 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: f3 May 01 23:43:04 fir-md1-s1 kernel: 90 May 01 23:43:04 fir-md1-s1 kernel: <41> May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: f6 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c9 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 04 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 18 May 01 23:43:04 fir-md1-s1 kernel: 09 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#25 stuck for 22s! [mdt_io01_082:103083] May 01 23:43:04 fir-md1-s1 kernel: Modules linked in: May 01 23:43:04 fir-md1-s1 kernel: osp(OE) May 01 23:43:04 fir-md1-s1 kernel: mdd(OE) May 01 23:43:04 fir-md1-s1 kernel: lod(OE) May 01 23:43:04 fir-md1-s1 kernel: mdt(OE) May 01 23:43:04 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:04 fir-md1-s1 kernel: mgs(OE) May 01 23:43:04 fir-md1-s1 kernel: mgc(OE) May 01 23:43:04 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lquota(OE) May 01 23:43:04 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lustre(OE) May 01 23:43:04 fir-md1-s1 kernel: lmv(OE) May 01 23:43:04 fir-md1-s1 kernel: mdc(OE) May 01 23:43:04 fir-md1-s1 kernel: osc(OE) May 01 23:43:04 fir-md1-s1 kernel: lov(OE) May 01 23:43:04 fir-md1-s1 kernel: fid(OE) May 01 23:43:04 fir-md1-s1 kernel: fld(OE) May 01 23:43:04 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:04 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:04 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:04 fir-md1-s1 kernel: lnet(OE) May 01 23:43:04 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:04 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:04 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:04 fir-md1-s1 kernel: nfsv4 May 01 23:43:04 fir-md1-s1 kernel: dns_resolver May 01 23:43:04 fir-md1-s1 kernel: nfs May 01 23:43:04 fir-md1-s1 kernel: lockd May 01 23:43:04 fir-md1-s1 kernel: grace May 01 23:43:04 fir-md1-s1 kernel: fscache May 01 23:43:04 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:04 fir-md1-s1 kernel: dell_rbu May 01 23:43:04 fir-md1-s1 kernel: sunrpc May 01 23:43:04 fir-md1-s1 kernel: vfat May 01 23:43:04 fir-md1-s1 kernel: fat May 01 23:43:04 fir-md1-s1 kernel: dm_round_robin May 01 23:43:04 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:04 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:04 fir-md1-s1 kernel: kvm_amd May 01 23:43:04 fir-md1-s1 kernel: kvm May 01 23:43:04 fir-md1-s1 kernel: ses May 01 23:43:04 fir-md1-s1 kernel: irqbypass May 01 23:43:04 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:04 fir-md1-s1 kernel: enclosure May 01 23:43:04 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:04 fir-md1-s1 kernel: dcdbas May 01 23:43:04 fir-md1-s1 kernel: aesni_intel May 01 23:43:04 fir-md1-s1 kernel: lrw May 01 23:43:04 fir-md1-s1 kernel: gf128mul May 01 23:43:04 fir-md1-s1 kernel: glue_helper May 01 23:43:04 fir-md1-s1 kernel: ablk_helper May 01 23:43:04 fir-md1-s1 kernel: cryptd May 01 23:43:04 fir-md1-s1 kernel: ipmi_si May 01 23:43:04 fir-md1-s1 kernel: pcspkr May 01 23:43:04 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:04 fir-md1-s1 kernel: ccp May 01 23:43:04 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:04 fir-md1-s1 kernel: dm_multipath May 01 23:43:04 fir-md1-s1 kernel: sg May 01 23:43:04 fir-md1-s1 kernel: k10temp May 01 23:43:04 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:04 fir-md1-s1 kernel: dm_mod May 01 23:43:04 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:04 fir-md1-s1 kernel: knem(OE) May 01 23:43:04 fir-md1-s1 kernel: ip_tables May 01 23:43:04 fir-md1-s1 kernel: ext4 May 01 23:43:04 fir-md1-s1 kernel: mbcache May 01 23:43:04 fir-md1-s1 kernel: jbd2 May 01 23:43:04 fir-md1-s1 kernel: sd_mod May 01 23:43:04 fir-md1-s1 kernel: crc_t10dif May 01 23:43:04 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:04 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:04 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:04 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:04 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:04 fir-md1-s1 kernel: syscopyarea May 01 23:43:04 fir-md1-s1 kernel: sysfillrect May 01 23:43:04 fir-md1-s1 kernel: sysimgblt May 01 23:43:04 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:04 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:04 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:04 fir-md1-s1 kernel: ttm May 01 23:43:04 fir-md1-s1 kernel: devlink May 01 23:43:04 fir-md1-s1 kernel: ahci May 01 23:43:04 fir-md1-s1 kernel: crct10dif_common May 01 23:43:04 fir-md1-s1 kernel: libahci May 01 23:43:04 fir-md1-s1 kernel: drm May 01 23:43:04 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:04 fir-md1-s1 kernel: tg3 May 01 23:43:04 fir-md1-s1 kernel: crc32c_intel May 01 23:43:04 fir-md1-s1 kernel: libata May 01 23:43:04 fir-md1-s1 kernel: megaraid_sas May 01 23:43:04 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:04 fir-md1-s1 kernel: ptp May 01 23:43:04 fir-md1-s1 kernel: pps_core May 01 23:43:04 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:04 fir-md1-s1 kernel: raid_class May 01 23:43:04 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:04 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: CPU: 25 PID: 103083 Comm: mdt_io01_082 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:04 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:04 fir-md1-s1 kernel: task: ffff985cfe905140 ti: ffff985ccaf18000 task.ti: ffff985ccaf18000 May 01 23:43:04 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:04 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x126/0x200 May 01 23:43:04 fir-md1-s1 kernel: RSP: 0018:ffff985ccaf1b800 EFLAGS: 00000246 May 01 23:43:04 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff9831703ceb60 RCX: 0000000000c90000 May 01 23:43:04 fir-md1-s1 kernel: RDX: ffff984cff71b780 RSI: 0000000000910101 RDI: ffff982c9fc8c480 May 01 23:43:04 fir-md1-s1 kernel: RBP: ffff985ccaf1b800 R08: ffff983cff79b780 R09: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R10: ffff983cff79f140 R11: ffffde3fa5607800 R12: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R13: ffff985ccaf1b7a0 R14: ffff9831703ce8d0 R15: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: FS: 00007f427f792740(0000) GS:ffff983cff780000(0000) knlGS:0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:04 fir-md1-s1 kernel: CR2: 00007f427f58b000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:04 fir-md1-s1 kernel: Call Trace: May 01 23:43:04 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:04 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:04 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:04 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:04 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:04 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: Code: May 01 23:43:04 fir-md1-s1 kernel: 0d May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 98 May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: e2 May 01 23:43:04 fir-md1-s1 kernel: 30 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 81 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: 80 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 03 May 01 23:43:04 fir-md1-s1 kernel: 14 May 01 23:43:04 fir-md1-s1 kernel: c5 May 01 23:43:04 fir-md1-s1 kernel: 60 May 01 23:43:04 fir-md1-s1 kernel: b9 May 01 23:43:04 fir-md1-s1 kernel: b4 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 4c May 01 23:43:04 fir-md1-s1 kernel: 89 May 01 23:43:04 fir-md1-s1 kernel: 02 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 75 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 1f May 01 23:43:04 fir-md1-s1 kernel: 44 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: f3 May 01 23:43:04 fir-md1-s1 kernel: 90 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: <85> May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: f6 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c9 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 04 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 18 May 01 23:43:04 fir-md1-s1 kernel: 09 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 17 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#34 stuck for 22s! [mdt_io02_034:102984] May 01 23:43:04 fir-md1-s1 kernel: Modules linked in: May 01 23:43:04 fir-md1-s1 kernel: osp(OE) May 01 23:43:04 fir-md1-s1 kernel: mdd(OE) May 01 23:43:04 fir-md1-s1 kernel: lod(OE) May 01 23:43:04 fir-md1-s1 kernel: mdt(OE) May 01 23:43:04 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:04 fir-md1-s1 kernel: mgs(OE) May 01 23:43:04 fir-md1-s1 kernel: mgc(OE) May 01 23:43:04 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lquota(OE) May 01 23:43:04 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:04 fir-md1-s1 kernel: lustre(OE) May 01 23:43:04 fir-md1-s1 kernel: lmv(OE) May 01 23:43:04 fir-md1-s1 kernel: mdc(OE) May 01 23:43:04 fir-md1-s1 kernel: osc(OE) May 01 23:43:04 fir-md1-s1 kernel: lov(OE) May 01 23:43:04 fir-md1-s1 kernel: fid(OE) May 01 23:43:04 fir-md1-s1 kernel: fld(OE) May 01 23:43:04 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:04 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:04 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:04 fir-md1-s1 kernel: lnet(OE) May 01 23:43:04 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:04 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:04 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:04 fir-md1-s1 kernel: nfsv4 May 01 23:43:04 fir-md1-s1 kernel: dns_resolver May 01 23:43:04 fir-md1-s1 kernel: nfs May 01 23:43:04 fir-md1-s1 kernel: lockd May 01 23:43:04 fir-md1-s1 kernel: grace May 01 23:43:04 fir-md1-s1 kernel: fscache May 01 23:43:04 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:04 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:04 fir-md1-s1 kernel: dell_rbu May 01 23:43:04 fir-md1-s1 kernel: sunrpc May 01 23:43:04 fir-md1-s1 kernel: vfat May 01 23:43:04 fir-md1-s1 kernel: fat May 01 23:43:04 fir-md1-s1 kernel: dm_round_robin May 01 23:43:04 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:04 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:04 fir-md1-s1 kernel: kvm_amd May 01 23:43:04 fir-md1-s1 kernel: kvm May 01 23:43:04 fir-md1-s1 kernel: ses May 01 23:43:04 fir-md1-s1 kernel: irqbypass May 01 23:43:04 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:04 fir-md1-s1 kernel: enclosure May 01 23:43:04 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:04 fir-md1-s1 kernel: dcdbas May 01 23:43:04 fir-md1-s1 kernel: aesni_intel May 01 23:43:04 fir-md1-s1 kernel: lrw May 01 23:43:04 fir-md1-s1 kernel: gf128mul May 01 23:43:04 fir-md1-s1 kernel: glue_helper May 01 23:43:04 fir-md1-s1 kernel: ablk_helper May 01 23:43:04 fir-md1-s1 kernel: cryptd May 01 23:43:04 fir-md1-s1 kernel: ipmi_si May 01 23:43:04 fir-md1-s1 kernel: pcspkr May 01 23:43:04 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:04 fir-md1-s1 kernel: ccp May 01 23:43:04 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:04 fir-md1-s1 kernel: dm_multipath May 01 23:43:04 fir-md1-s1 kernel: sg May 01 23:43:04 fir-md1-s1 kernel: k10temp May 01 23:43:04 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:04 fir-md1-s1 kernel: dm_mod May 01 23:43:04 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:04 fir-md1-s1 kernel: knem(OE) May 01 23:43:04 fir-md1-s1 kernel: ip_tables May 01 23:43:04 fir-md1-s1 kernel: ext4 May 01 23:43:04 fir-md1-s1 kernel: mbcache May 01 23:43:04 fir-md1-s1 kernel: jbd2 May 01 23:43:04 fir-md1-s1 kernel: sd_mod May 01 23:43:04 fir-md1-s1 kernel: crc_t10dif May 01 23:43:04 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:04 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:04 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:04 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:04 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:04 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:04 fir-md1-s1 kernel: syscopyarea May 01 23:43:04 fir-md1-s1 kernel: sysfillrect May 01 23:43:04 fir-md1-s1 kernel: sysimgblt May 01 23:43:04 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:04 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:04 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:04 fir-md1-s1 kernel: ttm May 01 23:43:04 fir-md1-s1 kernel: devlink May 01 23:43:04 fir-md1-s1 kernel: ahci May 01 23:43:04 fir-md1-s1 kernel: crct10dif_common May 01 23:43:04 fir-md1-s1 kernel: libahci May 01 23:43:04 fir-md1-s1 kernel: drm May 01 23:43:04 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:04 fir-md1-s1 kernel: tg3 May 01 23:43:04 fir-md1-s1 kernel: crc32c_intel May 01 23:43:04 fir-md1-s1 kernel: libata May 01 23:43:04 fir-md1-s1 kernel: megaraid_sas May 01 23:43:04 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:04 fir-md1-s1 kernel: ptp May 01 23:43:04 fir-md1-s1 kernel: pps_core May 01 23:43:04 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:04 fir-md1-s1 kernel: raid_class May 01 23:43:04 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:04 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: CPU: 34 PID: 102984 Comm: mdt_io02_034 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:04 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:04 fir-md1-s1 kernel: task: ffff982cf9de4100 ti: ffff985ce80d4000 task.ti: ffff985ce80d4000 May 01 23:43:04 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:04 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:04 fir-md1-s1 kernel: RSP: 0018:ffff985ce80d7800 EFLAGS: 00000246 May 01 23:43:04 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff983165105698 RCX: 0000000001110000 May 01 23:43:04 fir-md1-s1 kernel: RDX: ffff983cff79b780 RSI: 0000000000c90101 RDI: ffff982c9fc8c480 May 01 23:43:04 fir-md1-s1 kernel: RBP: ffff985ce80d7800 R08: ffff984cff81b780 R09: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R10: ffff984cff81f140 R11: ffffde3fa7770c00 R12: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R13: ffff985ce80d77a0 R14: ffff983165105408 R15: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: FS: 00007fe19c902740(0000) GS:ffff984cff800000(0000) knlGS:0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:04 fir-md1-s1 kernel: CR2: 00007fe19bb327c0 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:04 fir-md1-s1 kernel: Call Trace: May 01 23:43:04 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:04 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:04 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_free_reply_data+0x128/0x3b0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? kfree+0x106/0x140 May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_free_reply_data+0x128/0x3b0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:04 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: Code: May 01 23:43:04 fir-md1-s1 kernel: 13 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: c1 May 01 23:43:04 fir-md1-s1 kernel: ea May 01 23:43:04 fir-md1-s1 kernel: 0d May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 98 May 01 23:43:04 fir-md1-s1 kernel: 83 May 01 23:43:04 fir-md1-s1 kernel: e2 May 01 23:43:04 fir-md1-s1 kernel: 30 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 81 May 01 23:43:04 fir-md1-s1 kernel: c2 May 01 23:43:04 fir-md1-s1 kernel: 80 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 01 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 48 May 01 23:43:04 fir-md1-s1 kernel: 03 May 01 23:43:04 fir-md1-s1 kernel: 14 May 01 23:43:04 fir-md1-s1 kernel: c5 May 01 23:43:04 fir-md1-s1 kernel: 60 May 01 23:43:04 fir-md1-s1 kernel: b9 May 01 23:43:04 fir-md1-s1 kernel: b4 May 01 23:43:04 fir-md1-s1 kernel: b7 May 01 23:43:04 fir-md1-s1 kernel: 4c May 01 23:43:04 fir-md1-s1 kernel: 89 May 01 23:43:04 fir-md1-s1 kernel: 02 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 75 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 1f May 01 23:43:04 fir-md1-s1 kernel: 44 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: 00 May 01 23:43:04 fir-md1-s1 kernel: f3 May 01 23:43:04 fir-md1-s1 kernel: 90 May 01 23:43:04 fir-md1-s1 kernel: <41> May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 40 May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c0 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: f6 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: 08 May 01 23:43:04 fir-md1-s1 kernel: 4d May 01 23:43:04 fir-md1-s1 kernel: 85 May 01 23:43:04 fir-md1-s1 kernel: c9 May 01 23:43:04 fir-md1-s1 kernel: 74 May 01 23:43:04 fir-md1-s1 kernel: 04 May 01 23:43:04 fir-md1-s1 kernel: 41 May 01 23:43:04 fir-md1-s1 kernel: 0f May 01 23:43:04 fir-md1-s1 kernel: 18 May 01 23:43:04 fir-md1-s1 kernel: 09 May 01 23:43:04 fir-md1-s1 kernel: 8b May 01 23:43:04 fir-md1-s1 kernel: May 01 23:43:04 fir-md1-s1 kernel: Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.52@o2ib7, removing former export from same NID May 01 23:43:04 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:04 fir-md1-s1 kernel: mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas [last unloaded: libcfs] May 01 23:43:04 fir-md1-s1 kernel: CPU: 4 PID: 103101 Comm: mdt_io00_057 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:04 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:04 fir-md1-s1 kernel: task: ffff985c827130c0 ti: ffff985c1a30c000 task.ti: ffff985c1a30c000 May 01 23:43:04 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x1d6/0x200 May 01 23:43:04 fir-md1-s1 kernel: RSP: 0018:ffff985c1a30f800 EFLAGS: 00000293 May 01 23:43:04 fir-md1-s1 kernel: RAX: 0000000000000001 RBX: ffff9831703ce738 RCX: 0000000000000001 May 01 23:43:04 fir-md1-s1 kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffff982c9fc8c480 May 01 23:43:04 fir-md1-s1 kernel: RBP: ffff985c1a30f800 R08: 0000000000000101 R09: ffffffffc1231d1a May 01 23:43:04 fir-md1-s1 kernel: R10: ffff982cfee5f140 R11: ffffde3ed5b1ce00 R12: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: R13: ffff985c1a30f7a0 R14: ffff9831703ce4a8 R15: 0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: FS: 00007f427f792740(0000) GS:ffff982cfee40000(0000) knlGS:0000000000000000 May 01 23:43:04 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:04 fir-md1-s1 kernel: CR2: 00007f427f58b000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:04 fir-md1-s1 kernel: Call Trace: May 01 23:43:04 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:04 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:43:04 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:04 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:04 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:04 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:04 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:04 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:04 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:04 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:04 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:04 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:04 fir-md1-s1 kernel: Code: f4 e9 93 fe ff ff 0f 1f 80 00 00 00 00 83 fa 01 75 11 0f 1f 00 e9 68 fe ff ff 0f 1f 00 85 c0 74 0c f3 90 8b 07 0f b6 c0 83 f8 03 <75> f0 b8 01 00 00 00 66 89 07 5d c3 66 0f 1f 44 00 00 f3 90 4d May 01 23:43:05 fir-md1-s1 kernel: Lustre: 101395:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556779364/real 1556779364] req@ffff98555c27e600 x1632254604128256/t0(0) o601->fir-MDT0000-lwp-MDT0002@0@lo:23/10 lens 336/336 e 1 to 1 dl 1556779385 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 01 23:43:05 fir-md1-s1 kernel: Lustre: 101395:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 2 previous similar messages May 01 23:43:07 fir-md1-s1 kernel: INFO: rcu_sched self-detected stall on CPU May 01 23:43:07 fir-md1-s1 kernel: INFO: rcu_sched detected stalls on CPUs/tasks: May 01 23:43:07 fir-md1-s1 kernel: { May 01 23:43:07 fir-md1-s1 kernel: 16 May 01 23:43:07 fir-md1-s1 kernel: } May 01 23:43:07 fir-md1-s1 kernel: (detected by 9, t=60002 jiffies, g=60007355, c=60007354, q=340009) May 01 23:43:07 fir-md1-s1 kernel: Task dump for CPU 16: May 01 23:43:07 fir-md1-s1 kernel: mdt00_018 R May 01 23:43:07 fir-md1-s1 kernel: running task May 01 23:43:07 fir-md1-s1 kernel: 0 102388 2 0x00000088 May 01 23:43:07 fir-md1-s1 kernel: Call Trace: May 01 23:43:07 fir-md1-s1 kernel: [] ? ldiskfs_es_shrink+0xb4/0x130 [ldiskfs] May 01 23:43:07 fir-md1-s1 kernel: [] ? shrink_slab+0x175/0x340 May 01 23:43:07 fir-md1-s1 kernel: [] ? zone_watermark_ok+0x1f/0x30 May 01 23:43:07 fir-md1-s1 kernel: [] ? compaction_suitable+0xa3/0xb0 May 01 23:43:07 fir-md1-s1 kernel: [] ? zone_reclaim+0x1d1/0x2f0 May 01 23:43:07 fir-md1-s1 kernel: [] ? get_page_from_freelist+0x87b/0xa70 May 01 23:43:07 fir-md1-s1 kernel: [] ? __getblk+0x2d/0x300 May 01 23:43:07 fir-md1-s1 kernel: [] ? __alloc_pages_nodemask+0x176/0x420 May 01 23:43:07 fir-md1-s1 kernel: [] ? alloc_pages_current+0x98/0x110 May 01 23:43:07 fir-md1-s1 kernel: [] ? new_slab+0x2c5/0x390 May 01 23:43:07 fir-md1-s1 kernel: [] ? ___slab_alloc+0x3ac/0x4f0 May 01 23:43:07 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] ? fld_cache_lookup+0x36/0x1a0 [fld] May 01 23:43:07 fir-md1-s1 kernel: [] ? fld_local_lookup+0x62/0x270 [fld] May 01 23:43:07 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] ? __slab_alloc+0x40/0x5c May 01 23:43:07 fir-md1-s1 kernel: [] ? kmem_cache_alloc+0x19b/0x1f0 May 01 23:43:07 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] ? lod_object_init+0x1e7/0x3c0 [lod] May 01 23:43:07 fir-md1-s1 kernel: [] ? lu_object_alloc+0xe5/0x320 [obdclass] May 01 23:43:07 fir-md1-s1 kernel: [] ? lu_object_find_at+0x76/0x280 [obdclass] May 01 23:43:07 fir-md1-s1 kernel: [] ? lu_object_find_slice+0x1f/0x90 [obdclass] May 01 23:43:07 fir-md1-s1 kernel: [] ? mdd_object_find+0x10/0x70 [mdd] May 01 23:43:07 fir-md1-s1 kernel: [] ? obf_lookup+0x2c9/0x350 [mdd] May 01 23:43:07 fir-md1-s1 kernel: [] ? req_capsule_get_size+0x31/0x70 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? mdt_getattr_name_lock+0xf7c/0x1c30 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? __req_capsule_get+0x15f/0x740 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? mdt_intent_getattr+0x2b5/0x480 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] ? mdt_intent_policy+0x2e8/0xd00 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] ? mdt_intent_layout+0xcc0/0xcc0 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] ? ldlm_lock_enqueue+0x366/0xa60 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] May 01 23:43:07 fir-md1-s1 kernel: [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] May 01 23:43:07 fir-md1-s1 kernel: [] ? ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? tgt_enqueue+0x62/0x210 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:07 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? kthread+0xd1/0xe0 May 01 23:43:07 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:07 fir-md1-s1 kernel: [] ? ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:07 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:07 fir-md1-s1 kernel: { May 01 23:43:07 fir-md1-s1 kernel: 16} (t=60374 jiffies g=60007355 c=60007354 q=340300) May 01 23:43:07 fir-md1-s1 kernel: Task dump for CPU 16: May 01 23:43:07 fir-md1-s1 kernel: mdt00_018 R running task 0 102388 2 0x00000088 May 01 23:43:07 fir-md1-s1 kernel: Call Trace: May 01 23:43:07 fir-md1-s1 kernel: [] sched_show_task+0xa8/0x110 May 01 23:43:07 fir-md1-s1 kernel: [] dump_cpu_task+0x39/0x70 May 01 23:43:07 fir-md1-s1 kernel: [] rcu_dump_cpu_stacks+0x90/0xd0 May 01 23:43:07 fir-md1-s1 kernel: [] rcu_check_callbacks+0x442/0x730 May 01 23:43:07 fir-md1-s1 kernel: [] ? tick_sched_do_timer+0x50/0x50 May 01 23:43:07 fir-md1-s1 kernel: [] update_process_times+0x46/0x80 May 01 23:43:07 fir-md1-s1 kernel: [] tick_sched_handle+0x30/0x70 May 01 23:43:07 fir-md1-s1 kernel: [] tick_sched_timer+0x39/0x80 May 01 23:43:07 fir-md1-s1 kernel: [] __hrtimer_run_queues+0xf3/0x270 May 01 23:43:07 fir-md1-s1 kernel: [] hrtimer_interrupt+0xaf/0x1d0 May 01 23:43:07 fir-md1-s1 kernel: [] ? ldiskfs_init_inode_table+0x410/0x410 [ldiskfs] May 01 23:43:07 fir-md1-s1 kernel: [] local_apic_timer_interrupt+0x3b/0x60 May 01 23:43:07 fir-md1-s1 kernel: [] smp_apic_timer_interrupt+0x43/0x60 May 01 23:43:07 fir-md1-s1 kernel: [] apic_timer_interrupt+0x162/0x170 May 01 23:43:07 fir-md1-s1 kernel: [] ? unfreeze_partials.isra.44+0xd2/0x130 May 01 23:43:07 fir-md1-s1 kernel: [] ? ldiskfs_inode_touch_time_cmp+0x14/0x90 [ldiskfs] May 01 23:43:07 fir-md1-s1 kernel: [] merge+0x62/0xc0 May 01 23:43:07 fir-md1-s1 kernel: [] ? ldiskfs_init_inode_table+0x410/0x410 [ldiskfs] May 01 23:43:07 fir-md1-s1 kernel: [] list_sort+0x9b/0x250 May 01 23:43:07 fir-md1-s1 kernel: [] __ldiskfs_es_shrink+0x1ce/0x2a0 [ldiskfs] May 01 23:43:07 fir-md1-s1 kernel: [] ldiskfs_es_shrink+0xb4/0x130 [ldiskfs] May 01 23:43:07 fir-md1-s1 kernel: [] shrink_slab+0x175/0x340 May 01 23:43:07 fir-md1-s1 kernel: [] ? zone_watermark_ok+0x1f/0x30 May 01 23:43:07 fir-md1-s1 kernel: [] ? compaction_suitable+0xa3/0xb0 May 01 23:43:07 fir-md1-s1 kernel: [] zone_reclaim+0x1d1/0x2f0 May 01 23:43:07 fir-md1-s1 kernel: [] get_page_from_freelist+0x87b/0xa70 May 01 23:43:07 fir-md1-s1 kernel: [] ? __getblk+0x2d/0x300 May 01 23:43:07 fir-md1-s1 kernel: [] __alloc_pages_nodemask+0x176/0x420 May 01 23:43:07 fir-md1-s1 kernel: [] alloc_pages_current+0x98/0x110 May 01 23:43:07 fir-md1-s1 kernel: [] new_slab+0x2c5/0x390 May 01 23:43:07 fir-md1-s1 kernel: [] ___slab_alloc+0x3ac/0x4f0 May 01 23:43:07 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] ? fld_cache_lookup+0x36/0x1a0 [fld] May 01 23:43:07 fir-md1-s1 kernel: [] ? fld_local_lookup+0x62/0x270 [fld] May 01 23:43:07 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] __slab_alloc+0x40/0x5c May 01 23:43:07 fir-md1-s1 kernel: [] kmem_cache_alloc+0x19b/0x1f0 May 01 23:43:07 fir-md1-s1 kernel: [] ? osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] osp_object_alloc+0x40/0x170 [osp] May 01 23:43:07 fir-md1-s1 kernel: [] lod_object_init+0x1e7/0x3c0 [lod] May 01 23:43:07 fir-md1-s1 kernel: [] lu_object_alloc+0xe5/0x320 [obdclass] May 01 23:43:07 fir-md1-s1 kernel: [] lu_object_find_at+0x76/0x280 [obdclass] May 01 23:43:07 fir-md1-s1 kernel: [] lu_object_find_slice+0x1f/0x90 [obdclass] May 01 23:43:07 fir-md1-s1 kernel: [] mdd_object_find+0x10/0x70 [mdd] May 01 23:43:07 fir-md1-s1 kernel: [] obf_lookup+0x2c9/0x350 [mdd] May 01 23:43:07 fir-md1-s1 kernel: [] ? req_capsule_get_size+0x31/0x70 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] mdt_getattr_name_lock+0xf7c/0x1c30 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? __req_capsule_get+0x15f/0x740 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? lustre_msg_get_flags+0x2c/0xa0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] mdt_intent_getattr+0x2b5/0x480 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] mdt_intent_policy+0x2e8/0xd00 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] ? mdt_intent_layout+0xcc0/0xcc0 [mdt] May 01 23:43:07 fir-md1-s1 kernel: [] ldlm_lock_enqueue+0x366/0xa60 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] May 01 23:43:07 fir-md1-s1 kernel: [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] May 01 23:43:07 fir-md1-s1 kernel: [] ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] tgt_enqueue+0x62/0x210 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:07 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:07 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:07 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:07 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:07 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:07 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:07 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:10 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#44 stuck for 23s! [mdt_io00_041:103008] May 01 23:43:10 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif May 01 23:43:10 fir-md1-s1 kernel: crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas [last unloaded: libcfs] May 01 23:43:10 fir-md1-s1 kernel: CPU: 44 PID: 103008 Comm: mdt_io00_041 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:10 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:10 fir-md1-s1 kernel: task: ffff984b564bd140 ti: ffff984cf4430000 task.ti: ffff984cf4430000 May 01 23:43:10 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x128/0x200 May 01 23:43:10 fir-md1-s1 kernel: RSP: 0018:ffff984cf44338e8 EFLAGS: 00000246 May 01 23:43:10 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff985c7a711b78 RCX: 0000000001610000 May 01 23:43:10 fir-md1-s1 kernel: RDX: ffff984cff61b780 RSI: 0000000000110101 RDI: ffff982c9fc8c480 May 01 23:43:10 fir-md1-s1 kernel: RBP: ffff984cf44338e8 R08: ffff982cff0db780 R09: 0000000000000000 May 01 23:43:10 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff982cff0dac00 May 01 23:43:10 fir-md1-s1 kernel: R13: ffff984b564bd1a8 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:43:10 fir-md1-s1 kernel: FS: 00007fad68bc1880(0000) GS:ffff982cff0c0000(0000) knlGS:0000000000000000 May 01 23:43:10 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:10 fir-md1-s1 kernel: CR2: 00007fad62c1b090 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:10 fir-md1-s1 kernel: Call Trace: May 01 23:43:10 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:10 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:10 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:10 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:10 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:10 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:10 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:10 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:10 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:10 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:10 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:10 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:10 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:10 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:10 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:10 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:10 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:10 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:10 fir-md1-s1 kernel: Code: 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 41 8b 40 08 85 c0 <74> f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b 17 0f b7 c2 85 c0 May 01 23:43:13 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [mdt_io03_049:103296] May 01 23:43:13 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [mdt_io02_062:103114] May 01 23:43:13 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel May 01 23:43:13 fir-md1-s1 kernel: Modules linked in: May 01 23:43:13 fir-md1-s1 kernel: osp(OE) May 01 23:43:13 fir-md1-s1 kernel: mdd(OE) May 01 23:43:13 fir-md1-s1 kernel: lod(OE) May 01 23:43:13 fir-md1-s1 kernel: mdt(OE) May 01 23:43:13 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:13 fir-md1-s1 kernel: mgs(OE) May 01 23:43:13 fir-md1-s1 kernel: mgc(OE) May 01 23:43:13 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:13 fir-md1-s1 kernel: lquota(OE) May 01 23:43:13 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:13 fir-md1-s1 kernel: lustre(OE) May 01 23:43:13 fir-md1-s1 kernel: lmv(OE) May 01 23:43:13 fir-md1-s1 kernel: mdc(OE) May 01 23:43:13 fir-md1-s1 kernel: osc(OE) May 01 23:43:13 fir-md1-s1 kernel: lov(OE) May 01 23:43:13 fir-md1-s1 kernel: fid(OE) May 01 23:43:13 fir-md1-s1 kernel: fld(OE) May 01 23:43:13 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:13 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:13 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:13 fir-md1-s1 kernel: lnet(OE) May 01 23:43:13 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:13 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:13 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:13 fir-md1-s1 kernel: nfsv4 May 01 23:43:13 fir-md1-s1 kernel: dns_resolver May 01 23:43:13 fir-md1-s1 kernel: nfs May 01 23:43:13 fir-md1-s1 kernel: lockd May 01 23:43:13 fir-md1-s1 kernel: grace May 01 23:43:13 fir-md1-s1 kernel: fscache May 01 23:43:13 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:13 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:13 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:13 fir-md1-s1 kernel: dell_rbu May 01 23:43:13 fir-md1-s1 kernel: sunrpc May 01 23:43:13 fir-md1-s1 kernel: vfat May 01 23:43:13 fir-md1-s1 kernel: fat May 01 23:43:13 fir-md1-s1 kernel: dm_round_robin May 01 23:43:13 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:13 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:13 fir-md1-s1 kernel: kvm_amd May 01 23:43:13 fir-md1-s1 kernel: kvm May 01 23:43:13 fir-md1-s1 kernel: ses May 01 23:43:13 fir-md1-s1 kernel: irqbypass May 01 23:43:13 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:13 fir-md1-s1 kernel: enclosure May 01 23:43:13 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:13 fir-md1-s1 kernel: dcdbas May 01 23:43:13 fir-md1-s1 kernel: aesni_intel May 01 23:43:13 fir-md1-s1 kernel: lrw May 01 23:43:13 fir-md1-s1 kernel: gf128mul May 01 23:43:13 fir-md1-s1 kernel: glue_helper May 01 23:43:13 fir-md1-s1 kernel: ablk_helper May 01 23:43:13 fir-md1-s1 kernel: cryptd May 01 23:43:13 fir-md1-s1 kernel: ipmi_si May 01 23:43:13 fir-md1-s1 kernel: pcspkr May 01 23:43:13 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:13 fir-md1-s1 kernel: ccp May 01 23:43:13 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:13 fir-md1-s1 kernel: dm_multipath May 01 23:43:13 fir-md1-s1 kernel: sg May 01 23:43:13 fir-md1-s1 kernel: k10temp May 01 23:43:13 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:13 fir-md1-s1 kernel: dm_mod May 01 23:43:13 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:13 fir-md1-s1 kernel: knem(OE) May 01 23:43:13 fir-md1-s1 kernel: ip_tables May 01 23:43:13 fir-md1-s1 kernel: ext4 May 01 23:43:13 fir-md1-s1 kernel: mbcache May 01 23:43:13 fir-md1-s1 kernel: jbd2 May 01 23:43:13 fir-md1-s1 kernel: sd_mod May 01 23:43:13 fir-md1-s1 kernel: crc_t10dif May 01 23:43:13 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:13 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:13 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:13 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:13 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:13 fir-md1-s1 kernel: syscopyarea May 01 23:43:13 fir-md1-s1 kernel: sysfillrect May 01 23:43:13 fir-md1-s1 kernel: sysimgblt May 01 23:43:13 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:13 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:13 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:13 fir-md1-s1 kernel: ttm May 01 23:43:13 fir-md1-s1 kernel: devlink May 01 23:43:13 fir-md1-s1 kernel: ahci May 01 23:43:13 fir-md1-s1 kernel: crct10dif_common May 01 23:43:13 fir-md1-s1 kernel: libahci May 01 23:43:13 fir-md1-s1 kernel: drm May 01 23:43:13 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:13 fir-md1-s1 kernel: tg3 May 01 23:43:13 fir-md1-s1 kernel: crc32c_intel May 01 23:43:13 fir-md1-s1 kernel: libata May 01 23:43:13 fir-md1-s1 kernel: megaraid_sas May 01 23:43:13 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:13 fir-md1-s1 kernel: ptp May 01 23:43:13 fir-md1-s1 kernel: pps_core May 01 23:43:13 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:13 fir-md1-s1 kernel: raid_class May 01 23:43:13 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:13 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:13 fir-md1-s1 kernel: May 01 23:43:13 fir-md1-s1 kernel: CPU: 10 PID: 103114 Comm: mdt_io02_062 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:13 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:13 fir-md1-s1 kernel: task: ffff985cdbd69040 ti: ffff985cfaf20000 task.ti: ffff985cfaf20000 May 01 23:43:13 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:13 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x126/0x200 May 01 23:43:13 fir-md1-s1 kernel: RSP: 0018:ffff985cfaf238e8 EFLAGS: 00000246 May 01 23:43:13 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff98267ca42b78 RCX: 0000000000510000 May 01 23:43:13 fir-md1-s1 kernel: RDX: ffff982cff0db780 RSI: 0000000001610101 RDI: ffff982c9fc8c480 May 01 23:43:13 fir-md1-s1 kernel: RBP: ffff985cfaf238e8 R08: ffff984cff69b780 R09: 0000000000000000 May 01 23:43:13 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff984cff69ac00 May 01 23:43:13 fir-md1-s1 kernel: R13: ffff985cdbd690a8 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:43:13 fir-md1-s1 kernel: FS: 00007f759b098700(0000) GS:ffff984cff680000(0000) knlGS:0000000000000000 May 01 23:43:13 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:13 fir-md1-s1 kernel: CR2: 00007f759e3fa000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:13 fir-md1-s1 kernel: Call Trace: May 01 23:43:13 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:13 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:13 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:13 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:13 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? load_balance+0x178/0x9a0 May 01 23:43:13 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:13 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:13 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:13 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:13 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:13 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:13 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:13 fir-md1-s1 kernel: Code: May 01 23:43:13 fir-md1-s1 kernel: 0d May 01 23:43:13 fir-md1-s1 kernel: 48 May 01 23:43:13 fir-md1-s1 kernel: 98 May 01 23:43:13 fir-md1-s1 kernel: 83 May 01 23:43:13 fir-md1-s1 kernel: e2 May 01 23:43:13 fir-md1-s1 kernel: 30 May 01 23:43:13 fir-md1-s1 kernel: 48 May 01 23:43:13 fir-md1-s1 kernel: 81 May 01 23:43:13 fir-md1-s1 kernel: c2 May 01 23:43:13 fir-md1-s1 kernel: 80 May 01 23:43:13 fir-md1-s1 kernel: b7 May 01 23:43:13 fir-md1-s1 kernel: 01 May 01 23:43:13 fir-md1-s1 kernel: 00 May 01 23:43:13 fir-md1-s1 kernel: 48 May 01 23:43:13 fir-md1-s1 kernel: 03 May 01 23:43:13 fir-md1-s1 kernel: 14 May 01 23:43:13 fir-md1-s1 kernel: c5 May 01 23:43:13 fir-md1-s1 kernel: 60 May 01 23:43:13 fir-md1-s1 kernel: b9 May 01 23:43:13 fir-md1-s1 kernel: b4 May 01 23:43:13 fir-md1-s1 kernel: b7 May 01 23:43:13 fir-md1-s1 kernel: 4c May 01 23:43:13 fir-md1-s1 kernel: 89 May 01 23:43:13 fir-md1-s1 kernel: 02 May 01 23:43:13 fir-md1-s1 kernel: 41 May 01 23:43:13 fir-md1-s1 kernel: 8b May 01 23:43:13 fir-md1-s1 kernel: 40 May 01 23:43:13 fir-md1-s1 kernel: 08 May 01 23:43:13 fir-md1-s1 kernel: 85 May 01 23:43:13 fir-md1-s1 kernel: c0 May 01 23:43:13 fir-md1-s1 kernel: 75 May 01 23:43:13 fir-md1-s1 kernel: 0f May 01 23:43:13 fir-md1-s1 kernel: 0f May 01 23:43:13 fir-md1-s1 kernel: 1f May 01 23:43:13 fir-md1-s1 kernel: 44 May 01 23:43:13 fir-md1-s1 kernel: 00 May 01 23:43:13 fir-md1-s1 kernel: 00 May 01 23:43:13 fir-md1-s1 kernel: f3 May 01 23:43:13 fir-md1-s1 kernel: 90 May 01 23:43:13 fir-md1-s1 kernel: 41 May 01 23:43:13 fir-md1-s1 kernel: 8b May 01 23:43:13 fir-md1-s1 kernel: 40 May 01 23:43:13 fir-md1-s1 kernel: 08 May 01 23:43:13 fir-md1-s1 kernel: <85> May 01 23:43:13 fir-md1-s1 kernel: c0 May 01 23:43:13 fir-md1-s1 kernel: 74 May 01 23:43:13 fir-md1-s1 kernel: f6 May 01 23:43:13 fir-md1-s1 kernel: 4d May 01 23:43:13 fir-md1-s1 kernel: 8b May 01 23:43:13 fir-md1-s1 kernel: 08 May 01 23:43:13 fir-md1-s1 kernel: 4d May 01 23:43:13 fir-md1-s1 kernel: 85 May 01 23:43:13 fir-md1-s1 kernel: c9 May 01 23:43:13 fir-md1-s1 kernel: 74 May 01 23:43:13 fir-md1-s1 kernel: 04 May 01 23:43:13 fir-md1-s1 kernel: 41 May 01 23:43:13 fir-md1-s1 kernel: 0f May 01 23:43:13 fir-md1-s1 kernel: 18 May 01 23:43:13 fir-md1-s1 kernel: 09 May 01 23:43:13 fir-md1-s1 kernel: 8b May 01 23:43:13 fir-md1-s1 kernel: 17 May 01 23:43:13 fir-md1-s1 kernel: 0f May 01 23:43:13 fir-md1-s1 kernel: b7 May 01 23:43:13 fir-md1-s1 kernel: c2 May 01 23:43:13 fir-md1-s1 kernel: May 01 23:43:13 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#13 stuck for 22s! [mdt_io01_080:103078] May 01 23:43:13 fir-md1-s1 kernel: Modules linked in: May 01 23:43:13 fir-md1-s1 kernel: osp(OE) May 01 23:43:13 fir-md1-s1 kernel: mdd(OE) May 01 23:43:13 fir-md1-s1 kernel: lod(OE) May 01 23:43:13 fir-md1-s1 kernel: mdt(OE) May 01 23:43:13 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:13 fir-md1-s1 kernel: mgs(OE) May 01 23:43:13 fir-md1-s1 kernel: mgc(OE) May 01 23:43:13 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:13 fir-md1-s1 kernel: lquota(OE) May 01 23:43:13 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:13 fir-md1-s1 kernel: lustre(OE) May 01 23:43:13 fir-md1-s1 kernel: lmv(OE) May 01 23:43:13 fir-md1-s1 kernel: mdc(OE) May 01 23:43:13 fir-md1-s1 kernel: osc(OE) May 01 23:43:13 fir-md1-s1 kernel: lov(OE) May 01 23:43:13 fir-md1-s1 kernel: fid(OE) May 01 23:43:13 fir-md1-s1 kernel: fld(OE) May 01 23:43:13 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:13 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:13 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:13 fir-md1-s1 kernel: lnet(OE) May 01 23:43:13 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:13 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:13 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:13 fir-md1-s1 kernel: nfsv4 May 01 23:43:13 fir-md1-s1 kernel: dns_resolver May 01 23:43:13 fir-md1-s1 kernel: nfs May 01 23:43:13 fir-md1-s1 kernel: lockd May 01 23:43:13 fir-md1-s1 kernel: grace May 01 23:43:13 fir-md1-s1 kernel: fscache May 01 23:43:13 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:13 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:13 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:13 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:13 fir-md1-s1 kernel: dell_rbu May 01 23:43:13 fir-md1-s1 kernel: sunrpc May 01 23:43:13 fir-md1-s1 kernel: vfat May 01 23:43:13 fir-md1-s1 kernel: fat May 01 23:43:13 fir-md1-s1 kernel: dm_round_robin May 01 23:43:13 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:13 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:13 fir-md1-s1 kernel: kvm_amd May 01 23:43:13 fir-md1-s1 kernel: kvm May 01 23:43:13 fir-md1-s1 kernel: ses May 01 23:43:13 fir-md1-s1 kernel: irqbypass May 01 23:43:13 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:13 fir-md1-s1 kernel: enclosure May 01 23:43:13 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:13 fir-md1-s1 kernel: dcdbas May 01 23:43:13 fir-md1-s1 kernel: aesni_intel May 01 23:43:13 fir-md1-s1 kernel: lrw May 01 23:43:13 fir-md1-s1 kernel: gf128mul May 01 23:43:13 fir-md1-s1 kernel: glue_helper May 01 23:43:13 fir-md1-s1 kernel: ablk_helper May 01 23:43:13 fir-md1-s1 kernel: cryptd May 01 23:43:13 fir-md1-s1 kernel: ipmi_si May 01 23:43:13 fir-md1-s1 kernel: pcspkr May 01 23:43:13 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:13 fir-md1-s1 kernel: ccp May 01 23:43:13 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:13 fir-md1-s1 kernel: dm_multipath May 01 23:43:13 fir-md1-s1 kernel: sg May 01 23:43:13 fir-md1-s1 kernel: k10temp May 01 23:43:13 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:13 fir-md1-s1 kernel: dm_mod May 01 23:43:13 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:13 fir-md1-s1 kernel: knem(OE) May 01 23:43:13 fir-md1-s1 kernel: ip_tables May 01 23:43:13 fir-md1-s1 kernel: ext4 May 01 23:43:13 fir-md1-s1 kernel: mbcache May 01 23:43:13 fir-md1-s1 kernel: jbd2 May 01 23:43:13 fir-md1-s1 kernel: sd_mod May 01 23:43:13 fir-md1-s1 kernel: crc_t10dif May 01 23:43:13 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:13 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:13 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:13 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:13 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:13 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:13 fir-md1-s1 kernel: syscopyarea May 01 23:43:13 fir-md1-s1 kernel: sysfillrect May 01 23:43:13 fir-md1-s1 kernel: sysimgblt May 01 23:43:13 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:13 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:13 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:13 fir-md1-s1 kernel: ttm May 01 23:43:13 fir-md1-s1 kernel: devlink May 01 23:43:13 fir-md1-s1 kernel: ahci May 01 23:43:13 fir-md1-s1 kernel: crct10dif_common May 01 23:43:13 fir-md1-s1 kernel: libahci May 01 23:43:13 fir-md1-s1 kernel: drm May 01 23:43:13 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:13 fir-md1-s1 kernel: tg3 May 01 23:43:13 fir-md1-s1 kernel: crc32c_intel May 01 23:43:13 fir-md1-s1 kernel: libata May 01 23:43:13 fir-md1-s1 kernel: megaraid_sas May 01 23:43:13 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:13 fir-md1-s1 kernel: ptp May 01 23:43:13 fir-md1-s1 kernel: pps_core May 01 23:43:13 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:13 fir-md1-s1 kernel: raid_class May 01 23:43:13 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:13 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:13 fir-md1-s1 kernel: May 01 23:43:13 fir-md1-s1 kernel: CPU: 13 PID: 103078 Comm: mdt_io01_080 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:13 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:13 fir-md1-s1 kernel: task: ffff985cfe900000 ti: ffff985bc0140000 task.ti: ffff985bc0140000 May 01 23:43:13 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:13 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:13 fir-md1-s1 kernel: RSP: 0018:ffff985bc01438e8 EFLAGS: 00000246 May 01 23:43:13 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff98267ca42b78 RCX: 0000000000690000 May 01 23:43:13 fir-md1-s1 kernel: RDX: ffff984cff69b780 RSI: 0000000000510101 RDI: ffff982c9fc8c480 May 01 23:43:13 fir-md1-s1 kernel: RBP: ffff985bc01438e8 R08: ffff983cff6db780 R09: 0000000000000000 May 01 23:43:13 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff983cff7dac00 May 01 23:43:13 fir-md1-s1 kernel: R13: ffff985cfe900068 R14: 00ff985bc0143850 R15: ffff984cf34000a0 May 01 23:43:13 fir-md1-s1 kernel: FS: 00007f7593fff700(0000) GS:ffff983cff6c0000(0000) knlGS:0000000000000000 May 01 23:43:13 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:13 fir-md1-s1 kernel: CR2: 00007f759e3fa000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:13 fir-md1-s1 kernel: Call Trace: May 01 23:43:13 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:13 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:13 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:13 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:13 fir-md1-s1 kernel: dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas [last unloaded: libcfs] May 01 23:43:13 fir-md1-s1 kernel: CPU: 3 PID: 103296 Comm: mdt_io03_049 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:13 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:13 fir-md1-s1 kernel: task: ffff982cf0b6d140 ti: ffff98283e6f8000 task.ti: ffff98283e6f8000 May 01 23:43:13 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:13 fir-md1-s1 kernel: RSP: 0018:ffff98283e6fb8e8 EFLAGS: 00000246 May 01 23:43:13 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff984851f7a378 RCX: 0000000000190000 May 01 23:43:13 fir-md1-s1 kernel: RDX: ffff982cff05b780 RSI: 0000000001210101 RDI: ffff982c9fc8c480 May 01 23:43:13 fir-md1-s1 kernel: RBP: ffff98283e6fb8e8 R08: ffff985d3f41b780 R09: 0000000000000000 May 01 23:43:13 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff985d3f41ac00 May 01 23:43:13 fir-md1-s1 kernel: R13: ffff982cf0b6d1a8 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:43:13 fir-md1-s1 kernel: FS: 00007f0117f95880(0000) GS:ffff985d3f400000(0000) knlGS:0000000000000000 May 01 23:43:13 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:13 fir-md1-s1 kernel: CR2: 00007f010587e03c CR3: 00000030346d6000 CR4: 00000000003407e0 May 01 23:43:13 fir-md1-s1 kernel: Call Trace: May 01 23:43:13 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:13 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:13 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:13 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:13 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:13 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:13 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:13 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:13 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:13 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:13 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:13 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:13 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:13 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:13 fir-md1-s1 kernel: Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b May 01 23:43:15 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#36 stuck for 22s! [mdt_io00_043:103023] May 01 23:43:15 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif May 01 23:43:15 fir-md1-s1 kernel: crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas [last unloaded: libcfs] May 01 23:43:15 fir-md1-s1 kernel: CPU: 36 PID: 103023 Comm: mdt_io00_043 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:15 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:15 fir-md1-s1 kernel: task: ffff98379d684100 ti: ffff9837162a0000 task.ti: ffff9837162a0000 May 01 23:43:15 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:15 fir-md1-s1 kernel: RSP: 0018:ffff9837162a38e8 EFLAGS: 00000246 May 01 23:43:15 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff9837d662db78 RCX: 0000000001210000 May 01 23:43:15 fir-md1-s1 kernel: RDX: ffff983cff6db780 RSI: 0000000000690101 RDI: ffff982c9fc8c480 May 01 23:43:15 fir-md1-s1 kernel: RBP: ffff9837162a38e8 R08: ffff982cff05b780 R09: 0000000000000000 May 01 23:43:15 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: 0000000000000000 May 01 23:43:15 fir-md1-s1 kernel: R13: 0000000000000000 R14: 00ff981e3fd3ec00 R15: ffff984cf34000a0 May 01 23:43:15 fir-md1-s1 kernel: FS: 00007f67c209f740(0000) GS:ffff982cff040000(0000) knlGS:0000000000000000 May 01 23:43:15 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:15 fir-md1-s1 kernel: CR2: 00007f67c0de780d CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:15 fir-md1-s1 kernel: Call Trace: May 01 23:43:15 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:15 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:15 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:15 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:15 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:15 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:15 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:15 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:15 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? load_balance+0x178/0x9a0 May 01 23:43:15 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:15 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:15 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:15 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:15 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:15 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:15 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:15 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:15 fir-md1-s1 kernel: Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b May 01 23:43:15 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:15 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:15 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:15 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:15 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:15 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:15 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:15 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:15 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:15 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:15 fir-md1-s1 kernel: Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b May 01 23:43:18 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#26 stuck for 23s! [mdt_io02_043:103027] May 01 23:43:18 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif May 01 23:43:18 fir-md1-s1 kernel: crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas [last unloaded: libcfs] May 01 23:43:18 fir-md1-s1 kernel: CPU: 26 PID: 103027 Comm: mdt_io02_043 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:18 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:18 fir-md1-s1 kernel: task: ffff982c812730c0 ti: ffff983a5ba80000 task.ti: ffff983a5ba80000 May 01 23:43:18 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:18 fir-md1-s1 kernel: RSP: 0018:ffff983a5ba83800 EFLAGS: 00000246 May 01 23:43:18 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff983165526b60 RCX: 0000000000d10000 May 01 23:43:18 fir-md1-s1 kernel: RDX: ffff982cfeedb780 RSI: 0000000000610101 RDI: ffff982c9fc8c480 May 01 23:43:18 fir-md1-s1 kernel: RBP: ffff983a5ba83800 R08: ffff984cff79b780 R09: 0000000000000000 May 01 23:43:18 fir-md1-s1 kernel: R10: ffff984cff79f140 R11: ffffde3f6b96dc00 R12: 0000000000000000 May 01 23:43:18 fir-md1-s1 kernel: R13: ffff983a5ba837a0 R14: ffff9831655268d0 R15: 0000000000000000 May 01 23:43:18 fir-md1-s1 kernel: FS: 00007fa424097780(0000) GS:ffff984cff780000(0000) knlGS:0000000000000000 May 01 23:43:18 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:18 fir-md1-s1 kernel: Lustre: fir-MDT0000-lwp-MDT0002: Connection to fir-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete May 01 23:43:18 fir-md1-s1 kernel: Lustre: fir-MDT0000: Received new LWP connection from 0@lo, removing former export from same NID May 01 23:43:18 fir-md1-s1 kernel: CR2: 00007fa4240a8000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:18 fir-md1-s1 kernel: Call Trace: May 01 23:43:18 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:18 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:18 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:18 fir-md1-s1 kernel: [] ldiskfs_ext_map_blocks+0x7b5/0xf60 [ldiskfs] May 01 23:43:18 fir-md1-s1 kernel: [] ? ktime_get+0x52/0xe0 May 01 23:43:18 fir-md1-s1 kernel: [] ? kiblnd_check_sends_locked+0xa72/0xe40 [ko2iblnd] May 01 23:43:18 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x98/0x700 [ldiskfs] May 01 23:43:18 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:18 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:18 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:18 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:18 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:18 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:18 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:18 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:18 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:18 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:18 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:18 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:18 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:18 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:18 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:18 fir-md1-s1 kernel: Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b May 01 23:43:19 fir-md1-s1 kernel: Lustre: 103137:0:(service.c:2011:ptlrpc_server_handle_req_in()) @@@ Slow req_in handling 29s req@ffff984763ab9b00 x1631768250398768/t0(0) o35->1029f32e-c536-81b0-6441-16ee4f005637@10.8.22.9@o2ib6:0/0 lens 392/0 e 0 to 0 dl 0 ref 1 fl New:/0/ffffffff rc 0/-1 May 01 23:43:19 fir-md1-s1 kernel: Lustre: mdt_readpage: This server is not able to keep up with request traffic (cpu-bound). May 01 23:43:19 fir-md1-s1 kernel: Lustre: 103137:0:(service.c:1541:ptlrpc_at_check_timed()) earlyQ=49 reqQ=448 recA=59, svcEst=20, delay=34252 May 01 23:43:19 fir-md1-s1 kernel: Lustre: 103137:0:(service.c:1322:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-5s), not sending early reply. Consider increasing at_early_margin (5)? req@ffff984bf8a23c50 x1631604127569888/t0(0) o101->fir-MDT0000-lwp-OST001b_UUID@10.0.10.106@o2ib7:14/0 lens 456/0 e 0 to 0 dl 1556779394 ref 2 fl New:/0/ffffffff rc 0/-1 May 01 23:43:23 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [mdt_io02_001:101733] May 01 23:43:23 fir-md1-s1 kernel: Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm ses irqbypass crc32_pclmul enclosure ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr ipmi_devintf ccp i2c_piix4 dm_multipath sg k10temp ipmi_msghandler dm_mod acpi_power_meter knem(OE) ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif May 01 23:43:23 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [mdt_io03_043:103125] May 01 23:43:23 fir-md1-s1 kernel: crct10dif_generic mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper mlx5_core(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops mlxfw(OE) crct10dif_pclmul ttm devlink ahci crct10dif_common libahci drm mlx_compat(OE) tg3 crc32c_intel libata megaraid_sas drm_panel_orientation_quirks ptp pps_core mpt3sas(OE) raid_class scsi_transport_sas May 01 23:43:23 fir-md1-s1 kernel: Modules linked in: May 01 23:43:23 fir-md1-s1 kernel: osp(OE) May 01 23:43:23 fir-md1-s1 kernel: mdd(OE) May 01 23:43:23 fir-md1-s1 kernel: lod(OE) May 01 23:43:23 fir-md1-s1 kernel: mdt(OE) May 01 23:43:23 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:23 fir-md1-s1 kernel: mgs(OE) May 01 23:43:23 fir-md1-s1 kernel: mgc(OE) May 01 23:43:23 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:23 fir-md1-s1 kernel: lquota(OE) May 01 23:43:23 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:23 fir-md1-s1 kernel: lustre(OE) May 01 23:43:23 fir-md1-s1 kernel: lmv(OE) May 01 23:43:23 fir-md1-s1 kernel: mdc(OE) May 01 23:43:23 fir-md1-s1 kernel: osc(OE) May 01 23:43:23 fir-md1-s1 kernel: lov(OE) May 01 23:43:23 fir-md1-s1 kernel: fid(OE) May 01 23:43:23 fir-md1-s1 kernel: fld(OE) May 01 23:43:23 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:23 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:23 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:23 fir-md1-s1 kernel: lnet(OE) May 01 23:43:23 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:23 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:23 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:23 fir-md1-s1 kernel: nfsv4 May 01 23:43:23 fir-md1-s1 kernel: dns_resolver May 01 23:43:23 fir-md1-s1 kernel: nfs May 01 23:43:23 fir-md1-s1 kernel: lockd May 01 23:43:23 fir-md1-s1 kernel: grace May 01 23:43:23 fir-md1-s1 kernel: fscache May 01 23:43:23 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:23 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:23 fir-md1-s1 kernel: dell_rbu May 01 23:43:23 fir-md1-s1 kernel: sunrpc May 01 23:43:23 fir-md1-s1 kernel: vfat May 01 23:43:23 fir-md1-s1 kernel: fat May 01 23:43:23 fir-md1-s1 kernel: dm_round_robin May 01 23:43:23 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:23 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:23 fir-md1-s1 kernel: kvm_amd May 01 23:43:23 fir-md1-s1 kernel: kvm May 01 23:43:23 fir-md1-s1 kernel: ses May 01 23:43:23 fir-md1-s1 kernel: irqbypass May 01 23:43:23 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:23 fir-md1-s1 kernel: enclosure May 01 23:43:23 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:23 fir-md1-s1 kernel: dcdbas May 01 23:43:23 fir-md1-s1 kernel: aesni_intel May 01 23:43:23 fir-md1-s1 kernel: lrw May 01 23:43:23 fir-md1-s1 kernel: gf128mul May 01 23:43:23 fir-md1-s1 kernel: glue_helper May 01 23:43:23 fir-md1-s1 kernel: ablk_helper May 01 23:43:23 fir-md1-s1 kernel: cryptd May 01 23:43:23 fir-md1-s1 kernel: ipmi_si May 01 23:43:23 fir-md1-s1 kernel: pcspkr May 01 23:43:23 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:23 fir-md1-s1 kernel: ccp May 01 23:43:23 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:23 fir-md1-s1 kernel: dm_multipath May 01 23:43:23 fir-md1-s1 kernel: sg May 01 23:43:23 fir-md1-s1 kernel: k10temp May 01 23:43:23 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:23 fir-md1-s1 kernel: dm_mod May 01 23:43:23 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:23 fir-md1-s1 kernel: knem(OE) May 01 23:43:23 fir-md1-s1 kernel: ip_tables May 01 23:43:23 fir-md1-s1 kernel: ext4 May 01 23:43:23 fir-md1-s1 kernel: mbcache May 01 23:43:23 fir-md1-s1 kernel: jbd2 May 01 23:43:23 fir-md1-s1 kernel: sd_mod May 01 23:43:23 fir-md1-s1 kernel: crc_t10dif May 01 23:43:23 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:23 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:23 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:23 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:23 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:23 fir-md1-s1 kernel: syscopyarea May 01 23:43:23 fir-md1-s1 kernel: sysfillrect May 01 23:43:23 fir-md1-s1 kernel: sysimgblt May 01 23:43:23 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:23 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:23 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:23 fir-md1-s1 kernel: ttm May 01 23:43:23 fir-md1-s1 kernel: devlink May 01 23:43:23 fir-md1-s1 kernel: ahci May 01 23:43:23 fir-md1-s1 kernel: crct10dif_common May 01 23:43:23 fir-md1-s1 kernel: libahci May 01 23:43:23 fir-md1-s1 kernel: drm May 01 23:43:23 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:23 fir-md1-s1 kernel: tg3 May 01 23:43:23 fir-md1-s1 kernel: crc32c_intel May 01 23:43:23 fir-md1-s1 kernel: libata May 01 23:43:23 fir-md1-s1 kernel: megaraid_sas May 01 23:43:23 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:23 fir-md1-s1 kernel: ptp May 01 23:43:23 fir-md1-s1 kernel: pps_core May 01 23:43:23 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:23 fir-md1-s1 kernel: raid_class May 01 23:43:23 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:23 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:23 fir-md1-s1 kernel: May 01 23:43:23 fir-md1-s1 kernel: CPU: 15 PID: 103125 Comm: mdt_io03_043 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:23 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:23 fir-md1-s1 kernel: task: ffff985912f64100 ti: ffff9858407d0000 task.ti: ffff9858407d0000 May 01 23:43:23 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:23 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x128/0x200 May 01 23:43:23 fir-md1-s1 kernel: RSP: 0018:ffff9858407d38e8 EFLAGS: 00000246 May 01 23:43:23 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff984851f7a378 RCX: 0000000000790000 May 01 23:43:23 fir-md1-s1 kernel: RDX: ffff982cff01b780 RSI: 0000000001010101 RDI: ffff982c9fc8c480 May 01 23:43:23 fir-md1-s1 kernel: RBP: ffff9858407d38e8 R08: ffff985d3f4db780 R09: 0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff985d3f55ac00 May 01 23:43:23 fir-md1-s1 kernel: R13: ffff985912f64168 R14: 00ff9858407d3850 R15: ffff984cf34000a0 May 01 23:43:23 fir-md1-s1 kernel: FS: 00007f63c1c68740(0000) GS:ffff985d3f4c0000(0000) knlGS:0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:23 fir-md1-s1 kernel: CR2: 00007ff884d9fd1c CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:23 fir-md1-s1 kernel: Call Trace: May 01 23:43:23 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:23 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:23 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? load_balance+0x178/0x9a0 May 01 23:43:23 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:23 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:23 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:23 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:23 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: Code: May 01 23:43:23 fir-md1-s1 kernel: 98 May 01 23:43:23 fir-md1-s1 kernel: 83 May 01 23:43:23 fir-md1-s1 kernel: e2 May 01 23:43:23 fir-md1-s1 kernel: 30 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 81 May 01 23:43:23 fir-md1-s1 kernel: c2 May 01 23:43:23 fir-md1-s1 kernel: 80 May 01 23:43:23 fir-md1-s1 kernel: b7 May 01 23:43:23 fir-md1-s1 kernel: 01 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 03 May 01 23:43:23 fir-md1-s1 kernel: 14 May 01 23:43:23 fir-md1-s1 kernel: c5 May 01 23:43:23 fir-md1-s1 kernel: 60 May 01 23:43:23 fir-md1-s1 kernel: b9 May 01 23:43:23 fir-md1-s1 kernel: b4 May 01 23:43:23 fir-md1-s1 kernel: b7 May 01 23:43:23 fir-md1-s1 kernel: 4c May 01 23:43:23 fir-md1-s1 kernel: 89 May 01 23:43:23 fir-md1-s1 kernel: 02 May 01 23:43:23 fir-md1-s1 kernel: 41 May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 40 May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c0 May 01 23:43:23 fir-md1-s1 kernel: 75 May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 1f May 01 23:43:23 fir-md1-s1 kernel: 44 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: f3 May 01 23:43:23 fir-md1-s1 kernel: 90 May 01 23:43:23 fir-md1-s1 kernel: 41 May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 40 May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c0 May 01 23:43:23 fir-md1-s1 kernel: <74> May 01 23:43:23 fir-md1-s1 kernel: f6 May 01 23:43:23 fir-md1-s1 kernel: 4d May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 4d May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c9 May 01 23:43:23 fir-md1-s1 kernel: 74 May 01 23:43:23 fir-md1-s1 kernel: 04 May 01 23:43:23 fir-md1-s1 kernel: 41 May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 18 May 01 23:43:23 fir-md1-s1 kernel: 09 May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 17 May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: b7 May 01 23:43:23 fir-md1-s1 kernel: c2 May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c0 May 01 23:43:23 fir-md1-s1 kernel: May 01 23:43:23 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#21 stuck for 23s! [mdt_io01_085:103094] May 01 23:43:23 fir-md1-s1 kernel: Modules linked in: May 01 23:43:23 fir-md1-s1 kernel: osp(OE) May 01 23:43:23 fir-md1-s1 kernel: mdd(OE) May 01 23:43:23 fir-md1-s1 kernel: lod(OE) May 01 23:43:23 fir-md1-s1 kernel: mdt(OE) May 01 23:43:23 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:23 fir-md1-s1 kernel: mgs(OE) May 01 23:43:23 fir-md1-s1 kernel: mgc(OE) May 01 23:43:23 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:23 fir-md1-s1 kernel: lquota(OE) May 01 23:43:23 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:23 fir-md1-s1 kernel: lustre(OE) May 01 23:43:23 fir-md1-s1 kernel: lmv(OE) May 01 23:43:23 fir-md1-s1 kernel: mdc(OE) May 01 23:43:23 fir-md1-s1 kernel: osc(OE) May 01 23:43:23 fir-md1-s1 kernel: lov(OE) May 01 23:43:23 fir-md1-s1 kernel: fid(OE) May 01 23:43:23 fir-md1-s1 kernel: fld(OE) May 01 23:43:23 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:23 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:23 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:23 fir-md1-s1 kernel: lnet(OE) May 01 23:43:23 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:23 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:23 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:23 fir-md1-s1 kernel: nfsv4 May 01 23:43:23 fir-md1-s1 kernel: dns_resolver May 01 23:43:23 fir-md1-s1 kernel: nfs May 01 23:43:23 fir-md1-s1 kernel: lockd May 01 23:43:23 fir-md1-s1 kernel: grace May 01 23:43:23 fir-md1-s1 kernel: fscache May 01 23:43:23 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:23 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:23 fir-md1-s1 kernel: dell_rbu May 01 23:43:23 fir-md1-s1 kernel: sunrpc May 01 23:43:23 fir-md1-s1 kernel: vfat May 01 23:43:23 fir-md1-s1 kernel: fat May 01 23:43:23 fir-md1-s1 kernel: dm_round_robin May 01 23:43:23 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:23 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:23 fir-md1-s1 kernel: kvm_amd May 01 23:43:23 fir-md1-s1 kernel: kvm May 01 23:43:23 fir-md1-s1 kernel: ses May 01 23:43:23 fir-md1-s1 kernel: irqbypass May 01 23:43:23 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:23 fir-md1-s1 kernel: enclosure May 01 23:43:23 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:23 fir-md1-s1 kernel: dcdbas May 01 23:43:23 fir-md1-s1 kernel: aesni_intel May 01 23:43:23 fir-md1-s1 kernel: lrw May 01 23:43:23 fir-md1-s1 kernel: gf128mul May 01 23:43:23 fir-md1-s1 kernel: glue_helper May 01 23:43:23 fir-md1-s1 kernel: ablk_helper May 01 23:43:23 fir-md1-s1 kernel: cryptd May 01 23:43:23 fir-md1-s1 kernel: ipmi_si May 01 23:43:23 fir-md1-s1 kernel: pcspkr May 01 23:43:23 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:23 fir-md1-s1 kernel: ccp May 01 23:43:23 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:23 fir-md1-s1 kernel: dm_multipath May 01 23:43:23 fir-md1-s1 kernel: sg May 01 23:43:23 fir-md1-s1 kernel: k10temp May 01 23:43:23 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:23 fir-md1-s1 kernel: dm_mod May 01 23:43:23 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:23 fir-md1-s1 kernel: knem(OE) May 01 23:43:23 fir-md1-s1 kernel: ip_tables May 01 23:43:23 fir-md1-s1 kernel: ext4 May 01 23:43:23 fir-md1-s1 kernel: mbcache May 01 23:43:23 fir-md1-s1 kernel: jbd2 May 01 23:43:23 fir-md1-s1 kernel: sd_mod May 01 23:43:23 fir-md1-s1 kernel: crc_t10dif May 01 23:43:23 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:23 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:23 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:23 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:23 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:23 fir-md1-s1 kernel: syscopyarea May 01 23:43:23 fir-md1-s1 kernel: sysfillrect May 01 23:43:23 fir-md1-s1 kernel: sysimgblt May 01 23:43:23 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:23 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:23 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:23 fir-md1-s1 kernel: ttm May 01 23:43:23 fir-md1-s1 kernel: devlink May 01 23:43:23 fir-md1-s1 kernel: ahci May 01 23:43:23 fir-md1-s1 kernel: crct10dif_common May 01 23:43:23 fir-md1-s1 kernel: libahci May 01 23:43:23 fir-md1-s1 kernel: drm May 01 23:43:23 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:23 fir-md1-s1 kernel: tg3 May 01 23:43:23 fir-md1-s1 kernel: crc32c_intel May 01 23:43:23 fir-md1-s1 kernel: libata May 01 23:43:23 fir-md1-s1 kernel: megaraid_sas May 01 23:43:23 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:23 fir-md1-s1 kernel: ptp May 01 23:43:23 fir-md1-s1 kernel: pps_core May 01 23:43:23 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:23 fir-md1-s1 kernel: raid_class May 01 23:43:23 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:23 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:23 fir-md1-s1 kernel: May 01 23:43:23 fir-md1-s1 kernel: CPU: 21 PID: 103094 Comm: mdt_io01_085 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:23 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:23 fir-md1-s1 kernel: task: ffff982c9ff3c100 ti: ffff98596d728000 task.ti: ffff98596d728000 May 01 23:43:23 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:23 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:23 fir-md1-s1 kernel: RSP: 0018:ffff98596d72b8e8 EFLAGS: 00000246 May 01 23:43:23 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff98267ca42b78 RCX: 0000000000a90000 May 01 23:43:23 fir-md1-s1 kernel: RDX: ffff984cff79b780 RSI: 0000000000d10101 RDI: ffff982c9fc8c480 May 01 23:43:23 fir-md1-s1 kernel: RBP: ffff98596d72b8e8 R08: ffff983cff75b780 R09: 0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff983cff65ac00 May 01 23:43:23 fir-md1-s1 kernel: R13: ffff982c9ff3c168 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:43:23 fir-md1-s1 kernel: FS: 00007fad68bc1880(0000) GS:ffff983cff740000(0000) knlGS:0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:23 fir-md1-s1 kernel: CR2: 00007ffcc3425f98 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:23 fir-md1-s1 kernel: Call Trace: May 01 23:43:23 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:23 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:23 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:23 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:23 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:23 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:23 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: Code: May 01 23:43:23 fir-md1-s1 kernel: 13 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: c1 May 01 23:43:23 fir-md1-s1 kernel: ea May 01 23:43:23 fir-md1-s1 kernel: 0d May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 98 May 01 23:43:23 fir-md1-s1 kernel: 83 May 01 23:43:23 fir-md1-s1 kernel: e2 May 01 23:43:23 fir-md1-s1 kernel: 30 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 81 May 01 23:43:23 fir-md1-s1 kernel: c2 May 01 23:43:23 fir-md1-s1 kernel: 80 May 01 23:43:23 fir-md1-s1 kernel: b7 May 01 23:43:23 fir-md1-s1 kernel: 01 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 03 May 01 23:43:23 fir-md1-s1 kernel: 14 May 01 23:43:23 fir-md1-s1 kernel: c5 May 01 23:43:23 fir-md1-s1 kernel: 60 May 01 23:43:23 fir-md1-s1 kernel: b9 May 01 23:43:23 fir-md1-s1 kernel: b4 May 01 23:43:23 fir-md1-s1 kernel: b7 May 01 23:43:23 fir-md1-s1 kernel: 4c May 01 23:43:23 fir-md1-s1 kernel: 89 May 01 23:43:23 fir-md1-s1 kernel: 02 May 01 23:43:23 fir-md1-s1 kernel: 41 May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 40 May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c0 May 01 23:43:23 fir-md1-s1 kernel: 75 May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 1f May 01 23:43:23 fir-md1-s1 kernel: 44 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: f3 May 01 23:43:23 fir-md1-s1 kernel: 90 May 01 23:43:23 fir-md1-s1 kernel: <41> May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 40 May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c0 May 01 23:43:23 fir-md1-s1 kernel: 74 May 01 23:43:23 fir-md1-s1 kernel: f6 May 01 23:43:23 fir-md1-s1 kernel: 4d May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 4d May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c9 May 01 23:43:23 fir-md1-s1 kernel: 74 May 01 23:43:23 fir-md1-s1 kernel: 04 May 01 23:43:23 fir-md1-s1 kernel: 41 May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 18 May 01 23:43:23 fir-md1-s1 kernel: 09 May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: May 01 23:43:23 fir-md1-s1 kernel: NMI watchdog: BUG: soft lockup - CPU#32 stuck for 23s! [mdt_io00_072:103262] May 01 23:43:23 fir-md1-s1 kernel: Modules linked in: May 01 23:43:23 fir-md1-s1 kernel: osp(OE) May 01 23:43:23 fir-md1-s1 kernel: mdd(OE) May 01 23:43:23 fir-md1-s1 kernel: lod(OE) May 01 23:43:23 fir-md1-s1 kernel: mdt(OE) May 01 23:43:23 fir-md1-s1 kernel: lfsck(OE) May 01 23:43:23 fir-md1-s1 kernel: mgs(OE) May 01 23:43:23 fir-md1-s1 kernel: mgc(OE) May 01 23:43:23 fir-md1-s1 kernel: osd_ldiskfs(OE) May 01 23:43:23 fir-md1-s1 kernel: lquota(OE) May 01 23:43:23 fir-md1-s1 kernel: ldiskfs(OE) May 01 23:43:23 fir-md1-s1 kernel: lustre(OE) May 01 23:43:23 fir-md1-s1 kernel: lmv(OE) May 01 23:43:23 fir-md1-s1 kernel: mdc(OE) May 01 23:43:23 fir-md1-s1 kernel: osc(OE) May 01 23:43:23 fir-md1-s1 kernel: lov(OE) May 01 23:43:23 fir-md1-s1 kernel: fid(OE) May 01 23:43:23 fir-md1-s1 kernel: fld(OE) May 01 23:43:23 fir-md1-s1 kernel: ko2iblnd(OE) May 01 23:43:23 fir-md1-s1 kernel: ptlrpc(OE) May 01 23:43:23 fir-md1-s1 kernel: obdclass(OE) May 01 23:43:23 fir-md1-s1 kernel: lnet(OE) May 01 23:43:23 fir-md1-s1 kernel: libcfs(OE) May 01 23:43:23 fir-md1-s1 kernel: rpcsec_gss_krb5 May 01 23:43:23 fir-md1-s1 kernel: auth_rpcgss May 01 23:43:23 fir-md1-s1 kernel: nfsv4 May 01 23:43:23 fir-md1-s1 kernel: dns_resolver May 01 23:43:23 fir-md1-s1 kernel: nfs May 01 23:43:23 fir-md1-s1 kernel: lockd May 01 23:43:23 fir-md1-s1 kernel: grace May 01 23:43:23 fir-md1-s1 kernel: fscache May 01 23:43:23 fir-md1-s1 kernel: rdma_ucm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_ucm(OE) May 01 23:43:23 fir-md1-s1 kernel: rdma_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: iw_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_ipoib(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_cm(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_umad(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx5_fpga_tools(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_en(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_ib(OE) May 01 23:43:23 fir-md1-s1 kernel: mlx4_core(OE) May 01 23:43:23 fir-md1-s1 kernel: dell_rbu May 01 23:43:23 fir-md1-s1 kernel: sunrpc May 01 23:43:23 fir-md1-s1 kernel: vfat May 01 23:43:23 fir-md1-s1 kernel: fat May 01 23:43:23 fir-md1-s1 kernel: dm_round_robin May 01 23:43:23 fir-md1-s1 kernel: amd64_edac_mod May 01 23:43:23 fir-md1-s1 kernel: edac_mce_amd May 01 23:43:23 fir-md1-s1 kernel: kvm_amd May 01 23:43:23 fir-md1-s1 kernel: kvm May 01 23:43:23 fir-md1-s1 kernel: ses May 01 23:43:23 fir-md1-s1 kernel: irqbypass May 01 23:43:23 fir-md1-s1 kernel: crc32_pclmul May 01 23:43:23 fir-md1-s1 kernel: enclosure May 01 23:43:23 fir-md1-s1 kernel: ghash_clmulni_intel May 01 23:43:23 fir-md1-s1 kernel: dcdbas May 01 23:43:23 fir-md1-s1 kernel: aesni_intel May 01 23:43:23 fir-md1-s1 kernel: lrw May 01 23:43:23 fir-md1-s1 kernel: gf128mul May 01 23:43:23 fir-md1-s1 kernel: glue_helper May 01 23:43:23 fir-md1-s1 kernel: ablk_helper May 01 23:43:23 fir-md1-s1 kernel: cryptd May 01 23:43:23 fir-md1-s1 kernel: ipmi_si May 01 23:43:23 fir-md1-s1 kernel: pcspkr May 01 23:43:23 fir-md1-s1 kernel: ipmi_devintf May 01 23:43:23 fir-md1-s1 kernel: ccp May 01 23:43:23 fir-md1-s1 kernel: i2c_piix4 May 01 23:43:23 fir-md1-s1 kernel: dm_multipath May 01 23:43:23 fir-md1-s1 kernel: sg May 01 23:43:23 fir-md1-s1 kernel: k10temp May 01 23:43:23 fir-md1-s1 kernel: ipmi_msghandler May 01 23:43:23 fir-md1-s1 kernel: dm_mod May 01 23:43:23 fir-md1-s1 kernel: acpi_power_meter May 01 23:43:23 fir-md1-s1 kernel: knem(OE) May 01 23:43:23 fir-md1-s1 kernel: ip_tables May 01 23:43:23 fir-md1-s1 kernel: ext4 May 01 23:43:23 fir-md1-s1 kernel: mbcache May 01 23:43:23 fir-md1-s1 kernel: jbd2 May 01 23:43:23 fir-md1-s1 kernel: sd_mod May 01 23:43:23 fir-md1-s1 kernel: crc_t10dif May 01 23:43:23 fir-md1-s1 kernel: crct10dif_generic May 01 23:43:23 fir-md1-s1 kernel: mlx5_ib(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_uverbs(OE) May 01 23:43:23 fir-md1-s1 kernel: ib_core(OE) May 01 23:43:23 fir-md1-s1 kernel: i2c_algo_bit May 01 23:43:23 fir-md1-s1 kernel: drm_kms_helper May 01 23:43:23 fir-md1-s1 kernel: mlx5_core(OE) May 01 23:43:23 fir-md1-s1 kernel: syscopyarea May 01 23:43:23 fir-md1-s1 kernel: sysfillrect May 01 23:43:23 fir-md1-s1 kernel: sysimgblt May 01 23:43:23 fir-md1-s1 kernel: fb_sys_fops May 01 23:43:23 fir-md1-s1 kernel: mlxfw(OE) May 01 23:43:23 fir-md1-s1 kernel: crct10dif_pclmul May 01 23:43:23 fir-md1-s1 kernel: ttm May 01 23:43:23 fir-md1-s1 kernel: devlink May 01 23:43:23 fir-md1-s1 kernel: ahci May 01 23:43:23 fir-md1-s1 kernel: crct10dif_common May 01 23:43:23 fir-md1-s1 kernel: libahci May 01 23:43:23 fir-md1-s1 kernel: drm May 01 23:43:23 fir-md1-s1 kernel: mlx_compat(OE) May 01 23:43:23 fir-md1-s1 kernel: tg3 May 01 23:43:23 fir-md1-s1 kernel: crc32c_intel May 01 23:43:23 fir-md1-s1 kernel: libata May 01 23:43:23 fir-md1-s1 kernel: megaraid_sas May 01 23:43:23 fir-md1-s1 kernel: drm_panel_orientation_quirks May 01 23:43:23 fir-md1-s1 kernel: ptp May 01 23:43:23 fir-md1-s1 kernel: pps_core May 01 23:43:23 fir-md1-s1 kernel: mpt3sas(OE) May 01 23:43:23 fir-md1-s1 kernel: raid_class May 01 23:43:23 fir-md1-s1 kernel: scsi_transport_sas May 01 23:43:23 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:23 fir-md1-s1 kernel: May 01 23:43:23 fir-md1-s1 kernel: CPU: 32 PID: 103262 Comm: mdt_io00_072 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:23 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:23 fir-md1-s1 kernel: task: ffff984cba239040 ti: ffff982bc9ed4000 task.ti: ffff982bc9ed4000 May 01 23:43:23 fir-md1-s1 kernel: RIP: 0010:[] May 01 23:43:23 fir-md1-s1 kernel: [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:23 fir-md1-s1 kernel: RSP: 0018:ffff982bc9ed78e8 EFLAGS: 00000246 May 01 23:43:23 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff9837d662bb78 RCX: 0000000001010000 May 01 23:43:23 fir-md1-s1 kernel: RDX: ffff983cff75b780 RSI: 0000000000a90101 RDI: ffff982c9fc8c480 May 01 23:43:23 fir-md1-s1 kernel: RBP: ffff982bc9ed78e8 R08: ffff982cff01b780 R09: 0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff982cfef9ac00 May 01 23:43:23 fir-md1-s1 kernel: R13: ffff984cba2390a8 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:43:23 fir-md1-s1 kernel: FS: 00007f1fd5eaa700(0000) GS:ffff982cff000000(0000) knlGS:0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:23 fir-md1-s1 kernel: CR2: 00007f41ebfcd1b0 CR3: 0000001038b88000 CR4: 00000000003407e0 May 01 23:43:23 fir-md1-s1 kernel: Call Trace: May 01 23:43:23 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:23 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:23 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? load_balance+0x178/0x9a0 May 01 23:43:23 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:23 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:23 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:23 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:23 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: Code: May 01 23:43:23 fir-md1-s1 kernel: 13 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: c1 May 01 23:43:23 fir-md1-s1 kernel: ea May 01 23:43:23 fir-md1-s1 kernel: 0d May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 98 May 01 23:43:23 fir-md1-s1 kernel: 83 May 01 23:43:23 fir-md1-s1 kernel: e2 May 01 23:43:23 fir-md1-s1 kernel: 30 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 81 May 01 23:43:23 fir-md1-s1 kernel: c2 May 01 23:43:23 fir-md1-s1 kernel: 80 May 01 23:43:23 fir-md1-s1 kernel: b7 May 01 23:43:23 fir-md1-s1 kernel: 01 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: 48 May 01 23:43:23 fir-md1-s1 kernel: 03 May 01 23:43:23 fir-md1-s1 kernel: 14 May 01 23:43:23 fir-md1-s1 kernel: c5 May 01 23:43:23 fir-md1-s1 kernel: 60 May 01 23:43:23 fir-md1-s1 kernel: b9 May 01 23:43:23 fir-md1-s1 kernel: b4 May 01 23:43:23 fir-md1-s1 kernel: b7 May 01 23:43:23 fir-md1-s1 kernel: 4c May 01 23:43:23 fir-md1-s1 kernel: 89 May 01 23:43:23 fir-md1-s1 kernel: 02 May 01 23:43:23 fir-md1-s1 kernel: 41 May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 40 May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c0 May 01 23:43:23 fir-md1-s1 kernel: 75 May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 1f May 01 23:43:23 fir-md1-s1 kernel: 44 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: 00 May 01 23:43:23 fir-md1-s1 kernel: f3 May 01 23:43:23 fir-md1-s1 kernel: 90 May 01 23:43:23 fir-md1-s1 kernel: <41> May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 40 May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c0 May 01 23:43:23 fir-md1-s1 kernel: 74 May 01 23:43:23 fir-md1-s1 kernel: f6 May 01 23:43:23 fir-md1-s1 kernel: 4d May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: 08 May 01 23:43:23 fir-md1-s1 kernel: 4d May 01 23:43:23 fir-md1-s1 kernel: 85 May 01 23:43:23 fir-md1-s1 kernel: c9 May 01 23:43:23 fir-md1-s1 kernel: 74 May 01 23:43:23 fir-md1-s1 kernel: 04 May 01 23:43:23 fir-md1-s1 kernel: 41 May 01 23:43:23 fir-md1-s1 kernel: 0f May 01 23:43:23 fir-md1-s1 kernel: 18 May 01 23:43:23 fir-md1-s1 kernel: 09 May 01 23:43:23 fir-md1-s1 kernel: 8b May 01 23:43:23 fir-md1-s1 kernel: May 01 23:43:23 fir-md1-s1 kernel: [last unloaded: libcfs] May 01 23:43:23 fir-md1-s1 kernel: May 01 23:43:23 fir-md1-s1 kernel: CPU: 2 PID: 101733 Comm: mdt_io02_001 Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 May 01 23:43:23 fir-md1-s1 kernel: Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018 May 01 23:43:23 fir-md1-s1 kernel: task: ffff982cf0b630c0 ti: ffff984cfa370000 task.ti: ffff984cfa370000 May 01 23:43:23 fir-md1-s1 kernel: RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x122/0x200 May 01 23:43:23 fir-md1-s1 kernel: RSP: 0018:ffff984cfa3738e8 EFLAGS: 00000246 May 01 23:43:23 fir-md1-s1 kernel: RAX: 0000000000000000 RBX: ffff985c7a711b78 RCX: 0000000000110000 May 01 23:43:23 fir-md1-s1 kernel: RDX: ffff985d3f4db780 RSI: 0000000000790101 RDI: ffff982c9fc8c480 May 01 23:43:23 fir-md1-s1 kernel: RBP: ffff984cfa3738e8 R08: ffff984cff61b780 R09: 0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: R10: 0000000000000000 R11: ffff985c8b319038 R12: ffff984cff61ac00 May 01 23:43:23 fir-md1-s1 kernel: R13: ffff982cf0b63128 R14: 00ffffffb7a08d80 R15: ffff984cf34000a0 May 01 23:43:23 fir-md1-s1 kernel: FS: 00007f759e3eb740(0000) GS:ffff984cff600000(0000) knlGS:0000000000000000 May 01 23:43:23 fir-md1-s1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 01 23:43:23 fir-md1-s1 kernel: CR2: 00007f759e3fa000 CR3: 00000012f7610000 CR4: 00000000003407e0 May 01 23:43:23 fir-md1-s1 kernel: Call Trace: May 01 23:43:23 fir-md1-s1 kernel: [] queued_spin_lock_slowpath+0xb/0xf May 01 23:43:23 fir-md1-s1 kernel: [] _raw_spin_lock+0x20/0x30 May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_es_lru_add+0x57/0x90 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ldiskfs_map_blocks+0x210/0x700 [ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_ts64+0x52/0xf0 May 01 23:43:23 fir-md1-s1 kernel: [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] osd_write_prep+0x2b6/0x360 [osd_ldiskfs] May 01 23:43:23 fir-md1-s1 kernel: [] mdt_obd_preprw+0x637/0x1060 [mdt] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_brw_write+0xc7e/0x1a90 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? update_curr+0x14c/0x1e0 May 01 23:43:23 fir-md1-s1 kernel: [] ? account_entity_dequeue+0xae/0xd0 May 01 23:43:23 fir-md1-s1 kernel: [] ? __enqueue_entity+0x78/0x80 May 01 23:43:23 fir-md1-s1 kernel: [] ? tgt_lookup_reply+0x2d/0x190 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? default_wake_function+0x12/0x20 May 01 23:43:23 fir-md1-s1 kernel: [] ? __wake_up_common+0x5b/0x90 May 01 23:43:23 fir-md1-s1 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] May 01 23:43:23 fir-md1-s1 kernel: [] kthread+0xd1/0xe0 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: [] ret_from_fork_nospec_begin+0xe/0x21 May 01 23:43:23 fir-md1-s1 kernel: [] ? insert_kthread_work+0x40/0x40 May 01 23:43:23 fir-md1-s1 kernel: Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 80 b7 01 00 48 03 14 c5 60 b9 b4 b7 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b May 01 23:43:24 fir-md1-s1 kernel: Lustre: 103134:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (20:58s); client may timeout. req@ffff984b11785c50 x1631585929788768/t306055732550(0) o4->16749711-2a27-479b-83fc-14b2199ba6af@10.9.104.18@o2ib4:26/0 lens 8680/416 e 1 to 0 dl 1556779346 ref 1 fl Complete:/0/0 rc 0/0 May 01 23:43:24 fir-md1-s1 kernel: LustreError: 103039:0:(service.c:2128:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-10.8.24.34@o2ib6: deadline 30:1s ago req@ffff984a87ac4800 x1631778399965216/t0(0) o35->b1ac7951-67b3-5d05-244d-b23c643bc210@10.8.24.34@o2ib6:23/0 lens 392/0 e 0 to 0 dl 1556779403 ref 1 fl Interpret:/0/ffffffff rc 0/-1 May 01 23:43:24 fir-md1-s1 kernel: Lustre: 103134:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1263 previous similar messages May 01 23:43:26 fir-md1-s1 kernel: Lustre: 102928:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1556779385/real 1556779385] req@ffff98555c27d400 x1632254604141824/t0(0) o601->fir-MDT0000-lwp-MDT0002@0@lo:23/10 lens 336/336 e 1 to 1 dl 1556779406 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 May 01 23:43:26 fir-md1-s1 kernel: Lustre: 102928:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 1 previous similar message May 01 23:43:38 fir-md1-s1 kernel: Lustre: fir-MDT0002: Client f37c3da1-0e56-86e1-dca2-c29b3ae80868 (at 10.9.112.9@o2ib4) reconnecting May 01 23:43:38 fir-md1-s1 kernel: Lustre: Skipped 351 previous similar messages May 01 23:43:38 fir-md1-s1 kernel: Lustre: fir-MDT0002: Connection restored to (at 10.9.112.9@o2ib4) May 01 23:43:38 fir-md1-s1 kernel: Lustre: Skipped 354 previous similar messages May 01 23:56:15 fir-md1-s1 kernel: Lustre: DEBUG MARKER: Wed May 1 23:56:15 2019 May 01 23:57:11 fir-md1-s1 kernel: bash (34456): drop_caches: 2