Sep 8 06:54:31 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 24642s: evicting client at 12.0.6.139@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff880492ac8380/0x81ad926eaa82affd lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 467 type: IBT flags: 0x60200400000020 nid: 12.0.6.139@tcp1 remote: 0x66c5183129a45cf expref: 180 pid: 7092 timeout: 4970597002 lvb_type: 0 Sep 8 06:54:31 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 5 previous similar messages Sep 8 06:54:31 mds0 kernel: LustreError: 7257:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8810026e3800 ns: mdt-THFS-MDT0000_UUID lock: ffff880282fc61c0/0x81ad926eaa8e11f5 lrc: 3/0,0 mode: CR/CR res: [0x20000f671:0x296:0x0].0 bits 0x9 rrc: 1 type: IBT flags: 0x50200000000000 nid: 12.0.5.94@tcp1 remote: 0x48c63f905af907de expref: 4 pid: 7257 timeout: 0 lvb_type: 0 Sep 8 06:54:31 mds0 kernel: LustreError: 7257:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) Skipped 1 previous similar message Sep 8 06:54:31 mds0 kernel: Lustre: 7098:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:23090s); client may timeout. req@ffff88047e3ea000 x1573500961007016/t0(0) o101->5b259191-c1ef-4b38-f9c7-8bf40b706ddd@12.0.2.155@tcp1:0/0 lens 576/536 e 0 to 0 dl 1504801781 ref 1 fl Complete:/0/0 rc 0/0 Sep 8 06:54:31 mds0 kernel: Lustre: 7098:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 21 previous similar messages Sep 8 06:54:31 mds0 kernel: LNet: Service thread pid 7098 completed after 23439.97s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 06:54:31 mds0 kernel: LNet: Skipped 17 previous similar messages Sep 8 06:54:31 mds0 kernel: LustreError: 7098:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.6.129@tcp1: deadline 755:105s ago Sep 8 06:54:31 mds0 kernel: req@ffff8802a8cee400 x1573320288657976/t0(0) o101->ffd151a8-2a0d-a238-b177-2f524a3c1ff5@12.0.6.129@tcp1:0/0 lens 584/0 e 0 to 0 dl 1504824766 ref 1 fl Interpret:/2/ffffffff rc 0/-1 Sep 8 06:54:31 mds0 kernel: LustreError: 7098:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 264 previous similar messages Sep 8 06:54:31 mds0 kernel: LustreError: 6967:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff880fed7cf000 ns: mdt-THFS-MDT0000_UUID lock: ffff88100fcf4340/0x81ad926eaa8e1942 lrc: 2/0,0 mode: --/CR res: [0x200004f76:0x652e:0x0].0 bits 0x0 rrc: 1238 type: IBT flags: 0x40000000000000 nid: local remote: 0x3472174c8c7f1be0 expref: -99 pid: 6967 timeout: 0 lvb_type: 0 Sep 8 06:54:31 mds0 kernel: LustreError: 6967:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 22 previous similar messages Sep 8 07:02:26 mds0 kernel: Lustre: THFS-MDT0000: Client c88db792-f9e2-03c4-4471-0af4311099c9 (at 12.0.2.19@tcp1) reconnecting Sep 8 07:02:26 mds0 kernel: Lustre: Skipped 8471 previous similar messages Sep 8 07:02:26 mds0 kernel: Lustre: THFS-MDT0000: Client c88db792-f9e2-03c4-4471-0af4311099c9 (at 12.0.2.19@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 07:02:26 mds0 kernel: Lustre: Skipped 8370 previous similar messages Sep 8 07:04:30 mds0 kernel: Lustre: lock timed out (enqueued at 1504824270, 1200s ago) Sep 8 07:04:30 mds0 kernel: Lustre: Skipped 11 previous similar messages Sep 8 07:04:32 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 24041s: evicting client at 12.0.2.155@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff880840b46500/0x81ad926eaa85d199 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 536 type: IBT flags: 0x60200400000020 nid: 12.0.2.155@tcp1 remote: 0xb46b971333610fc7 expref: 80 pid: 7098 timeout: 4971198005 lvb_type: 0 Sep 8 07:04:32 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 14 previous similar messages Sep 8 07:04:32 mds0 kernel: Lustre: 6984:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:23286s); client may timeout. req@ffff8802284c6400 x1573500744867600/t0(0) o101->a37fe3bf-2f6d-4d50-c3f9-d77b8e8d6041@12.0.2.95@tcp1:0/0 lens 576/536 e 0 to 0 dl 1504802186 ref 1 fl Complete:/0/0 rc 0/0 Sep 8 07:04:32 mds0 kernel: LNet: Service thread pid 7367 completed after 24040.98s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:04:32 mds0 kernel: LNet: Skipped 92 previous similar messages Sep 8 07:04:32 mds0 kernel: LustreError: 7317:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff881013de9800 ns: mdt-THFS-MDT0000_UUID lock: ffff88101bad1740/0x81ad926eaa960210 lrc: 2/0,0 mode: --/CR res: [0x200004f76:0x65c8:0x0].0 bits 0x0 rrc: 284 type: IBT flags: 0x40000000000000 nid: local remote: 0x35c30e81c6b403b3 expref: -99 pid: 7317 timeout: 0 lvb_type: 0 Sep 8 07:04:32 mds0 kernel: LustreError: 7317:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 37 previous similar messages Sep 8 07:04:32 mds0 kernel: LustreError: 7317:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.3.207@tcp1: deadline 100:487s ago Sep 8 07:04:32 mds0 kernel: req@ffff880ffff06400 x1573501370385052/t0(0) o101->53439fda-0d39-881a-3206-2f3dfdd9fdc0@12.0.3.207@tcp1:0/0 lens 584/0 e 0 to 0 dl 1504824985 ref 1 fl Interpret:/2/ffffffff rc 0/-1 Sep 8 07:04:32 mds0 kernel: LustreError: 7317:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 97 previous similar messages Sep 8 07:04:32 mds0 kernel: Lustre: 6984:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 219 previous similar messages Sep 8 07:07:01 mds0 kernel: Lustre: 7257:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply Sep 8 07:07:01 mds0 kernel: req@ffff8807f8cf1400 x1573500603087284/t0(0) o101->d77f6df4-8d08-d91b-d72c-7bc1857f901b@12.0.2.87@tcp1:0/0 lens 576/3384 e 0 to 0 dl 1504825626 ref 2 fl Interpret:/0/0 rc 0/0 Sep 8 07:07:01 mds0 kernel: Lustre: 7257:0:(service.c:1347:ptlrpc_at_send_early_reply()) Skipped 29 previous similar messages Sep 8 07:12:26 mds0 kernel: Lustre: THFS-MDT0000: Client 902f9430-1cb5-bcd4-0b1b-7653215af491 (at 12.0.5.147@tcp1) reconnecting Sep 8 07:12:26 mds0 kernel: Lustre: Skipped 8320 previous similar messages Sep 8 07:12:26 mds0 kernel: Lustre: THFS-MDT0000: Client 902f9430-1cb5-bcd4-0b1b-7653215af491 (at 12.0.5.147@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 07:12:26 mds0 kernel: Lustre: Skipped 8293 previous similar messages Sep 8 07:14:31 mds0 kernel: Lustre: lock timed out (enqueued at 1504824871, 1200s ago) Sep 8 07:14:31 mds0 kernel: Lustre: Skipped 25 previous similar messages Sep 8 07:14:33 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 24642s: evicting client at 12.0.2.95@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff88029fb59cc0/0x81ad926eaa86fe76 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 467 type: IBT flags: 0x60200400000020 nid: 12.0.2.95@tcp1 remote: 0x95cab4bc88433f3f expref: 27 pid: 6984 timeout: 4971799023 lvb_type: 0 Sep 8 07:14:33 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 88 previous similar messages Sep 8 07:14:33 mds0 kernel: Lustre: 7201:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:23887s); client may timeout. req@ffff8802740dd400 x1573500404675252/t236318041314(0) o101->a3bf8a08-bfa0-38fe-f1d0-436089a9c9ed@12.0.2.3@tcp1:0/0 lens 584/600 e 0 to 0 dl 1504802186 ref 1 fl Complete:/0/0 rc 0/0 Sep 8 07:14:33 mds0 kernel: LustreError: 2297:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) ### lock on destroyed export ffff88029e921800 ns: mdt-THFS-MDT0000_UUID lock: ffff8801e47ac680/0x81ad926eaa88a395 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 447 type: IBT flags: 0x50200400000020 nid: 12.0.5.23@tcp1 remote: 0xf946874abebf9b8d expref: 2 pid: 2297 timeout: 0 lvb_type: 0 Sep 8 07:14:33 mds0 kernel: LNet: Service thread pid 7279 completed after 24626.82s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:14:33 mds0 kernel: LNet: Service thread pid 7064 completed after 24626.41s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:14:33 mds0 kernel: LNet: Skipped 17 previous similar messages Sep 8 07:14:33 mds0 kernel: LNet: Skipped 17 previous similar messages Sep 8 07:14:33 mds0 kernel: LustreError: 2297:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.6.128@tcp1: deadline 755:264s ago Sep 8 07:14:33 mds0 kernel: req@ffff8804ca657400 x1577164543209636/t0(0) o101->e54d6ad6-4658-f6b7-1885-24cbdd99b4e1@12.0.6.128@tcp1:0/0 lens 576/0 e 0 to 0 dl 1504825809 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Sep 8 07:14:33 mds0 kernel: LustreError: 2297:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 26 previous similar messages Sep 8 07:14:33 mds0 kernel: LustreError: 2297:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff8804cba5f800 ns: mdt-THFS-MDT0000_UUID lock: ffff88081a5965c0/0x81ad926eaa97f44b lrc: 2/0,0 mode: --/CR res: [0x200004cd9:0x15745:0x0].0 bits 0x0 rrc: 266 type: IBT flags: 0x40000000000000 nid: local remote: 0x882966e6523718c0 expref: -99 pid: 2297 timeout: 0 lvb_type: 0 Sep 8 07:14:33 mds0 kernel: LustreError: 2297:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 160 previous similar messages Sep 8 07:14:33 mds0 kernel: Lustre: 7201:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 420 previous similar messages Sep 8 07:17:02 mds0 kernel: Lustre: 7257:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply Sep 8 07:17:02 mds0 kernel: req@ffff88077c955400 x1573353295710972/t0(0) o101->8f05184a-31a7-58b8-6b2c-0aeaa04a29c6@12.0.6.11@tcp1:0/0 lens 584/0 e 0 to 0 dl 1504826227 ref 2 fl New:/0/ffffffff rc 0/-1 Sep 8 07:17:02 mds0 kernel: Lustre: 7257:0:(service.c:1347:ptlrpc_at_send_early_reply()) Skipped 82 previous similar messages Sep 8 07:22:26 mds0 kernel: Lustre: THFS-MDT0000: Client a6cf33ae-7f51-233a-2b87-055e488bbdc3 (at 12.0.4.207@tcp1) reconnecting Sep 8 07:22:26 mds0 kernel: Lustre: Skipped 8696 previous similar messages Sep 8 07:22:26 mds0 kernel: Lustre: THFS-MDT0000: Client a6cf33ae-7f51-233a-2b87-055e488bbdc3 (at 12.0.4.207@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 07:22:26 mds0 kernel: Lustre: Skipped 8598 previous similar messages Sep 8 07:24:32 mds0 kernel: Lustre: lock timed out (enqueued at 1504825472, 1200s ago) Sep 8 07:24:32 mds0 kernel: Lustre: Skipped 79 previous similar messages Sep 8 07:24:34 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 25228s: evicting client at 12.0.4.93@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff8802768c1580/0x81ad926eaa88a19d lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 453 type: IBT flags: 0x60200400000020 nid: 12.0.4.93@tcp1 remote: 0x2bad76e0bf716402 expref: 16 pid: 7064 timeout: 4972400003 lvb_type: 0 Sep 8 07:24:34 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 17 previous similar messages Sep 8 07:24:34 mds0 kernel: LustreError: 7069:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff8802d2a6a800 ns: mdt-THFS-MDT0000_UUID lock: ffff8808657ebd40/0x81ad926eaa98e7ed lrc: 2/0,0 mode: --/CR res: [0x200004f76:0x65c8:0x0].0 bits 0x0 rrc: 282 type: IBT flags: 0x40000000000000 nid: local remote: 0xa7be06c38d0e1785 expref: -99 pid: 7069 timeout: 0 lvb_type: 0 Sep 8 07:24:34 mds0 kernel: LustreError: 7024:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.4.221@tcp1: deadline 100:498s ago Sep 8 07:24:34 mds0 kernel: req@ffff8804f9b15400 x1573501614389256/t0(0) o101->8945f9c7-f71d-b71d-c7f7-b94c49ebfbec@12.0.4.221@tcp1:0/0 lens 576/0 e 0 to 0 dl 1504826176 ref 1 fl Interpret:/2/ffffffff rc 0/-1 Sep 8 07:24:34 mds0 kernel: LustreError: 7024:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 384 previous similar messages Sep 8 07:24:34 mds0 kernel: Lustre: 7024:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (100:498s); client may timeout. req@ffff8804f9b15400 x1573501614389256/t0(0) o101->8945f9c7-f71d-b71d-c7f7-b94c49ebfbec@12.0.4.221@tcp1:0/0 lens 576/0 e 0 to 0 dl 1504826176 ref 1 fl Interpret:/2/ffffffff rc 0/-1 Sep 8 07:24:34 mds0 kernel: LNet: Service thread pid 7342 completed after 23439.96s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:24:34 mds0 kernel: LNet: Skipped 7 previous similar messages Sep 8 07:24:34 mds0 kernel: LustreError: 7019:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8810123fac00 ns: mdt-THFS-MDT0000_UUID lock: ffff880296149100/0x81ad926eaa98e8a3 lrc: 3/0,0 mode: CR/CR res: [0x2000106a1:0x1:0x0].0 bits 0x9 rrc: 1 type: IBT flags: 0x50200000000000 nid: 12.0.3.132@tcp1 remote: 0xe36109090ca1e7c0 expref: 3 pid: 7019 timeout: 0 lvb_type: 0 Sep 8 07:24:34 mds0 kernel: LustreError: 7069:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 20 previous similar messages Sep 8 07:27:03 mds0 kernel: Lustre: 6965:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply Sep 8 07:27:03 mds0 kernel: req@ffff8807a61d3800 x1573500611746392/t0(0) o101->1a6fe463-b56d-d61e-eff0-7e5f179edb7f@12.0.2.84@tcp1:0/0 lens 576/3384 e 0 to 0 dl 1504826828 ref 2 fl Interpret:/0/0 rc 0/0 Sep 8 07:27:03 mds0 kernel: Lustre: 6965:0:(service.c:1347:ptlrpc_at_send_early_reply()) Skipped 329 previous similar messages Sep 8 07:32:26 mds0 kernel: Lustre: THFS-MDT0000: Client c0e5af9f-e41c-015c-5b1e-7e6bff22467c (at 12.0.2.207@tcp1) reconnecting Sep 8 07:32:26 mds0 kernel: Lustre: Skipped 7816 previous similar messages Sep 8 07:32:26 mds0 kernel: Lustre: THFS-MDT0000: Client c0e5af9f-e41c-015c-5b1e-7e6bff22467c (at 12.0.2.207@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 07:32:26 mds0 kernel: Lustre: Skipped 7700 previous similar messages Sep 8 07:34:33 mds0 kernel: Lustre: lock timed out (enqueued at 1504826073, 1200s ago) Sep 8 07:34:33 mds0 kernel: Lustre: Skipped 20 previous similar messages Sep 8 07:34:35 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 24041s: evicting client at 12.0.5.135@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff880febd982c0/0x81ad926eaa89f697 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 509 type: IBT flags: 0x60200400000020 nid: 12.0.5.135@tcp1 remote: 0xa9463dd64e82959a expref: 14 pid: 7342 timeout: 4973001003 lvb_type: 0 Sep 8 07:34:35 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 7 previous similar messages Sep 8 07:34:35 mds0 kernel: Lustre: 7140:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:23286s); client may timeout. req@ffff88101b9c1c00 x1573501664952672/t236318044857(0) o101->5d6fbfba-7156-3837-a585-b9018b3a83cb@12.0.4.205@tcp1:0/0 lens 584/600 e 0 to 0 dl 1504803989 ref 1 fl Complete:/0/0 rc 0/0 Sep 8 07:34:35 mds0 kernel: LNet: Service thread pid 7054 completed after 22838.99s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:34:35 mds0 kernel: LNet: Skipped 62 previous similar messages Sep 8 07:34:35 mds0 kernel: LustreError: 7054:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff88100a367800 ns: mdt-THFS-MDT0000_UUID lock: ffff880ffa2f5480/0x81ad926eaa9cdb35 lrc: 2/0,0 mode: --/CR res: [0x200004f76:0x65c8:0x0].0 bits 0x0 rrc: 260 type: IBT flags: 0x40000000000000 nid: local remote: 0x67013e050e01e4d4 expref: -99 pid: 7054 timeout: 0 lvb_type: 0 Sep 8 07:34:35 mds0 kernel: LustreError: 7054:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.3.18@tcp1: deadline 100:487s ago Sep 8 07:34:35 mds0 kernel: req@ffff8810082a9400 x1573501098052332/t0(0) o101->c78f9251-b59a-b791-587b-8ce848a26696@12.0.3.18@tcp1:0/0 lens 576/0 e 0 to 0 dl 1504826788 ref 1 fl Interpret:/2/ffffffff rc 0/-1 Sep 8 07:34:35 mds0 kernel: LustreError: 7054:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 803 previous similar messages Sep 8 07:34:35 mds0 kernel: Lustre: 7140:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 1021 previous similar messages Sep 8 07:37:04 mds0 kernel: Lustre: 7162:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply Sep 8 07:37:04 mds0 kernel: req@ffff881065f23400 x1573493766143816/t0(0) o101->0bfb2368-b21e-df0b-6e3c-dedcc2937e14@12.0.3.206@tcp1:0/0 lens 576/3384 e 0 to 0 dl 1504827429 ref 2 fl Interpret:/0/0 rc 0/0 Sep 8 07:37:04 mds0 kernel: Lustre: 7162:0:(service.c:1347:ptlrpc_at_send_early_reply()) Skipped 14 previous similar messages Sep 8 07:42:26 mds0 kernel: Lustre: THFS-MDT0000: Client d4f27fc6-9221-7b10-0ef3-378f9aa6da44 (at 12.0.5.8@tcp1) reconnecting Sep 8 07:42:26 mds0 kernel: Lustre: Skipped 7932 previous similar messages Sep 8 07:42:26 mds0 kernel: Lustre: THFS-MDT0000: Client d4f27fc6-9221-7b10-0ef3-378f9aa6da44 (at 12.0.5.8@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 07:42:26 mds0 kernel: Lustre: Skipped 7876 previous similar messages Sep 8 07:44:34 mds0 kernel: Lustre: lock timed out (enqueued at 1504826674, 1200s ago) Sep 8 07:44:34 mds0 kernel: Lustre: Skipped 17 previous similar messages Sep 8 07:44:36 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 22839s: evicting client at 12.0.2.31@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff881063607b80/0x81ad926eaa8a094b lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 458 type: IBT flags: 0x60200400000020 nid: 12.0.2.31@tcp1 remote: 0xee0df105f7ed015e expref: 25 pid: 7106 timeout: 4973602007 lvb_type: 0 Sep 8 07:44:36 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 60 previous similar messages Sep 8 07:44:36 mds0 kernel: Lustre: 6978:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:22203s); client may timeout. req@ffff8810375ed000 x1573501726522384/t236318033442(0) o101->2c16a83b-942c-93e1-f506-0e920e196bc3@12.0.5.129@tcp1:0/0 lens 584/600 e 0 to 0 dl 1504805673 ref 1 fl Complete:/2/0 rc 0/0 Sep 8 07:44:36 mds0 kernel: LNet: Service thread pid 7314 completed after 21636.99s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:44:36 mds0 kernel: LNet: Skipped 11 previous similar messages Sep 8 07:44:36 mds0 kernel: LustreError: 7135:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-10.23.0.3@tcp: deadline 755:419s ago Sep 8 07:44:36 mds0 kernel: req@ffff881028006800 x1575765629238136/t0(0) o49->d5484d08-d8c1-745d-fd88-0707ed6bab48@10.23.0.3@tcp:0/0 lens 464/0 e 0 to 0 dl 1504827457 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Sep 8 07:44:36 mds0 kernel: LustreError: 7135:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 22 previous similar messages Sep 8 07:44:36 mds0 kernel: LustreError: 7314:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff880ffe2f8400 ns: mdt-THFS-MDT0000_UUID lock: ffff8810130db2c0/0x81ad926eaa9cecb5 lrc: 2/0,0 mode: --/CR res: [0x200004f76:0x652e:0x0].0 bits 0x0 rrc: 1225 type: IBT flags: 0x40000000000000 nid: local remote: 0xee0df105f7ed015e expref: -99 pid: 7314 timeout: 0 lvb_type: 0 Sep 8 07:44:36 mds0 kernel: LustreError: 7314:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 59 previous similar messages Sep 8 07:44:36 mds0 kernel: Lustre: 6978:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 572 previous similar messages Sep 8 07:47:05 mds0 kernel: Lustre: 7106:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply Sep 8 07:47:05 mds0 kernel: req@ffff88103a168800 x1573501793971844/t0(0) o101->f02d1070-a02c-47bf-7bd8-09e58ca984b2@12.0.5.90@tcp1:0/0 lens 576/3384 e 0 to 0 dl 1504828030 ref 2 fl Interpret:/0/0 rc 0/0 Sep 8 07:47:05 mds0 kernel: Lustre: 7106:0:(service.c:1347:ptlrpc_at_send_early_reply()) Skipped 250 previous similar messages Sep 8 07:52:26 mds0 kernel: Lustre: THFS-MDT0000: Client 55f2c44f-0678-cb2b-c934-8772919c83ed (at 12.0.4.214@tcp1) reconnecting Sep 8 07:52:26 mds0 kernel: Lustre: Skipped 8198 previous similar messages Sep 8 07:52:26 mds0 kernel: Lustre: THFS-MDT0000: Client 55f2c44f-0678-cb2b-c934-8772919c83ed (at 12.0.4.214@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 07:52:26 mds0 kernel: Lustre: Skipped 8120 previous similar messages Sep 8 07:54:35 mds0 kernel: Lustre: lock timed out (enqueued at 1504827275, 1200s ago) Sep 8 07:54:35 mds0 kernel: Lustre: Skipped 58 previous similar messages Sep 8 07:54:37 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 22238s: evicting client at 12.0.3.29@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff881003415d40/0x81ad926eaa8a1d56 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 458 type: IBT flags: 0x60200400000020 nid: 12.0.3.29@tcp1 remote: 0x4f30f508402fbeca expref: 18 pid: 7314 timeout: 4974203002 lvb_type: 0 Sep 8 07:54:37 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 10 previous similar messages Sep 8 07:54:37 mds0 kernel: Lustre: 7335:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:20417s); client may timeout. req@ffff8810099d2400 x1573501148369312/t0(0) o101->1510eccf-9728-8d7a-cc1a-df038ef8d7a1@12.0.3.136@tcp1:0/0 lens 576/536 e 0 to 0 dl 1504808060 ref 1 fl Complete:/0/0 rc 0/0 Sep 8 07:54:37 mds0 kernel: LNet: Service thread pid 7181 completed after 19833.99s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:54:37 mds0 kernel: LNet: Service thread pid 6963 completed after 21035.98s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 07:54:37 mds0 kernel: LNet: Skipped 11 previous similar messages Sep 8 07:54:37 mds0 kernel: LNet: Skipped 11 previous similar messages Sep 8 07:54:37 mds0 kernel: LustreError: 2299:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.7.1@tcp1: deadline 755:1025s ago Sep 8 07:54:37 mds0 kernel: req@ffff8807b07cf400 x1573316797825008/t0(0) o101->d8634e89-052c-07a7-699b-54bf661099de@12.0.7.1@tcp1:0/0 lens 576/0 e 0 to 0 dl 1504827452 ref 1 fl Interpret:/2/ffffffff rc 0/-1 Sep 8 07:54:37 mds0 kernel: LustreError: 2299:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 494 previous similar messages Sep 8 07:54:37 mds0 kernel: LustreError: 7097:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff880ff5a15000 ns: mdt-THFS-MDT0000_UUID lock: ffff88100b362d40/0x81ad926eaa9d230a lrc: 2/0,0 mode: --/CR res: [0x2000038f6:0x1:0x0].0 bits 0x0 rrc: 437 type: IBT flags: 0x40000000000000 nid: local remote: 0xa98966c613553b10 expref: -99 pid: 7097 timeout: 0 lvb_type: 0 Sep 8 07:54:37 mds0 kernel: LustreError: 7097:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 28 previous similar messages Sep 8 07:54:37 mds0 kernel: Lustre: 7335:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 648 previous similar messages Sep 8 07:57:06 mds0 kernel: Lustre: 7255:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply Sep 8 07:57:06 mds0 kernel: req@ffff880ffbd76400 x1573501480315868/t0(0) o101->4064582c-24d0-6eaa-deee-6a42025f8e81@12.0.4.0@tcp1:0/0 lens 576/3384 e 0 to 0 dl 1504828631 ref 2 fl Interpret:/0/0 rc 0/0 Sep 8 07:57:06 mds0 kernel: Lustre: 7255:0:(service.c:1347:ptlrpc_at_send_early_reply()) Skipped 72 previous similar messages Sep 8 08:02:26 mds0 kernel: Lustre: THFS-MDT0000: Client 808111f6-fc85-f093-b72a-3dd03fbb841b (at 12.0.2.145@tcp1) reconnecting Sep 8 08:02:26 mds0 kernel: Lustre: Skipped 7901 previous similar messages Sep 8 08:02:26 mds0 kernel: Lustre: THFS-MDT0000: Client 808111f6-fc85-f093-b72a-3dd03fbb841b (at 12.0.2.145@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 08:02:26 mds0 kernel: Lustre: Skipped 7854 previous similar messages Sep 8 08:04:36 mds0 kernel: Lustre: lock timed out (enqueued at 1504827876, 1200s ago) Sep 8 08:04:36 mds0 kernel: Lustre: Skipped 5 previous similar messages Sep 8 08:04:38 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 21036s: evicting client at 12.0.3.136@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff881006db9100/0x81ad926eaa8a4009 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 469 type: IBT flags: 0x60200400000020 nid: 12.0.3.136@tcp1 remote: 0xddd327fca1cfc6ab expref: 18 pid: 7335 timeout: 4974804002 lvb_type: 0 Sep 8 08:04:38 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 10 previous similar messages Sep 8 08:04:38 mds0 kernel: Lustre: 7007:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:11266s); client may timeout. req@ffff8805ea4c6800 x1573500738748516/t0(0) o101->583c0ad0-cecb-82fc-5446-2c422ace1d2e@12.0.2.69@tcp1:0/0 lens 576/536 e 0 to 0 dl 1504817812 ref 1 fl Complete:/0/0 rc 0/0 Sep 8 08:04:38 mds0 kernel: LNet: Service thread pid 7200 completed after 12622.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 08:04:38 mds0 kernel: LNet: Skipped 22 previous similar messages Sep 8 08:04:38 mds0 kernel: LustreError: 7200:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff8807d0ea3000 ns: mdt-THFS-MDT0000_UUID lock: ffff880241e3a680/0x81ad926eaa9e9f64 lrc: 2/0,0 mode: --/CR res: [0x200004f3f:0x1fbd9:0x0].0 bits 0x0 rrc: 270 type: IBT flags: 0x40000000000000 nid: local remote: 0x195ed68f730a1d52 expref: -99 pid: 7200 timeout: 0 lvb_type: 0 Sep 8 08:04:38 mds0 kernel: LustreError: 7200:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 29 previous similar messages Sep 8 08:04:38 mds0 kernel: LustreError: 7200:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.7.1@tcp1: deadline 100:499s ago Sep 8 08:04:38 mds0 kernel: req@ffff8805ada8c000 x1573316798186936/t0(0) o38->@:0/0 lens 400/0 e 0 to 0 dl 1504828579 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Sep 8 08:04:38 mds0 kernel: LustreError: 7200:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 678 previous similar messages Sep 8 08:04:38 mds0 kernel: Lustre: 7007:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 253 previous similar messages Sep 8 08:05:54 mds0 kernel: Lustre: THFS-MDT0000: haven't heard from client 40e2be8e-1ff1-0b4f-e2c2-42e275bdd487 (at 12.0.4.143@tcp1) in 3081 seconds. I think it's dead, and I am evicting it. exp ffff88025e64e400, cur 1504829154 expire 1504829004 last 1504826073 Sep 8 08:05:54 mds0 kernel: Lustre: Skipped 4 previous similar messages Sep 8 08:07:07 mds0 kernel: Lustre: 6963:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply Sep 8 08:07:07 mds0 kernel: req@ffff880ff3e3cc00 x1573501141057944/t0(0) o101->abbd088a-87ea-2452-01e3-d3d2de50b85d@12.0.3.31@tcp1:0/0 lens 576/3384 e 0 to 0 dl 1504829232 ref 2 fl Interpret:/0/0 rc 0/0 Sep 8 08:07:07 mds0 kernel: Lustre: 6963:0:(service.c:1347:ptlrpc_at_send_early_reply()) Skipped 29 previous similar messages Sep 8 08:08:13 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 08:08:13 mds0 kernel: Lustre: Skipped 9 previous similar messages Sep 8 08:11:04 mds0 kernel: Lustre: 24791:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1504829458/real 1504829458] req@ffff88046ca0dc00 x1577214557311512/t0(0) o39->THFS-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1504829464 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Sep 8 08:11:04 mds0 kernel: LustreError: 24791:0:(obd_mount_server.c:1420:server_put_super()) THFS-MDT0000: failed to disconnect lwp. (rc=-110) Sep 8 08:11:04 mds0 kernel: Lustre: Failing over THFS-MDT0000 Sep 8 08:11:04 mds0 kernel: LustreError: 7026:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) ### lock on destroyed export ffff88100a03a400 ns: mdt-THFS-MDT0000_UUID lock: ffff8810204f9300/0x81ad926eaa8cfd67 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 443 type: IBT flags: 0x50200400000020 nid: 12.0.2.30@tcp1 remote: 0xb78b136b54a4435 expref: 2 pid: 7026 timeout: 0 lvb_type: 0 Sep 8 08:11:04 mds0 kernel: LustreError: 7026:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) Skipped 1 previous similar message Sep 8 08:11:04 mds0 kernel: Lustre: THFS-MDT0000: Not available for connect from 12.0.3.65@tcp1 (stopping) Sep 8 08:11:04 mds0 kernel: Lustre: Skipped 121 previous similar messages Sep 8 08:11:06 mds0 kernel: LustreError: 3987:0:(client.c:1079:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88100c7e8000 x1577214557311756/t0(0) o13->THFS-OST000a-osc-MDT0000@10.23.0.73@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Sep 8 08:11:06 mds0 kernel: LustreError: 3987:0:(client.c:1079:ptlrpc_import_delay_req()) Skipped 2 previous similar messages Sep 8 08:11:07 mds0 kernel: LustreError: 3991:0:(client.c:1079:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88101b211400 x1577214557311864/t0(0) o13->THFS-OST0007-osc-MDT0000@10.23.0.72@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Sep 8 08:11:07 mds0 kernel: LustreError: 3991:0:(client.c:1079:ptlrpc_import_delay_req()) Skipped 26 previous similar messages Sep 8 08:11:08 mds0 kernel: Lustre: THFS-MDT0000: Not available for connect from 12.0.4.14@tcp1 (stopping) Sep 8 08:11:08 mds0 kernel: Lustre: Skipped 298 previous similar messages Sep 8 08:11:17 mds0 kernel: Lustre: THFS-MDT0000: Not available for connect from 12.0.3.221@tcp1 (stopping) Sep 8 08:11:17 mds0 kernel: Lustre: Skipped 111 previous similar messages Sep 8 08:11:31 mds0 kernel: LustreError: 137-5: THFS-MDT0000_UUID: not available for connect from 12.0.2.150@tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server. Sep 8 08:11:31 mds0 kernel: LustreError: Skipped 13 previous similar messages Sep 8 08:11:35 mds0 kernel: LustreError: 137-5: THFS-MDT0000_UUID: not available for connect from 12.0.4.215@tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server. Sep 8 08:11:35 mds0 kernel: LustreError: Skipped 88 previous similar messages Sep 8 08:11:37 mds0 kernel: Lustre: 24791:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1504829491/real 1504829491] req@ffff881015139000 x1577214557311892/t0(0) o251->MGC10.23.0.64@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1504829497 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Sep 8 08:11:43 mds0 kernel: LustreError: 137-5: THFS-MDT0000_UUID: not available for connect from 12.0.7.192@tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server. Sep 8 08:11:43 mds0 kernel: LustreError: Skipped 168 previous similar messages Sep 8 08:11:51 mds0 kernel: Lustre: server umount THFS-MDT0000 complete Sep 8 08:12:27 mds0 kernel: LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: Sep 8 08:12:27 mds0 kernel: Lustre: THFS-MDT0000: used disk, loading Sep 8 08:12:28 mds0 kernel: Lustre: 24916:0:(mdt_lproc.c:379:lprocfs_wr_identity_upcall()) THFS-MDT0000: disable "identity_upcall" with ACL enabled maybe cause unexpected "EACCESS" Sep 8 08:12:28 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 08:12:28 mds0 kernel: LustreError: 11-0: THFS-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. Sep 8 08:12:30 mds0 kernel: Lustre: THFS-MDD0000: changelog on Sep 8 08:12:30 mds0 kernel: Lustre: THFS-MDT0000: Will be in recovery for at least 5:00, or until 496 clients reconnect Sep 8 08:12:31 mds0 kernel: Lustre: THFS-MDT0000: Denying connection for new client 8a75a544-b2c4-e80a-5e7b-223e4f09dd49 (at 12.0.5.130@tcp1), waiting for all 496 known clients (13 recovered, 3 in progress, and 0 evicted) to recover in 14:34 Sep 8 08:12:31 mds0 kernel: Lustre: Skipped 28 previous similar messages Sep 8 08:12:36 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 08:12:36 mds0 kernel: Lustre: Skipped 35 previous similar messages Sep 8 08:12:52 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 08:12:52 mds0 kernel: Lustre: Skipped 376 previous similar messages Sep 8 08:12:58 mds0 kernel: Lustre: THFS-MDT0000: Recovery over after 0:28, of 496 clients 496 recovered and 0 were evicted. Sep 8 09:24:00 mds0 kernel: Lustre: MGS: haven't heard from client 9d22fe83-207f-101e-bea9-a3cadea3a659 (at 12.0.2.66@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881016009800, cur 1504833840 expire 1504833690 last 1504833613 Sep 8 09:24:00 mds0 kernel: Lustre: Skipped 5 previous similar messages Sep 8 09:24:20 mds0 kernel: Lustre: THFS-MDT0000: haven't heard from client 833faabe-1727-671c-00c6-93dce877c09a (at 12.0.2.67@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881022645c00, cur 1504833860 expire 1504833710 last 1504833633 Sep 8 09:24:20 mds0 kernel: Lustre: Skipped 5 previous similar messages Sep 8 09:24:58 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=66, skb_q_idx=81 Sep 8 09:24:58 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=66, skb_q_idx=449 Sep 8 09:41:32 mds0 kernel: Lustre: MGS: Client 9bdf1f2e-419f-1c92-a1c2-fbd87cb2a449 (at 12.0.6.68@tcp1) reconnecting Sep 8 09:41:32 mds0 kernel: Lustre: THFS-MDT0000: Client aa20e87f-60ef-d4c3-b636-52ee928340a3 (at 12.0.6.133@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 09:41:32 mds0 kernel: Lustre: Skipped 6016 previous similar messages Sep 8 09:41:32 mds0 kernel: Lustre: Skipped 6129 previous similar messages Sep 8 09:41:37 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:41:37 mds0 kernel: Lustre: Skipped 187 previous similar messages Sep 8 09:41:41 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:41:41 mds0 kernel: Lustre: Skipped 84 previous similar messages Sep 8 09:41:50 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:41:50 mds0 kernel: Lustre: Skipped 24 previous similar messages Sep 8 09:41:52 mds0 kernel: Lustre: MGS: Export ffff8810000c1400 already connecting from 12.0.7.5@tcp1 Sep 8 09:43:06 mds0 kernel: Lustre: MGS: Client 20411448-e978-8a77-ef45-37d0318aa4e0 (at 12.0.7.70@tcp1) reconnecting Sep 8 09:43:06 mds0 kernel: Lustre: MGS: Client b4c98678-6d68-c286-7564-3a6e5e436a01 (at 12.0.6.138@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 09:43:06 mds0 kernel: Lustre: Skipped 143 previous similar messages Sep 8 09:43:06 mds0 kernel: Lustre: Skipped 1126 previous similar messages Sep 8 09:43:11 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:43:11 mds0 kernel: Lustre: Skipped 148 previous similar messages Sep 8 09:43:45 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:43:45 mds0 kernel: Lustre: Skipped 578 previous similar messages Sep 8 09:44:10 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=122 Sep 8 09:44:10 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=123 Sep 8 09:44:10 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1221, skb_q_idx=130 Sep 8 09:44:10 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=131 Sep 8 09:44:10 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1221, skb_q_idx=135 Sep 8 09:45:19 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 133s: evicting client at 12.0.3.25@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff8807d7710280/0x81ad926eb027418d lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 462 type: IBT flags: 0x60200400000020 nid: 12.0.3.25@tcp1 remote: 0x200481a75f963eae expref: 139 pid: 25158 timeout: 4980845617 lvb_type: 0 Sep 8 09:45:19 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 24 previous similar messages Sep 8 09:45:36 mds0 kernel: Lustre: THFS-MDT0000: Client c89ad29f-7a35-b7af-306b-9c8746e6b97c (at 12.0.2.217@tcp1) reconnecting Sep 8 09:45:36 mds0 kernel: Lustre: Skipped 1576 previous similar messages Sep 8 09:45:36 mds0 kernel: Lustre: THFS-MDT0000: Client c89ad29f-7a35-b7af-306b-9c8746e6b97c (at 12.0.2.217@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 09:45:36 mds0 kernel: Lustre: Skipped 592 previous similar messages Sep 8 09:45:41 mds0 kernel: Lustre: 25125:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (100:27s); client may timeout. req@ffff880826d84000 x1573500616447600/t0(0) o101->d152867a-163a-2037-2d97-a6f6b09eb1e9@12.0.2.74@tcp1:0/0 lens 576/0 e 0 to 0 dl 1504835114 ref 1 fl Interpret:/2/ffffffff rc 0/-1 Sep 8 09:45:41 mds0 kernel: Lustre: 25125:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 983 previous similar messages Sep 8 09:45:41 mds0 kernel: LustreError: 24881:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) ### lock on disconnected export ffff8810660a7000 ns: mdt-THFS-MDT0000_UUID lock: ffff880feee1e480/0x81ad926eb02d9e2c lrc: 2/0,0 mode: --/CR res: [0x200004f76:0x65c8:0x0].0 bits 0x0 rrc: 658 type: IBT flags: 0x40000000000000 nid: local remote: 0xf212b25b65afdcf9 expref: -99 pid: 24881 timeout: 0 lvb_type: 0 Sep 8 09:45:41 mds0 kernel: LustreError: 24881:0:(ldlm_lockd.c:1268:ldlm_handle_enqueue0()) Skipped 68 previous similar messages Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=405, skb_q_idx=17 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1157, skb_q_idx=21 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=386, skb_q_idx=47 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=406, skb_q_idx=61 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=405, skb_q_idx=74 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1157, skb_q_idx=75 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=386, skb_q_idx=105 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=405, skb_q_idx=122 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1157, skb_q_idx=123 Sep 8 09:46:18 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=406, skb_q_idx=127 Sep 8 09:46:18 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:46:18 mds0 kernel: Lustre: Skipped 7 previous similar messages Sep 8 09:46:26 mds0 kernel: Lustre: lock timed out (enqueued at 1504834986, 200s ago) Sep 8 09:46:26 mds0 kernel: Lustre: Skipped 13 previous similar messages Sep 8 09:49:34 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 388s: evicting client at 12.0.5.7@tcp1 ns: mdt-THFS-MDT0000_UUID lock: ffff88083c29cb40/0x81ad926eb0287214 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 529 type: IBT flags: 0x60200400000020 nid: 12.0.5.7@tcp1 remote: 0xfa00023f5dee89ba expref: 128 pid: 25033 timeout: 4981100273 lvb_type: 0 Sep 8 09:49:34 mds0 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 6 previous similar messages Sep 8 09:49:34 mds0 kernel: LNet: Service thread pid 25322 completed after 387.64s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 09:49:34 mds0 kernel: LustreError: 25363:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.3.73@tcp1: deadline 100:133s ago Sep 8 09:49:34 mds0 kernel: req@ffff88101a426c00 x1573501199775676/t0(0) o38->@:0/0 lens 400/0 e 0 to 0 dl 1504835241 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Sep 8 09:49:34 mds0 kernel: LustreError: 25363:0:(service.c:2007:ptlrpc_server_handle_request()) Skipped 604 previous similar messages Sep 8 09:49:34 mds0 kernel: Lustre: 25363:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (100:133s); client may timeout. req@ffff88101a426c00 x1573501199775676/t0(0) o38->@:0/0 lens 400/0 e 0 to 0 dl 1504835241 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Sep 8 09:49:34 mds0 kernel: Lustre: 25363:0:(service.c:2039:ptlrpc_server_handle_request()) Skipped 2 previous similar messages Sep 8 09:49:34 mds0 kernel: LNet: Skipped 413 previous similar messages Sep 8 09:50:12 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:50:12 mds0 kernel: Lustre: Skipped 12 previous similar messages Sep 8 09:50:37 mds0 kernel: Lustre: THFS-MDT0000: Client aa20e87f-60ef-d4c3-b636-52ee928340a3 (at 12.0.6.133@tcp1) reconnecting Sep 8 09:50:37 mds0 kernel: Lustre: Skipped 1122 previous similar messages Sep 8 09:50:37 mds0 kernel: Lustre: THFS-MDT0000: Client aa20e87f-60ef-d4c3-b636-52ee928340a3 (at 12.0.6.133@tcp1) refused reconnection, still busy with 1 active RPCs Sep 8 09:50:37 mds0 kernel: Lustre: Skipped 1112 previous similar messages Sep 8 09:51:57 mds0 kernel: Lustre: lock timed out (enqueued at 1504835119, 398s ago) Sep 8 09:51:57 mds0 kernel: Lustre: lock timed out (enqueued at 1504835119, 398s ago) Sep 8 09:51:57 mds0 kernel: Lustre: Skipped 259 previous similar messages Sep 8 09:52:42 mds0 kernel: __ratelimit: 12 callbacks suppressed Sep 8 09:52:42 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1099, skb_q_idx=69 Sep 8 09:52:42 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1099, skb_q_idx=90 Sep 8 09:52:42 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1099, skb_q_idx=102 Sep 8 09:52:42 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1221, skb_q_idx=189 Sep 8 09:52:42 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1099, skb_q_idx=222 Sep 8 09:52:42 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, tx_dropped, rmt_nic_id=1285, skb_q_idx=323 Sep 8 09:52:42 mds0 kernel: Lustre: 28359:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1504835556/real 1504835556] req@ffff88086c8cfc00 x1577214557697308/t0(0) o39->THFS-MDT0000-lwp-MDT0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1504835562 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Sep 8 09:52:42 mds0 kernel: LustreError: 28359:0:(obd_mount_server.c:1420:server_put_super()) THFS-MDT0000: failed to disconnect lwp. (rc=-110) Sep 8 09:52:42 mds0 kernel: Lustre: Failing over THFS-MDT0000 Sep 8 09:52:42 mds0 kernel: LustreError: 25202:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) ### lock on destroyed export ffff880fffb47400 ns: mdt-THFS-MDT0000_UUID lock: ffff8805502af500/0x81ad926eb02d7fe0 lrc: 3/0,0 mode: PR/PR res: [0x200004f76:0x6698:0x0].0 bits 0x13 rrc: 534 type: IBT flags: 0x50200400000020 nid: 12.0.2.213@tcp1 remote: 0xe9dc209c6b0c7bbf expref: 3 pid: 25202 timeout: 0 lvb_type: 0 Sep 8 09:52:42 mds0 kernel: LNet: Service thread pid 24996 completed after 421.14s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Sep 8 09:52:42 mds0 kernel: LustreError: 24996:0:(service.c:2007:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-12.0.2.72@tcp1: deadline 100:321s ago Sep 8 09:52:42 mds0 kernel: req@ffff881023642800 x1573500614918812/t0(0) o38->@:0/0 lens 400/0 e 0 to 0 dl 1504835241 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Sep 8 09:52:42 mds0 kernel: Lustre: 24996:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (100:321s); client may timeout. req@ffff881023642800 x1573500614918812/t0(0) o38->@:0/0 lens 400/0 e 0 to 0 dl 1504835241 ref 1 fl Interpret:/0/ffffffff rc 0/-1 Sep 8 09:52:42 mds0 kernel: Lustre: THFS-MDT0000: Not available for connect from 12.0.2.72@tcp1 (stopping) Sep 8 09:52:42 mds0 kernel: Lustre: Skipped 249 previous similar messages Sep 8 09:52:42 mds0 kernel: LustreError: 25202:0:(ldlm_lockd.c:1335:ldlm_handle_enqueue0()) Skipped 424 previous similar messages Sep 8 09:52:43 mds0 kernel: LustreError: 3995:0:(client.c:1079:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff881017b7b800 x1577214557697552/t0(0) o13->THFS-OST0007-osc-MDT0000@10.23.0.72@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Sep 8 09:52:43 mds0 kernel: LustreError: 3995:0:(client.c:1079:ptlrpc_import_delay_req()) Skipped 5 previous similar messages Sep 8 09:52:44 mds0 kernel: Lustre: THFS-MDT0000: Not available for connect from 12.0.2.23@tcp1 (stopping) Sep 8 09:52:44 mds0 kernel: Lustre: Skipped 276 previous similar messages Sep 8 09:52:48 mds0 kernel: LustreError: 137-5: THFS-MDT0000_UUID: not available for connect from 12.0.2.221@tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server. Sep 8 09:52:48 mds0 kernel: LustreError: Skipped 212 previous similar messages Sep 8 09:52:50 mds0 kernel: LustreError: 137-5: THFS-MDT0000_UUID: not available for connect from 12.0.4.221@tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server. Sep 8 09:52:50 mds0 kernel: LustreError: Skipped 21 previous similar messages Sep 8 09:52:54 mds0 kernel: Lustre: 28359:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1504835568/real 1504835568] req@ffff8807df1b4c00 x1577214557697684/t0(0) o251->MGC10.23.0.64@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1504835574 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Sep 8 09:52:55 mds0 kernel: LustreError: 137-5: THFS-MDT0000_UUID: not available for connect from 12.0.5.13@tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server. Sep 8 09:52:55 mds0 kernel: LustreError: Skipped 52 previous similar messages Sep 8 09:52:55 mds0 kernel: Lustre: server umount THFS-MDT0000 complete Sep 8 09:53:15 mds0 kernel: EXT4-fs (dm-1): Couldn't mount because of unsupported optional features (1000) Sep 8 09:55:51 mds0 kernel: LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: Sep 8 09:55:52 mds0 kernel: Lustre: THFS-MDT0000: used disk, loading Sep 8 09:55:52 mds0 kernel: Lustre: 28643:0:(mdt_lproc.c:379:lprocfs_wr_identity_upcall()) THFS-MDT0000: disable "identity_upcall" with ACL enabled maybe cause unexpected "EACCESS" Sep 8 09:55:52 mds0 kernel: Lustre: MGS: non-config logname received: params Sep 8 09:55:52 mds0 kernel: Lustre: Skipped 11 previous similar messages Sep 8 09:55:52 mds0 kernel: LustreError: 11-0: THFS-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. Sep 8 09:55:54 mds0 kernel: Lustre: THFS-MDD0000: changelog on Sep 8 09:55:56 mds0 kernel: Lustre: THFS-MDT0000: Will be in recovery for at least 5:00, or until 532 clients reconnect Sep 8 09:56:06 mds0 kernel: Lustre: THFS-MDT0000: Denying connection for new client eb97435d-f441-f39f-efe4-2725f6be06c1 (at 12.0.5.7@tcp1), waiting for all 532 known clients (61 recovered, 15 in progress, and 0 evicted) to recover in 14:15 Sep 8 09:56:06 mds0 kernel: Lustre: Skipped 62 previous similar messages Sep 8 09:56:43 mds0 kernel: Lustre: THFS-MDT0000: Denying connection for new client 8945f9c7-f71d-b71d-c7f7-b94c49ebfbec (at 12.0.4.221@tcp1), waiting for all 532 known clients (174 recovered, 73 in progress, and 0 evicted) to recover in 13:38 Sep 8 09:56:43 mds0 kernel: Lustre: Skipped 2 previous similar messages Sep 8 09:57:01 mds0 kernel: Lustre: THFS-MDT0000: Client 602f9c6c-383d-95be-8852-7b8a2fc6ce61 (at 12.0.4.151@tcp1) reconnecting, waiting for 532 clients in recovery for 13:20 Sep 8 09:57:01 mds0 kernel: Lustre: Skipped 197 previous similar messages Sep 8 09:57:03 mds0 kernel: Lustre: THFS-MDT0000: Denying connection for new client ff8fdaac-1bd2-44fa-a2fb-46022b08a00b (at 12.0.2.154@tcp1), waiting for all 532 known clients (484 recovered, 46 in progress, and 0 evicted) to recover in 13:18 Sep 8 09:57:03 mds0 kernel: Lustre: Skipped 8 previous similar messages Sep 8 09:57:39 mds0 kernel: Lustre: THFS-MDT0000: Recovery over after 1:43, of 532 clients 532 recovered and 0 were evicted. Sep 8 09:59:06 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=419 Sep 8 09:59:06 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=420 Sep 8 09:59:06 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=421 Sep 8 09:59:06 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=422 Sep 8 09:59:06 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=423 Sep 8 09:59:06 mds0 kernel: nio_dev 0000:82:00.0: gn0: _timer, rx_dropped, rmt_nic_id=1285, skb_q_idx=461