Mar 22 04:02:05 ALPL401 syslogd 1.4.1: restart. Mar 22 04:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 04:22:01 ALPL401 kernel: md: syncing RAID array md0 Mar 22 04:22:01 ALPL401 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. Mar 22 04:22:01 ALPL401 kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction. Mar 22 04:22:01 ALPL401 kernel: md: using 128k window, over a total of 521984 blocks. Mar 22 04:22:01 ALPL401 kernel: md: delaying resync of md1 until md0 has finished resync (they share one or more physical units) Mar 22 04:22:07 ALPL401 kernel: md: md0: sync done. Mar 22 04:22:07 ALPL401 kernel: md: syncing RAID array md1 Mar 22 04:22:07 ALPL401 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. Mar 22 04:22:07 ALPL401 kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction. Mar 22 04:22:07 ALPL401 kernel: md: using 128k window, over a total of 4192896 blocks. Mar 22 04:22:07 ALPL401 kernel: RAID1 conf printout: Mar 22 04:22:07 ALPL401 kernel: --- wd:2 rd:2 Mar 22 04:22:07 ALPL401 kernel: disk 0, wo:0, o:1, dev:sda1 Mar 22 04:22:07 ALPL401 kernel: disk 1, wo:0, o:1, dev:sdb1 Mar 22 04:22:53 ALPL401 kernel: md: md1: sync done. Mar 22 04:22:53 ALPL401 kernel: RAID1 conf printout: Mar 22 04:22:53 ALPL401 kernel: --- wd:2 rd:2 Mar 22 04:22:53 ALPL401 kernel: disk 0, wo:0, o:1, dev:sda2 Mar 22 04:22:53 ALPL401 kernel: disk 1, wo:0, o:1, dev:sdb2 Mar 22 05:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 06:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 07:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 08:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 09:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 10:10:39 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 11:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 12:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 12:45:26 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 5eb20aa9-deba-61b3-f130-3ff1ec8e2f4d (at 10.3.3.61@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 12:45:26 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 22 12:45:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 22 13:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 13:25:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 22 13:25:37 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 2088aa40-f79e-335d-291f-512480fd9835 (at 10.3.2.50@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 13:25:37 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 22 13:25:37 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 2088aa40-f79e-335d-291f-512480fd9835 (at 10.3.2.50@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 13:25:37 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 22 14:10:40 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 14:25:09 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 17201fea-d9f3-e71d-ddb7-112edacf4eca (at 10.3.3.23@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 14:25:09 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 22 14:25:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 22 15:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 15:40:01 ALPL401 auditd[4130]: Audit daemon rotating log files Mar 22 16:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 16:45:37 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 22 17:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 17:19:47 ALPL401 ntpd[6309]: time reset +0.389454 s Mar 22 17:23:29 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 22 17:26:43 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 22 17:45:28 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client bdee69f6-b31c-7cad-c866-52f19dc24387 (at 10.3.2.44@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 17:45:28 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 22 17:45:28 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client bdee69f6-b31c-7cad-c866-52f19dc24387 (at 10.3.2.44@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 17:45:28 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 22 17:45:59 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 22 18:05:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 22 18:05:55 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 6c7f4ce1-106b-cfcd-42e0-5501be8985ad (at 10.3.3.33@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 18:05:55 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 22 18:05:56 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 6c7f4ce1-106b-cfcd-42e0-5501be8985ad (at 10.3.3.33@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 18:06:12 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 22 18:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 18:25:10 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 7d8798c4-9083-43f3-cc1b-394af2edf331 (at 10.3.3.40@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 18:25:10 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 22 18:25:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 22 19:05:14 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 1a48c227-e08d-be1b-a13c-47605d68afa0 (at 10.3.3.7@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 19:05:14 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 22 19:05:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 22 19:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 20:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 21:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 21:45:54 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 8baafe7a-e1a9-103a-3c7d-41e168436877 (at 10.3.4.1@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 21:45:54 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 22 21:45:54 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 8baafe7a-e1a9-103a-3c7d-41e168436877 (at 10.3.4.1@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 22 22:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 22 23:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 00:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 01:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 02:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 03:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 03:51:08 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 23 04:08:14 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 23 04:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 04:42:21 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 23 04:59:25 ALPL401 ntpd[6309]: time reset -1.948684 s Mar 23 05:03:18 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 23 05:04:23 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 23 05:07:38 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 23 05:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 06:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 07:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 08:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 09:10:41 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 10:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 11:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 11:25:27 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 9be28f2b-025a-4018-dac6-f9aebb505571 (at 10.3.4.24@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 11:25:27 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 23 11:25:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 11:26:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 11:45:26 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client deba5140-718a-ef00-56c4-72a4ab6b2aee (at 10.3.3.36@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 11:45:26 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 23 11:45:26 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client deba5140-718a-ef00-56c4-72a4ab6b2aee (at 10.3.3.36@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 11:45:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 12:05:28 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 8b8b057c-63f1-03a6-e188-3d645a1464c2 (at 10.3.2.15@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 12:05:28 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 23 12:05:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 12:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 12:25:30 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 3aa17f0e-0972-f786-ebc7-5c6a3da82bda (at 10.3.2.48@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 12:25:30 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 23 12:25:30 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 3aa17f0e-0972-f786-ebc7-5c6a3da82bda (at 10.3.2.48@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 12:25:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 13:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 14:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 15:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 15:25:14 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 871b1ba7-dc43-b946-e07a-a1e01721103e (at 10.3.2.45@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 15:25:14 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 23 15:25:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 15:45:16 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 371901af-b2b2-b190-12ef-3ef7420075ea (at 10.3.2.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 15:45:16 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 23 15:45:16 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 371901af-b2b2-b190-12ef-3ef7420075ea (at 10.3.2.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 15:45:16 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 23 15:45:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 15:46:34 ALPL401 last message repeated 2 times Mar 23 15:47:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 16:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 17:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 18:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 19:05:39 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 16f325b0-f300-b443-5573-916730ce3073 (at 10.3.2.42@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 19:05:39 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 23 19:05:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 19:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 19:45:33 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client c2333014-1207-b9a0-0b90-eb829d85e3f4 (at 10.3.4.7@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 19:45:33 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 23 19:45:33 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client c2333014-1207-b9a0-0b90-eb829d85e3f4 (at 10.3.4.7@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 19:45:33 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 23 19:45:45 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 19:46:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 20:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 20:25:13 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 2971cc12-a5ac-9fbf-987f-54fecb80ccf4 (at 10.3.2.49@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 20:25:13 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 23 20:25:13 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 2971cc12-a5ac-9fbf-987f-54fecb80ccf4 (at 10.3.2.49@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 20:25:13 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 23 20:25:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 21:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 22:05:30 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 4b37ea3f-3b31-a8d8-4b63-6f812ade7544 (at 10.3.9.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 23 22:06:15 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 23 22:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 23 23:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 00:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 00:45:35 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client db7c6214-23dd-14b8-d945-e97bdfc35149 (at 10.3.3.54@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 00:45:35 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 00:45:35 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client db7c6214-23dd-14b8-d945-e97bdfc35149 (at 10.3.3.54@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 00:45:35 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 24 00:45:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 01:05:13 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client a3aac1c0-2b4a-1efe-1a4c-8069d851135c (at 10.3.9.1@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 01:05:13 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 24 01:05:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 01:06:29 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 01:07:09 ALPL401 last message repeated 2 times Mar 24 01:07:25 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 118862bd-4bd7-cefb-5560-f5a1c59f8307 (at 10.3.9.26@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 01:07:25 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 01:07:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 01:08:03 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 01:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 01:59:53 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 24 02:10:42 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 02:17:26 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 24 02:34:03 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 24 02:51:06 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 24 03:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 04:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 04:25:12 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 4749bf06-d4b2-2204-295e-460c7b6102fc (at 10.3.2.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 04:25:12 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 04:25:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 04:26:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 05:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 06:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 06:25:11 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 78f8daca-f15d-dd2f-82dd-f9a356ee1be4 (at 10.3.1.55@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 06:25:11 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 06:25:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 07:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 08:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 09:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 09:14:40 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client d380ad42-74b5-f939-cc25-15e6b5a3684b (at 10.3.8.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 09:14:40 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 09:27:14 ALPL401 kernel: LustreError: 15489:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-107) req@ffff810633372800 x1494613909667477/t0 o400->@:0/0 lens 192/0 e 0 to 0 dl 1427160451 ref 1 fl Interpret:H/0/0 rc -107/0 Mar 24 09:27:14 ALPL401 kernel: LustreError: 15489:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 1 previous similar message Mar 24 09:45:08 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 75376e77-4ed0-0fb3-14b5-5514a030078a (at 10.3.3.62@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 09:45:08 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 09:45:08 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 75376e77-4ed0-0fb3-14b5-5514a030078a (at 10.3.3.62@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 09:45:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 10:05:29 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client eae73e49-3988-9791-1d63-f29311e241da (at 10.3.2.37@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 10:05:29 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 10:05:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 10:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 10:39:46 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client d380ad42-74b5-f939-cc25-15e6b5a3684b (at 10.3.8.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 10:39:46 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 10:39:46 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client d380ad42-74b5-f939-cc25-15e6b5a3684b (at 10.3.8.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 10:39:46 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 11:01:03 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 11:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 11:45:20 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 2df29833-109b-194d-b263-0ab850b56514 (at 10.3.9.8@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 11:46:06 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 11:46:41 ALPL401 last message repeated 2 times Mar 24 12:05:10 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 9ea92ddd-2c2f-f3e5-f4a2-359d2f875d1c (at 10.3.3.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 12:05:10 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 12:05:10 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 9ea92ddd-2c2f-f3e5-f4a2-359d2f875d1c (at 10.3.3.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 12:05:10 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 24 12:05:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 12:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 13:05:09 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client d6a31f10-3bf0-4e3e-c201-c3dd33a5dfeb (at 10.3.4.29@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 13:05:09 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 24 13:06:10 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 13:06:20 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 13:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 13:25:09 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 2a2700bf-2d19-e0ac-0e90-fbaad38ded83 (at 10.3.2.27@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 13:25:09 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 13:25:09 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 2a2700bf-2d19-e0ac-0e90-fbaad38ded83 (at 10.3.2.27@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 13:25:09 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 24 13:25:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 13:57:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 14:05:34 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 14:05:38 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client c2848d9f-e3cd-49af-a13c-350eef1ff7ff (at 10.3.9.30@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 14:05:38 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 24 14:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 14:25:33 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 6ff885b3-49a6-e5dd-5b26-3156ac626eeb (at 10.3.4.41@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 14:25:33 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 14:25:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 14:26:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 14:33:13 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client b11d177d-e1e5-b259-752f-9e543728b1e7 (at 10.3.3.55@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 14:33:13 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 15:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 15:25:19 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client c510c9c0-7ad9-14ac-2fe6-82c3cc1a21ce (at 10.3.8.59@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 15:25:19 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 15:25:19 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client c510c9c0-7ad9-14ac-2fe6-82c3cc1a21ce (at 10.3.8.59@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 15:25:19 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 24 15:25:48 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:05:31 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 5ca706ea-e3a1-45d0-634c-b771c3daf9f5 (at 10.3.4.14@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:05:31 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 24 16:05:31 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 5ca706ea-e3a1-45d0-634c-b771c3daf9f5 (at 10.3.4.14@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:05:31 ALPL401 kernel: Lustre: Skipped 15 previous similar messages Mar 24 16:05:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:06:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:07:07 ALPL401 last message repeated 2 times Mar 24 16:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 16:14:30 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 22cb8b33-48b9-e4f2-c43a-ac3ac0dd5a52 (at 10.3.5.14@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:14:30 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 16:25:19 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 2735af5c-a978-f6b0-7d57-e38299fbe065 (at 10.3.2.44@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:25:19 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 16:25:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:26:06 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:27:17 ALPL401 last message repeated 2 times Mar 24 16:27:33 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 0c77fc6d-f2dd-3fe0-4327-3147c8649991 (at 10.3.7.8@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:27:33 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 16:27:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:28:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:29:25 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 6c25023e-9b5b-b857-c2c1-0eba6fc6c27b (at 10.3.8.61@o2ib) in 303 seconds. I think it's dead, and I am evicting it. Mar 24 16:29:25 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 24 16:29:57 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 6c25023e-9b5b-b857-c2c1-0eba6fc6c27b (at 10.3.8.61@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:29:57 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 24 16:30:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:31:17 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:32:21 ALPL401 last message repeated 2 times Mar 24 16:34:18 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:35:57 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 563f892c-0835-685b-9b5f-e334974eb913 (at 10.3.8.27@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:35:57 ALPL401 kernel: Lustre: Skipped 8 previous similar messages Mar 24 16:45:27 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client e6670c14-ccaa-7ca3-0b98-3bed46ae3cbf (at 10.3.4.1@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:45:27 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 16:45:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:46:42 ALPL401 last message repeated 2 times Mar 24 16:47:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:47:19 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 64ace192-3eb5-5c90-af06-3a44964e6356 (at 10.3.5.25@o2ib) in 313 seconds. I think it's dead, and I am evicting it. Mar 24 16:47:19 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 16:47:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:47:41 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 64ace192-3eb5-5c90-af06-3a44964e6356 (at 10.3.5.25@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:47:41 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 16:48:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:48:47 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:49:15 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:49:18 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 8a1ba4b5-c9a6-1934-d832-74003f70ff49 (at 10.3.6.26@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:49:18 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 16:49:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:50:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:50:45 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:51:10 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 29e8432a-1fba-132f-181d-f56fd8277c5d (at 10.3.6.60@o2ib) in 326 seconds. I think it's dead, and I am evicting it. Mar 24 16:51:10 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 16:51:17 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:52:22 ALPL401 last message repeated 2 times Mar 24 16:52:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 16:53:29 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 77d741a0-6c95-349d-f792-a36134d01f71 (at 10.3.9.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 16:53:29 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 24 16:53:33 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:05:18 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 8cbf5bac-f4d5-5e0e-94cc-1ec138129c27 (at 10.3.1.54@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 17:05:18 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 17:05:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:06:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:07:52 ALPL401 last message repeated 3 times Mar 24 17:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 17:25:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:25:53 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client bac6c40f-e5bb-1341-d909-3b334e442ee5 (at 10.3.9.4@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 17:25:53 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 17:26:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:26:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:27:33 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:45:06 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 4c101156-5fb9-e609-057d-a5772911f5ac (at 10.3.4.35@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 17:45:06 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 17:45:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:46:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:46:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:46:59 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client bc498f85-de82-101b-c23a-0997cf2fcc45 (at 10.3.5.59@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 17:46:59 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 17:47:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:47:47 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 17:48:20 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 18:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 19:05:08 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 1e3b58db-4664-fdee-1645-45f064aa6927 (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 19:05:08 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 19:05:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 19:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 20:05:40 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client fe687965-a30d-d7d2-5e57-8784a0b6b021 (at 10.3.5.27@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 20:05:40 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 20:05:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 20:06:17 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 20:07:22 ALPL401 last message repeated 2 times Mar 24 20:07:48 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 20:08:07 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client b51c8552-9201-2938-26b7-ffff9d7d4647 (at 10.3.9.12@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 20:08:07 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 24 20:08:23 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 20:09:04 ALPL401 last message repeated 2 times Mar 24 20:09:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 20:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 20:25:15 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client da21dba5-d358-2a48-56a7-475de274aba9 (at 10.3.3.6@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 20:25:15 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 20:25:34 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 20:26:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 20:26:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 21:05:41 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client cf231887-d4f8-2be1-37d2-e33b8f34f29a (at 10.3.2.23@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 21:05:41 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 21:05:41 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client cf231887-d4f8-2be1-37d2-e33b8f34f29a (at 10.3.2.23@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 21:05:41 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 24 21:05:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 21:06:40 ALPL401 last message repeated 2 times Mar 24 21:07:17 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 21:07:39 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client c158ca20-de8e-cafb-5222-99cb0882ef3a (at 10.3.5.3@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 21:07:39 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 24 21:07:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 21:10:43 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 21:15:00 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 4bb16068-86c1-baee-fce1-a2c1e7025235 (at 10.3.2.48@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 21:15:00 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 21:45:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:05:36 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 053178b7-9725-282a-c942-f91a42a0bafd (at 10.3.5.9@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 22:05:36 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 24 22:05:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:06:48 ALPL401 last message repeated 2 times Mar 24 22:07:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:07:32 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client acc04756-7e86-f3b6-8ad7-5ff2c14a8813 (at 10.3.6.2@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 22:07:32 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 22:07:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:08:43 ALPL401 last message repeated 2 times Mar 24 22:09:13 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:09:33 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client b22a62f5-53c3-321f-c7fc-bba9fa817c4d (at 10.3.7.13@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 22:09:33 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 22:10:14 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 22:10:45 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:11:25 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client ee2781db-e9fa-b821-69f6-2e17000e10db (at 10.3.7.25@o2ib) in 299 seconds. I think it's dead, and I am evicting it. Mar 24 22:11:25 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 24 22:11:45 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:12:01 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client ee2781db-e9fa-b821-69f6-2e17000e10db (at 10.3.7.25@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 22:12:01 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 24 22:12:14 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:13:28 ALPL401 last message repeated 2 times Mar 24 22:25:35 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client a025d710-f376-b6fa-9bbd-0f0bf5cc3d5c (at 10.3.1.55@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 22:25:35 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 24 22:25:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:26:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:26:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:45:14 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 82611064-8c23-8234-e2b4-aa7566161f8b (at 10.3.3.54@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 22:45:14 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 22:45:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:46:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 22:46:31 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 24 23:25:13 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client cf664f9f-9128-a41f-14d0-5e867802a805 (at 10.3.2.42@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:25:13 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 23:25:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:26:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:27:31 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 987be3cc-686f-1f0f-42bb-c79f133ddbce (at 10.3.5.17@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:27:31 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 24 23:27:52 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:29:23 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client cb78624a-8695-0620-bd2d-16d8619a28a0 (at 10.3.6.52@o2ib) in 292 seconds. I think it's dead, and I am evicting it. Mar 24 23:29:23 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 24 23:29:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:30:06 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client cb78624a-8695-0620-bd2d-16d8619a28a0 (at 10.3.6.52@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:30:06 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 23:30:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:30:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:31:16 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client cab17d39-9eb7-ebc9-619a-238bca42fdb6 (at 10.3.6.63@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:31:16 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 23:31:51 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:32:30 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 4f4814fe-21d1-405b-7c61-68344f6d4be9 (at 10.3.8.58@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:32:30 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 24 23:33:00 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:33:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:45:12 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 13bd0fc3-6b61-3431-58ba-9f16579bd1b5 (at 10.3.2.56@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:45:12 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 24 23:45:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:46:10 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:47:04 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 195f5cb2-0c35-4ca9-834e-07dc416cd09e (at 10.3.5.43@o2ib) in 282 seconds. I think it's dead, and I am evicting it. Mar 24 23:47:04 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 23:47:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:47:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:47:57 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 195f5cb2-0c35-4ca9-834e-07dc416cd09e (at 10.3.5.43@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:48:13 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:49:10 ALPL401 last message repeated 2 times Mar 24 23:49:13 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client da59401d-1401-388a-62de-55618c9c69fb (at 10.3.6.31@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:49:13 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 24 23:49:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:49:49 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client af3515a5-a19e-07d6-d4dd-f4d0b3e09650 (at 10.3.6.32@o2ib) in 310 seconds. I think it's dead, and I am evicting it. Mar 24 23:49:49 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 23:50:14 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client af3515a5-a19e-07d6-d4dd-f4d0b3e09650 (at 10.3.6.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:50:14 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 24 23:50:18 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:51:13 ALPL401 last message repeated 2 times Mar 24 23:51:34 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client f5c4d30f-a32d-9ce1-2112-bef37db39c87 (at 10.3.7.6@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:51:34 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 24 23:51:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:52:21 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:53:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:53:52 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 4143bbba-7f97-42fb-d866-a9542f5605f4 (at 10.3.8.41@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 24 23:53:52 ALPL401 kernel: Lustre: Skipped 20 previous similar messages Mar 24 23:54:18 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 24 23:56:25 ALPL401 last message repeated 2 times Mar 25 00:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 00:25:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:25:43 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 1ed226f5-e144-23fd-70bf-b0a5f52dc3df (at 10.3.1.63@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 00:25:43 ALPL401 kernel: Lustre: Skipped 16 previous similar messages Mar 25 00:26:14 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:27:14 ALPL401 last message repeated 2 times Mar 25 00:27:35 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client a26c0d2f-4bad-07a4-ef78-8841534f5ea0 (at 10.3.4.55@o2ib) in 300 seconds. I think it's dead, and I am evicting it. Mar 25 00:27:35 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 25 00:27:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:28:10 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client a26c0d2f-4bad-07a4-ef78-8841534f5ea0 (at 10.3.4.55@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 00:28:10 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 25 00:28:13 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:28:48 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:30:14 ALPL401 last message repeated 3 times Mar 25 00:31:20 ALPL401 last message repeated 2 times Mar 25 00:31:54 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:32:12 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client fc79150c-3db0-c1d5-09a1-723b7c81eefd (at 10.3.9.9@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 00:32:12 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 25 00:32:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:45:12 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 7ec8c9c3-814f-0a96-b9b8-b7b8c59f2998 (at 10.3.1.56@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 00:45:12 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 00:45:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:46:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 00:46:57 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 25 00:50:11 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 25 01:02:57 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 25 01:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 01:20:00 ALPL401 ntpd[6309]: time reset -2.610055 s Mar 25 01:23:44 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 25 01:25:06 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client cb90fe6f-dd66-cbab-f600-c593f8dfece2 (at 10.3.4.16@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 01:25:06 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 01:25:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 01:26:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 01:26:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 01:26:56 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 25 01:26:58 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 52a721fd-2f86-3fd0-e811-a09270aedaa0 (at 10.3.5.50@o2ib) in 324 seconds. I think it's dead, and I am evicting it. Mar 25 01:26:58 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 25 01:27:09 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 52a721fd-2f86-3fd0-e811-a09270aedaa0 (at 10.3.5.50@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 01:27:09 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 01:27:10 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 01:27:34 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 01:28:00 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 25 01:28:04 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 01:34:26 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 7be21a34-563c-aac1-3f2d-29cd1e9dbaac (at 10.3.4.45@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 01:34:26 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 25 01:45:35 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 0d61af74-9a1c-2f5e-c60b-1c2666614119 (at 10.3.2.56@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 01:45:35 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 01:45:38 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 0d61af74-9a1c-2f5e-c60b-1c2666614119 (at 10.3.2.56@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 01:45:38 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 25 01:45:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 01:46:39 ALPL401 last message repeated 2 times Mar 25 01:47:40 ALPL401 last message repeated 2 times Mar 25 02:05:27 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client c4bdd3e4-b65c-bf24-373b-fd6c8b7784a5 (at 10.3.1.64@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 02:05:27 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 02:05:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 02:06:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 02:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 02:25:19 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client f75901da-16b7-c350-ab6b-d3963d6ba867 (at 10.3.4.9@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 02:25:19 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 02:25:46 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 02:26:42 ALPL401 last message repeated 2 times Mar 25 02:27:06 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 02:27:12 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client d8cefc74-4a3d-ee98-ac2d-7cafda5a5353 (at 10.3.5.46@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 02:27:12 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 25 02:27:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 02:28:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 02:33:58 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 725b7508-f3fd-b46f-d108-74dec363493c (at 10.3.4.9@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 02:33:58 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 03:05:04 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client e24930fe-11c2-5688-30f8-e2ce3121cd7b (at 10.3.3.33@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 03:05:04 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 03:05:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 03:06:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 03:06:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 03:07:01 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client d650251c-dd97-a113-9158-fc5dbb95e109 (at 10.3.6.1@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 03:07:01 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 03:07:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 03:08:12 ALPL401 last message repeated 2 times Mar 25 03:09:11 ALPL401 last message repeated 2 times Mar 25 03:09:31 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 298287b8-084c-21aa-a3ba-ed2679149cc5 (at 10.3.8.23@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 03:09:31 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 25 03:09:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 03:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 03:45:34 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client ff302238-eecb-109d-1387-cc15ea36e82e (at 10.3.3.49@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 03:45:34 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 03:45:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 04:25:21 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 09d0365e-274e-09fa-2705-f75deef6a7d5 (at 10.3.6.11@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 04:25:21 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 04:25:21 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 8e8c34c9-885e-9f4d-c4f0-c3e4b546bdf8 (at 10.3.6.14@o2ib) in 263 seconds. I think it's dead, and I am evicting it. Mar 25 04:25:21 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 25 04:25:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:26:34 ALPL401 last message repeated 2 times Mar 25 04:27:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:27:13 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 7e4dc57c-6856-1184-a70b-845a992337a7 (at 10.3.6.19@o2ib) in 332 seconds. I think it's dead, and I am evicting it. Mar 25 04:27:13 ALPL401 kernel: Lustre: Skipped 17 previous similar messages Mar 25 04:27:16 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 7e4dc57c-6856-1184-a70b-845a992337a7 (at 10.3.6.19@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 04:27:16 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 25 04:27:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:28:40 ALPL401 last message repeated 2 times Mar 25 04:29:13 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:45:22 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 4722d546-2618-74ab-396a-781c30a48802 (at 10.3.5.6@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 04:45:22 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 25 04:45:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:46:15 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:46:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:47:14 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 499e10d8-03e1-8909-24ba-ee5eb5cf2cc2 (at 10.3.6.27@o2ib) in 306 seconds. I think it's dead, and I am evicting it. Mar 25 04:47:14 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 25 04:47:18 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:47:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:47:43 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 499e10d8-03e1-8909-24ba-ee5eb5cf2cc2 (at 10.3.6.27@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 04:47:43 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 25 04:48:05 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:48:46 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:49:06 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 3246709d-ff7e-492a-3087-2b9c6c3c1f85 (at 10.3.8.39@o2ib) in 303 seconds. I think it's dead, and I am evicting it. Mar 25 04:49:06 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 25 04:49:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:49:38 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 3246709d-ff7e-492a-3087-2b9c6c3c1f85 (at 10.3.8.39@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 04:49:38 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 25 04:49:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:50:14 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 04:55:59 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 4e4dd571-2e18-3162-fcb6-a5e73d42f6c4 (at 10.3.6.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 04:55:59 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 05:05:10 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 6ba0f53b-4824-b5b8-6419-5340fd23b896 (at 10.3.2.63@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 05:05:10 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 05:05:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 05:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 05:25:39 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client c83f56ff-7015-3298-0f94-d17ec5a7e7f2 (at 10.3.9.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 05:25:39 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 05:25:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 05:26:31 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 05:26:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 06:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 07:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 07:45:09 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client ff8f1df3-a289-1808-68a5-343af10d913c (at 10.3.3.42@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 07:45:09 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 07:45:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 07:46:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 08:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 08:25:35 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 75383426-fef6-1906-5d08-1671f6df0d52 (at 10.3.4.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 08:25:35 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 08:25:35 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 75383426-fef6-1906-5d08-1671f6df0d52 (at 10.3.4.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 08:25:35 ALPL401 kernel: Lustre: Skipped 8 previous similar messages Mar 25 08:26:06 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 08:26:36 ALPL401 last message repeated 2 times Mar 25 08:45:11 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client d71afa90-aa04-dbb6-0b86-ae0829f92783 (at 10.3.4.3@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 08:45:11 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 25 08:45:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 08:46:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 08:47:16 ALPL401 last message repeated 2 times Mar 25 09:05:25 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client b970b091-28eb-ead4-196b-0d04e405ba79 (at 10.3.2.48@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 09:05:25 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 25 09:05:25 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client b970b091-28eb-ead4-196b-0d04e405ba79 (at 10.3.2.48@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 09:05:25 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 09:05:34 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 09:05:56 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 09:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 09:45:07 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 7a0390e7-7fe0-70ec-05e9-1fee0359a268 (at 10.3.3.3@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 09:45:07 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 25 09:45:07 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 7a0390e7-7fe0-70ec-05e9-1fee0359a268 (at 10.3.3.3@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 09:45:07 ALPL401 kernel: Lustre: Skipped 15 previous similar messages Mar 25 09:45:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 09:46:40 ALPL401 last message repeated 2 times Mar 25 09:47:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 09:47:10 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client f6d289f2-9ea8-c2bd-6494-bbbaa9a23939 (at 10.3.6.37@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 09:47:10 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 09:47:34 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 09:48:10 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 09:48:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 10:05:16 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 960cc361-53df-b8a1-7e3a-4bda4d38157c (at 10.3.9.17@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 10:05:16 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 25 10:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 10:25:03 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 48a4bc6f-7470-8393-01ee-3b23c68423a6 (at 10.3.5.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 10:25:03 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 10:25:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 10:26:30 ALPL401 last message repeated 2 times Mar 25 11:05:29 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 20daa1a9-8c3d-99b6-5ccd-e4e134239792 (at 10.3.9.9@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 11:05:29 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 11:05:29 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 20daa1a9-8c3d-99b6-5ccd-e4e134239792 (at 10.3.9.9@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 11:05:29 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 11:06:04 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 11:10:44 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 11:25:30 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 11:45:24 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 5403955d-10f2-c2cd-fcae-0467148bafd2 (at 10.3.2.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 11:45:24 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 25 11:45:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 12:05:08 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 2b069ca9-f053-07d4-9915-581556f9486f (at 10.3.9.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 12:05:08 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 12:05:08 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 2b069ca9-f053-07d4-9915-581556f9486f (at 10.3.9.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 12:05:08 ALPL401 kernel: Lustre: Skipped 15 previous similar messages Mar 25 12:05:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 12:06:53 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 12:07:03 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 12:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 12:25:19 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client d337a0da-fc80-a2b5-bdb7-85cb4af9793e (at 10.3.4.63@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 12:25:19 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 12:25:35 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 12:26:10 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 12:45:16 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 3b0a7bc6-24ff-83ae-3387-071a8716cc71 (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 12:45:16 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 12:45:16 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 3b0a7bc6-24ff-83ae-3387-071a8716cc71 (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 12:45:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 12:46:02 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 13:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 13:45:06 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 74330132-bb8d-511f-de30-9d85ad22e3ae (at 10.3.2.3@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 13:45:06 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 14:05:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 14:05:59 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 46c143db-b9af-a11c-027a-a3cfa20aecd5 (at 10.3.9.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 14:05:59 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 14:05:59 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 46c143db-b9af-a11c-027a-a3cfa20aecd5 (at 10.3.9.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 14:05:59 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 25 14:06:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 14:06:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 14:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 14:25:07 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client d1cfbd34-aea7-80bf-6cb7-3d324ba1b6dc (at 10.3.5.2@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 14:25:07 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 25 14:25:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 14:26:17 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 14:27:08 ALPL401 last message repeated 2 times Mar 25 14:27:20 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 7be528b9-848b-151c-a007-41f3dde6f7f7 (at 10.3.6.54@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 14:27:20 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 25 14:27:45 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 14:28:37 ALPL401 last message repeated 2 times Mar 25 14:45:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 14:45:40 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client e738b0ab-38e8-f8d9-87bf-d01ca301fef6 (at 10.3.3.3@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 14:45:40 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 25 15:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 15:25:09 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 34d88127-649c-999a-88d3-fb4cda08b3e6 (at 10.3.3.23@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 15:25:09 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 15:25:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 15:26:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 15:45:25 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 220df50b-a9c3-622e-d3c1-144907e17d4e (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 15:45:25 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 15:45:25 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 220df50b-a9c3-622e-d3c1-144907e17d4e (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 15:45:25 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 15:45:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 15:46:45 ALPL401 last message repeated 2 times Mar 25 15:47:12 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 16:00:44 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 6e94dc22-b462-936b-13da-8ca7350ae616 (at 10.3.1.38@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 16:00:44 ALPL401 kernel: Lustre: Skipped 15 previous similar messages Mar 25 16:01:53 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 16:05:10 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 9be58b88-13ae-cd7a-372a-a7b498fc32fa (at 10.3.2.20@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 16:05:10 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 16:05:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 16:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 17:05:33 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 5c49a4c6-da43-d2f0-8ec7-ad757f7d47ef (at 10.3.2.9@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 17:05:33 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 17:05:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 17:06:12 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 17:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 17:25:18 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client f99c6118-e884-c819-2d98-13a483d04a16 (at 10.3.2.26@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 17:25:18 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 17:25:18 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client f99c6118-e884-c819-2d98-13a483d04a16 (at 10.3.2.26@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 17:25:18 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 25 17:25:34 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 17:26:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 17:45:39 ALPL401 last message repeated 2 times Mar 25 17:45:40 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client c8f17db1-3268-1faf-4a85-efd8f11a08ea (at 10.3.3.4@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 17:45:40 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 25 17:45:40 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client c8f17db1-3268-1faf-4a85-efd8f11a08ea (at 10.3.3.4@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 17:45:40 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 25 17:46:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 18:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 18:29:43 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client a8dbd026-f198-8295-3c09-a7a86a4eafdb (at 10.3.6.66@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 18:29:43 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 25 18:29:46 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client a8dbd026-f198-8295-3c09-a7a86a4eafdb (at 10.3.6.66@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 18:29:49 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client a8dbd026-f198-8295-3c09-a7a86a4eafdb (at 10.3.6.66@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 18:29:49 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 25 18:32:15 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client a8dbd026-f198-8295-3c09-a7a86a4eafdb (at 10.3.6.66@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 18:35:26 ALPL401 kernel: LustreError: 15434:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-107) req@ffff8105ef064800 x1478417547717609/t0 o101->@:0/0 lens 296/0 e 0 to 0 dl 1427279732 ref 1 fl Interpret:/0/0 rc -107/0 Mar 25 18:35:26 ALPL401 kernel: LustreError: 15434:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 13 previous similar messages Mar 25 18:40:13 ALPL401 kernel: LustreError: 15653:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-107) req@ffff810ae7482050 x1478417547769735/t0 o101->@:0/0 lens 304/0 e 0 to 0 dl 1427280019 ref 1 fl Interpret:/0/0 rc -107/0 Mar 25 18:40:13 ALPL401 kernel: LustreError: 15653:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 1 previous similar message Mar 25 18:41:01 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client a8dbd026-f198-8295-3c09-a7a86a4eafdb (at 10.3.6.66@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 18:41:22 ALPL401 kernel: LustreError: 15446:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-107) req@ffff8105c2f1a800 x1478417547778775/t0 o101->@:0/0 lens 296/0 e 0 to 0 dl 1427280125 ref 1 fl Interpret:/0/0 rc -107/0 Mar 25 18:41:22 ALPL401 kernel: LustreError: 15446:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 2 previous similar messages Mar 25 18:45:14 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 996b6e16-a1da-0599-0660-ef8616e82fe0 (at 10.3.4.2@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 18:45:14 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 25 18:45:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 18:46:10 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 19:05:39 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client eba14250-22d6-e092-f2af-064faab033b5 (at 10.3.2.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 19:05:39 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 19:05:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 19:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 19:25:27 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 99c163c9-a48a-3f12-a45c-1ebe493b9698 (at 10.3.4.23@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 19:25:27 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 19:25:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 19:26:09 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 20:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 20:25:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 20:25:41 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client ae0aff64-ff9f-6c1b-db13-32926a25d3f1 (at 10.3.2.27@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 20:25:41 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 20:25:41 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client ae0aff64-ff9f-6c1b-db13-32926a25d3f1 (at 10.3.2.27@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 20:45:14 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 1cdd2eed-6f16-ecc7-5d45-dd676ee9d095 (at 10.3.3.58@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 20:45:14 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 20:45:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 20:46:15 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 21:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 22:05:15 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client ecc92ada-dbb4-dc5e-cbc6-80623f7fce13 (at 10.3.9.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 22:05:15 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 22:05:15 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client ecc92ada-dbb4-dc5e-cbc6-80623f7fce13 (at 10.3.9.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 22:05:15 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 25 22:06:05 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 22:06:13 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 22:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 22:25:34 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 40e046ec-c7bb-b99f-9b1a-a5b12600973a (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 22:25:34 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 25 22:25:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 22:26:11 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 22:43:37 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883805 sent from LFS05-OST0051 to NID 10.3.2.21@o2ib 7s ago has timed out (7s prior to deadline). Mar 25 22:43:37 ALPL401 kernel: req@ffff810c29db0c00 x1495855439883805/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294617 ref 2 fl Rpc:/0/0 rc 0/0 Mar 25 22:43:43 ALPL401 kernel: Lustre: 15591:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883806 sent from LFS05-OST0054 to NID 10.3.2.21@o2ib 13s ago has timed out (13s prior to deadline). Mar 25 22:43:43 ALPL401 kernel: req@ffff8105e8bb9800 x1495855439883806/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294623 ref 2 fl Rpc:/0/0 rc 0/0 Mar 25 22:43:43 ALPL401 kernel: Lustre: 15591:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 1 previous similar message Mar 25 22:43:44 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883805 sent from LFS05-OST0051 to NID 10.3.2.21@o2ib 7s ago has timed out (7s prior to deadline). Mar 25 22:43:44 ALPL401 kernel: req@ffff810c29db0c00 x1495855439883805/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294624 ref 3 fl Rpc:/2/0 rc 0/0 Mar 25 22:43:44 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 1 previous similar message Mar 25 22:43:51 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883805 sent from LFS05-OST0051 to NID 10.3.2.21@o2ib 7s ago has timed out (7s prior to deadline). Mar 25 22:43:51 ALPL401 kernel: req@ffff810c29db0c00 x1495855439883805/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294631 ref 4 fl Rpc:/2/0 rc 0/0 Mar 25 22:43:56 ALPL401 kernel: Lustre: 14667:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883804 sent from LFS05-OST0052 to NID 10.3.2.21@o2ib 13s ago has timed out (13s prior to deadline). Mar 25 22:43:56 ALPL401 kernel: req@ffff810134927000 x1495855439883804/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294636 ref 3 fl Rpc:/2/0 rc 0/0 Mar 25 22:43:56 ALPL401 kernel: Lustre: 14667:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 1 previous similar message Mar 25 22:44:05 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883805 sent from LFS05-OST0051 to NID 10.3.2.21@o2ib 7s ago has timed out (7s prior to deadline). Mar 25 22:44:05 ALPL401 kernel: req@ffff810c29db0c00 x1495855439883805/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294645 ref 6 fl Rpc:/2/0 rc 0/0 Mar 25 22:44:05 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Mar 25 22:44:19 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883805 sent from LFS05-OST0051 to NID 10.3.2.21@o2ib 7s ago has timed out (7s prior to deadline). Mar 25 22:44:19 ALPL401 kernel: req@ffff810c29db0c00 x1495855439883805/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294659 ref 8 fl Rpc:/2/0 rc 0/0 Mar 25 22:44:19 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 4 previous similar messages Mar 25 22:44:41 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1495855439883805 sent from LFS05-OST0051 to NID 10.3.2.21@o2ib 7s ago has timed out (7s prior to deadline). Mar 25 22:44:41 ALPL401 kernel: req@ffff810c29db0c00 x1495855439883805/t0 o106->@NET_0x500000a030215_UUID:15/16 lens 296/424 e 0 to 1 dl 1427294681 ref 11 fl Rpc:/2/0 rc 0/0 Mar 25 22:44:41 ALPL401 kernel: Lustre: 15760:0:(client.c:1529:ptlrpc_expire_one_request()) Skipped 11 previous similar messages Mar 25 22:45:07 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client c9552633-dd01-5969-6634-ec04d558bc55 (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 22:45:07 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 25 22:45:07 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client c9552633-dd01-5969-6634-ec04d558bc55 (at 10.3.2.21@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 22:45:07 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 25 22:46:10 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 23:05:31 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 4e1621b4-714b-6052-4d58-5d985630298b (at 10.3.3.17@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 23:05:31 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 25 23:05:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 23:10:45 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 25 23:25:43 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 8dfac636-94ea-3296-8118-41fe18b1c035 (at 10.3.3.17@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 23:25:43 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 25 23:25:43 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 8dfac636-94ea-3296-8118-41fe18b1c035 (at 10.3.3.17@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 23:25:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 25 23:43:45 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 25 23:45:13 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client c4270cb0-8ca8-3e88-759e-6434e9d6f30b (at 10.3.3.29@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 25 23:45:13 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 26 00:05:26 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client e5aa52e5-5b92-9e0b-3000-ff8e8fc6b8f1 (at 10.3.3.17@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 00:05:26 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 00:05:26 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client e5aa52e5-5b92-9e0b-3000-ff8e8fc6b8f1 (at 10.3.3.17@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 00:05:26 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 26 00:05:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 00:06:12 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 00:06:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 00:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 00:17:49 ALPL401 ntpd[6309]: time reset -2.739175 s Mar 26 00:21:24 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 26 00:24:40 ALPL401 ntpd[6309]: synchronized to 10.3.8.66, stratum 4 Mar 26 00:25:35 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client d11d00f6-79f5-1e09-16df-9561a0b8482a (at 10.3.3.29@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 00:25:35 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 26 00:25:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 00:45:15 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 75dee454-4c99-e4e2-faf0-4247ad97c895 (at 10.3.3.40@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 00:45:15 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 00:45:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 00:46:05 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 01:05:08 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client d1b8f965-4633-4af4-88c9-56a7f2127a09 (at 10.3.3.38@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 01:05:08 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 26 01:05:08 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client d1b8f965-4633-4af4-88c9-56a7f2127a09 (at 10.3.3.38@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 01:05:08 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 26 01:05:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 01:06:32 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 01:07:02 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 711c6cb2-02b8-befa-5f7b-ecefaba9fbcf (at 10.3.9.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 01:07:02 ALPL401 kernel: Lustre: Skipped 11 previous similar messages Mar 26 01:07:06 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 01:07:19 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 01:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 01:25:21 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 38b46147-83e1-4aa0-fcd2-125a0dd1721a (at 10.3.9.31@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 01:25:21 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 01:25:49 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 02:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 03:10:46 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 03:25:16 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 89e6e133-a2ee-4082-096e-feb2581ddd06 (at 10.3.1.49@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 03:25:16 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 03:25:16 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 89e6e133-a2ee-4082-096e-feb2581ddd06 (at 10.3.1.49@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 03:25:16 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 26 03:25:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 03:26:39 ALPL401 last message repeated 2 times Mar 26 04:05:10 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 8293f4cf-f997-b9b6-04f2-05f1f2631e10 (at 10.3.5.23@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 04:05:10 ALPL401 kernel: Lustre: Skipped 8 previous similar messages Mar 26 04:05:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 04:06:08 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 04:10:47 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 04:25:36 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 879df44d-6567-a545-bd6e-31fb04a2ae6e (at 10.3.2.42@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 04:25:36 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 26 04:25:36 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 879df44d-6567-a545-bd6e-31fb04a2ae6e (at 10.3.2.42@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 04:25:36 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 04:25:37 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 04:45:31 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client ad48a26e-7b20-42b2-23da-c1a48385b9a9 (at 10.3.4.56@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 04:45:31 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 26 04:45:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 04:46:14 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 04:47:12 ALPL401 last message repeated 2 times Mar 26 04:52:53 ALPL401 ntpd[6309]: synchronized to LOCAL(0), stratum 10 Mar 26 04:54:56 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 34e1f0ec-c648-3d7b-bc5e-a6ca8218e646 (at 10.3.5.31@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 04:54:56 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 26 05:05:08 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 00826c0e-8b85-9f14-57c6-81d43f64c03a (at 10.3.3.24@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 05:05:08 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 05:05:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 05:10:48 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 05:25:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 05:45:11 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client a63c1a20-81f9-35cc-2486-71915d4d8ee9 (at 10.3.1.59@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 05:45:11 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 05:45:11 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client a63c1a20-81f9-35cc-2486-71915d4d8ee9 (at 10.3.1.59@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 05:45:11 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 26 05:45:45 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 05:46:06 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 06:10:48 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 07:05:43 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 15a51e4c-80fc-dd92-01fd-0a7922cc9823 (at 10.3.9.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 07:05:43 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 26 07:06:07 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 07:06:36 ALPL401 last message repeated 2 times Mar 26 07:10:49 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 07:25:19 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 05e86d25-4907-c7b8-9bbf-6abb70704f14 (at 10.3.3.11@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 07:25:19 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 26 07:25:39 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:05:14 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 6e11be27-d508-8bb5-0ca9-e3c9085969c7 (at 10.3.3.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:05:14 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 08:05:14 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 6e11be27-d508-8bb5-0ca9-e3c9085969c7 (at 10.3.3.32@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:05:14 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 08:05:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:10:49 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 08:25:11 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 1f7ad9ab-95ea-d07a-ec46-fb188d227130 (at 10.3.9.30@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:25:11 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 26 08:25:36 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:45:29 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client bb2cca60-1294-60e9-d2ca-f7ac42ad7c29 (at 10.3.2.20@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:45:29 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 08:45:29 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client bb2cca60-1294-60e9-d2ca-f7ac42ad7c29 (at 10.3.2.20@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:45:29 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 26 08:46:15 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:47:10 ALPL401 last message repeated 2 times Mar 26 08:47:41 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 1ac0c044-dd7e-6e56-89a5-115b0197b11a (at 10.3.4.36@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:47:41 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 26 08:47:41 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:48:14 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:49:19 ALPL401 last message repeated 2 times Mar 26 08:49:40 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 883468c0-1df8-b323-080b-928e6bebafc2 (at 10.3.5.8@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:49:40 ALPL401 kernel: Lustre: Skipped 14 previous similar messages Mar 26 08:49:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:50:15 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:51:42 ALPL401 last message repeated 3 times Mar 26 08:52:03 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 901f948b-21e8-0389-a926-da662bc48de1 (at 10.3.5.54@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:52:03 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 26 08:52:14 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:53:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:53:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:54:13 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client da5c56b0-7550-8c77-e017-4b07dc92587e (at 10.3.6.33@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:54:13 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 26 08:54:42 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:55:45 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:56:05 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 71f83b12-c11d-818f-e3a8-f22e096f82d7 (at 10.3.7.18@o2ib) in 334 seconds. I think it's dead, and I am evicting it. Mar 26 08:56:05 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 26 08:56:22 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:56:56 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:57:47 ALPL401 last message repeated 2 times Mar 26 08:58:06 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client 75dc59fc-c075-2001-e7b9-0cb5dd3786e4 (at 10.3.8.5@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 08:58:06 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 26 08:58:16 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:58:48 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 08:59:26 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:00:02 ALPL401 kernel: Lustre: LFS05-OST0051: haven't heard from client 838de325-db6a-b254-479a-504088178e71 (at 10.3.8.34@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 09:00:02 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 26 09:00:44 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:01:20 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:01:56 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:01:58 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client a83efbbd-14a2-0c13-d2b0-eff155c09366 (at 10.3.8.46@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 09:01:58 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 26 09:02:03 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:02:03 ALPL401 kernel: LustreError: Skipped 13 previous similar messages Mar 26 09:02:03 ALPL401 kernel: LustreError: 15426:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff810621ecc800 x1496665778290964/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427331873 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:02:03 ALPL401 kernel: LustreError: 15426:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 1 previous similar message Mar 26 09:02:18 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:02:25 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:02:25 ALPL401 kernel: LustreError: 15678:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff810abfe2b050 x1496665802408212/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427331895 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:02:49 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:02:49 ALPL401 kernel: LustreError: 14593:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff81063ea39000 x1496665802408400/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427331919 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:03:20 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:03:20 ALPL401 kernel: LustreError: 14693:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105f19fc400 x1496665778291418/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427331950 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:03:52 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:03:52 ALPL401 kernel: LustreError: 15513:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105f353d000 x1496665802408707/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427331982 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:04:14 ALPL401 kernel: Lustre: LFS05-OST0053: haven't heard from client 221c008c-b2ff-bac0-f5f4-cb7e01b18635 (at 10.3.8.56@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 09:04:14 ALPL401 kernel: Lustre: Skipped 19 previous similar messages Mar 26 09:04:34 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:04:34 ALPL401 kernel: LustreError: 14638:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff810abe229450 x1496665778291946/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427332024 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:04:50 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:04:57 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:04:57 ALPL401 kernel: LustreError: 14626:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105f1a46c00 x1496665962840340/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427332047 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:05:21 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:05:21 ALPL401 kernel: LustreError: Skipped 1 previous similar message Mar 26 09:05:21 ALPL401 kernel: LustreError: 15753:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105f05e3000 x1496665962840528/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427332071 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:05:21 ALPL401 kernel: LustreError: 15753:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 1 previous similar message Mar 26 09:05:53 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:06:00 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:06:00 ALPL401 kernel: LustreError: Skipped 2 previous similar messages Mar 26 09:06:00 ALPL401 kernel: LustreError: 15622:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff810621905800 x1496666027852052/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427332110 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:06:00 ALPL401 kernel: LustreError: 15622:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 2 previous similar messages Mar 26 09:06:38 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:07:01 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:07:08 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:07:08 ALPL401 kernel: LustreError: Skipped 7 previous similar messages Mar 26 09:07:08 ALPL401 kernel: LustreError: 14646:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105f35b0400 x1496666099155220/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427332178 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:07:08 ALPL401 kernel: LustreError: 14646:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 7 previous similar messages Mar 26 09:08:23 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:09:21 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:09:21 ALPL401 kernel: LustreError: Skipped 19 previous similar messages Mar 26 09:09:21 ALPL401 kernel: LustreError: 15697:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105e865e000 x1496665778296090/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427332311 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:09:21 ALPL401 kernel: LustreError: 15697:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 19 previous similar messages Mar 26 09:10:49 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 09:13:46 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 10.3.2.42@o2ib ns: filter-LFS05-OST0052_UUID lock: ffff8109fe676600/0xa230552c8e1a726c lrc: 3/0,0 mode: PW/PW res: 363419942/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x20 remote: 0xae0e521e655767de expref: 25 pid: 15712 timeout 5069248931 Mar 26 09:13:46 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) Skipped 1 previous similar message Mar 26 09:13:52 ALPL401 kernel: LustreError: 15712:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-107) req@ffff8105ef064400 x1496648394215007/t0 o400->@:0/0 lens 192/0 e 0 to 0 dl 1427332448 ref 1 fl Interpret:H/0/0 rc -107/0 Mar 26 09:13:52 ALPL401 kernel: LustreError: 15712:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 42 previous similar messages Mar 26 09:13:57 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:13:57 ALPL401 kernel: LustreError: Skipped 42 previous similar messages Mar 26 09:22:30 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:22:30 ALPL401 kernel: LustreError: Skipped 63 previous similar messages Mar 26 09:22:30 ALPL401 kernel: LustreError: 15567:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105e7d8ac00 x1496665802451236/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427333100 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:22:30 ALPL401 kernel: LustreError: 15567:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 64 previous similar messages Mar 26 09:25:43 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 09:26:44 ALPL401 last message repeated 2 times Mar 26 09:28:16 ALPL401 last message repeated 3 times Mar 26 09:32:31 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0057_UUID' is not available for connect (no target) Mar 26 09:32:31 ALPL401 kernel: LustreError: Skipped 106 previous similar messages Mar 26 09:32:31 ALPL401 kernel: LustreError: 14691:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff8105f353d000 x1496667370033094/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427333701 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 09:32:31 ALPL401 kernel: LustreError: 14691:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 106 previous similar messages Mar 26 09:32:43 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 10.3.3.17@o2ib ns: filter-LFS05-OST0052_UUID lock: ffff810be78adc00/0xa230552c8e1a8352 lrc: 3/0,0 mode: PW/PW res: 363419942/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x20 remote: 0x42efddc13e7c0b83 expref: 5 pid: 15618 timeout 5070385941 Mar 26 09:35:30 ALPL401 kernel: sd 6:0:0:1: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments. Mar 26 09:35:30 ALPL401 kernel: igb: eth2 NIC Link is Down Mar 26 09:35:32 ALPL401 kernel: igb: eth2 NIC Link is Up 10 Mbps Full Duplex, Flow Control: RX/TX Mar 26 09:35:44 ALPL401 kernel: sd 7:0:0:1: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments. Mar 26 09:35:51 ALPL401 kernel: igb: eth2 NIC Link is Down Mar 26 09:35:53 ALPL401 kernel: igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Mar 26 09:36:18 ALPL401 kernel: LDISKFS-fs warning (device dm-5): ldiskfs_multi_mount_protect: MMP interval 42 higher than expected, please wait. Mar 26 09:36:18 ALPL401 kernel: Mar 26 09:36:53 ALPL401 kernel: igb: eth2 NIC Link is Down Mar 26 09:37:01 ALPL401 kernel: LDISKFS-fs (dm-5): recovery complete Mar 26 09:37:01 ALPL401 kernel: LDISKFS-fs (dm-5): mounted filesystem with ordered data mode Mar 26 09:37:01 ALPL401 multipathd: dm-5: umount map (uevent) Mar 26 09:37:01 ALPL401 kernel: JBD: barrier-based sync failed on dm-5-8 - disabling barriers Mar 26 09:37:01 ALPL401 kernel: LDISKFS-fs (dm-5): mounted filesystem with ordered data mode Mar 26 09:37:03 ALPL401 kernel: Lustre: 28007:0:(filter.c:1003:filter_init_server_data()) RECOVERY: service LFS05-OST0055, 554 recoverable clients, 0 delayed clients, last_rcvd 51579108031 Mar 26 09:37:03 ALPL401 kernel: Lustre: 28007:0:(filter.c:1003:filter_init_server_data()) Skipped 1 previous similar message Mar 26 09:37:03 ALPL401 kernel: JBD: barrier-based sync failed on dm-5-8 - disabling barriers Mar 26 09:37:03 ALPL401 kernel: Lustre: LFS05-OST0055: Now serving LFS05-OST0055 on /dev/mpath/OST05 with recovery enabled Mar 26 09:37:03 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 09:37:03 ALPL401 kernel: Lustre: LFS05-OST0055: Will be in recovery for at least 7:30, or until 554 clients reconnect Mar 26 09:37:03 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 09:37:03 ALPL401 kernel: Lustre: 15411:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) LFS05-OST0055: 553 recoverable clients remain Mar 26 09:37:03 ALPL401 kernel: Lustre: 15411:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) Skipped 1379 previous similar messages Mar 26 09:37:04 ALPL401 kernel: LDISKFS-fs warning (device dm-6): ldiskfs_multi_mount_protect: MMP interval 42 higher than expected, please wait. Mar 26 09:37:04 ALPL401 kernel: Mar 26 09:37:19 ALPL401 kernel: Lustre: 15679:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) LFS05-OST0055: 430 recoverable clients remain Mar 26 09:37:19 ALPL401 kernel: Lustre: 15679:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) Skipped 122 previous similar messages Mar 26 09:37:23 ALPL401 kernel: igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Mar 26 09:37:45 ALPL401 kernel: Lustre: 15738:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0055: fee7adbe-ed97-441d-7ac4-8f5b9fb2407f reconnecting Mar 26 09:37:48 ALPL401 kernel: LDISKFS-fs (dm-6): recovery complete Mar 26 09:37:48 ALPL401 kernel: LDISKFS-fs (dm-6): mounted filesystem with ordered data mode Mar 26 09:37:48 ALPL401 multipathd: dm-6: umount map (uevent) Mar 26 09:37:48 ALPL401 kernel: JBD: barrier-based sync failed on dm-6-8 - disabling barriers Mar 26 09:37:48 ALPL401 kernel: Lustre: 15622:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0055: 26b8c416-c1f5-a498-cc17-a0e79835cdac reconnecting Mar 26 09:37:48 ALPL401 kernel: Lustre: 15622:0:(ldlm_lib.c:576:target_handle_reconnect()) Skipped 149 previous similar messages Mar 26 09:37:48 ALPL401 kernel: LDISKFS-fs (dm-6): mounted filesystem with ordered data mode Mar 26 09:37:49 ALPL401 kernel: Lustre: 28330:0:(filter.c:1003:filter_init_server_data()) RECOVERY: service LFS05-OST0056, 550 recoverable clients, 0 delayed clients, last_rcvd 51589090675 Mar 26 09:37:49 ALPL401 kernel: JBD: barrier-based sync failed on dm-6-8 - disabling barriers Mar 26 09:37:49 ALPL401 kernel: Lustre: LFS05-OST0056: Now serving LFS05-OST0056 on /dev/mpath/OST06 with recovery enabled Mar 26 09:37:49 ALPL401 kernel: Lustre: LFS05-OST0056: Will be in recovery for at least 7:30, or until 550 clients reconnect Mar 26 09:37:50 ALPL401 kernel: LDISKFS-fs warning (device dm-7): ldiskfs_multi_mount_protect: MMP interval 42 higher than expected, please wait. Mar 26 09:37:50 ALPL401 kernel: Mar 26 09:37:52 ALPL401 kernel: Lustre: 15455:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) LFS05-OST0055: 170 recoverable clients remain Mar 26 09:37:52 ALPL401 kernel: Lustre: 15455:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) Skipped 259 previous similar messages Mar 26 09:37:54 ALPL401 kernel: Lustre: 15495:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0055: 5bae9bad-b216-9731-49ea-68f92f167c89 reconnecting Mar 26 09:38:14 ALPL401 kernel: Lustre: 15577:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0055: d39d7ed1-7a4a-184b-85e4-597c40fd04a9 reconnecting Mar 26 09:38:14 ALPL401 kernel: Lustre: 15577:0:(ldlm_lib.c:576:target_handle_reconnect()) Skipped 149 previous similar messages Mar 26 09:38:34 ALPL401 kernel: LDISKFS-fs (dm-7): recovery complete Mar 26 09:38:34 ALPL401 kernel: LDISKFS-fs (dm-7): mounted filesystem with ordered data mode Mar 26 09:38:34 ALPL401 multipathd: dm-7: umount map (uevent) Mar 26 09:38:34 ALPL401 kernel: JBD: barrier-based sync failed on dm-7-8 - disabling barriers Mar 26 09:38:35 ALPL401 kernel: LDISKFS-fs (dm-7): mounted filesystem with ordered data mode Mar 26 09:38:35 ALPL401 kernel: Lustre: 28620:0:(filter.c:1003:filter_init_server_data()) RECOVERY: service LFS05-OST0057, 541 recoverable clients, 0 delayed clients, last_rcvd 51586630838 Mar 26 09:38:35 ALPL401 kernel: JBD: barrier-based sync failed on dm-7-8 - disabling barriers Mar 26 09:38:35 ALPL401 kernel: Lustre: LFS05-OST0057: Now serving LFS05-OST0057 on /dev/mpath/OST07 with recovery enabled Mar 26 09:38:35 ALPL401 kernel: Lustre: LFS05-OST0057: Will be in recovery for at least 7:30, or until 541 clients reconnect Mar 26 09:38:35 ALPL401 kernel: LDISKFS-fs warning (device dm-8): ldiskfs_multi_mount_protect: MMP interval 42 higher than expected, please wait. Mar 26 09:38:35 ALPL401 kernel: Mar 26 09:38:41 ALPL401 kernel: Lustre: 15614:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0056: eb213ea3-479e-b802-bb09-3f117b090d69 reconnecting Mar 26 09:38:41 ALPL401 kernel: Lustre: 15614:0:(ldlm_lib.c:576:target_handle_reconnect()) Skipped 7 previous similar messages Mar 26 09:38:41 ALPL401 kernel: LustreError: 15535:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.6.57@o2ib (3787053a-d829-e08f-ec66-bb1855166ad9): 541 clients in recovery for 820s Mar 26 09:38:41 ALPL401 kernel: LustreError: 15648:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.49@o2ib (2239d008-05c1-1187-48bb-b7befcf09224): 541 clients in recovery for 820s Mar 26 09:39:00 ALPL401 kernel: Lustre: 15610:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) LFS05-OST0056: 70 recoverable clients remain Mar 26 09:39:00 ALPL401 kernel: Lustre: 15610:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) Skipped 871 previous similar messages Mar 26 09:39:23 ALPL401 kernel: LDISKFS-fs (dm-8): recovery complete Mar 26 09:39:23 ALPL401 kernel: LDISKFS-fs (dm-8): mounted filesystem with ordered data mode Mar 26 09:39:23 ALPL401 multipathd: dm-8: umount map (uevent) Mar 26 09:39:23 ALPL401 kernel: JBD: barrier-based sync failed on dm-8-8 - disabling barriers Mar 26 09:39:24 ALPL401 kernel: LDISKFS-fs (dm-8): mounted filesystem with ordered data mode Mar 26 09:39:24 ALPL401 kernel: Lustre: 29229:0:(filter.c:1003:filter_init_server_data()) RECOVERY: service LFS05-OST0058, 550 recoverable clients, 0 delayed clients, last_rcvd 51582043166 Mar 26 09:39:24 ALPL401 kernel: JBD: barrier-based sync failed on dm-8-8 - disabling barriers Mar 26 09:39:24 ALPL401 kernel: Lustre: LFS05-OST0058: Now serving LFS05-OST0058 on /dev/mpath/OST08 with recovery enabled Mar 26 09:39:24 ALPL401 kernel: Lustre: LFS05-OST0058: Will be in recovery for at least 7:30, or until 550 clients reconnect Mar 26 09:39:25 ALPL401 kernel: LDISKFS-fs warning (device dm-9): ldiskfs_multi_mount_protect: MMP interval 42 higher than expected, please wait. Mar 26 09:39:25 ALPL401 kernel: Mar 26 09:39:35 ALPL401 kernel: LustreError: 15729:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.53@o2ib (5bf23e13-0c6d-4e76-5a7b-24ff4966c20d): 273 clients in recovery for 766s Mar 26 09:39:35 ALPL401 kernel: LustreError: 15729:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 6 previous similar messages Mar 26 09:39:35 ALPL401 kernel: Lustre: 15393:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0057: 8e4ef019-7af6-26c9-90eb-88463217be78 reconnecting Mar 26 09:39:35 ALPL401 kernel: Lustre: 15393:0:(ldlm_lib.c:576:target_handle_reconnect()) Skipped 190 previous similar messages Mar 26 09:39:36 ALPL401 kernel: LustreError: 14619:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.49@o2ib (2239d008-05c1-1187-48bb-b7befcf09224): 271 clients in recovery for 765s Mar 26 09:39:36 ALPL401 kernel: LustreError: 14619:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 14 previous similar messages Mar 26 09:40:02 ALPL401 kernel: LustreError: 15682:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.2.20@o2ib (ae385a37-5b81-fadc-ab47-620d883a35ae): 143 clients in recovery for 739s Mar 26 09:40:02 ALPL401 kernel: LustreError: 15682:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 6 previous similar messages Mar 26 09:40:07 ALPL401 kernel: LDISKFS-fs (dm-9): recovery complete Mar 26 09:40:07 ALPL401 kernel: LDISKFS-fs (dm-9): mounted filesystem with ordered data mode Mar 26 09:40:07 ALPL401 multipathd: dm-9: umount map (uevent) Mar 26 09:40:07 ALPL401 kernel: JBD: barrier-based sync failed on dm-9-8 - disabling barriers Mar 26 09:40:07 ALPL401 kernel: LDISKFS-fs (dm-9): mounted filesystem with ordered data mode Mar 26 09:40:07 ALPL401 kernel: Lustre: 29621:0:(filter.c:1003:filter_init_server_data()) RECOVERY: service LFS05-OST0059, 554 recoverable clients, 0 delayed clients, last_rcvd 55875724785 Mar 26 09:40:07 ALPL401 kernel: JBD: barrier-based sync failed on dm-9-8 - disabling barriers Mar 26 09:40:07 ALPL401 kernel: Lustre: LFS05-OST0059: Now serving LFS05-OST0059 on /dev/mpath/OST09 with recovery enabled Mar 26 09:40:07 ALPL401 kernel: Lustre: LFS05-OST0059: Will be in recovery for at least 7:30, or until 554 clients reconnect Mar 26 09:40:23 ALPL401 kernel: LustreError: 15439:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.49@o2ib (2239d008-05c1-1187-48bb-b7befcf09224): 20 clients in recovery for 719s Mar 26 09:40:23 ALPL401 kernel: LustreError: 15439:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 12 previous similar messages Mar 26 09:41:06 ALPL401 kernel: LustreError: 14692:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.5.58@o2ib (4c06ecbc-dd38-d3dd-e132-fb06fea787bc): 6 clients in recovery for 675s Mar 26 09:41:06 ALPL401 kernel: LustreError: 14692:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 11 previous similar messages Mar 26 09:41:06 ALPL401 kernel: Lustre: 15491:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0059: 430d4c08-e9b2-d8a1-e610-86fb1a0fac13 reconnecting Mar 26 09:41:06 ALPL401 kernel: Lustre: 15491:0:(ldlm_lib.c:576:target_handle_reconnect()) Skipped 1436 previous similar messages Mar 26 09:41:08 ALPL401 kernel: Lustre: 15628:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) LFS05-OST0059: 245 recoverable clients remain Mar 26 09:41:08 ALPL401 kernel: Lustre: 15628:0:(ldlm_lib.c:1817:target_queue_last_replay_reply()) Skipped 1231 previous similar messages Mar 26 09:41:24 ALPL401 kernel: LustreError: 15442:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.46@o2ib (d6224002-f960-ab91-5335-cd97c5227ed9): 6 clients in recovery for 657s Mar 26 09:41:24 ALPL401 kernel: LustreError: 15442:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 15 previous similar messages Mar 26 09:41:56 ALPL401 kernel: LustreError: 15764:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.51@o2ib (003d9c9d-b322-544f-f763-81a1165cba37): 6 clients in recovery for 625s Mar 26 09:41:56 ALPL401 kernel: LustreError: 15764:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 29 previous similar messages Mar 26 09:42:32 ALPL401 kernel: LustreError: 15405:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-16) req@ffff8105efeac000 x1496665802464087/t0 o8->@:0/0 lens 368/264 e 0 to 0 dl 1427334302 ref 1 fl Interpret:/0/0 rc -16/0 Mar 26 09:42:32 ALPL401 kernel: LustreError: 15405:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 17284 previous similar messages Mar 26 09:43:02 ALPL401 kernel: LustreError: 14654:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.34@o2ib (7c6c3466-2f3e-cf62-756a-afc717adfb8c): 6 clients in recovery for 559s Mar 26 09:43:02 ALPL401 kernel: LustreError: 14654:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 39 previous similar messages Mar 26 09:45:10 ALPL401 kernel: LustreError: 15670:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.6.57@o2ib (3787053a-d829-e08f-ec66-bb1855166ad9): 6 clients in recovery for 431s Mar 26 09:45:10 ALPL401 kernel: LustreError: 15670:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 78 previous similar messages Mar 26 09:49:28 ALPL401 kernel: LustreError: 14690:0:(ldlm_lib.c:946:target_handle_connect()) LFS05-OST0057: denying connection for new client 10.3.8.51@o2ib (003d9c9d-b322-544f-f763-81a1165cba37): 6 clients in recovery for 173s Mar 26 09:49:28 ALPL401 kernel: LustreError: 14690:0:(ldlm_lib.c:946:target_handle_connect()) Skipped 165 previous similar messages Mar 26 09:50:44 ALPL401 kernel: Lustre: LFS05-OST0055: Recovery period over after 13:41, of 554 clients 548 recovered and 6 were evicted. Mar 26 09:50:44 ALPL401 kernel: Lustre: LFS05-OST0055: sending delayed replies to recovered clients Mar 26 09:50:44 ALPL401 kernel: LustreError: 27982:0:(filter_log.c:135:filter_cancel_cookies_cb()) error cancelling log cookies: rc = -19 Mar 26 09:50:44 ALPL401 kernel: Lustre: LFS05-OST0055: received MDS connection from 10.3.5.72@o2ib Mar 26 09:50:44 ALPL401 kernel: Lustre: 15739:0:(filter.c:3123:filter_destroy_precreated()) LFS05-OST0055: deleting orphan objects from 401706442 to 401710693, orphan objids won't be reused any more. Mar 26 09:51:33 ALPL401 kernel: Lustre: LFS05-OST0055: slow i_mutex 48s due to heavy IO load Mar 26 09:51:33 ALPL401 kernel: Lustre: Skipped 5 previous similar messages Mar 26 09:51:33 ALPL401 kernel: Lustre: LFS05-OST0055: slow journal start 47s due to heavy IO load Mar 26 09:51:33 ALPL401 kernel: Lustre: Skipped 194 previous similar messages Mar 26 09:51:33 ALPL401 kernel: Lustre: LFS05-OST0055: slow direct_io 47s due to heavy IO load Mar 26 09:51:33 ALPL401 kernel: Lustre: LFS05-OST0055: slow journal start 47s due to heavy IO load Mar 26 09:51:33 ALPL401 kernel: Lustre: LFS05-OST0055: slow commitrw commit 47s due to heavy IO load Mar 26 09:51:36 ALPL401 kernel: Lustre: LFS05-OST0056: Recovery period over after 13:42, of 550 clients 547 recovered and 2 were evicted. Mar 26 09:51:36 ALPL401 kernel: Lustre: LFS05-OST0056: sending delayed replies to recovered clients Mar 26 09:51:36 ALPL401 kernel: Lustre: LFS05-OST0056: received MDS connection from 10.3.5.72@o2ib Mar 26 09:51:36 ALPL401 kernel: Lustre: 15609:0:(filter.c:3123:filter_destroy_precreated()) LFS05-OST0056: deleting orphan objects from 414019838 to 414031004, orphan objids won't be reused any more. Mar 26 09:52:22 ALPL401 kernel: Lustre: LFS05-OST0057: Recovery period over after 13:41, of 541 clients 535 recovered and 6 were evicted. Mar 26 09:52:22 ALPL401 kernel: Lustre: LFS05-OST0057: sending delayed replies to recovered clients Mar 26 09:52:22 ALPL401 kernel: Lustre: LFS05-OST0057: received MDS connection from 10.3.5.72@o2ib Mar 26 09:52:22 ALPL401 kernel: Lustre: 14589:0:(filter.c:3123:filter_destroy_precreated()) LFS05-OST0057: deleting orphan objects from 385691556 to 385695368, orphan objids won't be reused any more. Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow i_mutex 48s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: Skipped 12 previous similar messages Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow journal start 47s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: Skipped 36 previous similar messages Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow brw_start 47s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: Skipped 34 previous similar messages Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow journal start 48s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow parent lock 43s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: Skipped 34 previous similar messages Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow direct_io 48s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: Skipped 35 previous similar messages Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow quota init 48s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow preprw_read setup 47s due to heavy IO load Mar 26 09:52:24 ALPL401 kernel: Lustre: LFS05-OST0056: slow preprw_read setup 47s due to heavy IO load Mar 26 09:52:42 ALPL401 kernel: Lustre: 15597:0:(ldlm_lib.c:805:target_handle_connect()) LFS05-OST0057: exp ffff810567184600 already connecting Mar 26 09:52:42 ALPL401 kernel: LustreError: 15597:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-114) req@ffff810331fcb800 x1496666185207939/t0 o8->@:0/0 lens 368/264 e 0 to 0 dl 1427334912 ref 1 fl Interpret:/0/0 rc -114/0 Mar 26 09:52:42 ALPL401 kernel: LustreError: 15597:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 475 previous similar messages Mar 26 09:52:42 ALPL401 kernel: Lustre: 14640:0:(ldlm_lib.c:805:target_handle_connect()) LFS05-OST0057: exp ffff810433364800 already connecting Mar 26 09:52:45 ALPL401 kernel: Lustre: 14654:0:(ldlm_lib.c:805:target_handle_connect()) LFS05-OST0057: exp ffff81024d0ae000 already connecting Mar 26 09:52:45 ALPL401 kernel: Lustre: 14654:0:(ldlm_lib.c:805:target_handle_connect()) Skipped 1 previous similar message Mar 26 09:52:47 ALPL401 kernel: Lustre: 14588:0:(ldlm_lib.c:805:target_handle_connect()) LFS05-OST0057: exp ffff81048d807800 already connecting Mar 26 09:52:47 ALPL401 kernel: Lustre: 14588:0:(ldlm_lib.c:805:target_handle_connect()) Skipped 1 previous similar message Mar 26 09:53:03 ALPL401 kernel: Lustre: 15742:0:(ldlm_lib.c:805:target_handle_connect()) LFS05-OST0057: exp ffff810567184600 already connecting Mar 26 09:53:03 ALPL401 kernel: Lustre: 15742:0:(ldlm_lib.c:805:target_handle_connect()) Skipped 3 previous similar messages Mar 26 09:53:06 ALPL401 kernel: Lustre: LFS05-OST0058: Recovery period over after 13:41, of 550 clients 548 recovered and 2 were evicted. Mar 26 09:53:06 ALPL401 kernel: Lustre: LFS05-OST0058: sending delayed replies to recovered clients Mar 26 09:53:06 ALPL401 kernel: Lustre: LFS05-OST0058: received MDS connection from 10.3.5.72@o2ib Mar 26 09:53:06 ALPL401 kernel: Lustre: 15396:0:(filter.c:3123:filter_destroy_precreated()) LFS05-OST0058: deleting orphan objects from 404462523 to 404474738, orphan objids won't be reused any more. Mar 26 09:53:08 ALPL401 kernel: Lustre: 15444:0:(ldlm_lib.c:805:target_handle_connect()) LFS05-OST0057: exp ffff81048d807800 already connecting Mar 26 09:53:08 ALPL401 kernel: Lustre: 15444:0:(ldlm_lib.c:805:target_handle_connect()) Skipped 4 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow journal start 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow journal start 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow brw_start 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 8 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 8 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow journal start 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 25 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow journal start 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow journal start 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow brw_start 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow parent lock 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 22 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow i_mutex 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 16 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow direct_io 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow direct_io 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 28 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow i_mutex 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow parent lock 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow preprw_write setup 50s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow preprw_write setup 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow parent lock 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 4 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow preprw_write setup 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow preprw_write setup 43s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow i_mutex 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow quota init 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow journal start 50s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 23 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow quota init 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow brw_start 50s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 24 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 26 09:53:14 ALPL401 kernel: Lustre: LFS05-OST0057: slow direct_io 52s due to heavy IO load Mar 26 09:53:14 ALPL401 kernel: Lustre: Skipped 6 previous similar messages Mar 26 09:53:15 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 151s: evicting client at 10.3.2.42@o2ib ns: filter-LFS05-OST0055_UUID lock: ffff8108637bbe00/0xa230552c8e2106e8 lrc: 3/0,0 mode: PW/PW res: 343278442/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0xae0e521e6557682b expref: 13 pid: 14619 timeout 5071617466 Mar 26 09:53:26 ALPL401 kernel: Lustre: 15412:0:(ldlm_lib.c:576:target_handle_reconnect()) LFS05-OST0057: e793370c-7b48-fe22-d536-64294908cdaf reconnecting Mar 26 09:53:26 ALPL401 kernel: Lustre: 15412:0:(ldlm_lib.c:576:target_handle_reconnect()) Skipped 334 previous similar messages Mar 26 09:53:56 ALPL401 kernel: Lustre: LFS05-OST0059: Recovery period over after 13:42, of 554 clients 548 recovered and 6 were evicted. Mar 26 09:53:56 ALPL401 kernel: Lustre: LFS05-OST0059: sending delayed replies to recovered clients Mar 26 09:53:56 ALPL401 kernel: Lustre: LFS05-OST0059: received MDS connection from 10.3.5.72@o2ib Mar 26 09:53:56 ALPL401 kernel: Lustre: 15754:0:(filter.c:3123:filter_destroy_precreated()) LFS05-OST0059: deleting orphan objects from 388165587 to 388174634, orphan objids won't be reused any more. Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow i_mutex 49s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 6 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow journal start 40s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow brw_start 40s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 25 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 25 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow journal start 50s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow journal start 50s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 59 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow brw_start 50s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 1 previous similar message Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow journal start 40s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 91 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow direct_io 50s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 31 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow direct_io 40s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 56 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow parent lock 40s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 107 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow preprw_write setup 40s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 34 previous similar messages Mar 26 09:54:04 ALPL401 kernel: Lustre: LFS05-OST0058: slow i_mutex 50s due to heavy IO load Mar 26 09:54:04 ALPL401 kernel: Lustre: Skipped 13 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow i_mutex 47s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 23 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow journal start 38s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow brw_start 38s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 49 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 52 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow journal start 44s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 7 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow parent lock 45s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow preprw_write setup 38s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 2 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow direct_io 47s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 87 previous similar messages Mar 26 09:54:43 ALPL401 kernel: Lustre: LFS05-OST0059: slow preprw_read setup 45s due to heavy IO load Mar 26 09:54:43 ALPL401 kernel: Lustre: Skipped 3 previous similar messages Mar 26 09:55:37 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 151s: evicting client at 10.3.7.32@o2ib ns: filter-LFS05-OST0058_UUID lock: ffff81094e8db800/0xa230552c8e5f1d10 lrc: 3/0,0 mode: PR/PR res: 402527551/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0x850f4863599301e8 expref: 4770 pid: 15611 timeout 5071759114 Mar 26 09:56:26 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 150s: evicting client at 10.3.3.17@o2ib ns: filter-LFS05-OST0059_UUID lock: ffff8108191c0a00/0xa230552c8e6eb4f3 lrc: 3/0,0 mode: PW/PW res: 329543457/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0x42efddc13e7f4d1d expref: 5 pid: 14595 timeout 5071808654 Mar 26 09:57:53 ALPL401 kernel: Lustre: Failing over LFS05-OST0059 Mar 26 09:57:53 ALPL401 kernel: LustreError: 29622:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8102fb92cc00 x1495855444201145/t0 o401->@NET_0x500000a030548_UUID:17/18 lens 2528/384 e 0 to 1 dl 0 ref 1 fl Rpc:N/0/0 rc 0/0 Mar 26 09:57:53 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0059_UUID' is not available for connect (stopping) Mar 26 09:57:53 ALPL401 kernel: LustreError: Skipped 17160 previous similar messages Mar 26 09:57:56 ALPL401 kernel: Lustre: LFS05-OST0059: shutting down for failover; client state will be preserved. Mar 26 09:57:57 ALPL401 kernel: Lustre: OST LFS05-OST0059 has stopped. Mar 26 09:57:57 ALPL401 multipathd: dm-9: umount map (uevent) Mar 26 09:57:58 ALPL401 kernel: Lustre: server umount LFS05-OST0059 complete Mar 26 09:57:58 ALPL401 kernel: Lustre: Failing over LFS05-OST0058 Mar 26 09:57:58 ALPL401 kernel: LustreError: 29230:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810a5f73b000 x1495855444201491/t0 o401->@NET_0x500000a030548_UUID:17/18 lens 2528/384 e 0 to 1 dl 0 ref 1 fl Rpc:N/0/0 rc 0/0 Mar 26 09:58:00 ALPL401 kernel: Lustre: LFS05-OST0058: shutting down for failover; client state will be preserved. Mar 26 09:58:01 ALPL401 kernel: Lustre: OST LFS05-OST0058 has stopped. Mar 26 09:58:01 ALPL401 multipathd: dm-8: umount map (uevent) Mar 26 09:58:01 ALPL401 kernel: Lustre: server umount LFS05-OST0058 complete Mar 26 09:58:01 ALPL401 kernel: Lustre: Failing over LFS05-OST0057 Mar 26 09:58:01 ALPL401 kernel: LustreError: 28621:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8100b0422c00 x1495855444201797/t0 o401->@NET_0x500000a030548_UUID:17/18 lens 1120/384 e 0 to 1 dl 0 ref 1 fl Rpc:N/0/0 rc 0/0 Mar 26 09:58:12 ALPL401 kernel: Lustre: LFS05-OST0057: shutting down for failover; client state will be preserved. Mar 26 09:58:14 ALPL401 kernel: Lustre: OST LFS05-OST0057 has stopped. Mar 26 09:58:16 ALPL401 multipathd: dm-7: umount map (uevent) Mar 26 09:58:17 ALPL401 kernel: Lustre: server umount LFS05-OST0057 complete Mar 26 09:58:17 ALPL401 kernel: Lustre: Failing over LFS05-OST0056 Mar 26 09:58:17 ALPL401 kernel: LustreError: 28331:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8102d9739000 x1495855444202818/t0 o401->@NET_0x500000a030548_UUID:17/18 lens 2976/384 e 0 to 1 dl 0 ref 1 fl Rpc:N/0/0 rc 0/0 Mar 26 09:58:19 ALPL401 kernel: Lustre: LFS05-OST0056: shutting down for failover; client state will be preserved. Mar 26 09:58:19 ALPL401 kernel: Lustre: OST LFS05-OST0056 has stopped. Mar 26 09:58:19 ALPL401 multipathd: dm-6: umount map (uevent) Mar 26 09:58:20 ALPL401 kernel: Lustre: server umount LFS05-OST0056 complete Mar 26 09:58:20 ALPL401 kernel: Lustre: Failing over LFS05-OST0055 Mar 26 09:58:20 ALPL401 kernel: LustreError: 28008:0:(client.c:859:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810857c15000 x1495855444202899/t0 o401->@NET_0x500000a030548_UUID:17/18 lens 416/384 e 0 to 1 dl 0 ref 1 fl Rpc:N/0/0 rc 0/0 Mar 26 09:58:22 ALPL401 kernel: Lustre: LFS05-OST0055: shutting down for failover; client state will be preserved. Mar 26 09:58:23 ALPL401 kernel: Lustre: OST LFS05-OST0055 has stopped. Mar 26 09:58:23 ALPL401 multipathd: dm-5: umount map (uevent) Mar 26 09:58:23 ALPL401 kernel: Lustre: server umount LFS05-OST0055 complete Mar 26 09:59:08 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0056_UUID' is not available for connect (no target) Mar 26 09:59:08 ALPL401 kernel: LustreError: Skipped 7006 previous similar messages Mar 26 09:59:40 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 10.3.7.32@o2ib ns: filter-LFS05-OST0054_UUID lock: ffff8105aa859a00/0xa230552c8ead5833 lrc: 3/0,0 mode: PW/PW res: 414376023/0 rrc: 9 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0x850f486359e99e14 expref: 4805 pid: 15548 timeout 5072002266 Mar 26 09:59:40 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) Skipped 1 previous similar message Mar 26 09:59:47 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 10.3.7.40@o2ib ns: filter-LFS05-OST0051_UUID lock: ffff810432abe200/0xa230552c8ead7bcd lrc: 3/0,0 mode: PW/PW res: 362399914/0 rrc: 10 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0xde5dd27abecf5402 expref: 4804 pid: 15679 timeout 5072009842 Mar 26 10:01:21 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 202s: evicting client at 10.3.7.64@o2ib ns: filter-LFS05-OST0054_UUID lock: ffff810b6319ce00/0xa230552c8ead5dab lrc: 3/0,0 mode: PW/PW res: 414376023/0 rrc: 8 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0x7e14c4d1d667773c expref: 4801 pid: 14691 timeout 5072103020 Mar 26 10:01:33 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 10.3.7.39@o2ib ns: filter-LFS05-OST0054_UUID lock: ffff8101b6007200/0xa230552c8eadc6a4 lrc: 3/0,0 mode: PW/PW res: 413658149/0 rrc: 4 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0x4d832bc0861e12a8 expref: 4804 pid: 15739 timeout 5072115235 Mar 26 10:01:33 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) Skipped 2 previous similar messages Mar 26 10:03:02 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 296s: evicting client at 10.3.7.43@o2ib ns: filter-LFS05-OST0054_UUID lock: ffff810594a79000/0xa230552c8ead9cce lrc: 3/0,0 mode: PW/PW res: 414376023/0 rrc: 6 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0xc0610c7798a708ce expref: 4804 pid: 15755 timeout 5072204010 Mar 26 10:03:02 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) Skipped 1 previous similar message Mar 26 10:03:04 ALPL401 kernel: LustreError: 14661:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-107) req@ffff8100540eb800 x1495124780240041/t0 o101->@:0/0 lens 296/0 e 0 to 0 dl 1427335480 ref 1 fl Interpret:/0/0 rc -107/0 Mar 26 10:03:04 ALPL401 kernel: LustreError: 14661:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 16353 previous similar messages Mar 26 10:03:35 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 10.3.7.43@o2ib ns: filter-LFS05-OST0052_UUID lock: ffff81095615be00/0xa230552c8eade1fc lrc: 3/0,0 mode: PW/PW res: 422753209/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0xc0610c7798a85f9d expref: 4803 pid: 14615 timeout 5072237030 Mar 26 10:03:35 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) Skipped 7 previous similar messages Mar 26 10:04:43 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 391s: evicting client at 10.3.7.33@o2ib ns: filter-LFS05-OST0054_UUID lock: ffff810b16e25200/0xa230552c8eadc776 lrc: 3/0,0 mode: PW/PW res: 414376023/0 rrc: 4 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0xd18a9242e2de58cf expref: 4801 pid: 15639 timeout 5072305012 Mar 26 10:05:12 ALPL401 kernel: Lustre: LFS05-OST0052: haven't heard from client 1411a38a-890a-52f5-be5e-7ac1aa9573d2 (at 10.3.3.54@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 10:05:12 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 26 10:05:40 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 10:06:04 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 10:07:15 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 539s: evicting client at 10.3.7.58@o2ib ns: filter-LFS05-OST0054_UUID lock: ffff810b4e629200/0xa230552c8eadf749 lrc: 3/0,0 mode: PW/PW res: 414376023/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0xb19e507faa253c07 expref: 4800 pid: 15702 timeout 5072457013 Mar 26 10:07:15 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) Skipped 8 previous similar messages Mar 26 10:10:50 ALPL401 : error getting update info: Cannot find a valid baseurl for repo: base Mar 26 10:11:33 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) ### lock callback timer expired after 102s: evicting client at 10.3.7.44@o2ib ns: filter-LFS05-OST0052_UUID lock: ffff8105582a0400/0xa230552c8eb18ba1 lrc: 3/0,0 mode: PW/PW res: 429245526/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 remote: 0x3ee0a469612e678 expref: 4800 pid: 15609 timeout 5072715604 Mar 26 10:11:33 ALPL401 kernel: LustreError: 0:0:(ldlm_lockd.c:315:waiting_locks_callback()) Skipped 19 previous similar messages Mar 26 10:14:10 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0056_UUID' is not available for connect (no target) Mar 26 10:14:10 ALPL401 kernel: LustreError: Skipped 6499 previous similar messages Mar 26 10:14:10 ALPL401 kernel: LustreError: 15630:0:(ldlm_lib.c:1921:target_send_reply_msg()) @@@ processing error (-19) req@ffff81048b2fd800 x1496669789114148/t0 o8->@:0/0 lens 368/0 e 0 to 0 dl 1427336200 ref 1 fl Interpret:/0/0 rc -19/0 Mar 26 10:14:10 ALPL401 kernel: LustreError: 15630:0:(ldlm_lib.c:1921:target_send_reply_msg()) Skipped 40 previous similar messages Mar 26 10:14:31 ALPL401 kernel: LustreError: 137-5: UUID 'LFS05-OST0056_UUID' is not available for connect (no target) Mar 26 10:14:31 ALPL401 kernel: LustreError: Skipped 5 previous similar messages Mar 26 10:45:10 ALPL401 kernel: Lustre: LFS05-OST0050: haven't heard from client 460108d8-24d5-5d81-4cea-dba4a6f4c3b6 (at 10.3.2.16@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 10:45:10 ALPL401 kernel: Lustre: Skipped 9 previous similar messages Mar 26 10:45:46 ALPL401 kernel: ib_send_cm_mra: cm_id_priv->id.state: 0x6 Mar 26 10:46:49 ALPL401 last message repeated 2 times Mar 26 10:54:33 ALPL401 kernel: Lustre: LFS05-OST0054: haven't heard from client aa3dd8b7-33bc-86b2-094a-0b54208f7152 (at 10.3.2.18@o2ib) in 335 seconds. I think it's dead, and I am evicting it. Mar 26 10:54:33 ALPL401 kernel: Lustre: Skipped 14 previous similar messages