[root@localhost ~]# tail -f /var/log/messages May 25 19:30:41 localhost kernel: LustreError: Skipped 38 previous similar messages May 25 19:32:51 localhost kernel: LustreError: 137-5: lustre-OST0003_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. May 25 19:32:51 localhost kernel: LustreError: Skipped 77 previous similar messages May 25 19:37:12 localhost kernel: LustreError: 137-5: lustre-OST0003_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. May 25 19:37:12 localhost kernel: LustreError: Skipped 155 previous similar messages May 25 19:40:11 localhost kernel: LDISKFS-fs (sdd): mounted filesystem with ordered data mode. quota=on. Opts: May 25 19:40:11 localhost kernel: Lustre: lustre-OST0003: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-90 May 25 19:40:16 localhost kernel: Lustre: lustre-OST0003: Will be in recovery for at least 1:00, or until 3 clients reconnect May 25 19:40:16 localhost kernel: Lustre: lustre-OST0003: Recovery over after 0:01, of 3 clients 3 recovered and 0 were evicted. May 25 19:40:16 localhost kernel: Lustre: lustre-OST0003-osc-MDT0001: Connection restored to lustre-OST0003 (at 0@lo) May 25 21:58:40 localhost kernel: Lustre: setting import lustre-MDT0000_UUID INACTIVE by administrator request May 25 21:58:40 localhost kernel: Lustre: Fid capabilities renewed: 0 May 25 21:58:40 localhost kernel: Fid capabilities renewal ENOENT: 0 May 25 21:58:40 localhost kernel: Fid capabilities failed to renew: 0 May 25 21:58:40 localhost kernel: Fid capabilities renewal retries: 0 May 25 21:58:40 localhost kernel: Lustre: Unmounted lustre-client May 25 21:58:44 localhost kernel: LustreError: 11-0: lustre-MDT0000-lwp-OST0000: operation obd_ping to node 0@lo failed: rc = -107 May 25 21:58:44 localhost kernel: LustreError: Skipped 2 previous similar messages May 25 21:58:44 localhost kernel: Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete May 25 21:58:44 localhost kernel: Lustre: Skipped 2 previous similar messages May 25 21:58:44 localhost kernel: Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) May 25 21:58:44 localhost kernel: LustreError: 3600:0:(client.c:1074:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff8800425453c0 x1502149894916288/t0(0) o13->lustre-OST0003-osc-MDT0000@0@lo:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 May 25 21:58:46 localhost kernel: Lustre: server umount lustre-MDT0000 complete May 25 21:58:49 localhost kernel: LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. May 25 21:58:49 localhost kernel: LustreError: Skipped 107 previous similar messages May 25 21:58:51 localhost kernel: LustreError: 11-0: lustre-OST0000-osc-MDT0001: operation ost_statfs to node 0@lo failed: rc = -107 May 25 21:58:51 localhost kernel: LustreError: Skipped 4 previous similar messages May 25 21:58:51 localhost kernel: Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete May 25 21:58:51 localhost kernel: Lustre: Skipped 4 previous similar messages May 25 21:58:51 localhost kernel: Lustre: lustre-OST0000: Not available for connect from 0@lo (stopping) May 25 21:58:51 localhost kernel: Lustre: Skipped 4 previous similar messages May 25 21:58:53 localhost kernel: Lustre: server umount lustre-OST0000 complete May 25 21:58:56 localhost kernel: LustreError: 11-0: lustre-OST0001-osc-MDT0001: operation ost_statfs to node 0@lo failed: rc = -107 May 25 21:58:56 localhost kernel: Lustre: lustre-OST0001-osc-MDT0001: Connection to lustre-OST0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete May 25 21:58:56 localhost kernel: Lustre: lustre-OST0001: Not available for connect from 0@lo (stopping) May 25 21:58:56 localhost kernel: Lustre: 3600:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571329/real 1432571329] req@ffff880042568c80 x1502149894916316/t0(0) o400->MGC192.168.102.13@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1432571336 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 21:58:56 localhost kernel: LustreError: 166-1: MGC192.168.102.13@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail May 25 21:58:59 localhost kernel: Lustre: server umount lustre-OST0001 complete May 25 21:59:02 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571336/real 1432571336] req@ffff880042568c80 x1502149894916396/t0(0) o250->MGC192.168.102.13@tcp@0@lo:26/25 lens 520/544 e 0 to 1 dl 1432571342 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 21:59:17 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571346/real 1432571346] req@ffff880042568c80 x1502149894916436/t0(0) o250->MGC192.168.102.13@tcp@0@lo:26/25 lens 520/544 e 0 to 1 dl 1432571357 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 21:59:31 localhost kernel: Lustre: lustre-MDT0001: haven't heard from client 0c82d43c-cf5f-b882-2c36-9dbf732e43c7 (at 0@lo) in 52 seconds. I think it's dead, and I am evicting it. exp ffff88003dcda400, cur 1432571371 expire 1432571341 last 1432571319 May 25 21:59:34 localhost kernel: Lustre: lustre-OST0003: haven't heard from client 0c82d43c-cf5f-b882-2c36-9dbf732e43c7 (at 0@lo) in 55 seconds. I think it's dead, and I am evicting it. exp ffff88003dae3c00, cur 1432571374 expire 1432571344 last 1432571319 May 25 21:59:37 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571361/real 1432571361] req@ffff880042568c80 x1502149894916524/t0(0) o250->MGC192.168.102.13@tcp@0@lo:26/25 lens 520/544 e 0 to 1 dl 1432571377 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 21:59:56 localhost kernel: LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. May 25 21:59:56 localhost kernel: LustreError: Skipped 62 previous similar messages May 25 22:00:02 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571381/real 1432571381] req@ffff880042568c80 x1502149894916640/t0(0) o250->MGC192.168.102.13@tcp@0@lo:26/25 lens 520/544 e 0 to 1 dl 1432571402 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 22:00:09 localhost kernel: Lustre: Failing over lustre-MDT0001 May 25 22:00:09 localhost kernel: Lustre: server umount lustre-MDT0001 complete May 25 22:00:18 localhost kernel: Lustre: 3600:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571411/real 1432571411] req@ffff880048487080 x1502149894916820/t0(0) o400->lustre-MDT0001-lwp-OST0003@0@lo:12/10 lens 224/224 e 0 to 1 dl 1432571418 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 22:00:18 localhost kernel: Lustre: lustre-MDT0001-lwp-OST0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete May 25 22:00:36 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571411/real 1432571411] req@ffff8800456769c0 x1502149894916816/t0(0) o38->lustre-MDT0000-lwp-OST0003@0@lo:12/10 lens 520/544 e 0 to 1 dl 1432571436 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 22:00:36 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) Skipped 3 previous similar messages May 25 22:01:08 localhost kernel: ------------[ cut here ]------------ May 25 22:01:08 localhost kernel: WARNING: at fs/proc/generic.c:591 proc_register+0xb9/0x170() (Not tainted) May 25 22:01:08 localhost kernel: Hardware name: VMware Virtual Platform May 25 22:01:08 localhost kernel: proc_dir_entry 'lustre/osc' already registered May 25 22:01:08 localhost kernel: Modules linked in: osc(+)(U) mdc(U) lmv(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) mdd(U) osd_ldiskfs(U) ldiskfs(U) exportfs lquota(U) lfsck(U) jbd mgc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan tun uinput microcode ppdev vmware_balloon parport_pc parport e1000 sg i2c_piix4 i2c_core shpchp ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lmv] May 25 22:01:08 localhost kernel: Pid: 9503, comm: modprobe Not tainted 2.6.32.431.5.1.el6_lustre #2 May 25 22:01:08 localhost kernel: Call Trace: May 25 22:01:08 localhost kernel: [] ? warn_slowpath_common+0x87/0xc0 May 25 22:01:08 localhost kernel: [] ? warn_slowpath_fmt+0x46/0x50 May 25 22:01:08 localhost kernel: [] ? proc_register+0xb9/0x170 May 25 22:01:08 localhost kernel: [] ? proc_mkdir_mode+0x42/0x60 May 25 22:01:08 localhost kernel: [] ? proc_mkdir+0x16/0x20 May 25 22:01:08 localhost kernel: [] ? lprocfs_register+0x1b/0x80 [obdclass] May 25 22:01:08 localhost kernel: [] ? class_register_type+0xb87/0xe50 [obdclass] May 25 22:01:08 localhost kernel: [] ? osc_init+0x0/0x1f7 [osc] May 25 22:01:08 localhost kernel: [] ? osc_init+0x9f/0x1f7 [osc] May 25 22:01:08 localhost kernel: [] ? do_one_initcall+0x3c/0x1d0 May 25 22:01:08 localhost kernel: [] ? sys_init_module+0xdf/0x250 May 25 22:01:08 localhost kernel: [] ? system_call_fastpath+0x16/0x1b May 25 22:01:08 localhost kernel: ---[ end trace decd986c598c1921 ]--- May 25 22:01:08 localhost kernel: ------------[ cut here ]------------ May 25 22:01:08 localhost kernel: WARNING: at fs/proc/generic.c:591 proc_register+0xb9/0x170() (Tainted: G W --------------- ) May 25 22:01:08 localhost kernel: Hardware name: VMware Virtual Platform May 25 22:01:08 localhost kernel: proc_dir_entry 'lustre/lov' already registered May 25 22:01:08 localhost kernel: Modules linked in: lov(+)(U) osc(U) mdc(U) lmv(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) mdd(U) osd_ldiskfs(U) ldiskfs(U) exportfs lquota(U) lfsck(U) jbd mgc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan tun uinput microcode ppdev vmware_balloon parport_pc parport e1000 sg i2c_piix4 i2c_core shpchp ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lmv] May 25 22:01:08 localhost kernel: Pid: 9513, comm: modprobe Tainted: G W --------------- 2.6.32.431.5.1.el6_lustre #2 May 25 22:01:08 localhost kernel: Call Trace: May 25 22:01:08 localhost kernel: [] ? warn_slowpath_common+0x87/0xc0 May 25 22:01:08 localhost kernel: [] ? warn_slowpath_fmt+0x46/0x50 May 25 22:01:08 localhost kernel: [] ? proc_register+0xb9/0x170 May 25 22:01:08 localhost kernel: [] ? proc_mkdir_mode+0x42/0x60 May 25 22:01:08 localhost kernel: [] ? proc_mkdir+0x16/0x20 May 25 22:01:08 localhost kernel: [] ? lprocfs_register+0x1b/0x80 [obdclass] May 25 22:01:08 localhost kernel: [] ? class_register_type+0xb87/0xe50 [obdclass] May 25 22:01:08 localhost kernel: [] ? lov_init+0x0/0x1cf [lov] May 25 22:01:08 localhost kernel: [] ? lov_init+0xbe/0x1cf [lov] May 25 22:01:08 localhost kernel: [] ? do_one_initcall+0x3c/0x1d0 May 25 22:01:08 localhost kernel: [] ? sys_init_module+0xdf/0x250 May 25 22:01:08 localhost kernel: [] ? system_call_fastpath+0x16/0x1b May 25 22:01:08 localhost kernel: ---[ end trace decd986c598c1922 ]--- May 25 22:01:08 localhost kernel: Lustre: Echo OBD driver; http://www.lustre.org/ May 25 22:01:10 localhost kernel: LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:10 localhost kernel: LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:11 localhost kernel: LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:12 localhost kernel: LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:12 localhost kernel: LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:24 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1432571463/real 1432571463] req@ffff88007b76f3c0 x1502149894916852/t0(0) o38->lustre-MDT0001-lwp-OST0003@0@lo:12/10 lens 520/544 e 0 to 1 dl 1432571484 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 May 25 22:01:24 localhost kernel: Lustre: 3598:0:(client.c:1939:ptlrpc_expire_one_request()) Skipped 4 previous similar messages May 25 22:01:29 localhost kernel: Lustre: Evicted from MGS (at MGC192.168.102.13@tcp_0) after server handle changed from 0xfb9815dac6f74dee to 0xfb9815dac6f75702 May 25 22:01:29 localhost kernel: Lustre: MGC192.168.102.13@tcp: Connection restored to MGS (at 0@lo) May 25 22:01:29 localhost kernel: Lustre: Skipped 2 previous similar messages May 25 22:01:29 localhost kernel: Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000 May 25 22:01:29 localhost kernel: Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space May 25 22:01:30 localhost kernel: Lustre: lustre-MDT0000: new disk, initializing May 25 22:01:30 localhost kernel: LDISKFS-fs (loop1): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:30 localhost kernel: LDISKFS-fs (loop1): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:30 localhost kernel: Lustre: srv-lustre-OST0000: No data found on store. Initialize space May 25 22:01:30 localhost kernel: Lustre: Skipped 1 previous similar message May 25 22:01:30 localhost kernel: LDISKFS-fs (loop2): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:30 localhost kernel: LDISKFS-fs (loop2): mounted filesystem with ordered data mode. quota=on. Opts: May 25 22:01:31 localhost kernel: Lustre: lustre-OST0001: new disk, initializing May 25 22:01:31 localhost kernel: Lustre: Skipped 1 previous similar message May 25 22:01:36 localhost kernel: LustreError: 167-0: lustre-MDT0000-lwp-OST0003: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. May 25 22:01:36 localhost kernel: Lustre: lustre-MDT0000-lwp-OST0003: Connection restored to lustre-MDT0000 (at 0@lo) May 25 22:01:43 localhost kernel: Lustre: client enabled MDS capability! May 25 22:01:43 localhost kernel: Lustre: client enabled OSS capability! May 25 22:01:43 localhost kernel: Lustre: Mounted lustre-client May 25 22:01:44 localhost kernel: Lustre: DEBUG MARKER: Using TIMEOUT=20 May 25 22:02:08 localhost kernel: LustreError: 137-5: lustre-MDT0001_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. May 25 22:02:08 localhost kernel: LustreError: Skipped 16 previous similar messages