Apr 18 11:35:33 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 11:42:15 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 11:48:57 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 11:55:38 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:02:19 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:09:00 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:15:41 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:22:22 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:29:03 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:35:44 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:36:59 com-0443 kernel: collect_groups_[7164]: segfault at 0000000000000010 rip 0000000000410924 rsp 00007fff41f9a270 error 6 Apr 18 12:37:11 com-0443 kernel: BUG: soft lockup - CPU#7 stuck for 10s! [sh:32397] Apr 18 12:37:11 com-0443 kernel: CPU 7: Apr 18 12:37:11 com-0443 kernel: Modules linked in: cpufreq_ondemand(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) raid0(U) xfs(U) uhci_hcd(U) rdma_ucm(U) qlgc_vnic(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_uverbs(U) iw_cxgb3(U) cxgb3(U) dm_mirror(U) dm_log(U) dm_multipath(U) scsi_dh(U) dm_mod(U) video(U) hwmon(U) sbs(U) backlight(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U) cdrom(U) sd_mod(U) sg(U) joydev(U) usb_storage(U) sata_nv(U) libata(U) i2c_nforce2(U) i2c_core(U) ohci_hcd(U) ehci_hcd(U) shpchp(U) pcspkr(U) serio_raw(U) scsi_mod(U) ipmi_devintf(U) ipmi_si(U) ipmi_msghandler(U) perfctr(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) auth_rpcgss(U) sunrpc(U) ext3(U) jbd(U) fuse(U) gnbd(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipoib_helper(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) mlx4_ib(U) ib_umad(U) ib_mthca(U) ib_mad(U) ib_core(U) loop(U) mlx4_en(U) mlx4_core(U) tg3(U) libphy(U) forcedet Apr 18 12:37:11 com-0443 kernel: (U) e1000(U) igb(U) e1000e(U) r8168(U) Apr 18 12:37:11 com-0443 kernel: Pid: 32397, comm: sh Tainted: G 2.6.18-128.7.1.el5-pctr40-PAPI #5 Apr 18 12:37:11 com-0443 kernel: RIP: 0010:[] [] :lustre:llap_cast_private+0x13/0x90 Apr 18 12:37:11 com-0443 kernel: RSP: 0018:ffff810293c91cc8 EFLAGS: 00000246 Apr 18 12:37:11 com-0443 kernel: RAX: 0000000000000821 RBX: 0000000000000f0f RCX: 0000000000000034 Apr 18 12:37:11 com-0443 kernel: RDX: ffff81024db6d990 RSI: 00000000000000d0 RDI: ffff81023517cd90 Apr 18 12:37:11 com-0443 kernel: RBP: ffff81037ff03060 R08: ffff810230001600 R09: ffff810230001600 Apr 18 12:37:11 com-0443 kernel: R10: ffff81024db6dae0 R11: ffffffff88b6d510 R12: ffff8101eaa31400 Apr 18 12:37:11 com-0443 kernel: R13: ffff810429d942c0 R14: ffff81037ff03000 R15: ffff81023724e800 Apr 18 12:37:11 com-0443 kernel: FS: 00002b015ec7adb0(0000) GS:ffff8102372a00c0(0000) knlGS:0000000000000000 Apr 18 12:37:11 com-0443 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Apr 18 12:37:11 com-0443 kernel: CR2: 00002ad8d340b000 CR3: 0000000385731000 CR4: 00000000000006e0 Apr 18 12:37:11 com-0443 kernel: Apr 18 12:37:11 com-0443 kernel: Call Trace: Apr 18 12:37:11 com-0443 kernel: [] :lustre:ll_removepage+0x30/0x860 Apr 18 12:37:11 com-0443 kernel: [] :lustre:ll_releasepage+0x10/0x20 Apr 18 12:37:11 com-0443 kernel: [] __invalidate_mapping_pages+0x99/0x183 Apr 18 12:37:11 com-0443 kernel: [] drop_pagecache+0x97/0x12d Apr 18 12:37:11 com-0443 kernel: [] do_proc_dointvec_minmax_conv+0x0/0x56 Apr 18 12:37:11 com-0443 kernel: [] drop_caches_sysctl_handler+0x1a/0x2c Apr 18 12:37:11 com-0443 kernel: [] do_rw_proc+0xcb/0x126 Apr 18 12:37:11 com-0443 kernel: [] vfs_write+0xce/0x174 Apr 18 12:37:11 com-0443 kernel: [] sys_write+0x45/0x6e Apr 18 12:37:11 com-0443 kernel: [] system_call+0x7e/0x83 Apr 18 12:37:12 com-0443 kernel: Apr 18 12:42:25 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:49:06 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 12:55:47 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 16:40:30 com-0443 syslogd 1.4.1: restart. Apr 18 16:42:00 com-0443 syslogd: /dev/console: Interrupted system call Apr 18 16:40:30 com-0443 kernel: klogd 1.4.1, log source = /proc/kmsg started. Apr 18 16:40:36 com-0443 kernel: md: md0 stopped. Apr 18 16:40:36 com-0443 kernel: md: bind Apr 18 16:40:36 com-0443 kernel: md: bind Apr 18 16:40:36 com-0443 kernel: md: bind Apr 18 16:40:36 com-0443 kernel: md: raid0 personality registered for level 0 Apr 18 16:40:36 com-0443 kernel: md0: setting max_sectors to 128, segment boundary to 32767 Apr 18 16:40:36 com-0443 kernel: raid0: looking at sdb Apr 18 16:40:36 com-0443 kernel: raid0: comparing sdb(245117312) with sdb(245117312) Apr 18 16:40:36 com-0443 kernel: raid0: END Apr 18 16:40:36 com-0443 kernel: raid0: ==> UNIQUE Apr 18 16:40:36 com-0443 kernel: raid0: 1 zones Apr 18 16:40:36 com-0443 kernel: raid0: looking at sdd Apr 18 16:42:00 com-0443 kernel: raid0: comparing sdd(245117312) with sdb(245117312) Apr 18 16:42:01 com-0443 kernel: Kernel logging (proc) stopped. Apr 18 16:42:01 com-0443 kernel: Kernel log daemon terminating. Apr 18 16:42:05 com-0443 exiting on signal 15 Apr 18 16:43:35 com-0443 syslogd 1.4.1: restart. Apr 18 16:43:36 com-0443 syslogd: /dev/console: Interrupted system call Apr 18 16:43:35 com-0443 kernel: klogd 1.4.1, log source = /proc/kmsg started. Apr 18 16:43:36 com-0443 exiting on signal 15 Apr 18 16:44:08 com-0443 syslogd 1.4.1: restart. Apr 18 16:46:08 com-0443 syslogd: /dev/console: Interrupted system call Apr 18 16:44:08 com-0443 kernel: klogd 1.4.1, log source = /proc/kmsg started. Apr 18 16:44:08 com-0443 kernel: Lustre: OBD class driver, http://www.lustre.org/ Apr 18 16:44:08 com-0443 kernel: Lustre: Lustre Version: 1.8.4.ddn2.2 Apr 18 16:44:08 com-0443 kernel: Lustre: Build Version: 1.8.4.ddn2.2-20110211142229-PRISTINE-2.6.18-128.7.1.el5-pctr40-PAPI Apr 18 16:44:08 com-0443 kernel: Lustre: Listener bound to ib0:10.12.20.94:987:mlx4_0 Apr 18 16:44:08 com-0443 kernel: Lustre: Register global MR array, MR size: 0xffffffffffffffff, array size: 1 Apr 18 16:44:08 com-0443 kernel: Lustre: Added LNI 10.12.20.94@o2ib [8/64/0/180] Apr 18 16:44:09 com-0443 kernel: Lustre: Lustre Client File System; http://www.lustre.org/ Apr 18 16:44:09 com-0443 kernel: Lustre: MGC10.12.200.1@o2ib: Reactivating import Apr 18 16:44:09 com-0443 kernel: Lustre: lfs-client-ffff81020436b400.llite: set parameter statahead_max=0 Apr 18 16:44:09 com-0443 kernel: LustreError: 11-0: an error occurred while communicating with 10.12.200.3@o2ib. The ost_connect operation failed with -19 Apr 18 16:44:09 com-0443 kernel: LustreError: 11-0: an error occurred while communicating with 10.12.200.3@o2ib. The ost_connect operation failed with -19 Apr 18 16:44:09 com-0443 kernel: Lustre: Client lfs-client has started Apr 18 16:47:58 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes Apr 18 16:48:36 com-0443 ntpd[7672]: synchronized to 10.10.0.3, stratum 4 Apr 18 16:54:39 com-0443 init: Id "S1" respawning too fast: disabled for 5 minutes