Script started on Wed Mar 27 11:29:44 2013 ]0;root@wk1:~/DDN_20130326[root@wk1 DDN_20130326]# ]0;root@wk1:~/DDN_20130326[root@wk1 DDN_20130326]# crash ../vmlinux_2.6.18-308.11.1.el5 /share3/adm/dump/c083.vmcore.20130326 crash 4.1.2-4.el5.centos Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... please wait... (gathering kmem slab cache data) please wait... (gathering module symbol data) please wait... (gathering task table data) please wait... (determining panic task) KERNEL: ../vmlinux_2.6.18-308.11.1.el5 DUMPFILE: /share3/adm/dump/c083.vmcore.20130326 CPUS: 8 DATE: Tue Mar 26 11:47:31 2013 UPTIME: 34 days, 20:57:37 LOAD AVERAGE: 1.02, 1.02, 1.00 TASKS: 253 NODENAME: c083 RELEASE: 2.6.18-308.11.1.el5 VERSION: #1 SMP Tue Jul 10 08:48:43 EDT 2012 MACHINE: x86_64 (2992 Mhz) MEMORY: 31.5 GB PANIC: "SysRq : Trigger a crashdump" PID: 0 COMMAND: "swapper" TASK: ffff81011dd7a080 (1 of 8) [THREAD_INFO: ffff81011ddec000] CPU: 4 STATE: TASK_RUNNING (SYSRQ) crash> crash> bt [?1h= PID: 0 TASK: ffff81011dd7a080 CPU: 4 COMMAND: "swapper" #0 [ffff81011ddf3d60] crash_kexec at ffffffff800b09ac #1 [ffff81011ddf3e20] sysrq_handle_crashdump at ffffffff801bbbfd #2 [ffff81011ddf3e30] __handle_sysrq at ffffffff801bb9cb #3 [ffff81011ddf3e70] receive_chars at ffffffff801ca675 #4 [ffff81011ddf3ec0] serial8250_interrupt at ffffffff801cb80d #5 [ffff81011ddf3f20] handle_IRQ_event at ffffffff80010d1c #6 [ffff81011ddf3f50] __do_IRQ at ffffffff800be16b #7 [ffff81011ddf3f90] do_IRQ at ffffffff8006d4d1 --- --- #8 [ffff81011ddede38] ret_from_intr at ffffffff8005d615 [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011ddedee8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: 00000000000000ff RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 000ab2e79440cb92 R8: ffff81011ddec000 R9: 000000000000003c R10: ffff81011def0038 R11: 0000000000000246 R12: ffff81087aff8820 R13: ffff81011dd7a080 R14: 0000000000000001 R15: 0000000000000000 ORIG_RAX: fffffffffffffffc CS: 0010 SS: 0018 #9 [ffff81011ddedee8] mwait_idle at ffffffff80056c6a #10 [ffff81011ddedef0] cpu_idle at ffffffff80048f67 [?1l>crash> crash> crash> runq [?1h= CPU 0 RUNQUEUE: ffff810009004420 CURRENT: PID: 0 TASK: ffffffff80319b60 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff810009004d78 [no tasks queued] EXPIRED PRIO_ARRAY: ffff810009004498 [no tasks queued] CPU 1 RUNQUEUE: ffff81000900caa0 CURRENT: PID: 0 TASK: ffff81011dd0a100 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff81000900d3f8 [no tasks queued] EXPIRED PRIO_ARRAY: ffff81000900cb18 [no tasks queued] CPU 2 RUNQUEUE: ffff810009015120 CURRENT: PID: 0 TASK: ffff81011dd60080 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff810009015198 [no tasks queued] EXPIRED PRIO_ARRAY: ffff810009015a78 [no tasks queued] CPU 3 RUNQUEUE: ffff81000901d7a0 CURRENT: PID: 0 TASK: ffff81011dd6c100 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff81000901d818 [no tasks queued] EXPIRED PRIO_ARRAY: ffff81000901e0f8 [no tasks queued] CPU 4 RUNQUEUE: ffff810009025e20 CURRENT: PID: 0 TASK: ffff81011dd7a080 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff810009025e98 [115] PID: 3575 TASK: ffff81087f863100 COMMAND: "klogd" EXPIRED PRIO_ARRAY: ffff810009026778 [no tasks queued] CPU 5 RUNQUEUE: ffff81000902e4a0 CURRENT: PID: 0 TASK: ffff81011de20100 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff81000902edf8 [no tasks queued] EXPIRED PRIO_ARRAY: ffff81000902e518 [no tasks queued] CPU 6 RUNQUEUE: ffff810009036b20 CURRENT: PID: 0 TASK: ffff81011de2f080 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff810009036b98 [no tasks queued] EXPIRED PRIO_ARRAY: ffff810009037478 [no tasks queued]  -- MORE -- forward: , or j backward: b or k quit: q  CPU 7 RUNQUEUE: ffff81000903f1a0 CURRENT: PID: 0 TASK: ffff81011deac100 COMMAND: "swapper" ACTIVE PRIO_ARRAY: ffff81000903f218 [no tasks queued] EXPIRED PRIO_ARRAY: ffff81000903faf8 [no tasks queued] [?1l>crash> crash> crash> crash> bt -a [?1h= PID: 0 TASK: ffffffff80319b60 CPU: 0 COMMAND: "swapper" #0 [ffffffff804b4f20] crash_nmi_callback at ffffffff8007c228 #1 [ffffffff804b4f40] do_nmi at ffffffff800658e5 #2 [ffffffff804b4f50] nmi at ffffffff80064ecf [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffffffff80463f88 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff80056c5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000090000 R8: ffffffff80462000 R9: 0000000000000038 R10: ffff81011def00f8 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- --- #3 [ffffffff80463f88] mwait_idle_with_hints at ffffffff8006b9cf #4 [ffffffff80463f88] mwait_idle at ffffffff80056c6a #5 [ffffffff80463f90] cpu_idle at ffffffff80048f67 PID: 0 TASK: ffff81011dd0a100 CPU: 1 COMMAND: "swapper" #0 [ffff81011dd38f20] crash_nmi_callback at ffffffff8007c228 #1 [ffff81011dd38f40] do_nmi at ffffffff800658e5 #2 [ffff81011dd38f50] nmi at ffffffff80064ecf [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011dd2fee8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff80056c5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000000000ff R8: ffff81011dd2e000 R9: 000000000000003d R10: ffff81011def0008 R11: 0000000000000248 R12: 0000000000000001 R13: ffffffff80439080 R14: 0000000000000100 R15: ffffffff8045b280 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- --- #3 [ffff81011dd2fee8] mwait_idle_with_hints at ffffffff8006b9cf #4 [ffff81011dd2fee8] mwait_idle at ffffffff80056c6a #5 [ffff81011dd2fef0] cpu_idle at ffffffff80048f67 PID: 0 TASK: ffff81011dd60080 CPU: 2 COMMAND: "swapper" #0 [ffff81011dd8cf20] crash_nmi_callback at ffffffff8007c228 #1 [ffff81011dd8cf40] do_nmi at ffffffff800658e5 #2 [ffff81011dd8cf50] nmi at ffffffff80064ecf [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011dd85ee8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff80056c5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000000000ff R8: ffff81011dd84000 R9: 0000000000000038 R10: ffff81011def00f8 R11: 0000000000000286 R12: 0000000000000002 R13: ffffffff80439180 R14: 0000000000000200 R15: ffffffff8045b2a0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- --- #3 [ffff81011dd85ee8] mwait_idle_with_hints at ffffffff8006b9cf  -- MORE -- forward: , or j backward: b or k quit: q  #4 [ffff81011dd85ee8] mwait_idle at ffffffff80056c6a #5 [ffff81011dd85ef0] cpu_idle at ffffffff80048f67 PID: 0 TASK: ffff81011dd6c100 CPU: 3 COMMAND: "swapper" #0 [ffff81011ddbbf20] crash_nmi_callback at ffffffff8007c228 #1 [ffff81011ddbbf40] do_nmi at ffffffff800658e5 #2 [ffff81011ddbbf50] nmi at ffffffff80064ecf [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011ddb9ee8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff80056c5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000000000ff R8: ffff81011ddb8000 R9: 000000000000003d R10: ffff81011def0008 R11: 0000000000000246 R12: 0000000000000003 R13: ffffffff80439280 R14: 0000000000000300 R15: ffffffff8045b2c0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- --- #3 [ffff81011ddb9ee8] mwait_idle_with_hints at ffffffff8006b9cf #4 [ffff81011ddb9ee8] mwait_idle at ffffffff80056c6a #5 [ffff81011ddb9ef0] cpu_idle at ffffffff80048f67 PID: 0 TASK: ffff81011dd7a080 CPU: 4 COMMAND: "swapper" #0 [ffff81011ddf3d60] crash_kexec at ffffffff800b09ac #1 [ffff81011ddf3e20] sysrq_handle_crashdump at ffffffff801bbbfd #2 [ffff81011ddf3e30] __handle_sysrq at ffffffff801bb9cb #3 [ffff81011ddf3e70] receive_chars at ffffffff801ca675 #4 [ffff81011ddf3ec0] serial8250_interrupt at ffffffff801cb80d #5 [ffff81011ddf3f20] handle_IRQ_event at ffffffff80010d1c #6 [ffff81011ddf3f50] __do_IRQ at ffffffff800be16b #7 [ffff81011ddf3f90] do_IRQ at ffffffff8006d4d1 --- --- #8 [ffff81011ddede38] ret_from_intr at ffffffff8005d615 [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011ddedee8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: 00000000000000ff RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 000ab2e79440cb92 R8: ffff81011ddec000 R9: 000000000000003c R10: ffff81011def0038 R11: 0000000000000246 R12: ffff81087aff8820 R13: ffff81011dd7a080 R14: 0000000000000001 R15: 0000000000000000 ORIG_RAX: fffffffffffffffc CS: 0010 SS: 0018 #9 [ffff81011ddedee8] mwait_idle at ffffffff80056c6a #10 [ffff81011ddedef0] cpu_idle at ffffffff80048f67 PID: 0 TASK: ffff81011de20100 CPU: 5 COMMAND: "swapper" #0 [ffff81011de48f20] crash_nmi_callback at ffffffff8007c228 #1 [ffff81011de48f40] do_nmi at ffffffff800658e5 #2 [ffff81011de48f50] nmi at ffffffff80064ecf [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011de41ee8 RFLAGS: 00000246  -- MORE -- forward: , or j backward: b or k quit: q  RAX: 0000000000000000 RBX: ffffffff80056c5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000000000ff R8: ffff81011de40000 R9: 0000000000000039 R10: ffff81011def00c8 R11: 0000000000000246 R12: 0000000000000005 R13: ffffffff80439480 R14: 0000000000000500 R15: ffffffff8045b300 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- --- #3 [ffff81011de41ee8] mwait_idle_with_hints at ffffffff8006b9cf #4 [ffff81011de41ee8] mwait_idle at ffffffff80056c6a #5 [ffff81011de41ef0] cpu_idle at ffffffff80048f67 PID: 0 TASK: ffff81011de2f080 CPU: 6 COMMAND: "swapper" #0 [ffff81011de7df20] crash_nmi_callback at ffffffff8007c228 #1 [ffff81011de7df40] do_nmi at ffffffff800658e5 #2 [ffff81011de7df50] nmi at ffffffff80064ecf [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011de75ee8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff80056c5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000000000ff R8: ffff81011de74000 R9: 000000000000003c R10: ffff81011def0038 R11: 0000000000000202 R12: 0000000000000006 R13: ffffffff80439580 R14: 0000000000000600 R15: ffffffff8045b320 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- --- #3 [ffff81011de75ee8] mwait_idle_with_hints at ffffffff8006b9cf #4 [ffff81011de75ee8] mwait_idle at ffffffff80056c6a #5 [ffff81011de75ef0] cpu_idle at ffffffff80048f67 PID: 0 TASK: ffff81011deac100 CPU: 7 COMMAND: "swapper" #0 [ffff81011ded1f20] crash_nmi_callback at ffffffff8007c228 #1 [ffff81011ded1f40] do_nmi at ffffffff800658e5 #2 [ffff81011ded1f50] nmi at ffffffff80064ecf [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011dec9ee8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff80056c5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000000000ff R8: ffff81011dec8000 R9: 0000000000000039 R10: ffff81011def00c8 R11: 0000000000000202 R12: 0000000000000007 R13: ffffffff80439680 R14: 0000000000000700 R15: ffffffff8045b340 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- --- #3 [ffff81011dec9ee8] mwait_idle_with_hints at ffffffff8006b9cf #4 [ffff81011dec9ee8] mwait_idle at ffffffff80056c6a #5 [ffff81011dec9ef0] cpu_idle at ffffffff80048f67 [?1l>crash> crash> crash> crash> bt -f [?1h= PID: 0 TASK: ffff81011dd7a080 CPU: 4 COMMAND: "swapper" #0 [ffff81011ddf3d60] crash_kexec at ffffffff800b09ac ffff81011ddf3d68: 0000000000000000 0000000000000001 ffff81011ddf3d78: ffff81011dd7a080 ffff81087aff8820 ffff81011ddf3d88: 000ab2e79440cb92 00000000000000ff ffff81011ddf3d98: 0000000000000246 ffff81011def0038 ffff81011ddf3da8: 000000000000003c ffff81011ddec000 ffff81011ddf3db8: 0000000000000000 0000000000000000 ffff81011ddf3dc8: 0000000000000000 0000000000000000 ffff81011ddf3dd8: 0000000000000000 fffffffffffffffc ffff81011ddf3de8: ffffffff8006b9cf 0000000000000010 ffff81011ddf3df8: 0000000000000246 ffff81011ddedee8 ffff81011ddf3e08: 0000000000000018 0000000000000000 ffff81011ddf3e18: ffffffff803488e0 ffffffff801bbbfd #1 [ffff81011ddf3e20] sysrq_handle_crashdump at ffffffff801bbbfd ffff81011ddf3e28: 0000000000000063 ffffffff801bb9cb #2 [ffff81011ddf3e30] __handle_sysrq at ffffffff801bb9cb ffff81011ddf3e38: 0000000000000092 ffffffff805b3600 ffff81011ddf3e48: ffffffff805b3661 0000000000000063 ffff81011ddf3e58: 0000000000000100 ffff81011ddf3eec ffff81011ddf3e68: ffff81087ba38800 ffffffff801ca675 #3 [ffff81011ddf3e70] receive_chars at ffffffff801ca675 ffff81011ddf3e78: ffff81011ddede38 ffff8100090272a0 ffff81011ddf3e88: ffff81011ddf3ee8 ffffffff805b3600 ffff81011ddf3e98: ffffffff805b36d8 ffffffff805b2630 ffff81011ddf3ea8: 0000000000000246 0000000000000000 ffff81011ddf3eb8: 0000000000000000 ffffffff801cb80d #4 [ffff81011ddf3ec0] serial8250_interrupt at ffffffff801cb80d ffff81011ddf3ec8: ffffffff8004f249 ffff81011ddede38 ffff81011ddf3ed8: 0000000300000046 ffff81011dde4000 ffff81011ddf3ee8: 0000006180448f90 0000000000000000 ffff81011ddf3ef8: ffff81087d4c7140 0000000000000003 ffff81011ddf3f08: 0000000000000000 ffff81011ddede38 ffff81011ddf3f18: ffff81011ddede38 ffffffff80010d1c #5 [ffff81011ddf3f20] handle_IRQ_event at ffffffff80010d1c ffff81011ddf3f28: 0000000000010000 ffffffff80449680 ffff81011ddf3f38: 0000000000000300 0000000000000003 ffff81011ddf3f48: ffff81087d4c7140 ffffffff800be16b #6 [ffff81011ddf3f50] __do_IRQ at ffffffff800be16b ffff81011ddf3f58: ffffffff804496bc 0000000000000003 ffff81011ddf3f68: ffff81011ddede38 0000000000000004 ffff81011ddf3f78: ffffffff80439380 0000000000000400 ffff81011ddf3f88: ffffffff8045b2e0 ffffffff8006d4d1 #7 [ffff81011ddf3f90] do_IRQ at ffffffff8006d4d1 --- --- #8 [ffff81011ddede38] ret_from_intr at ffffffff8005d615 [exception RIP: mwait_idle_with_hints+102] RIP: ffffffff8006b9cf RSP: ffff81011ddedee8 RFLAGS: 00000246  -- MORE -- forward: , or j backward: b or k quit: q  RAX: 0000000000000000 RBX: 00000000000000ff RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 000ab2e79440cb92 R8: ffff81011ddec000 R9: 000000000000003c R10: ffff81011def0038 R11: 0000000000000246 R12: ffff81087aff8820 R13: ffff81011dd7a080 R14: 0000000000000001 R15: 0000000000000000 ORIG_RAX: fffffffffffffffc CS: 0010 SS: 0018 ffff81011ddede40: 0000000000000001 ffff81011dd7a080 ffff81011ddede50: ffff81087aff8820 000ab2e79440cb92 ffff81011ddede60: 00000000000000ff 0000000000000246 ffff81011ddede70: ffff81011def0038 000000000000003c ffff81011ddede80: ffff81011ddec000 0000000000000000 ffff81011ddede90: 0000000000000000 0000000000000000 ffff81011ddedea0: 0000000000000000 0000000000000000 ffff81011ddedeb0: fffffffffffffffc ffffffff8006b9cf ffff81011ddedec0: 0000000000000010 0000000000000246 ffff81011ddeded0: ffff81011ddedee8 0000000000000018 ffff81011ddedee0: 00000000000000ff ffffffff80056c6a #9 [ffff81011ddedee8] mwait_idle at ffffffff80056c6a ffff81011ddedef0: ffffffff80048f67 #10 [ffff81011ddedef0] cpu_idle at ffffffff80048f67 [?1l>crash> crash> crash> crash> ps [?1h= PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffff80319b60 RU 0.0 0 0 [swapper] > 0 1 1 ffff81011dd0a100 RU 0.0 0 0 [swapper] > 0 1 2 ffff81011dd60080 RU 0.0 0 0 [swapper] > 0 1 3 ffff81011dd6c100 RU 0.0 0 0 [swapper] > 0 1 4 ffff81011dd7a080 RU 0.0 0 0 [swapper] > 0 1 5 ffff81011de20100 RU 0.0 0 0 [swapper] > 0 1 6 ffff81011de2f080 RU 0.0 0 0 [swapper] > 0 1 7 ffff81011deac100 RU 0.0 0 0 [swapper] 1 0 2 ffff81011dcf97a0 IN 0.0 10372 648 init 2 1 0 ffff81011dcf9040 IN 0.0 0 0 [migration/0] 3 1 0 ffff81011dcfb7e0 IN 0.0 0 0 [ksoftirqd/0] 4 1 0 ffff81011dcfb080 IN 0.0 0 0 [watchdog/0] 5 1 1 ffff81011dcfc820 IN 0.0 0 0 [migration/1] 6 1 1 ffff81011dcfc0c0 IN 0.0 0 0 [ksoftirqd/1] 7 1 1 ffff81011dd0a860 IN 0.0 0 0 [watchdog/1] 8 1 2 ffff81011dd0d7a0 IN 0.0 0 0 [migration/2] 9 1 2 ffff81011dd0d040 IN 0.0 0 0 [ksoftirqd/2] 10 1 2 ffff81011dd607e0 IN 0.0 0 0 [watchdog/2] 11 1 3 ffff81011dd62820 IN 0.0 0 0 [migration/3] 12 1 3 ffff81011dd620c0 IN 0.0 0 0 [ksoftirqd/3] 13 1 3 ffff81011dd6c860 IN 0.0 0 0 [watchdog/3] 14 1 4 ffff81011dd6e7a0 IN 0.0 0 0 [migration/4] 15 1 4 ffff81011dd6e040 IN 0.0 0 0 [ksoftirqd/4] 16 1 4 ffff81011dd7a7e0 IN 0.0 0 0 [watchdog/4] 17 1 5 ffff81011de13820 IN 0.0 0 0 [migration/5] 18 1 5 ffff81011de130c0 IN 0.0 0 0 [ksoftirqd/5] 19 1 5 ffff81011de20860 IN 0.0 0 0 [watchdog/5] 20 1 6 ffff81011de237a0 IN 0.0 0 0 [migration/6] 21 1 6 ffff81011de23040 IN 0.0 0 0 [ksoftirqd/6] 22 1 6 ffff81011de2f7e0 IN 0.0 0 0 [watchdog/6] 23 1 7 ffff81011de31820 IN 0.0 0 0 [migration/7] 24 1 7 ffff81011de310c0 IN 0.0 0 0 [ksoftirqd/7] 25 1 7 ffff81011deac860 IN 0.0 0 0 [watchdog/7] 26 1 0 ffff81087fe7a7a0 IN 0.0 0 0 [events/0] 27 1 1 ffff81087fe7a040 IN 0.0 0 0 [events/1] 28 1 2 ffff81087fe7b7e0 IN 0.0 0 0 [events/2] 29 1 3 ffff81087fe7b080 IN 0.0 0 0 [events/3] 30 1 4 ffff81087fe7c820 IN 0.0 0 0 [events/4] 31 1 5 ffff81087fe7c0c0 IN 0.0 0 0 [events/5] 32 1 6 ffff81087fe7d860 IN 0.0 0 0 [events/6] 33 1 7 ffff81087fe7d100 IN 0.0 0 0 [events/7] 34 1 3 ffff81087fe7e7a0 IN 0.0 0 0 [khelper] 307 1 0 ffff81087feb97e0 IN 0.0 0 0 [kthread] 318 307 0 ffff81087feb9080 IN 0.0 0 0 [kblockd/0] 319 307 1 ffff81087feba820 IN 0.0 0 0 [kblockd/1] 320 307 2 ffff81087feba0c0 IN 0.0 0 0 [kblockd/2] 321 307 3 ffff81087febd860 IN 0.0 0 0 [kblockd/3]  -- MORE -- forward: , or j backward: b or k quit: q  322 307 4 ffff81087febd100 IN 0.0 0 0 [kblockd/4] 323 307 5 ffff81087febf7a0 IN 0.0 0 0 [kblockd/5] 324 307 6 ffff81087febf040 IN 0.0 0 0 [kblockd/6] 325 307 7 ffff81087fec17e0 IN 0.0 0 0 [kblockd/7] 326 307 0 ffff81087feca040 IN 0.0 0 0 [kacpid] 490 307 0 ffff81087ff77820 IN 0.0 0 0 [cqueue/0] 491 307 1 ffff81087fecd7e0 IN 0.0 0 0 [cqueue/1] 492 307 2 ffff81087fecd080 IN 0.0 0 0 [cqueue/2] 493 307 3 ffff81087fecf820 IN 0.0 0 0 [cqueue/3] 494 307 4 ffff81087fecf0c0 IN 0.0 0 0 [cqueue/4] 495 307 5 ffff81087fed3860 IN 0.0 0 0 [cqueue/5] 496 307 6 ffff81087fed3100 IN 0.0 0 0 [cqueue/6] 497 307 7 ffff81087fed57a0 IN 0.0 0 0 [cqueue/7] 500 307 0 ffff81087fed5040 IN 0.0 0 0 [khubd] 502 307 0 ffff81087f80e860 IN 0.0 0 0 [kseriod] 619 307 0 ffff81087f80a080 IN 0.0 0 0 [khungtaskd] 620 307 0 ffff81087f8fb820 IN 0.0 0 0 [pdflush] 621 307 4 ffff81087f8fb0c0 IN 0.0 0 0 [pdflush] 622 307 5 ffff81087f8fe860 IN 0.0 0 0 [kswapd0] 623 307 0 ffff81087f8fe100 IN 0.0 0 0 [aio/0] 624 307 1 ffff81087f9017a0 IN 0.0 0 0 [aio/1] 625 307 2 ffff81087f901040 IN 0.0 0 0 [aio/2] 626 307 3 ffff81087f9027e0 IN 0.0 0 0 [aio/3] 627 307 4 ffff81087f905820 IN 0.0 0 0 [aio/4] 628 307 5 ffff81087f815820 IN 0.0 0 0 [aio/5] 629 307 6 ffff81087f814080 IN 0.0 0 0 [aio/6] 630 307 7 ffff81087ffbe0c0 IN 0.0 0 0 [aio/7] 797 307 0 ffff81087ffc1860 IN 0.0 0 0 [kpsmoused] 877 307 0 ffff81087f8147e0 IN 0.0 0 0 [scsi_eh_0] 878 307 4 ffff81087f84d100 IN 0.0 0 0 [aacraid] 897 307 0 ffff81087f946100 IN 0.0 0 0 [ata/0] 898 307 1 ffff81087f94a7a0 IN 0.0 0 0 [ata/1] 899 307 2 ffff81087f94a040 IN 0.0 0 0 [ata/2] 900 307 3 ffff81087f94b7e0 IN 0.0 0 0 [ata/3] 901 307 4 ffff81087f94b080 IN 0.0 0 0 [ata/4] 902 307 5 ffff81087f94d820 IN 0.0 0 0 [ata/5] 903 307 6 ffff81087f94d0c0 IN 0.0 0 0 [ata/6] 904 307 7 ffff81087f950860 IN 0.0 0 0 [ata/7] 905 307 0 ffff81087f950100 IN 0.0 0 0 [ata_aux] 915 307 0 ffff81087f8ad860 IN 0.0 0 0 [scsi_eh_1] 916 307 0 ffff81087f8ad100 IN 0.0 0 0 [scsi_eh_2] 935 307 0 ffff81087f8b27e0 IN 0.0 0 0 [kstriped] 972 307 4 ffff81087fffb860 IN 0.0 0 0 [kjournald] 994 307 1 ffff81087ffc57e0 IN 0.0 0 0 [kauditd] 1022 1 3 ffff81087f863860 IN 0.0 25592 1564 udevd 1890 307 5 ffff81087ebf1100 IN 0.0 0 0 [mlx4] 1893 307 5 ffff81087c7fe7a0 IN 0.0 0 0 [mlx4_opreq] 2251 307 5 ffff81087d5ea080 IN 0.0 0 0 [mlx4_sense]  -- MORE -- forward: , or j backward: b or k quit: q  2271 307 5 ffff81087bcda860 IN 0.0 0 0 [mlx4_en] 2565 307 0 ffff81087de89040 IN 0.0 0 0 [kmpathd/0] 2566 307 1 ffff81087a59b860 IN 0.0 0 0 [kmpathd/1] 2567 307 2 ffff81087b588080 IN 0.0 0 0 [kmpathd/2] 2568 307 3 ffff81087d851080 IN 0.0 0 0 [kmpathd/3] 2569 307 4 ffff81087d6ff040 IN 0.0 0 0 [kmpathd/4] 2570 307 5 ffff81087d21a040 IN 0.0 0 0 [kmpathd/5] 2571 307 6 ffff81011de12820 IN 0.0 0 0 [kmpathd/6] 2572 307 7 ffff81087aca80c0 IN 0.0 0 0 [kmpathd/7] 2573 307 0 ffff81087d99e100 IN 0.0 0 0 [kmpath_handlerd] 2604 307 4 ffff81087b6ed820 IN 0.0 0 0 [kjournald] 2815 307 0 ffff81087d6607a0 IN 0.0 0 0 [mthcacatas] 2828 307 0 ffff81087c359040 IN 0.0 0 0 [mlx4_ib] 2831 307 6 ffff81087b1227e0 IN 0.0 0 0 [ib_mad1] 2833 307 0 ffff81087ce55040 IN 0.0 0 0 [ib_mad2] 2914 307 0 ffff81087d9c70c0 IN 0.0 0 0 [ib_mcast] 2915 307 0 ffff81087b233100 IN 0.0 0 0 [ib_inform] 2916 307 6 ffff81087df9f100 IN 0.0 0 0 [local_sa] 2926 307 0 ffff81087c8b6860 IN 0.0 0 0 [ib_cm/0] 2927 307 1 ffff81087d889080 IN 0.0 0 0 [ib_cm/1] 2928 307 2 ffff81087caa30c0 IN 0.0 0 0 [ib_cm/2] 2929 307 3 ffff81087b3800c0 IN 0.0 0 0 [ib_cm/3] 2930 307 4 ffff81087d1cd860 IN 0.0 0 0 [ib_cm/4] 2931 307 5 ffff81087f9050c0 IN 0.0 0 0 [ib_cm/5] 2932 307 6 ffff81087bcda100 IN 0.0 0 0 [ib_cm/6] 2933 307 7 ffff81087f80e100 IN 0.0 0 0 [ib_cm/7] 2954 307 4 ffff81087c115080 IN 0.0 0 0 [ipoib] 2955 307 6 ffff81087e7f50c0 IN 0.0 0 0 [ipoib_auto_mode] 3002 307 6 ffff81087ae81040 IN 0.0 0 0 [ib_addr] 3011 307 0 ffff81087d9ac7e0 IN 0.0 0 0 [iw_cm_wq] 3022 307 0 ffff81087b0467e0 IN 0.0 0 0 [rdma_cm] 3101 307 0 ffff81087ce67100 IN 0.0 0 0 [iscsi_eh] 3180 307 0 ffff81087cd557e0 IN 0.0 0 0 [cnic_wq] 3189 307 0 ffff81087ae817a0 IN 0.0 0 0 [bnx2i_thread/0] 3191 307 1 ffff81087ccb80c0 IN 0.0 0 0 [bnx2i_thread/1] 3192 307 2 ffff81087c07e0c0 IN 0.0 0 0 [bnx2i_thread/2] 3193 307 3 ffff81087c6c47e0 IN 0.0 0 0 [bnx2i_thread/3] 3194 307 4 ffff81087dc7c040 IN 0.0 0 0 [bnx2i_thread/4] 3195 307 5 ffff81087c8ca080 IN 0.0 0 0 [bnx2i_thread/5] 3196 307 6 ffff81087caa3820 IN 0.0 0 0 [bnx2i_thread/6] 3197 307 7 ffff81087de897a0 IN 0.0 0 0 [bnx2i_thread/7] 3231 1 0 ffff81087c70f860 IN 0.1 28696 22544 iscsiuio 3232 1 7 ffff81087ffc1100 IN 0.1 28696 22544 iscsiuio 3235 1 0 ffff81087c3180c0 IN 0.1 28696 22544 iscsiuio 3236 1 2 ffff81087ba3a0c0 IN 0.0 4592 508 iscsid 3238 1 0 ffff81087d660040 IN 0.0 5096 3044 iscsid 3554 1 6 ffff81087f80a7e0 IN 0.0 92912 888 auditd 3555 1 4 ffff81087d18e820 IN 0.0 92912 888 auditd  -- MORE -- forward: , or j backward: b or k quit: q  3556 3554 2 ffff81087ce20820 IN 0.0 81828 836 audispd 3557 3554 5 ffff81011de120c0 IN 0.0 81828 836 audispd 3572 1 7 ffff81087cefa7a0 IN 0.0 5932 696 syslogd 3575 1 4 ffff81087f863100 RU 0.0 3828 440 klogd 3589 1 7 ffff81087d930860 IN 0.0 9764 364 irqbalance 3600 1 7 ffff81087feca7a0 IN 0.0 8076 604 portmap 3636 307 0 ffff81087d6ff7a0 IN 0.0 0 0 [rpciod/0] 3638 307 1 ffff81087cc637e0 IN 0.0 0 0 [rpciod/1] 3639 307 2 ffff81087ce200c0 IN 0.0 0 0 [rpciod/2] 3640 307 3 ffff81087ba3a820 IN 0.0 0 0 [rpciod/3] 3641 307 4 ffff81087c07e820 IN 0.0 0 0 [rpciod/4] 3642 307 5 ffff81087dc7c7a0 IN 0.0 0 0 [rpciod/5] 3643 307 6 ffff81087d930100 IN 0.0 0 0 [rpciod/6] 3644 307 7 ffff81087d18e0c0 IN 0.0 0 0 [rpciod/7] 3657 1 0 ffff81087ff770c0 IN 0.0 10184 820 rpc.statd 3680 1 1 ffff81087aff80c0 IN 0.0 22812 576 rpc.idmapd 3764 1 4 ffff81087c6c4080 IN 0.0 31852 1432 dbus-daemon 3810 1 6 ffff81087f902080 IN 0.0 124260 1952 automount 3811 1 1 ffff81087cc63080 IN 0.0 124260 1952 automount 3812 1 1 ffff81087f8150c0 IN 0.0 124260 1952 automount 3815 1 4 ffff81087df9f860 IN 0.0 124260 1952 automount 3818 1 4 ffff81087d99e860 IN 0.0 124260 1952 automount 3847 1 6 ffff81087f84d860 IN 0.0 59796 1200 sshd 3860 1 6 ffff81087f8b2080 IN 0.0 22092 992 xinetd 3886 1 4 ffff81087da35080 IN 0.0 59964 312 uuidd 3895 1 4 ffff81087aff8820 IN 0.0 24936 2804 gmond 3970 1 1 ffff81086b32a820 IN 0.0 31132 1224 cfservd 4040 1 1 ffff81087d9c7820 IN 0.0 21508 3780 xfs 4057 1 4 ffff81087e7f5820 IN 0.0 17088 464 atd 4078 1 2 ffff81086acb20c0 IN 0.0 28964 4600 hald 4079 4078 6 ffff81086978e820 IN 0.0 20712 1176 hald-runner 4086 4079 7 ffff81086978e0c0 IN 0.0 10676 824 hald-addon-acpi 4093 4079 4 ffff81087ffc5080 IN 0.0 10680 824 hald-addon-keyb 4101 4079 2 ffff81087d8897e0 IN 0.0 10252 760 hald-addon-stor 4155 1 4 ffff81087f946860 IN 0.0 83444 2428 login 4156 1 3 ffff81086a6a0820 IN 0.0 3816 548 mingetty 4157 1 6 ffff81087b122080 IN 0.0 3816 548 mingetty 4158 1 0 ffff81087fffb100 IN 0.0 3816 548 mingetty 4159 1 0 ffff81086950c7e0 IN 0.0 3816 548 mingetty 4160 1 3 ffff81087c318820 IN 0.0 3816 552 mingetty 4163 1 6 ffff81087c1157e0 IN 0.0 3816 552 mingetty 4775 1 2 ffff81087c70f100 IN 0.0 214304 2476 nscd 4777 1 6 ffff81087ccb8820 IN 0.0 214304 2476 nscd 4778 1 6 ffff81087cefa040 IN 0.0 214304 2476 nscd 4779 1 6 ffff81086acb2820 IN 0.0 214304 2476 nscd 4780 1 6 ffff810631270860 IN 0.0 214304 2476 nscd 4781 1 6 ffff8107eba81100 IN 0.0 214304 2476 nscd 4782 1 6 ffff81086a6a00c0 IN 0.0 214304 2476 nscd  -- MORE -- forward: , or j backward: b or k quit: q  4783 1 6 ffff8107ea7c47a0 IN 0.0 214304 2476 nscd 6458 1 2 ffff8104637b60c0 IN 0.0 0 0 [ldlm_bl_11] 7925 1 2 ffff8107e9239080 IN 0.0 0 0 [ldlm_bl_08] 7926 1 2 ffff8107e59687a0 IN 0.0 0 0 [ldlm_bl_09] 7927 1 6 ffff81085ce57820 IN 0.0 0 0 [ldlm_bl_10] 8871 12032 2 ffff81087fec1080 IN 0.0 59668 2548 pickup 9915 1 6 ffff81080d8eb7a0 IN 0.0 0 0 [obd_zombid] 9940 1 6 ffff8107f6bcb7e0 IN 0.0 0 0 [kiblnd_sd_00] 9941 1 6 ffff8107f6bcb080 IN 0.0 0 0 [kiblnd_sd_01] 9942 1 6 ffff810821aab820 IN 0.0 0 0 [kiblnd_sd_02] 9943 1 6 ffff810821aab0c0 IN 0.0 0 0 [kiblnd_sd_03] 9944 1 6 ffff8107f7a23860 IN 0.0 0 0 [kiblnd_sd_04] 9945 1 6 ffff8107f7a23100 IN 0.0 0 0 [kiblnd_sd_05] 9946 1 6 ffff8107f7fd57a0 IN 0.0 0 0 [kiblnd_sd_06] 9947 1 6 ffff8107f7fd5040 IN 0.0 0 0 [kiblnd_sd_07] 9948 1 0 ffff8108635097e0 IN 0.0 0 0 [kiblnd_connd] 9950 1 1 ffff8107f7edc040 IN 0.0 0 0 [ptlrpcd] 9951 1 3 ffff8107f7edc7a0 IN 0.0 0 0 [ptlrpcd-recov] 9952 1 3 ffff8107f7ede100 IN 0.0 0 0 [ll_ping] 10015 1 2 ffff8107f508b7a0 IN 0.0 0 0 [ldlm_bl_00] 10016 1 3 ffff8107f508b040 IN 0.0 0 0 [ldlm_bl_01] 10017 1 2 ffff8107f7537860 IN 0.0 0 0 [ldlm_cn_00] 10018 1 2 ffff8107ff88f860 IN 0.0 0 0 [ldlm_cn_01] 10019 1 2 ffff8107f7537100 IN 0.0 0 0 [lc_watchdogd] 10020 1 7 ffff8107f7eda080 IN 0.0 0 0 [ldlm_cb_00] 10021 1 7 ffff8107f703a040 IN 0.0 0 0 [ldlm_cb_01] 10022 1 2 ffff81085f337820 IN 0.0 0 0 [ldlm_elt] 10024 1 2 ffff8107f7038080 IN 0.0 0 0 [ll_close] 10027 1 3 ffff81080d8eb040 IN 0.0 0 0 [ll_close] 10030 1 7 ffff81087da357e0 IN 0.0 0 0 [ll_close] 10052 307 0 ffff81086950c080 IN 0.0 0 0 [nfsiod] 10053 1 6 ffff81087fe7e040 IN 0.0 0 0 [lockd] 10996 1 4 ffff81087b380820 IN 0.0 124496 3796 sge_execd 10997 1 2 ffff8107ec7f3080 IN 0.0 124496 3796 sge_execd 10998 1 2 ffff81087c3597a0 IN 0.0 124496 3796 sge_execd 10999 1 4 ffff8107f703a7a0 IN 0.0 124496 3796 sge_execd 11000 1 6 ffff810867a707a0 IN 0.0 124496 3796 sge_execd 11007 1 6 ffff81087b6ed0c0 IN 0.0 0 0 [ldlm_cb_02] 11601 3847 4 ffff81087d1cd100 IN 0.0 101284 3772 sshd 11603 11601 6 ffff81087ffbe820 IN 0.0 101284 1792 sshd 11604 11603 6 ffff81087d21a7a0 IN 0.0 68556 1632 bash 11731 11604 4 ffff810836495080 IN 0.0 138812 2612 su 11732 11731 3 ffff81087ce557a0 IN 0.0 68544 1644 bash 11871 4155 4 ffff81087a59b100 IN 0.0 68424 1604 bash 12032 1 7 ffff8108685be860 IN 0.0 62580 2596 master 12035 12032 4 ffff8107ed5f90c0 IN 0.0 59840 2660 qmgr 13098 1 0 ffff81085f3370c0 IN 0.0 72964 1036 crond 15251 1 3 ffff8107eba81860 IN 0.0 21632 7320 ntpd  -- MORE -- forward: , or j backward: b or k quit: q  16477 10996 3 ffff810631270100 IN 0.0 16624 1464 sge_shepherd 16478 16477 2 ffff8107f7ede860 UN 0.0 79740 5888 perl 20014 1 1 ffff8107eb1f37a0 IN 0.0 0 0 [ldlm_bl_15] 26874 1 4 ffff8104aa5db0c0 IN 0.0 0 0 [ldlm_bl_13] 26875 1 5 ffff810863509080 IN 0.0 0 0 [ldlm_bl_14] 26894 1 0 ffff8107ec46a7a0 IN 0.0 0 0 [ldlm_bl_02] 28889 1 6 ffff8107ff88f100 IN 0.0 0 0 [ldlm_cb_03] 28952 1 7 ffff81018c643080 IN 0.0 0 0 [ldlm_cb_04] 29048 1 3 ffff8105fe731080 IN 0.0 0 0 [ldlm_bl_12] 29432 1 1 ffff8107eb1f3040 IN 0.0 0 0 [ldlm_bl_03] 29433 1 7 ffff8107ea9d07a0 IN 0.0 0 0 [ldlm_bl_04] 29434 1 7 ffff8107eab5a7a0 IN 0.0 0 0 [ldlm_bl_05] 29435 1 5 ffff8107ea7c4040 IN 0.0 0 0 [ldlm_bl_06] 29436 1 6 ffff8107ead427a0 IN 0.0 0 0 [ldlm_bl_07] [?1l>crash> crash> crash> crash> kmem [?1h= Usage: kmem [-f|-F|-p|-c|-C|-i|-s|-S|-v|-V|-n|-z] [-[l|L][a|i]] [slab] [[-P] address] Enter "help kmem" for details. [?1l>crash> crash> kmem -f [?1h= ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 0 DMA 4096 2487 ffff810100000000 0 0 AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES 0 4k ffff810000032858 5 5 1 8k ffff810000032870 3 6 2 16k ffff810000032888 3 12 3 32k ffff8100000328a0 4 32 4 64k ffff8100000328b8 4 64 5 128k ffff8100000328d0 4 128 6 256k ffff8100000328e8 1 64 7 512k ffff810000032900 1 128 8 1024k ffff810000032918 0 0 9 2048k ffff810000032930 0 0 10 4096k ffff810000032948 2 2048 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 1 DMA32 1044480 433449 ffff810100038000 1000000 4096 AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES 0 4k ffff810000033358 1031 1031 1 8k ffff810000033370 979 1958 2 16k ffff810000033388 983 3932 3 32k ffff8100000333a0 838 6704 4 64k ffff8100000333b8 711 11376 5 128k ffff8100000333d0 414 13248 6 256k ffff8100000333e8 315 20160 7 512k ffff810000033400 360 46080 8 1024k ffff810000033418 319 81664 9 2048k ffff810000033430 193 98816 10 4096k ffff810000033448 145 148480 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 2 Normal 7864320 4983573 ffff810103800000 100000000 1048576 AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES 0 4k ffff810000033e58 128347 128347 1 8k ffff810000033e70 127993 255986 2 16k ffff810000033e88 106962 427848 3 32k ffff810000033ea0 70726 565808 4 64k ffff810000033eb8 44147 706352 5 128k ffff810000033ed0 29613 947616 6 256k ffff810000033ee8 15828 1012992 7 512k ffff810000033f00 6763 865664 8 1024k ffff810000033f18 253 64768 9 2048k ffff810000033f30 4 2048 10 4096k ffff810000033f48 6 6144 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 3 HighMem 0 0 0 0 0  -- MORE -- forward: , or j backward: b or k quit: q nr_free_pages: 5419509 (verified)  -- MORE -- forward: , or j backward: b or k quit: q  [?1l>crash> crash> crash> log [?1h= event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785519484 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 10s ago has timed ou t (10s prior to deadline). req@ffff8107e5710800 x1427389785519484/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364139989 ref 2 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 84 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 36 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785523998 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 33s ago has timed ou t (33s prior to deadline). req@ffff8104e1e91c00 x1427389785523998/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364140636 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 88 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (14) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (4) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785529526 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 29s ago has timed ou t (29s prior to deadline). req@ffff810449a28c00 x1427389785529526/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364141307 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 94 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (33) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 35 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 34s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (14) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785533896 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff8101ac00f400 x1427389785533896/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364141909 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 27 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 24s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 31 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 4 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 3 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (24) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785538009 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 74s ago has timed ou t (73s prior to deadline). req@ffff810608970400 x1427389785538009/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364142522 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 104 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 6s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 30 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785543334 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 19s ago has timed ou t (19s prior to deadline). req@ffff8101b5e76000 x1427389785543334/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364143172 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 94 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 30 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 3 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (27) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 4s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 31 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785547574 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 35s ago has timed ou t (35s prior to deadline). req@ffff81048636a800 x1427389785547574/t0 o8->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364143776 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 32 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 30s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785552127 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 75s ago has timed ou t (75s prior to deadline). req@ffff8103cf456800 x1427389785552127/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364144415 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 108 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 43 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 45 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 2 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (33) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 28s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 35 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785556676 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 83s ago has timed ou t (83s prior to deadline). req@ffff8104ab925400 x1427389785556676/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364145031 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 114 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 42 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 42 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (47) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785561610 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 63s ago has timed ou t (63s prior to deadline). req@ffff810645b9b400 x1427389785561610/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364145642 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 48s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST0011; in progress operations using this service will fail. Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST0010_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a528bd06b to 0x9d26d1a52a864ad LustreError: 167-0: This client was evicted by share3-OST0010; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785565896 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 66s ago has timed ou t (65s prior to deadline). req@ffff8103e4cdc000 x1427389785565896/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364146243 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 78 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 29 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 29 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 8s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 30 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (37) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (27) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785570519 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 53s ago has timed ou t (53s prior to deadline). req@ffff81084aad6000 x1427389785570519/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364146849 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 83 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 4 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (40) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (31) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785575211 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff81039a2a9000 x1427389785575211/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364147480 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 97 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 32 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 24s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (10) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785579558 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 36s ago has timed ou t (36s prior to deadline). req@ffff8107eb840000 x1427389785579558/t0 o8->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364148083 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 25 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 35s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 32 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 7 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785697760 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff8103369a9c00 x1427389785697760/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364148693 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 83 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 32 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 10s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2927:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2927:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785702345 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 55s ago has timed ou t (55s prior to deadline). req@ffff81021c99b800 x1427389785702345/t0 o400->share3-OST000c_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364149308 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 31 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 5s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 3 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2926:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2926:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785706723 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 438s ago has timed o ut (438s prior to deadline). req@ffff8105f61e6400 x1427389785706723/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 4 to 1 dl 1364150254 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 23 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 29 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 4 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (29) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785714430 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 55s ago has timed ou t (55s prior to deadline). req@ffff81074e8ac800 x1427389785714430/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364150858 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 100 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 38 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 39 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785719358 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff810358eb8800 x1427389785719358/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364151459 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 93 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 30 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 9s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (25) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785724185 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 29s ago has timed ou t (29s prior to deadline). req@ffff8107c1232800 x1427389785724185/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364152090 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 76 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 9s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (31) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81030da64000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81048a462000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106486a2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 8 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785727753 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 1034s ago has timed out (1034s prior to deadline). req@ffff8105f61e6400 x1427389785727753/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 5 to 1 dl 1364153568 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 80 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 31 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785735827 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 26s ago has failed d ue to network error (127s prior to deadline). req@ffff81075bc22800 x1427389785735827/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364153755 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785737295 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 28s ago has failed d ue to network error (119s prior to deadline). req@ffff81026cad5800 x1427389785737295/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364153922 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 9 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 5 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 41s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (21) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810672d16000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81068ae70000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81063beaa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105e75de000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81063bef6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102838fa000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785738319 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 627s ago has timed o ut (627s prior to deadline). req@ffff8105f61e6400 x1427389785738319/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 5 to 1 dl 1364154575 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 24 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 7 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 14 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 4s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: tx_queue, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (53) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106bde28000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810441074000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (20) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 16s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 18 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785747141 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 1s ago has failed du e to network error (60s prior to deadline). req@ffff810751d1a800 x1427389785747141/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364155266 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 109 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff81010ecf04c8 map ffff8102d2d38a30 index 16640 flags 438100000000a21 count 7 priv ffff810475837620: read queue failed:  rc -5 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8108135f2400 x1427389785747753/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 25050:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 25050:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81015953c500 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8107f0751000 x1427389785747755/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810850840000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810642c2a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102c635c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107d9ee4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810571896000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81084c622000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (26) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785750871 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 97s ago has timed ou t (97s prior to deadline). req@ffff8105f61e6800 x1427389785750871/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364155840 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 59 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 25 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 55s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81076d1c0000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106317a6000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81016e432000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105f1da6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81026e876000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81085d3d4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81012bd24000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (21) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785755775 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 14s ago has failed d ue to network error (95s prior to deadline). req@ffff81052baa7400 x1427389785755775/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364156555 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 76 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete.  -- MORE -- forward: , or j backward: b or k quit: q Lustre: Skipped 23 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 71s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105f1da6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104212b2000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101638e0000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810500b8a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810170f2c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81064aec4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104ee3b0000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102efe76000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81016047c000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (13) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785759767 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 46s ago has timed ou t (46s prior to deadline). req@ffff81027be21000 x1427389785759767/t0 o8->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364157084 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 71 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 24 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 60s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff81066aadd800 x1427389785758010/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  1 to 1 dl 1364157069 ref 2 fl Rpc:EX/0/0 rc -4/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 26 previous similar messages LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff81011bf605a8 map ffff8102d2d38a30 index 37120 flags 7f8100000000a21 count 7 priv ffff810738e53660: read queue failed:  rc -5 LustreError: 26704:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 26704:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81015953c500 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff81087e994000 x1427389785760214/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103d3c48000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810260760000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (42) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785764376 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff81051e02b800 x1427389785764376/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364157734 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 58 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 19 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (4) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81054bc8e000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 9s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81084374a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810237516000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c399e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81038de82000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105f96ea000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c399e000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81085b05a000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785768970 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 35s ago has timed ou t (35s prior to deadline). req@ffff8104596a5800 x1427389785768970/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364158363 ref 2 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (45) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 26 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 21s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 12 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81046b0e2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff810110da90e8 map ffff8102d2d38a30 index 53760 flags 4d0100000000a21 count 7 priv ffff810322f5de00: read queue failed:  rc -5 LustreError: 28230:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 28230:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81015953c500 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff81063326b400 x1427389785772077/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102286ac000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81072b2d8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81030d2de000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 64 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785773170 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 62s ago has timed ou t (62s prior to deadline). req@ffff8107e33aa000 x1427389785773170/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364158994 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 78 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 57s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 10 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81062e266000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81048c6dc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104dd9fc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810615280000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103de75c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107a52b4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c7ed6000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810463046000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810615280000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 9 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: tx_queue, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (52) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785778584 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 27s ago has failed d ue to network error (105s prior to deadline). req@ffff8103dacec400 x1427389785778584/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364159758 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81086231c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810850ee2000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 42 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c7ed6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81040482e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810757704000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81026ee78000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101f46cc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810821170000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104f4336000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (30) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 39s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 17 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785782720 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 123s ago has timed o ut (123s prior to deadline). req@ffff810220b38c00 x1427389785782720/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364160301 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 38 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 5s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810706852000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81017c5e4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81017c5e6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81054c452000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81059c4b2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810321338000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81032133a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81064fe68000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810321304000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785787189 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 109s ago has timed o ut (109s prior to deadline). req@ffff810342148800 x1427389785787189/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364160937 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 94 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 8 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103c21f6000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 23s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785791951 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 17s ago has failed d ue to network error (61s prior to deadline). req@ffff8106e4d1f800 x1427389785791951/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364161589 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 65 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103412fa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81061bde0000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810619f60000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810838144000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810619f62000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810838146000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 23 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (11) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (39) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103becf4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810619f60000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810619f62000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81027def4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103412fa000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785795410 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 144s ago has timed o ut (144s prior to deadline). req@ffff8105cbbe8c00 x1427389785795410/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364162163 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 58 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 20 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102a30e6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81024c2ac000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81080471e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81024c2ae000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81060d72c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810161c50000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81060d72e000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785798034 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 615s ago has timed o  -- MORE -- forward: , or j backward: b or k quit: q ut (615s prior to deadline). req@ffff8105f61e6400 x1427389785798034/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 5 to 1 dl 1364162975 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 36 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 7 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 13 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 106 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 32s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 7 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810522b42000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106cae0a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105afc64000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102d7176000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102d7172000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102d7174000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81059aa88000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (28) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103fc5d6000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785805882 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 130s ago has timed o ut (130s prior to deadline). req@ffff81079dfd8400 x1427389785805882/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364163585 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 76 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 23 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2928:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2928:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (21) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000f_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52b54be2 to 0x9d26d1a52ba9e7a LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff810104a81208 map ffff8102d2d38a30 index 166400 flags 150100000000a21 count 7 priv ffff81063fbd1e40: read queue failed : rc -5 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810868fdf400 x1427389785810762/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff810104a81208 map ffff8102d2d38a30 index 166400 flags 150100000000a21 count 5 priv ffff81063fbd1e40: read queue failed : rc -5 LustreError: 533:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 533:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81015953c500 (28447854/0/0/0) (rc: 1) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785810453 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 57s ago has timed ou t (57s prior to deadline). req@ffff8102af8e6800 x1427389785810453/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364164186 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 27 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810204262000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81038bc60000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105522ee000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101416a8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104a9ac8000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 13s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 20 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (25) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (5) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785814405 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 148s ago has timed o  -- MORE -- forward: , or j backward: b or k quit: q ut (148s prior to deadline). req@ffff8104d6ee7000 x1427389785814405/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364164807 ref 2 fl Rpc:/2/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 83 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 36s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81066b028000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810144946000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785817861 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 1305s ago has timed out (1305s prior to deadline). req@ffff8107faa18800 x1427389785817861/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 4 to 1 dl 1364166483 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 56 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 13 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 15 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 15s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 7 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785830102 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 7s ago has timed out  (7s prior to deadline). req@ffff8103bca9d000 x1427389785830102/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364166566 ref 2 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 12 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: Skipped 3 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 1 previous similar message LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 5s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810667c98000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785831198 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 43s ago has timed ou t (43s prior to deadline). req@ffff810450b85400 x1427389785831198/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364166771 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 23 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 9 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810775eb6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810775ebe000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810221004000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107b2906000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102af542000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 176 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 15s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 3 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 11 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810775eb6000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785833566 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff8104f44e4c00 x1427389785833566/t0 o8->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364167072 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 34 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 17s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 9 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107d53c4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e559c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810775ebe000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104ddf3e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81036e070000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785837870 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 28s ago has failed d ue to network error (113s prior to deadline). req@ffff8106a4f0a400 x1427389785837870/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364167766 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 46 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 26 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043acfc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043acfc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043acf8000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 10 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 60s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 7 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (37) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785878445 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff81038e7c5c00 x1427389785878445/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364168288 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 73 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (13) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104aa1da000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043acf8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 49s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 7 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785882538 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 113s ago has timed o ut (113s prior to deadline). req@ffff8105e4290800 x1427389785882538/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364168891 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 46 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81051bfae000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81051bfac000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810572812000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810343d22000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105a1240000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106d5d92000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105fcd06000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81053b6a6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106d5d94000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810775df4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810343d22000 Lustre: Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81074b57c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104d4ccc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810150c62000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81012290e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103ab7ee000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810513c72000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 59s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785887335 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 121s ago has timed o ut (121s prior to deadline). req@ffff81037babac00 x1427389785887335/t0 o400->share3-OST000e_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364169499 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 100 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81051ca52000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106771e4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810718c44000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 18 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81045b702000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105f4af8000 Lustre: Skipped 18 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (4) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105e796e000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810763e5e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105e2224000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 19 previous similar messages Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000c_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a2de9ec8a to 0x9d26d1a52c03281 LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. LustreError: 167-0: This client was evicted by share3-OST000e; in progress operations using this service will fail. Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785892577 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 42s ago has timed ou t (42s prior to deadline). req@ffff81070e202800 x1427389785892577/t0 o8->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364170122 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 57 previous similar messages LustreError: 167-0: This client was evicted by share3-OST0011; in progress operations using this service will fail. LustreError: 167-0: This client was evicted by share3-OST0010; in progress operations using this service will fail. LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (10) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 18 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81068d9ac000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101e0340000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81068d9a8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81068d9aa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103e6638000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81068d9aa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81068d9ac000 Lustre: Skipped 19 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785897715 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 21s ago has failed d ue to network error (69s prior to deadline). req@ffff8102a1c6b000 x1427389785897715/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364170822 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101f3c20000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101f3c22000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104cc3f8000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810153dfa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81030fcbe000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 58 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 39s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101bb940000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81046e26c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81067432a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81073e038000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810620842000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785902441 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 32s ago has timed ou t (32s prior to deadline). req@ffff81077349b800 x1427389785902441/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364171432 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106a0d74000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104a551c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104a5500000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810796654000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102354d8000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 22 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81032cb98000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043ff0a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810796654000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102354d8000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043ff0a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81045fa5c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81045fa58000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102ae41a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106a0d74000 Lustre: 2930:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2930:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: Skipped 20 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (13) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785907208 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 133s ago has timed o ut (133s prior to deadline). req@ffff81077214a000 x1427389785907208/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364172145 ref 2 fl Rpc:/2/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 70 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81059a050000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (6) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810233bfa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81015d2e8000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 22 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 54s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785911540 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 113s ago has timed o ut (113s prior to deadline). req@ffff8105b5a95c00 x1427389785911540/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364172748 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 57 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (20) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff81010dda1610 map ffff810134b4ae70 index 93952 flags 3f0100000000a21 count 7 priv ffff8107bf10fc30: read queue failed:  rc -5 LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff81010dda1610 map ffff810134b4ae70 index 93952 flags 3f0100000000a21 count 6 priv ffff8107bf10fc30: read queue failed:  rc -5  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11981:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 11981:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8107ea3be500 (28447855/0/0/0) (rc: 1) Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 26 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (10) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST0010_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52c04209 to 0x9d26d1a52c3495e LustreError: 167-0: This client was evicted by share3-OST0010; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 8s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000f_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52c32ca8 to 0x9d26d1a52c35dcb LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff8101087b1db0 map ffff8102d2d38a30 index 6144 flags 268100000000a21 count 7 priv ffff81070c778c70: read queue failed: rc -5 LustreError: 12262:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 12262:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8107e7336800 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff81012849d400 x1427389785953866/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 28 previous similar messages LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff81012849dc00 x1427389785953868/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 1 previous similar message Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785954181 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 49s ago has timed ou t (49s prior to deadline). req@ffff81057d8ab800 x1427389785954181/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364173352 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 68 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 123 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (25) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103686a2000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105892ac000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c1e5c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810677558000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810389c48000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810173cc6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101a705a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103244d4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103686a2000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 23 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 35s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785958312 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 55s ago has timed ou t (55s prior to deadline). req@ffff8105b2c9a800 x1427389785958312/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364173958 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 59 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (19) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 Lustre: 22367:0:(rw.c:2323:ll_readpage()) ino 576389176 page 16042 (65708032) not covered by a lock (mmap?). check debug logs. LustreError: Skipped 3 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101a705a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810173cc6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c1e5c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81039356a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81084793e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103c10d0000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (22) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810342a06000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81070ce34000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 24s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 28 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove  -- MORE -- forward: , or j backward: b or k quit: q ry to complete. Lustre: Skipped 21 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785963174 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 9s ago has failed du e to network error (67s prior to deadline). req@ffff8104af7c1800 x1427389785963174/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364174645 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 85 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106ea146000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810777620000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101342e2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810656f58000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810300194000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 20s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 29 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785967061 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 55s ago has timed ou t (55s prior to deadline). req@ffff810425e37400 x1427389785967061/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364175204 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 95 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (3) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (24) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 48s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 27 previous similar messages LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff810114eb7770 map ffff8102d2d38a30 index 28928 flags 5f8100000000a21 count 7 priv ffff8105d460a0e0: read queue failed:  rc -5 LustreError: 14807:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 14807:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8107e7336800 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810868fdf400 x1427389785971687/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 25 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785971452 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff8104d2fa3000 x1427389785971452/t0 o400->share3-OST000e_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364175818 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 84 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (32) Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103c3c44000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104720ec000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107c187c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810362874000 Lustre: Skipped 24 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785976250 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 39s ago has timed ou t (39s prior to deadline). req@ffff8101dd5f4400 x1427389785976250/t0 o8->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364176431 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 57 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 53s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 7 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (45) Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 25 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16  -- MORE -- forward: , or j backward: b or k quit: q LustreError: Skipped 4 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (3) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000f_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52c59056 to 0x9d26d1a52c691ea LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 16046:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (3) after lock cleanup; forcing cleanup. LustreError: 22620:0:(file.c:999:ll_glimpse_size()) obd_enqueue returned rc -5, returning -EIO LustreError: 22620:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 LustreError: 16046:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8107e7336800 (28447854/0/0/0) (rc: 1) LustreError: 23379:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5 LustreError: 23379:0:(llite_mmap.c:210:ll_tree_unlock()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (49) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (10) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785980478 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 113s ago has timed o ut (113s prior to deadline). req@ffff810273112c00 x1427389785980478/t0 o400->share3-OST000f_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364177041 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 65 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 49s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 23 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (50) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043a32a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101f889c000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (14) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101bf80a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785985705 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 14s ago has failed d ue to network error (65s prior to deadline). req@ffff81074a78a400 x1427389785985705/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364177718 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103c7cfa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81087c1f0000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101f52ea000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105d9c04000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 81 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (44) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 44s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 24 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 21 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (46) Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (0) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785990131 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 50s ago has timed ou t (50s prior to deadline). req@ffff8107f6b4fc00 x1427389785990131/t0 o8->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364178308 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 67 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 33s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810309f1e000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 15 previous similar messages LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810200dc8000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104a20bc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810264904000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81065f52a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810475dc2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810174a98000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81051759a000 Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000c_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52c777c4 to 0x9d26d1a52c7bfbc LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389785996262 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 18s ago has failed d ue to network error (75s prior to deadline). req@ffff81025ee45400 x1427389785996262/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364179028 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 77 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 30s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (47) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000f_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52c691ea to 0x9d26d1a52c8b0be LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff810108d1f738 map ffff8102d2d38a30 index 61696 flags 280100000000a21 count 5 priv ffff8101ee599a50: read queue failed:  rc -5 LustreError: 19319:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 19319:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81012515ab00 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810163126400 x1427389785999838/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: 22620:0:(rw.c:2323:ll_readpage()) ino 576389180 page 2306 (9445376) not covered by a lock (mmap?). check debug logs. Lustre: 22620:0:(rw.c:2323:ll_readpage()) Skipped 2 previous similar messages Lustre: 22367:0:(rw.c:2323:ll_readpage()) ino 576389180 page 2306 (9445376) not covered by a lock (mmap?). check debug logs. Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786000743 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 21s ago has failed d  -- MORE -- forward: , or j backward: b or k quit: q ue to network error (75s prior to deadline). req@ffff810389458c00 x1427389786000743/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364179628 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 95 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (25) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 37 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 30 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (39) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105d593a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81046f3b8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81019d418000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 26875:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway LustreError: 26875:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81048bcca000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81083cb1a000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81052e37c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810495dde000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105b85f0000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107b3a7a000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 207 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81079c454000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81048fbe8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810201eb2000 LustreError: 7925:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -11 from cancel RPC: canceling anyway LustreError: 20014:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 LustreError: 29048:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11 LustreError: 7925:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 3 previous similar messages LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff810105765ed8 map ffff8102d2d38a30 index 63489 flags 188100000000a21 count 5 priv ffff8107d045f490: read queue failed:  rc -5 LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff8101140f4630 map ffff810134b4ae70 index 11264 flags 5b8100000000821 count 5 priv ffff810876bb51f0: read queue failed:  rc -5 LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff8101140f4630 map ffff810134b4ae70 index 11264 flags 5b8100000000821 count 5 priv ffff810876bb51f0: read queue failed:  -- MORE -- forward: , or j backward: b or k quit: q  rc -5 LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff8101140f4630 map ffff810134b4ae70 index 11264 flags 5b8100000000821 count 5 priv ffff810876bb51f0: read queue failed:  rc -5 LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff8101140f4630 map ffff810134b4ae70 index 11264 flags 5b8100000000821 count 5 priv ffff810876bb51f0: read queue failed:  rc -5 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8103c9b2fc00 x1427389786005107/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810151744000 x1427389786005112/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 2 previous similar messages LustreError: 20272:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 20272:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Skipped 1 previous similar message LustreError: 20272:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81012515ab00 (28447854/0/0/0) (rc: 1) LustreError: 20272:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Skipped 1 previous similar message Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786006398 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 15s ago has failed d ue to network error (119s prior to deadline). req@ffff810522880000 x1427389786006398/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364180397 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81041d58e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81083cb1a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101e2972000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106ca812000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 55 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 52s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 8 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 25 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (23) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106ad89e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103382ba000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 8 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786008849 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 807s ago has timed o ut (807s prior to deadline). req@ffff8105f61e6400 x1427389786008849/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364181432 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 54 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 6 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 37s  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 12 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (32) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81076a22e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101b5bda000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810446cd2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81032bb90000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81062d172000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81027c828000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81085043c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105b1826000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810854bea000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786019644 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 9s ago has failed du e to network error (131s prior to deadline). req@ffff81031dc9fc00 x1427389786019644/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364182284 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 47 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 22 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 21 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 49s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 8 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81030eff0000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101d0598000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81047faa2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101d059a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102dea2c000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786020604 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 804s ago has timed o ut (804s prior to deadline). req@ffff8105f61e6400 x1427389786020604/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364183107 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 20 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 6 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (49) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 44s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810121d74000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810858080000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105f38b8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104d4874000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (1) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786029885 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 115s ago has timed o ut (115s prior to deadline). req@ffff81047ac1d400 x1427389786029885/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364183718 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 58 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81085c964000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81056ddf8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810593526000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105ed754000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 46s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 31 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (30) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81056ddf8000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 21 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786034374 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 77s ago has timed ou t (77s prior to deadline). req@ffff810868fdf400 x1427389786034374/t0 o400->share3-OST0010_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364184348 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 97 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2930:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810810e26000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107ea568000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106c7104000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810810e24000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 19s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81083909a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104f1f84000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104f1f86000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810255b20000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810463b42000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102dbce0000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786050495 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 802s ago has timed o ut (802s prior to deadline). req@ffff8105f61e6400 x1427389786050495/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364185539 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 51 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 19 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81061fb4c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102a2c72000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102d7a32000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 34s  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Skipped 2 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (37) LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81059e394000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81061fb4c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102c766e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104a9230000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810673ad2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106ab308000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810869452000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104d8aac000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81059e394000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81035d07e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810673ad2000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786060648 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 12s ago has failed d ue to network error (111s prior to deadline). req@ffff810308615c00 x1427389786060648/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364186239 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81024c0cc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81024c0ce000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81019e1e8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81035d07e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810144876000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810673ad2000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 47 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 22 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 170 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 48s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 9 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 8 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (45)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786064631 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 142s ago has timed o ut (141s prior to deadline). req@ffff81012849d400 x1427389786064631/t0 o400->share3-OST000e_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364186790 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 56 previous similar messages LustreError: 167-0: This client was evicted by share3-OST0010; in progress operations using this service will fail. LustreError: Skipped 3 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 13 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103cee88000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107c9c18000 Lustre: Skipped 19 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786066566 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 806s ago has timed o ut (806s prior to deadline). req@ffff8105f61e6400 x1427389786066566/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364187717 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 8 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 3 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810749602000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810242cf8000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 19s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 28s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 32s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (35) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786080429 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 35s ago has timed ou  -- MORE -- forward: , or j backward: b or k quit: q t (35s prior to deadline). req@ffff810721811800 x1427389786080429/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364188330 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 86 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81035bd7a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101a4e2c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101a4e24000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81057cf76000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 25 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 46s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 12 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786084844 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 35s ago has timed ou t (35s prior to deadline). req@ffff8107e53c0800 x1427389786084844/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364188963 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 63 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 25 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (6) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 12s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (45) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102dafbc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102dafb8000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 3 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (1) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786090105 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 107s ago has timed o ut (107s prior to deadline). req@ffff8101b0b80400 x1427389786090105/t0 o400->share3-OST000d_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364189585 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 25 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (38) Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81069e5c8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107ced12000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103d37ec000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107ced14000 Lustre: Skipped 27 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 39s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 19 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786095281 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 27s ago has failed d ue to network error (143s prior to deadline). req@ffff8103b96bbc00 x1427389786095281/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364190321 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 26 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810804bfe000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107bf658000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81029d9b6000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (20)  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 23 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 27s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 5 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (7) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff810105ad2f70 map ffff8102d2d38a30 index 6400 flags 198100000000a21 count 5 priv ffff8104ec28f770: read queue failed: rc -5 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8103c72cd800 x1427389786098044/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 4 previous similar messages LustreError: 29608:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 29608:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Skipped 1 previous similar message LustreError: 29608:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81083ccbb380 (28447854/0/0/0) (rc: 1) LustreError: 29608:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (48) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786098827 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 121s ago has timed o ut (121s prior to deadline). req@ffff81050807a400 x1427389786098827/t0 o400->share3-OST000c_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364190824 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 56 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 16 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (1) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 18 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 20 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810466300000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81046d86a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810257a12000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail.  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff810116d21840 map ffff8102d2d38a30 index 23040 flags 680100000000a21 count 7 priv ffff8103841694d0: read queue failed:  rc -5 LustreError: 30387:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 30387:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff81083ccbb380 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8108135f2400 x1427389786104291/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 29 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786104106 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 57s ago has timed ou t (57s prior to deadline). req@ffff810480108000 x1427389786104106/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364191435 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 62 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810247ac4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810391816000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (45) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 21s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 10 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104c85e8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106592c6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810751540000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101f04b2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81081f55e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101cda40000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81050f046000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786108988 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 12s ago has failed d ue to network error (95s prior to deadline). req@ffff81065978e800 x1427389786108988/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364192156 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104c85e8000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 83 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16  -- MORE -- forward: , or j backward: b or k quit: q LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 21 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 7s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (38) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81039f94e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104f38a2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786131164 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 29s ago has timed ou t (29s prior to deadline). req@ffff8101a75b1800 x1427389786131164/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364192682 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 86 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 20 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786147159 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 55s ago has timed ou t (55s prior to deadline). req@ffff810863811c00 x1427389786147159/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364193283 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 75 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (37) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810658b28000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81040a736000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106c8f56000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81048c758000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (43) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81058f31a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 31 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 37s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 18 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102a213a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81075ee44000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786151755 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 85s ago has timed ou t (85s prior to deadline). req@ffff81037728f800 x1427389786151755/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364193888 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 83 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (43) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101a6e3e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104b836c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81067737a000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810724f1a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810132d60000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101a6e3e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81067737a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104b836c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810351564000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81040830c000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786154338 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 799s ago has timed o ut (799s prior to deadline). req@ffff8105f61e6400 x1427389786154338/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364194936 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 23 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 24 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 13 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (48) Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 43s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 7 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 5 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 44s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106a122e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106f6fec000 Lustre: 2933:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810682a10000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81031d6ac000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81032d348000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81048b71c000 Lustre: 2933:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 22s LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786165805 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 59s ago has timed ou t (59s prior to deadline). req@ffff810380643400 x1427389786165805/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364195537 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 76 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 12 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (21)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 19 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786166406 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 777s ago has timed o ut (777s prior to deadline). req@ffff8105f61e6400 x1427389786166406/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364196324 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 18 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 4 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 4 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (3) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 41s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786171887 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 796s ago has timed o ut (796s prior to deadline). req@ffff8105f61e6400 x1427389786171887/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364197124 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 8 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 4 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101c6b94000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043960c000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 38s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 43s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 3 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (8) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 30s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 30s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103775fa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103775f8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81070d15a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81017fe2e000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 30s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 6s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81015b102000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103775f8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81070d15a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103775fa000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786182479 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 12s ago has timed ou t (12s prior to deadline). req@ffff8107ef281400 x1427389786182479/t0 o8->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364197725 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 63 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107d9b30000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81017fe2e000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 15s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (40) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (8) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 45s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 8 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786187606 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 8s ago has failed du e to network error (55s prior to deadline). req@ffff8106e1153800 x1427389786187606/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364198433 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 85 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 27 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104ba9ae000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102dea18000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81019d9e2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106d2a14000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81071a31c000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 12s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786192253 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 41s ago has timed ou t (41s prior to deadline). req@ffff81069bf08c00 x1427389786192253/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364199094 ref 2 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 94 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 20 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81022f268000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810488414000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103962ca000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103c94a6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106c90b2000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043220c000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 33s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (0) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786199236 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 0s ago has failed du e to network error (177s prior to deadline). req@ffff8105cf554c00 x1427389786199236/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364200220 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81051aa94000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 58 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 19 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (41) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107ba706000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81062025e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105a74bc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106bc1ca000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104cdd60000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810259a78000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 10 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 60s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff81011b26f270 map ffff810134b4ae70 index 82432 flags 7c0100000000a21 count 7 priv ffff8107c1fc1c30: read queue failed:  rc -5 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8107ef27a400 x1427389786203097/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8101aed34800 x1427389786203106/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 8 previous similar messages LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff81011b26f270 map ffff810134b4ae70 index 82432 flags 7c0100000000a21 count 5 priv ffff8107c1fc1c30: read queue failed:  rc -5 LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff81011b26f270 map ffff810134b4ae70 index 82432 flags 7c0100000000a21 count 5 priv ffff8107c1fc1c30: read queue failed:  rc -5 LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff81011b26f270 map ffff810134b4ae70 index 82432 flags 7c0100000000a21 count 6 priv ffff8107c1fc1c30: read queue failed:  -- MORE -- forward: , or j backward: b or k quit: q  rc -5 LustreError: 9478:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 9478:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8104b89466c0 (28447855/0/0/0) (rc: 1) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786204119 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 18s ago has failed d ue to network error (99s prior to deadline). req@ffff8101fb658c00 x1427389786204119/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364200752 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 58 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81061c798000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810671916000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81039b2de000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810343572000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107d160a000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101ee1f0000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 22 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 15s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786208291 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 21s ago has timed ou t (21s prior to deadline). req@ffff8105dfdb1c00 x1427389786208291/t0 o8->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364201286 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 87 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 12 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810671916000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786212035 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff81034479f400 x1427389786212035/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364201893 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 87 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810343572000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810308d36000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102af9da000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102af9de000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102ace7e000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810402686000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101ee858000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810343572000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810716858000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101fc222000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (49) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810402686000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101ee858000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810716858000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101fc222000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810343572000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81081a452000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 57s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786216406 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 115s ago has timed o ut (115s prior to deadline). req@ffff81038be98000 x1427389786216406/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364202543 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 66 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 20 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102af9da000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81042bc10000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810573f42000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81027fa08000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81077a870000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81050502e000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101fc222000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810343572000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810666d70000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810716858000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810682fbe000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810402686000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103ab742000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 10 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786221137 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff8101a1c58400 x1427389786221137/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364203143 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 65 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 23s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786225849 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 27s ago has timed ou t (27s prior to deadline). req@ffff8107108fc000 x1427389786225849/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364203764 ref 1 fl Rpc:N/0/0 rc 0/0  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 77 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 12s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 27 previous similar messages Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000f_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52eca6d6 to 0x9d26d1a531553b1 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff81011d23a6e0 map ffff8102d2d38a30 index 21760 flags 850100000000a21 count 5 priv ffff810612c7ecb0: read queue failed:  rc -5 LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff81010aafc298 map ffff810134b4ae70 index 92676 flags 308100000000821 count 6 priv ffff81058e633dc0: read queue failed:  rc -5 LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff81010aafc298 map ffff810134b4ae70 index 92676 flags 308100000000821 count 5 priv ffff81058e633dc0: read queue failed:  rc -5 LustreError: 22367:0:(rw.c:1341:ll_issue_page_read()) page ffff81010aafc298 map ffff810134b4ae70 index 92676 flags 308100000000821 count 5 priv ffff81058e633dc0: read queue failed:  rc -5 LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff81010aafc298 map ffff810134b4ae70 index 92676 flags 308100000000821 count 5 priv ffff81058e633dc0: read queue failed:  rc -5 LustreError: 12955:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 12955:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8107e608b500 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8104f44e4c00 x1427389786229694/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 21 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786230215 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 10s ago has timed ou t (10s prior to deadline). req@ffff81056ddf0400 x1427389786230215/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364204375 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 97 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (22)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (37) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105e8a04000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101d5134000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81085abc8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c4a9e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102b5912000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106e7870000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786233236 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 129s ago has timed o ut (129s prior to deadline). req@ffff8104f1186800 x1427389786233236/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364204977 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 27 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 38s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 16 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810541344000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107ec3b3a80 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810353008000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103f6b6c000 LustreError: 167-0: This client was evicted by share3-OST0010; in progress operations using this service will fail. LustreError: 167-0: This client was evicted by share3-OST0011; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786238017 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 126s ago has timed o ut (125s prior to deadline). req@ffff8105ca59e800 x1427389786238017/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364205578 ref 1 fl Rpc:N/0/0 rc 0/0  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 49 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 32s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 17 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (9) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786243676 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 12s ago has failed d ue to network error (75s prior to deadline). req@ffff81081738f000 x1427389786243676/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364206327 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81074aba8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810175b2c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106ddb9e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102d24f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 27s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 9 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 31 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786245164 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 776s ago has timed o ut (776s prior to deadline). req@ffff8105f61e6400 x1427389786245164/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364207240 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 31 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 26 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 10 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (25) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810700c4e000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786255128 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 29s ago has timed ou t (29s prior to deadline). req@ffff8102376d6400 x1427389786255128/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364207863 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 87 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 13 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 42 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810174c8e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103c6f42000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810626ffe000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81066c584000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101e20a4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810504320000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810316782000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102adeb2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786259109 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 105s ago has timed o ut (105s prior to deadline).  -- MORE -- forward: , or j backward: b or k quit: q  req@ffff810501ff8800 x1427389786259109/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364208507 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 85 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 24 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 56s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103c6f42000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810379332000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81047cb36000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810463312000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 19 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786264236 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 27s ago has timed ou t (27s prior to deadline). req@ffff8105eb009000 x1427389786264236/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364209154 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 66 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 20 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 61s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 20 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff8101041b0fe8 map ffff8102d2d38a30 index 70144 flags 128100000000a21 count 6 priv ffff8101d3c9fa50: read queue failed:  rc -5 LustreError: 17967:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 17967:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8107e608b500 (28447854/0/0/0) (rc: 1) LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810509332c00 x1427389786265183/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106f7de6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (24) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786268714 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 25s ago has failed d ue to network error (75s prior to deadline). req@ffff8108694f1400 x1427389786268714/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364209852 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102adeb2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81070b868000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104adf20000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101627c8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81016da6e000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 85 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810200166000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 6 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 51s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81070b868000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810200166000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: tx_queue, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (51) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81084214e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810503bc4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 21 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786312770 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 117s ago has timed o ut (117s prior to deadline). req@ffff8107c11f1c00 x1427389786312770/t0 o400->share3-OST000d_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364210444 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 62 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 25 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 53s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786317350 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 43s ago has timed ou t (43s prior to deadline). req@ffff8106abad7c00 x1427389786317350/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364211045 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 61 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: Skipped 21 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000f_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a5334f85c to 0x9d26d1a53371f41 LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 22620:0:(rw.c:1341:ll_issue_page_read()) page ffff8101089bc650 map ffff8102d2d38a30 index 83712 flags 270100000000a21 count 7 priv ffff81074386e060: read queue failed:  rc -5 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8104210eb000 x1427389786317549/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810868fdf000 x1427389786317559/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e  0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0 LustreError: 9950:0:(client.c:857:ptlrpc_import_delay_req()) Skipped 9 previous similar messages LustreError: 23379:0:(rw.c:1341:ll_issue_page_read()) page ffff8101089bc650 map ffff8102d2d38a30 index 83712 flags 270100000000a21 count 5 priv ffff81074386e060: read queue failed:  rc -5 LustreError: 19615:0:(ldlm_resource.c:519:ldlm_namespace_cleanup()) Namespace share3-OST000f-osc-ffff8107f8008c00 resource refcount nonzero (1) after lock cleanup; forcing cleanup. LustreError: 19615:0:(ldlm_resource.c:524:ldlm_namespace_cleanup()) Resource: ffff8107e608b500 (28447854/0/0/0) (rc: 1) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107897cc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81061d9ee000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104f69ba000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107dde0e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810439054000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81060397e000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 26 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (14) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 23379:0:(rw.c:2323:ll_readpage()) ino 576389176 page 93693 (383766528) not covered by a lock (mmap?). check debug logs. Lustre: 22367:0:(rw.c:2323:ll_readpage()) ino 576389176 page 93693 (383766528) not covered by a lock (mmap?). check debug logs. LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (37)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81021d2e8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81087fc2e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81062e8e8000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81068b792000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105c4712000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (4) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786321318 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 91s ago has timed ou t (91s prior to deadline). req@ffff81016ef0e800 x1427389786321318/t0 o400->share3-OST0011_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364211653 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 25 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 26 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81044d7ea000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81044d7f0000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 159 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 23s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 25 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (20) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786325313 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 150s ago has timed o ut (150s prior to deadline). req@ffff8106ba3db800 x1427389786325313/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364212288 ref 2 fl Rpc:/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 89 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 21 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 22 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105d9c0e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8108500a4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107c5140000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 3s  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810280c6e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107ea826000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105d5934000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810513d34000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104b58c2000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786330093 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 41s ago has timed ou t (41s prior to deadline). req@ffff8101ad4f8000 x1427389786330093/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364212891 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 86 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (27) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107e385e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101e6df4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81077e60a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81049c6c6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106e8c36000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107fcb74000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 43s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 20 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (9) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (10) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101ec798000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81043b2a4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810314dce000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103d60da000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104353e4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81072b7fa000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786334096 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 169s ago has timed o ut (169s prior to deadline). req@ffff810205a50800 x1427389786334096/t0 o3->share3-OST000f_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 0 to 1 dl 1364213580 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 48 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786334087 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 793s ago has timed o ut (792s prior to deadline). req@ffff8105f61e6400 x1427389786334087/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364214203 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810709130000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81061e976000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810709132000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810761f7c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810411be6000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 1 previous similar message Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 43s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 6 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (21) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81019720e000 Lustre: 2930:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102a3816000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102a3814000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81077ddb4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (34) Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 6 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786344161 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 51s ago has timed ou t (51s prior to deadline).  -- MORE -- forward: , or j backward: b or k quit: q  req@ffff810389c97000 x1427389786344161/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364214817 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 42 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106bd6ca000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104d7fba000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81046060e000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 20 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 34s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 13 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e5b42000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786348280 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 67s ago has timed ou t (67s prior to deadline). req@ffff8102a4606400 x1427389786348280/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364215419 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 86 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101f5d8a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102bd1ae000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104884be000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81036900c000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 25 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 48s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786352644 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 30s ago has timed ou t (30s prior to deadline).  -- MORE -- forward: , or j backward: b or k quit: q  req@ffff8108135f2c00 x1427389786352644/t0 o8->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364216028 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 68 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 18 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 16 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 20s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 18 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786356888 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 38s ago has timed ou t (38s prior to deadline). req@ffff8104898e8800 x1427389786356888/t0 o8->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364216647 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 75 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 18 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 21 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 6s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786361896 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 37s ago has timed ou t (37s prior to deadline). req@ffff81069bfa3c00 x1427389786361896/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364217276 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 80 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (12) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (14) Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810701c44000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810591e52000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 8 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786366024 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 49s ago has timed ou t (49s prior to deadline). req@ffff81032ce2d400 x1427389786366024/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364217876 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 76 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (29) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103fa18e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102096ca000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 29s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105debfa000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105debfc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810411418000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81037a648000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105debfe000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (7) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 26 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786370618 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 23s ago has timed ou t (23s prior to deadline). req@ffff8103ed48d800 x1427389786370618/t0 o8->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364218485 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 64 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810494466000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81069ad74000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810467330000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810374f4a000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81034a436000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8103b6c64000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 20s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105663d4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810467330000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 14 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (13) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2926:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2926:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81033e7b4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81045c11c000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81045c11e000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810467330000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8105663d4000 Lustre: Skipped 18 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786374533 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 111s ago has timed o ut (111s prior to deadline). req@ffff8102edaad800 x1427389786374533/t0 o400->share3-OST0010_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364219089 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 78 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 31s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106de4d6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810303c38000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81030eb2a000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 21 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 17 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786379254 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 37s ago has timed ou t (37s prior to deadline). req@ffff810817a68000 x1427389786379254/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364219714 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 78 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 36s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 22 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (23) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST000c_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a52c9acef to 0x9d26d1a5384b3ec LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102a3ab0000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101bc050000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101e8ff6000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8107bdf36000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8106fd730000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101e8ff2000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8101e8ff4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786385760 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 30s ago has timed ou t (30s prior to deadline). req@ffff810759337800 x1427389786385760/t0 o8->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364220335 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8102086ae000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810608aca000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810608acc000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810608ace000 Lustre: Skipped 26 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 33s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 26 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786386226 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 780s ago has timed o ut (780s prior to deadline). req@ffff8105f61e6400 x1427389786386226/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364221158 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 7 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 4 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 17 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: tx_queue, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (50) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810634e04000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810381338000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810373f50000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff810634e02000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff81055a4a6000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 52s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786396165 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 119s ago has timed o ut (119s prior to deadline). req@ffff8105ae1a4000 x1427389786396165/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364221771 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 41 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 18 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 15 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (22) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 5s  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786412931 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 28s ago has timed ou t (28s prior to deadline). req@ffff8102c18ed800 x1427389786412931/t0 o8->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364222376 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786411825 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 762s ago has timed o ut (31s prior to deadline). req@ffff810551af3c00 x1427389786411825/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364222321 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 16 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 10 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (34) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786440403 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 85s ago has timed ou t (85s prior to deadline). req@ffff8103e8510400 x1427389786440403/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364223656 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 129 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 37 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (36)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 39 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 35s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 43 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786444978 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 43s ago has timed ou t (43s prior to deadline). req@ffff81052efdc800 x1427389786444978/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364224270 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 81 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 8s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (9) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786449744 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 33s ago has timed ou t (33s prior to deadline). req@ffff8106fe010000 x1427389786449744/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364224885 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 80 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 5s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 18 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786454126 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 32s ago has timed ou t (31s prior to deadline). req@ffff8107eddebc00 x1427389786454126/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364225527 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 101 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 37 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (11) Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786484843 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 12s ago has failed d ue to network error (49s prior to deadline). req@ffff810858317400 x1427389786484843/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364226201 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 92 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 32 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib.  -- MORE -- forward: , or j backward: b or k quit: q Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (12) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 12 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786504986 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 67s ago has timed ou t (67s prior to deadline). req@ffff8108567f0000 x1427389786504986/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364226769 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 83 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (7) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 10s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786509757 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 55s ago has timed ou t (55s prior to deadline). req@ffff810175ef1800 x1427389786509757/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364227382 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 75 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 22s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 9 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786517386 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 27s ago has timed ou t (27s prior to deadline). req@ffff810276f1a800 x1427389786517386/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364228004 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 84 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 10s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786521364 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 81s ago has timed ou t (81s prior to deadline). req@ffff8103c7ec3800 x1427389786521364/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364228608 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 101 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 41 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 41 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 9s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16  -- MORE -- forward: , or j backward: b or k quit: q LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786526320 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 9s ago has timed out  (9s prior to deadline). req@ffff810602747800 x1427389786526320/t0 o8->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364229210 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 38 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 40 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 40s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 12 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (12) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786569229 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 95s ago has timed ou t (95s prior to deadline). req@ffff8107843eec00 x1427389786569229/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364229822 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 1110 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 27 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 8 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 35s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 27 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (27) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786574073 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 30s ago has timed ou t (30s prior to deadline). req@ffff8107ef281400 x1427389786574073/t0 o8->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364230472 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 99 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 31 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 28s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 2 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (21) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786578678 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 3s ago has failed du e to network error (22s prior to deadline). req@ffff81087e994000 x1427389786578678/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364231104 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 14s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 22 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786620492 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 14s ago has failed d ue to network error (43s prior to deadline). req@ffff81017a3a2c00 x1427389786620492/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364231745 ref 1 fl Rpc:N/0/0 rc 0/0  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 77 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 6s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786624483 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 49s ago has timed ou t (49s prior to deadline). req@ffff810532701800 x1427389786624483/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364232326 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -114 LustreError: Skipped 4 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 12s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786628731 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 79s ago has timed ou t (79s prior to deadline). req@ffff81023df5bc00 x1427389786628731/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364232931 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 81 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 32 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786633331 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 53s ago has timed ou t (53s prior to deadline). req@ffff8105d0e05000 x1427389786633331/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364233534 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 96 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 7 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 36 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786636928 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 766s ago has timed o ut (766s prior to deadline). req@ffff8105f61e6400 x1427389786636928/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364234719 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 46 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 11 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 17 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 27 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 5s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (10) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (23)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786646431 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 45s ago has timed ou t (45s prior to deadline). req@ffff810291298400 x1427389786646431/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364235322 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 127 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 44 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 41 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 12 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786650787 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 69s ago has timed ou t (69s prior to deadline). req@ffff8104bcff3c00 x1427389786650787/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364235946 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 64 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 30 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 28 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (49) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786655370 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff81018a1c5c00 x1427389786655370/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364236567 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 104 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 41 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 41 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 26s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 132 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST000c; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786659780 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 117s ago has timed o ut (117s prior to deadline). req@ffff810585c2e800 x1427389786659780/t0 o400->share3-OST000f_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364237194 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 70 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 27 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 29s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages LustreError: 167-0: This client was evicted by share3-OST000f; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786664969 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 35s ago has timed ou t (35s prior to deadline). req@ffff810454f40400 x1427389786664969/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364237840 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 85 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786669454 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 41s ago has timed ou t (41s prior to deadline). req@ffff810295451c00 x1427389786669454/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364238468 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 98 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (20) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 21s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 39 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (36)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786673898 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 26s ago has timed ou t (25s prior to deadline). req@ffff8104ce23f400 x1427389786673898/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364239102 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 92 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 7 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 37s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 23 previous similar messages Lustre: 9951:0:(import.c:855:ptlrpc_connect_interpret()) share3-OST0010_UUID@172.26.8.140@o2ib changed server handle from 0x9d26d1a53264462 to 0x9d26d1a54e5d8ad LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 167-0: This client was evicted by share3-OST0010; in progress operations using this service will fail. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786678102 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 20s ago has timed ou t (20s prior to deadline). req@ffff81025efeb000 x1427389786678102/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364239708 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 111 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 42 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 42 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 18 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786681824 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 85s ago has timed ou t (85s prior to deadline). req@ffff8101a824bc00 x1427389786681824/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364240310 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 109 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 43 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 44 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 37s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786686275 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 89s ago has timed ou t (89s prior to deadline). req@ffff810713fe7800 x1427389786686275/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364240916 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 69 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 28 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 31 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 3s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (25) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786690933 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 33s ago has timed ou t (33s prior to deadline). req@ffff810149a25000 x1427389786690933/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364241535 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 119 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 37 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 6 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 48 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (5) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786695835 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 18s ago has failed d ue to network error (65s prior to deadline). req@ffff810181f85000 x1427389786695835/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364242217 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 77 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 29 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786700349 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 17s ago has timed ou t (17s prior to deadline).  -- MORE -- forward: , or j backward: b or k quit: q  req@ffff810851daa000 x1427389786700349/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364242794 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 81 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 9 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786704990 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 37s ago has timed ou t (37s prior to deadline). req@ffff8102f561c400 x1427389786704990/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364243448 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 98 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 40 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 41 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 6 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 10s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (13) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786709117 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 61s ago has timed ou t (61s prior to deadline). req@ffff810248cbfc00 x1427389786709117/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364244063 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 84 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib. Lustre: Skipped 30 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 1 previous similar message LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 14s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786713602 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 34s ago has timed ou t (34s prior to deadline). req@ffff81055e2d7400 x1427389786713602/t0 o8->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364244690 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 79 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 35 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 45s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786717998 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 21s ago has timed ou t (21s prior to deadline). req@ffff8101ca037c00 x1427389786717998/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364245323 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 71 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 32 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove  -- MORE -- forward: , or j backward: b or k quit: q ry to complete. Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 6s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 23 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786722813 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 37s ago has timed ou t (37s prior to deadline). req@ffff8107f6b4fc00 x1427389786722813/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364245977 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 113 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 40 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 40 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (9) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (35) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 26 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (35) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786727533 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 41s ago has timed ou t (41s prior to deadline). req@ffff8103941cd400 x1427389786727533/t0 o8->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364246578 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 100 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib.  -- MORE -- forward: , or j backward: b or k quit: q Lustre: Skipped 31 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 29 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 3s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 25 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 32 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786732500 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 26s ago has timed ou t (25s prior to deadline). req@ffff810185e04000 x1427389786732500/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364247252 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 82 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (47) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (36) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 25s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786736767 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff8101de53a000 x1427389786736767/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364247869 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 105 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786741175 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 19s ago has timed ou t (19s prior to deadline). req@ffff8103424b6400 x1427389786741175/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364248471 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 62 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (32) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (13) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 6s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 43 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786745146 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 85s ago has timed ou t (85s prior to deadline). req@ffff81039f51b000 x1427389786745146/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364249076 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 101 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 24 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib.  -- MORE -- forward: , or j backward: b or k quit: q Lustre: Skipped 29 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 9 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (18) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 7s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786750176 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 17s ago has timed ou t (17s prior to deadline). req@ffff8101e18c7800 x1427389786750176/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364249721 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 107 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 31 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 39 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786831584 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 37s ago has timed ou t (37s prior to deadline). req@ffff8101e1225000 x1427389786831584/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364250328 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 98 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete.  -- MORE -- forward: , or j backward: b or k quit: q Lustre: Skipped 31 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 28s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786835998 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 18s ago has failed d ue to network error (95s prior to deadline). req@ffff810294c02400 x1427389786835998/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364251022 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 84 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 30 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 7s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786839580 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 99s ago has timed ou t (99s prior to deadline). req@ffff8101b0fc9800 x1427389786839580/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364251546 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 113 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (24) Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 50 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 47 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (26) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (8) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 14s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786844515 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 13s ago has timed ou t (13s prior to deadline). req@ffff81027664b800 x1427389786844515/t0 o8->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364252151 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 119 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 38 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 23s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 40 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786850082 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 57s ago has timed ou t (57s prior to deadline). req@ffff8103592be000 x1427389786850082/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364252759 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 29 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection restored to service share3-OST0010 using nid 172.26.8.140@o2ib.  -- MORE -- forward: , or j backward: b or k quit: q Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (14) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (9) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786854133 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 65s ago has timed ou t (65s prior to deadline). req@ffff81087eb1b800 x1427389786854133/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364253364 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 98 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 17s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: Skipped 40 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 8 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (22) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 19s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786913523 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 24s ago has timed ou t (24s prior to deadline). req@ffff8108135f2800 x1427389786913523/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364253993 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 812 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (25)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 24 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 8s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 26 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786917928 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 67s ago has timed ou t (67s prior to deadline). req@ffff8106f6eb3c00 x1427389786917928/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364254594 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 92 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (16) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 36 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 37 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786922100 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 79s ago has timed ou t (79s prior to deadline). req@ffff8106a9c3f000 x1427389786922100/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364255206 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 96 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (31) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 12s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib.  -- MORE -- forward: , or j backward: b or k quit: q Lustre: Skipped 34 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786926890 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 22s ago has timed ou t (22s prior to deadline). req@ffff8107f6b4f000 x1427389786926890/t0 o8->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364255820 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 90 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0010-osc-ffff8107f8008c00: tried all connections, increasing latency to 20s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 19 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (8) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786930687 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 99s ago has timed ou t (99s prior to deadline). req@ffff810150be7000 x1427389786930687/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364256426 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 73 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection to service share3-OST0011 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786935553 sent from share3-OST0011-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff810136bfac00 x1427389786935553/t0 o400->share3-OST0011_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364257033 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 91 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST0011-osc-ffff8107f8008c00: tried all connections, increasing latency to 10s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 36 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 40 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786986101 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 81s ago has timed ou t (81s prior to deadline). req@ffff81071b2cb800 x1427389786986101/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364257636 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 1958 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (12) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 24 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 41 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 40 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 8 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (23) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786990926 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 21s ago has timed ou t (21s prior to deadline). req@ffff810471efcc00 x1427389786990926/t0 o400->share3-OST000e_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364258248 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 81 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 7s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 24 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786995323 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff8101f70b5000 x1427389786995323/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364258883 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 89 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000f-osc-ffff8107f8008c00: tried all connections, increasing latency to 18s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 16 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (12) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: share3-OST0010-osc-ffff8107f8008c00: Connection to service share3-OST0010 via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389786999358 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 41s ago has timed ou t (41s prior to deadline). req@ffff8107eddf0400 x1427389786999358/t0 o400->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364259487 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 89 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 41 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2930:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2930:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages Lustre: Skipped 35 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 7 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787003521 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 95s ago has timed ou t (95s prior to deadline). req@ffff8107a63dd800 x1427389787003521/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364260097 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 74 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection restored to service share3-OST000f using nid 172.26.8.140@o2ib. Lustre: Skipped 27 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (13) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 32 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 2 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787008380 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 31s ago has timed ou t (31s prior to deadline). req@ffff810292911800 x1427389787008380/t0 o400->share3-OST000c_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364260708 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 83 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (15)  -- MORE -- forward: , or j backward: b or k quit: q LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 11s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 37 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 30 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (8) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787012621 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 74s ago has timed ou t (73s prior to deadline). req@ffff8102689d0400 x1427389787012621/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364261325 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 84 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 3s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 14 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (23) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 30 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 32 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 7 previous similar messages Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787017127 sent from share3-OST0010-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 23s ago has timed ou t (23s prior to deadline). req@ffff810804362800 x1427389787017127/t0 o8->share3-OST0010_UUID@172.26.8.140@o2ib:28/4 lens 368/584 e 0 to 1 dl 1364261945 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9951:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 78 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 17s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 21 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787021081 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 75s ago has timed ou t (75s prior to deadline). req@ffff8108458c2000 x1427389787021081/t0 o400->share3-OST000d_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364262552 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 81 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 1 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (12) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection restored to service share3-OST000c using nid 172.26.8.140@o2ib. Lustre: Skipped 33 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 13s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages Lustre: share3-OST000f-osc-ffff8107f8008c00: Connection to service share3-OST000f via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 33 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 7924:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787026178 sent from share3-OST000e-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 30s ago has timed ou t (30s prior to deadline). req@ffff810868fdf000 x1427389787026178/t0 o13->share3-OST000e_UUID@172.26.8.140@o2ib:7/4 lens 192/528 e 0 to 1 dl 1364263180 ref 1 fl Rpc:/0/0 rc 0/0 Lustre: 7924:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 87 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 3 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787034668 sent from share3-OST000d-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 783s ago has timed o  -- MORE -- forward: , or j backward: b or k quit: q ut (783s prior to deadline). req@ffff8105f61e6400 x1427389787034668/t0 o3->share3-OST000d_UUID@172.26.8.140@o2ib:6/4 lens 448/592 e 6 to 1 dl 1364264260 ref 2 fl Rpc:/2/0 rc -11/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 71 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 35 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 41 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 2s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 11 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 4 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (21) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 6 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages Lustre: share3-OST0011-osc-ffff8107f8008c00: Connection restored to service share3-OST0011 using nid 172.26.8.140@o2ib. Lustre: Skipped 6 previous similar messages Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000d-osc-ffff8107f8008c00: tried all connections, increasing latency to 5s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 5 previous similar messages Lustre: share3-OST000c-osc-ffff8107f8008c00: Connection to service share3-OST000c via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 11 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 16 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 7s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 12 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection to service share3-OST000e via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 22 previous similar messages Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787088375 sent from share3-OST000f-osc-ffff8107f8008c00 to NID 172.26.8.140@o2ib 33s ago has timed ou t (33s prior to deadline). req@ffff81035253d000 x1427389787088375/t0 o400->share3-OST000f_UUID@172.26.8.140@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364264860 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 112 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000  -- MORE -- forward: , or j backward: b or k quit: q Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000e-osc-ffff8107f8008c00: tried all connections, increasing latency to 10s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 13 previous similar messages Lustre: share3-OST000e-osc-ffff8107f8008c00: Connection restored to service share3-OST000e using nid 172.26.8.140@o2ib. Lustre: Skipped 17 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (9) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 2 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Conn race 172.26.8.140@o2ib Lustre: 2932:0:(o2iblnd_cb.c:2259:kiblnd_passive_connect()) Skipped 4 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) @@@ Request x1427389787092986 sent from share3-OST000c-osc-ffff8107f8008c00 to NID 172.26.8.141@o2ib 35s ago has timed ou t (35s prior to deadline). req@ffff8106c03ab000 x1427389787092986/t0 o400->share3-OST000c_UUID@172.26.8.141@o2ib:28/4 lens 192/384 e 0 to 1 dl 1364265462 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9950:0:(client.c:1486:ptlrpc_expire_one_request()) Skipped 79 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection to service share3-OST000d via nid 172.26.8.140@o2ib was lost; in progress operations using this service will wait for recove ry to complete. Lustre: Skipped 34 previous similar messages Lustre: share3-OST000d-osc-ffff8107f8008c00: Connection restored to service share3-OST000d using nid 172.26.8.140@o2ib. Lustre: Skipped 34 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 Lustre: 9952:0:(import.c:517:import_select_connection()) share3-OST000c-osc-ffff8107f8008c00: tried all connections, increasing latency to 12s Lustre: 9952:0:(import.c:517:import_select_connection()) Skipped 15 previous similar messages LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 11 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (28) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 LustreError: 11-0: an error occurred while communicating with 172.26.8.140@o2ib. The ost_connect operation failed with -16 LustreError: Skipped 6 previous similar messages LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (17) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 SysRq : HELP : loglevel0-8 reBoot Crashdump tErm Full kIll thaw-filesystems(J) saK showMem Nice powerOff showPc unRaw Sync showTasks Unmount shoWcpus LustreError: 9948:0:(o2iblnd_cb.c:2914:kiblnd_check_txs()) Timed out tx: active_txs, 0 seconds LustreError: 9948:0:(o2iblnd_cb.c:2977:kiblnd_check_conns()) Timed out RDMA with 172.26.8.140@o2ib (30) LustreError: 9948:0:(events.c:198:client_bulk_callback()) event type 1, status -103, desc ffff8104e84f4000 SysRq : HELP : loglevel0-8 reBoot Crashdump tErm Full kIll thaw-filesystems(J) saK showMem Nice powerOff showPc unRaw Sync showTasks Unmount shoWcpus SysRq : Emergency Sync Emergency Sync complete SysRq : Emergency Remount R/O Emergency Remount complete SysRq : Trigger a crashdump [?1l>crash> crash> crash> crash> exit ]0;root@wk1:~/DDN_20130326[root@wk1 DDN_20130326]# ]0;root@wk1:~/DDN_20130326[root@wk1 DDN_20130326]# exit exit Script done on Wed Mar 27 11:36:19 2013