[ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-957.27.2.el7_lustre.pl2.x86_64 (sthiell@oak-rbh01) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Thu Nov 7 15:26:16 PST 2019 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.27.2.el7_lustre.pl2.x86_64 root=UUID=abdfca31-9e32-4c60-981c-98bd3cab6b0a ro crashkernel=auto nomodeset console=ttyS0,115200 LANG=en_US.UTF-8 [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008efff] usable [ 0.000000] BIOS-e820: [mem 0x000000000008f000-0x000000000008ffff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x0000000000090000-0x000000000009ffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000004f882fff] usable [ 0.000000] BIOS-e820: [mem 0x000000004f883000-0x000000005788bfff] reserved [ 0.000000] BIOS-e820: [mem 0x000000005788c000-0x000000006cacefff] usable [ 0.000000] BIOS-e820: [mem 0x000000006cacf000-0x000000006efcefff] reserved [ 0.000000] BIOS-e820: [mem 0x000000006efcf000-0x000000006fdfefff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000006fdff000-0x000000006fffefff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000006ffff000-0x000000006fffffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000070000000-0x000000008fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000107f37ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000107f380000-0x000000107fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000001080000000-0x000000207ff7ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000207ff80000-0x000000207fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000002080000000-0x000000307ff7ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000307ff80000-0x000000307fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000003080000000-0x000000407ff7ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000407ff80000-0x000000407fffffff] reserved [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] e820: update [mem 0x3705b020-0x3708cc5f] usable ==> usable [ 0.000000] e820: update [mem 0x37029020-0x3705ac5f] usable ==> usable [ 0.000000] e820: update [mem 0x37020020-0x3702805f] usable ==> usable [ 0.000000] e820: update [mem 0x37007020-0x3701f65f] usable ==> usable [ 0.000000] extended physical RAM map: [ 0.000000] reserve setup_data: [mem 0x0000000000000000-0x000000000008efff] usable [ 0.000000] reserve setup_data: [mem 0x000000000008f000-0x000000000008ffff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x0000000000090000-0x000000000009ffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000000100000-0x000000003700701f] usable [ 0.000000] reserve setup_data: [mem 0x0000000037007020-0x000000003701f65f] usable [ 0.000000] reserve setup_data: [mem 0x000000003701f660-0x000000003702001f] usable [ 0.000000] reserve setup_data: [mem 0x0000000037020020-0x000000003702805f] usable [ 0.000000] reserve setup_data: [mem 0x0000000037028060-0x000000003702901f] usable [ 0.000000] reserve setup_data: [mem 0x0000000037029020-0x000000003705ac5f] usable [ 0.000000] reserve setup_data: [mem 0x000000003705ac60-0x000000003705b01f] usable [ 0.000000] reserve setup_data: [mem 0x000000003705b020-0x000000003708cc5f] usable [ 0.000000] reserve setup_data: [mem 0x000000003708cc60-0x000000004f882fff] usable [ 0.000000] reserve setup_data: [mem 0x000000004f883000-0x000000005788bfff] reserved [ 0.000000] reserve setup_data: [mem 0x000000005788c000-0x000000006cacefff] usable [ 0.000000] reserve setup_data: [mem 0x000000006cacf000-0x000000006efcefff] reserved [ 0.000000] reserve setup_data: [mem 0x000000006efcf000-0x000000006fdfefff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x000000006fdff000-0x000000006fffefff] ACPI data [ 0.000000] reserve setup_data: [mem 0x000000006ffff000-0x000000006fffffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000070000000-0x000000008fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000fec10000-0x00000000fec10fff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000fed80000-0x00000000fed80fff] reserved [ 0.000000] reserve setup_data: [mem 0x0000000100000000-0x000000107f37ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000107f380000-0x000000107fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000001080000000-0x000000207ff7ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000207ff80000-0x000000207fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000002080000000-0x000000307ff7ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000307ff80000-0x000000307fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000003080000000-0x000000407ff7ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000407ff80000-0x000000407fffffff] reserved [ 0.000000] efi: EFI v2.50 by Dell Inc. [ 0.000000] efi: ACPI=0x6fffe000 ACPI 2.0=0x6fffe014 SMBIOS=0x6eab5000 SMBIOS 3.0=0x6eab3000 [ 0.000000] efi: mem00: type=3, attr=0xf, range=[0x0000000000000000-0x0000000000001000) (0MB) [ 0.000000] efi: mem01: type=2, attr=0xf, range=[0x0000000000001000-0x0000000000002000) (0MB) [ 0.000000] efi: mem02: type=7, attr=0xf, range=[0x0000000000002000-0x0000000000010000) (0MB) [ 0.000000] efi: mem03: type=3, attr=0xf, range=[0x0000000000010000-0x0000000000014000) (0MB) [ 0.000000] efi: mem04: type=7, attr=0xf, range=[0x0000000000014000-0x0000000000063000) (0MB) [ 0.000000] efi: mem05: type=3, attr=0xf, range=[0x0000000000063000-0x000000000008f000) (0MB) [ 0.000000] efi: mem06: type=10, attr=0xf, range=[0x000000000008f000-0x0000000000090000) (0MB) [ 0.000000] efi: mem07: type=3, attr=0xf, range=[0x0000000000090000-0x00000000000a0000) (0MB) [ 0.000000] efi: mem08: type=4, attr=0xf, range=[0x0000000000100000-0x0000000000120000) (0MB) [ 0.000000] efi: mem09: type=7, attr=0xf, range=[0x0000000000120000-0x0000000000c00000) (10MB) [ 0.000000] efi: mem10: type=3, attr=0xf, range=[0x0000000000c00000-0x0000000001000000) (4MB) [ 0.000000] efi: mem11: type=2, attr=0xf, range=[0x0000000001000000-0x000000000267b000) (22MB) [ 0.000000] efi: mem12: type=7, attr=0xf, range=[0x000000000267b000-0x0000000004000000) (25MB) [ 0.000000] efi: mem13: type=4, attr=0xf, range=[0x0000000004000000-0x000000000403b000) (0MB) [ 0.000000] efi: mem14: type=7, attr=0xf, range=[0x000000000403b000-0x0000000037007000) (815MB) [ 0.000000] efi: mem15: type=2, attr=0xf, range=[0x0000000037007000-0x000000004eee6000) (382MB) [ 0.000000] efi: mem16: type=7, attr=0xf, range=[0x000000004eee6000-0x000000004eeea000) (0MB) [ 0.000000] efi: mem17: type=2, attr=0xf, range=[0x000000004eeea000-0x000000004eeec000) (0MB) [ 0.000000] efi: mem18: type=1, attr=0xf, range=[0x000000004eeec000-0x000000004f009000) (1MB) [ 0.000000] efi: mem19: type=2, attr=0xf, range=[0x000000004f009000-0x000000004f128000) (1MB) [ 0.000000] efi: mem20: type=1, attr=0xf, range=[0x000000004f128000-0x000000004f237000) (1MB) [ 0.000000] efi: mem21: type=3, attr=0xf, range=[0x000000004f237000-0x000000004f883000) (6MB) [ 0.000000] efi: mem22: type=0, attr=0xf, range=[0x000000004f883000-0x000000005788c000) (128MB) [ 0.000000] efi: mem23: type=3, attr=0xf, range=[0x000000005788c000-0x000000005796e000) (0MB) [ 0.000000] efi: mem24: type=4, attr=0xf, range=[0x000000005796e000-0x000000005b4cf000) (59MB) [ 0.000000] efi: mem25: type=3, attr=0xf, range=[0x000000005b4cf000-0x000000005b8cf000) (4MB) [ 0.000000] efi: mem26: type=7, attr=0xf, range=[0x000000005b8cf000-0x0000000067b63000) (194MB) [ 0.000000] efi: mem27: type=4, attr=0xf, range=[0x0000000067b63000-0x0000000067b70000) (0MB) [ 0.000000] efi: mem28: type=7, attr=0xf, range=[0x0000000067b70000-0x0000000067b74000) (0MB) [ 0.000000] efi: mem29: type=4, attr=0xf, range=[0x0000000067b74000-0x00000000681aa000) (6MB) [ 0.000000] efi: mem30: type=7, attr=0xf, range=[0x00000000681aa000-0x00000000681ab000) (0MB) [ 0.000000] efi: mem31: type=4, attr=0xf, range=[0x00000000681ab000-0x00000000681b5000) (0MB) [ 0.000000] efi: mem32: type=7, attr=0xf, range=[0x00000000681b5000-0x00000000681b6000) (0MB) [ 0.000000] efi: mem33: type=4, attr=0xf, range=[0x00000000681b6000-0x00000000681ba000) (0MB) [ 0.000000] efi: mem34: type=7, attr=0xf, range=[0x00000000681ba000-0x00000000681bb000) (0MB) [ 0.000000] efi: mem35: type=4, attr=0xf, range=[0x00000000681bb000-0x00000000681cc000) (0MB) [ 0.000000] efi: mem36: type=7, attr=0xf, range=[0x00000000681cc000-0x00000000681cd000) (0MB) [ 0.000000] efi: mem37: type=4, attr=0xf, range=[0x00000000681cd000-0x00000000681d2000) (0MB) [ 0.000000] efi: mem38: type=7, attr=0xf, range=[0x00000000681d2000-0x00000000681d3000) (0MB) [ 0.000000] efi: mem39: type=4, attr=0xf, range=[0x00000000681d3000-0x00000000681db000) (0MB) [ 0.000000] efi: mem40: type=7, attr=0xf, range=[0x00000000681db000-0x00000000681dc000) (0MB) [ 0.000000] efi: mem41: type=4, attr=0xf, range=[0x00000000681dc000-0x00000000681de000) (0MB) [ 0.000000] efi: mem42: type=7, attr=0xf, range=[0x00000000681de000-0x00000000681df000) (0MB) [ 0.000000] efi: mem43: type=4, attr=0xf, range=[0x00000000681df000-0x00000000681f0000) (0MB) [ 0.000000] efi: mem44: type=7, attr=0xf, range=[0x00000000681f0000-0x00000000681f1000) (0MB) [ 0.000000] efi: mem45: type=4, attr=0xf, range=[0x00000000681f1000-0x00000000681f4000) (0MB) [ 0.000000] efi: mem46: type=7, attr=0xf, range=[0x00000000681f4000-0x00000000681f6000) (0MB) [ 0.000000] efi: mem47: type=4, attr=0xf, range=[0x00000000681f6000-0x00000000681ff000) (0MB) [ 0.000000] efi: mem48: type=7, attr=0xf, range=[0x00000000681ff000-0x0000000068200000) (0MB) [ 0.000000] efi: mem49: type=4, attr=0xf, range=[0x0000000068200000-0x0000000068202000) (0MB) [ 0.000000] efi: mem50: type=7, attr=0xf, range=[0x0000000068202000-0x0000000068203000) (0MB) [ 0.000000] efi: mem51: type=4, attr=0xf, range=[0x0000000068203000-0x0000000068207000) (0MB) [ 0.000000] efi: mem52: type=7, attr=0xf, range=[0x0000000068207000-0x0000000068208000) (0MB) [ 0.000000] efi: mem53: type=4, attr=0xf, range=[0x0000000068208000-0x000000006853d000) (3MB) [ 0.000000] efi: mem54: type=7, attr=0xf, range=[0x000000006853d000-0x000000006853e000) (0MB) [ 0.000000] efi: mem55: type=4, attr=0xf, range=[0x000000006853e000-0x0000000068552000) (0MB) [ 0.000000] efi: mem56: type=7, attr=0xf, range=[0x0000000068552000-0x0000000068554000) (0MB) [ 0.000000] efi: mem57: type=4, attr=0xf, range=[0x0000000068554000-0x0000000068564000) (0MB) [ 0.000000] efi: mem58: type=7, attr=0xf, range=[0x0000000068564000-0x0000000068565000) (0MB) [ 0.000000] efi: mem59: type=4, attr=0xf, range=[0x0000000068565000-0x000000006857a000) (0MB) [ 0.000000] efi: mem60: type=7, attr=0xf, range=[0x000000006857a000-0x000000006857b000) (0MB) [ 0.000000] efi: mem61: type=4, attr=0xf, range=[0x000000006857b000-0x000000006858b000) (0MB) [ 0.000000] efi: mem62: type=7, attr=0xf, range=[0x000000006858b000-0x000000006858c000) (0MB) [ 0.000000] efi: mem63: type=4, attr=0xf, range=[0x000000006858c000-0x00000000685b4000) (0MB) [ 0.000000] efi: mem64: type=7, attr=0xf, range=[0x00000000685b4000-0x00000000685b5000) (0MB) [ 0.000000] efi: mem65: type=4, attr=0xf, range=[0x00000000685b5000-0x00000000685cf000) (0MB) [ 0.000000] efi: mem66: type=7, attr=0xf, range=[0x00000000685cf000-0x00000000685d0000) (0MB) [ 0.000000] efi: mem67: type=4, attr=0xf, range=[0x00000000685d0000-0x00000000685eb000) (0MB) [ 0.000000] efi: mem68: type=7, attr=0xf, range=[0x00000000685eb000-0x00000000685ec000) (0MB) [ 0.000000] efi: mem69: type=4, attr=0xf, range=[0x00000000685ec000-0x000000006862f000) (0MB) [ 0.000000] efi: mem70: type=7, attr=0xf, range=[0x000000006862f000-0x0000000068630000) (0MB) [ 0.000000] efi: mem71: type=4, attr=0xf, range=[0x0000000068630000-0x0000000068641000) (0MB) [ 0.000000] efi: mem72: type=7, attr=0xf, range=[0x0000000068641000-0x0000000068643000) (0MB) [ 0.000000] efi: mem73: type=4, attr=0xf, range=[0x0000000068643000-0x0000000068648000) (0MB) [ 0.000000] efi: mem74: type=7, attr=0xf, range=[0x0000000068648000-0x0000000068649000) (0MB) [ 0.000000] efi: mem75: type=4, attr=0xf, range=[0x0000000068649000-0x0000000068658000) (0MB) [ 0.000000] efi: mem76: type=7, attr=0xf, range=[0x0000000068658000-0x0000000068659000) (0MB) [ 0.000000] efi: mem77: type=4, attr=0xf, range=[0x0000000068659000-0x000000006867a000) (0MB) [ 0.000000] efi: mem78: type=7, attr=0xf, range=[0x000000006867a000-0x000000006867b000) (0MB) [ 0.000000] efi: mem79: type=4, attr=0xf, range=[0x000000006867b000-0x00000000686da000) (0MB) [ 0.000000] efi: mem80: type=7, attr=0xf, range=[0x00000000686da000-0x00000000686db000) (0MB) [ 0.000000] efi: mem81: type=4, attr=0xf, range=[0x00000000686db000-0x00000000686de000) (0MB) [ 0.000000] efi: mem82: type=7, attr=0xf, range=[0x00000000686de000-0x00000000686df000) (0MB) [ 0.000000] efi: mem83: type=4, attr=0xf, range=[0x00000000686df000-0x00000000686e5000) (0MB) [ 0.000000] efi: mem84: type=7, attr=0xf, range=[0x00000000686e5000-0x00000000686e6000) (0MB) [ 0.000000] efi: mem85: type=4, attr=0xf, range=[0x00000000686e6000-0x00000000686e8000) (0MB) [ 0.000000] efi: mem86: type=7, attr=0xf, range=[0x00000000686e8000-0x00000000686e9000) (0MB) [ 0.000000] efi: mem87: type=4, attr=0xf, range=[0x00000000686e9000-0x00000000686ed000) (0MB) [ 0.000000] efi: mem88: type=7, attr=0xf, range=[0x00000000686ed000-0x00000000686ee000) (0MB) [ 0.000000] efi: mem89: type=4, attr=0xf, range=[0x00000000686ee000-0x00000000686f6000) (0MB) [ 0.000000] efi: mem90: type=7, attr=0xf, range=[0x00000000686f6000-0x00000000686f7000) (0MB) [ 0.000000] efi: mem91: type=4, attr=0xf, range=[0x00000000686f7000-0x0000000068701000) (0MB) [ 0.000000] efi: mem92: type=7, attr=0xf, range=[0x0000000068701000-0x0000000068702000) (0MB) [ 0.000000] efi: mem93: type=4, attr=0xf, range=[0x0000000068702000-0x0000000068704000) (0MB) [ 0.000000] efi: mem94: type=7, attr=0xf, range=[0x0000000068704000-0x0000000068705000) (0MB) [ 0.000000] efi: mem95: type=4, attr=0xf, range=[0x0000000068705000-0x0000000068722000) (0MB) [ 0.000000] efi: mem96: type=7, attr=0xf, range=[0x0000000068722000-0x0000000068723000) (0MB) [ 0.000000] efi: mem97: type=4, attr=0xf, range=[0x0000000068723000-0x000000006b8cf000) (49MB) [ 0.000000] efi: mem98: type=7, attr=0xf, range=[0x000000006b8cf000-0x000000006b8d0000) (0MB) [ 0.000000] efi: mem99: type=3, attr=0xf, range=[0x000000006b8d0000-0x000000006cacf000) (17MB) [ 0.000000] efi: mem100: type=6, attr=0x800000000000000f, range=[0x000000006cacf000-0x000000006cbcf000) (1MB) [ 0.000000] efi: mem101: type=5, attr=0x800000000000000f, range=[0x000000006cbcf000-0x000000006cdcf000) (2MB) [ 0.000000] efi: mem102: type=0, attr=0xf, range=[0x000000006cdcf000-0x000000006efcf000) (34MB) [ 0.000000] efi: mem103: type=10, attr=0xf, range=[0x000000006efcf000-0x000000006fdff000) (14MB) [ 0.000000] efi: mem104: type=9, attr=0xf, range=[0x000000006fdff000-0x000000006ffff000) (2MB) [ 0.000000] efi: mem105: type=4, attr=0xf, range=[0x000000006ffff000-0x0000000070000000) (0MB) [ 0.000000] efi: mem106: type=7, attr=0xf, range=[0x0000000100000000-0x000000107f380000) (63475MB) [ 0.000000] efi: mem107: type=7, attr=0xf, range=[0x0000001080000000-0x000000207ff80000) (65535MB) [ 0.000000] efi: mem108: type=7, attr=0xf, range=[0x0000002080000000-0x000000307ff80000) (65535MB) [ 0.000000] efi: mem109: type=7, attr=0xf, range=[0x0000003080000000-0x000000407ff80000) (65535MB) [ 0.000000] efi: mem110: type=0, attr=0x9, range=[0x0000000070000000-0x0000000080000000) (256MB) [ 0.000000] efi: mem111: type=11, attr=0x800000000000000f, range=[0x0000000080000000-0x0000000090000000) (256MB) [ 0.000000] efi: mem112: type=11, attr=0x800000000000000f, range=[0x00000000fec10000-0x00000000fec11000) (0MB) [ 0.000000] efi: mem113: type=11, attr=0x800000000000000f, range=[0x00000000fed80000-0x00000000fed81000) (0MB) [ 0.000000] efi: mem114: type=0, attr=0x0, range=[0x000000107f380000-0x0000001080000000) (12MB) [ 0.000000] efi: mem115: type=0, attr=0x0, range=[0x000000207ff80000-0x0000002080000000) (0MB) [ 0.000000] efi: mem116: type=0, attr=0x0, range=[0x000000307ff80000-0x0000003080000000) (0MB) [ 0.000000] efi: mem117: type=0, attr=0x0, range=[0x000000407ff80000-0x0000004080000000) (0MB) [ 0.000000] SMBIOS 3.2.0 present. [ 0.000000] DMI: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019 [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable [ 0.000000] e820: last_pfn = 0x407ff80 max_arch_pfn = 0x400000000 [ 0.000000] MTRR default type: uncachable [ 0.000000] MTRR fixed ranges enabled: [ 0.000000] 00000-9FFFF write-back [ 0.000000] A0000-FFFFF uncachable [ 0.000000] MTRR variable ranges enabled: [ 0.000000] 0 base 0000FF000000 mask FFFFFF000000 write-protect [ 0.000000] 1 base 000000000000 mask FFFF80000000 write-back [ 0.000000] 2 base 000070000000 mask FFFFF0000000 uncachable [ 0.000000] 3 disabled [ 0.000000] 4 disabled [ 0.000000] 5 disabled [ 0.000000] 6 disabled [ 0.000000] 7 disabled [ 0.000000] TOM2: 0000004080000000 aka 264192M [ 0.000000] PAT configuration [0-7]: WB WC UC- UC WB WP UC- UC [ 0.000000] e820: last_pfn = 0x70000 max_arch_pfn = 0x400000000 [ 0.000000] Base memory trampoline at [ffff884bc0099000] 99000 size 24576 [ 0.000000] Using GB pages for direct mapping [ 0.000000] BRK [0x318cc53000, 0x318cc53fff] PGTABLE [ 0.000000] BRK [0x318cc54000, 0x318cc54fff] PGTABLE [ 0.000000] BRK [0x318cc55000, 0x318cc55fff] PGTABLE [ 0.000000] BRK [0x318cc56000, 0x318cc56fff] PGTABLE [ 0.000000] BRK [0x318cc57000, 0x318cc57fff] PGTABLE [ 0.000000] BRK [0x318cc58000, 0x318cc58fff] PGTABLE [ 0.000000] BRK [0x318cc59000, 0x318cc59fff] PGTABLE [ 0.000000] BRK [0x318cc5a000, 0x318cc5afff] PGTABLE [ 0.000000] BRK [0x318cc5b000, 0x318cc5bfff] PGTABLE [ 0.000000] BRK [0x318cc5c000, 0x318cc5cfff] PGTABLE [ 0.000000] BRK [0x318cc5d000, 0x318cc5dfff] PGTABLE [ 0.000000] BRK [0x318cc5e000, 0x318cc5efff] PGTABLE [ 0.000000] RAMDISK: [mem 0x3708d000-0x383d1fff] [ 0.000000] Early table checksum verification disabled [ 0.000000] ACPI: RSDP 000000006fffe014 00024 (v02 DELL ) [ 0.000000] ACPI: XSDT 000000006fffd0e8 000AC (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: FACP 000000006fff0000 00114 (v06 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: DSDT 000000006ffdc000 1038C (v02 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: FACS 000000006fdd3000 00040 [ 0.000000] ACPI: SSDT 000000006fffc000 000D2 (v02 DELL PE_SC3 00000002 MSFT 04000000) [ 0.000000] ACPI: BERT 000000006fffb000 00030 (v01 DELL BERT 00000001 DELL 00000001) [ 0.000000] ACPI: HEST 000000006fffa000 006DC (v01 DELL HEST 00000001 DELL 00000001) [ 0.000000] ACPI: SSDT 000000006fff9000 00294 (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: SRAT 000000006fff8000 00420 (v03 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: MSCT 000000006fff7000 0004E (v01 DELL PE_SC3 00000000 AMD 00000001) [ 0.000000] ACPI: SLIT 000000006fff6000 0003C (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: CRAT 000000006fff3000 02DC0 (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: EINJ 000000006fff2000 00150 (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: SLIC 000000006fff1000 00024 (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: HPET 000000006ffef000 00038 (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: APIC 000000006ffee000 004B2 (v03 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: MCFG 000000006ffed000 0003C (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: SSDT 000000006ffdb000 00629 (v02 DELL xhc_port 00000001 INTL 20170119) [ 0.000000] ACPI: IVRS 000000006ffda000 00210 (v02 DELL PE_SC3 00000001 AMD 00000000) [ 0.000000] ACPI: SSDT 000000006ffd8000 01658 (v01 AMD CPMCMN 00000001 INTL 20170119) [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] SRAT: PXM 0 -> APIC 0x00 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x01 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x02 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x03 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x04 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x05 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x08 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x09 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0a -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0b -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0c -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0d -> Node 0 [ 0.000000] SRAT: PXM 1 -> APIC 0x10 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x11 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x12 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x13 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x14 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x15 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x18 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x19 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1a -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1b -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1c -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1d -> Node 1 [ 0.000000] SRAT: PXM 2 -> APIC 0x20 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x21 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x22 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x23 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x24 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x25 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x28 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x29 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2a -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2b -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2c -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2d -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 0x30 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x31 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x32 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x33 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x34 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x35 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x38 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x39 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3a -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3b -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3c -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3d -> Node 3 [ 0.000000] SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff] [ 0.000000] SRAT: Node 0 PXM 0 [mem 0x00100000-0x7fffffff] [ 0.000000] SRAT: Node 0 PXM 0 [mem 0x100000000-0x107fffffff] [ 0.000000] SRAT: Node 1 PXM 1 [mem 0x1080000000-0x207fffffff] [ 0.000000] SRAT: Node 2 PXM 2 [mem 0x2080000000-0x307fffffff] [ 0.000000] SRAT: Node 3 PXM 3 [mem 0x3080000000-0x407fffffff] [ 0.000000] NUMA: Initialized distance table, cnt=4 [ 0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0x7fffffff] -> [mem 0x00000000-0x7fffffff] [ 0.000000] NUMA: Node 0 [mem 0x00000000-0x7fffffff] + [mem 0x100000000-0x107fffffff] -> [mem 0x00000000-0x107fffffff] [ 0.000000] NODE_DATA(0) allocated [mem 0x107f359000-0x107f37ffff] [ 0.000000] NODE_DATA(1) allocated [mem 0x207ff59000-0x207ff7ffff] [ 0.000000] NODE_DATA(2) allocated [mem 0x307ff59000-0x307ff7ffff] [ 0.000000] NODE_DATA(3) allocated [mem 0x407ff58000-0x407ff7efff] [ 0.000000] Reserving 176MB of memory at 704MB for crashkernel (System RAM: 261692MB) [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x00001000-0x00ffffff] [ 0.000000] DMA32 [mem 0x01000000-0xffffffff] [ 0.000000] Normal [mem 0x100000000-0x407ff7ffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00001000-0x0008efff] [ 0.000000] node 0: [mem 0x00090000-0x0009ffff] [ 0.000000] node 0: [mem 0x00100000-0x4f882fff] [ 0.000000] node 0: [mem 0x5788c000-0x6cacefff] [ 0.000000] node 0: [mem 0x6ffff000-0x6fffffff] [ 0.000000] node 0: [mem 0x100000000-0x107f37ffff] [ 0.000000] node 1: [mem 0x1080000000-0x207ff7ffff] [ 0.000000] node 2: [mem 0x2080000000-0x307ff7ffff] [ 0.000000] node 3: [mem 0x3080000000-0x407ff7ffff] [ 0.000000] Initmem setup node 0 [mem 0x00001000-0x107f37ffff] [ 0.000000] On node 0 totalpages: 16661989 [ 0.000000] DMA zone: 64 pages used for memmap [ 0.000000] DMA zone: 1126 pages reserved [ 0.000000] DMA zone: 3998 pages, LIFO batch:0 [ 0.000000] DMA32 zone: 6380 pages used for memmap [ 0.000000] DMA32 zone: 408263 pages, LIFO batch:31 [ 0.000000] Normal zone: 253902 pages used for memmap [ 0.000000] Normal zone: 16249728 pages, LIFO batch:31 [ 0.000000] Initmem setup node 1 [mem 0x1080000000-0x207ff7ffff] [ 0.000000] On node 1 totalpages: 16777088 [ 0.000000] Normal zone: 262142 pages used for memmap [ 0.000000] Normal zone: 16777088 pages, LIFO batch:31 [ 0.000000] Initmem setup node 2 [mem 0x2080000000-0x307ff7ffff] [ 0.000000] On node 2 totalpages: 16777088 [ 0.000000] Normal zone: 262142 pages used for memmap [ 0.000000] Normal zone: 16777088 pages, LIFO batch:31 [ 0.000000] Initmem setup node 3 [mem 0x3080000000-0x407ff7ffff] [ 0.000000] On node 3 totalpages: 16777088 [ 0.000000] Normal zone: 262142 pages used for memmap [ 0.000000] Normal zone: 16777088 pages, LIFO batch:31 [ 0.000000] ACPI: PM-Timer IO Port: 0x408 [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x20] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x30] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x08] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x18] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x28] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x38] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x02] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x12] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x22] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x32] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x1a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x2a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x3a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x10] lapic_id[0x04] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x11] lapic_id[0x14] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x12] lapic_id[0x24] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x13] lapic_id[0x34] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x14] lapic_id[0x0c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x15] lapic_id[0x1c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x16] lapic_id[0x2c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x17] lapic_id[0x3c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x18] lapic_id[0x01] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x19] lapic_id[0x11] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1a] lapic_id[0x21] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1b] lapic_id[0x31] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1c] lapic_id[0x09] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1d] lapic_id[0x19] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1e] lapic_id[0x29] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1f] lapic_id[0x39] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x20] lapic_id[0x03] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x21] lapic_id[0x13] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x22] lapic_id[0x23] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x23] lapic_id[0x33] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x24] lapic_id[0x0b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x25] lapic_id[0x1b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x26] lapic_id[0x2b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x27] lapic_id[0x3b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x28] lapic_id[0x05] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x29] lapic_id[0x15] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2a] lapic_id[0x25] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2b] lapic_id[0x35] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2c] lapic_id[0x0d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2d] lapic_id[0x1d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2e] lapic_id[0x2d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2f] lapic_id[0x3d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x30] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x31] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x32] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x33] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x34] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x35] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x36] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x37] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x38] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x39] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x40] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x41] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x42] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x43] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x44] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x45] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x46] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x47] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x48] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x49] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x50] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x51] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x52] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x53] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x54] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x55] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x56] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x57] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x58] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x59] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x60] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x61] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x62] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x63] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x64] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x65] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x66] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x67] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x68] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x69] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x70] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x71] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x72] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x73] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x74] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x75] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x76] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x77] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x78] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x79] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1]) [ 0.000000] ACPI: IOAPIC (id[0x80] address[0xfec00000] gsi_base[0]) [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: IOAPIC (id[0x81] address[0xfd880000] gsi_base[24]) [ 0.000000] IOAPIC[1]: apic_id 129, version 33, address 0xfd880000, GSI 24-55 [ 0.000000] ACPI: IOAPIC (id[0x82] address[0xe0900000] gsi_base[56]) [ 0.000000] IOAPIC[2]: apic_id 130, version 33, address 0xe0900000, GSI 56-87 [ 0.000000] ACPI: IOAPIC (id[0x83] address[0xc5900000] gsi_base[88]) [ 0.000000] IOAPIC[3]: apic_id 131, version 33, address 0xc5900000, GSI 88-119 [ 0.000000] ACPI: IOAPIC (id[0x84] address[0xaa900000] gsi_base[120]) [ 0.000000] IOAPIC[4]: apic_id 132, version 33, address 0xaa900000, GSI 120-151 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) [ 0.000000] ACPI: IRQ0 used by override. [ 0.000000] ACPI: IRQ9 used by override. [ 0.000000] Using ACPI (MADT) for SMP configuration information [ 0.000000] ACPI: HPET id: 0x10228201 base: 0xfed00000 [ 0.000000] smpboot: Allowing 128 CPUs, 80 hotplug CPUs [ 0.000000] PM: Registered nosave memory: [mem 0x0008f000-0x0008ffff] [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000fffff] [ 0.000000] PM: Registered nosave memory: [mem 0x37007000-0x37007fff] [ 0.000000] PM: Registered nosave memory: [mem 0x3701f000-0x3701ffff] [ 0.000000] PM: Registered nosave memory: [mem 0x37020000-0x37020fff] [ 0.000000] PM: Registered nosave memory: [mem 0x37028000-0x37028fff] [ 0.000000] PM: Registered nosave memory: [mem 0x37029000-0x37029fff] [ 0.000000] PM: Registered nosave memory: [mem 0x3705a000-0x3705afff] [ 0.000000] PM: Registered nosave memory: [mem 0x3705b000-0x3705bfff] [ 0.000000] PM: Registered nosave memory: [mem 0x3708c000-0x3708cfff] [ 0.000000] PM: Registered nosave memory: [mem 0x4f883000-0x5788bfff] [ 0.000000] PM: Registered nosave memory: [mem 0x6cacf000-0x6efcefff] [ 0.000000] PM: Registered nosave memory: [mem 0x6efcf000-0x6fdfefff] [ 0.000000] PM: Registered nosave memory: [mem 0x6fdff000-0x6fffefff] [ 0.000000] PM: Registered nosave memory: [mem 0x70000000-0x8fffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x90000000-0xfec0ffff] [ 0.000000] PM: Registered nosave memory: [mem 0xfec10000-0xfec10fff] [ 0.000000] PM: Registered nosave memory: [mem 0xfec11000-0xfed7ffff] [ 0.000000] PM: Registered nosave memory: [mem 0xfed80000-0xfed80fff] [ 0.000000] PM: Registered nosave memory: [mem 0xfed81000-0xffffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x107f380000-0x107fffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x207ff80000-0x207fffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x307ff80000-0x307fffffff] [ 0.000000] e820: [mem 0x90000000-0xfec0ffff] available for PCI devices [ 0.000000] Booting paravirtualized kernel on bare hardware [ 0.000000] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:128 nr_cpu_ids:128 nr_node_ids:4 [ 0.000000] PERCPU: Embedded 38 pages/cpu @ffff885bfee00000 s118784 r8192 d28672 u262144 [ 0.000000] pcpu-alloc: s118784 r8192 d28672 u262144 alloc=1*2097152 [ 0.000000] pcpu-alloc: [0] 000 004 008 012 016 020 024 028 [ 0.000000] pcpu-alloc: [0] 032 036 040 044 048 052 056 060 [ 0.000000] pcpu-alloc: [0] 064 068 072 076 080 084 088 092 [ 0.000000] pcpu-alloc: [0] 096 100 104 108 112 116 120 124 [ 0.000000] pcpu-alloc: [1] 001 005 009 013 017 021 025 029 [ 0.000000] pcpu-alloc: [1] 033 037 041 045 049 053 057 061 [ 0.000000] pcpu-alloc: [1] 065 069 073 077 081 085 089 093 [ 0.000000] pcpu-alloc: [1] 097 101 105 109 113 117 121 125 [ 0.000000] pcpu-alloc: [2] 002 006 010 014 018 022 026 030 [ 0.000000] pcpu-alloc: [2] 034 038 042 046 050 054 058 062 [ 0.000000] pcpu-alloc: [2] 066 070 074 078 082 086 090 094 [ 0.000000] pcpu-alloc: [2] 098 102 106 110 114 118 122 126 [ 0.000000] pcpu-alloc: [3] 003 007 011 015 019 023 027 031 [ 0.000000] pcpu-alloc: [3] 035 039 043 047 051 055 059 063 [ 0.000000] pcpu-alloc: [3] 067 071 075 079 083 087 091 095 [ 0.000000] pcpu-alloc: [3] 099 103 107 111 115 119 123 127 [ 0.000000] Built 4 zonelists in Zone order, mobility grouping on. Total pages: 65945355 [ 0.000000] Policy zone: Normal [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.27.2.el7_lustre.pl2.x86_64 root=UUID=abdfca31-9e32-4c60-981c-98bd3cab6b0a ro crashkernel=auto nomodeset console=ttyS0,115200 LANG=en_US.UTF-8 [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100 [ 0.000000] xsave: enabled xstate_bv 0x7, cntxt size 0x340 using standard form [ 0.000000] Memory: 9613428k/270532096k available (7676k kernel code, 2559084k absent, 4654532k reserved, 6045k data, 1876k init) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=4 [ 0.000000] Hierarchical RCU implementation. [ 0.000000] RCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=128. [ 0.000000] NR_IRQS:327936 nr_irqs:3624 0 [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [ttyS0] enabled [ 0.000000] allocated 1072693248 bytes of page_cgroup [ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups [ 0.000000] Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl [ 0.000000] hpet clockevent registered [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] tsc: Detected 1996.233 MHz processor [ 0.000057] Calibrating delay loop (skipped), value calculated using timer frequency.. 3992.46 BogoMIPS (lpj=1996233) [ 0.010704] pid_max: default: 131072 minimum: 1024 [ 0.016200] Security Framework initialized [ 0.020319] SELinux: Initializing. [ 0.023882] SELinux: Starting in permissive mode [ 0.023883] Yama: becoming mindful. [ 0.044373] Dentry cache hash table entries: 33554432 (order: 16, 268435456 bytes) [ 0.100613] Inode-cache hash table entries: 16777216 (order: 15, 134217728 bytes) [ 0.128398] Mount-cache hash table entries: 524288 (order: 10, 4194304 bytes) [ 0.135808] Mountpoint-cache hash table entries: 524288 (order: 10, 4194304 bytes) [ 0.144968] Initializing cgroup subsys memory [ 0.149361] Initializing cgroup subsys devices [ 0.153817] Initializing cgroup subsys freezer [ 0.158272] Initializing cgroup subsys net_cls [ 0.162727] Initializing cgroup subsys blkio [ 0.167006] Initializing cgroup subsys perf_event [ 0.171731] Initializing cgroup subsys hugetlb [ 0.176186] Initializing cgroup subsys pids [ 0.180381] Initializing cgroup subsys net_prio [ 0.184990] tseg: 0070000000 [ 0.190618] LVT offset 2 assigned for vector 0xf4 [ 0.195351] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 512 [ 0.201371] Last level dTLB entries: 4KB 1536, 2MB 1536, 4MB 768 [ 0.207385] tlb_flushall_shift: 6 [ 0.210734] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp [ 0.220307] FEATURE SPEC_CTRL Not Present [ 0.224328] FEATURE IBPB_SUPPORT Present [ 0.228265] Spectre V2 : Enabling Indirect Branch Prediction Barrier [ 0.234701] Spectre V2 : Mitigation: Full retpoline [ 0.240136] Freeing SMP alternatives: 28k freed [ 0.246554] ACPI: Core revision 20130517 [ 0.255273] ACPI: All ACPI Tables successfully acquired [ 0.265617] ftrace: allocating 29216 entries in 115 pages [ 0.605906] Switched APIC routing to physical flat. [ 0.612819] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.628830] smpboot: CPU0: AMD EPYC 7401P 24-Core Processor (fam: 17, model: 01, stepping: 02) [ 0.709602] random: fast init done [ 0.740602] APIC calibration not consistent with PM-Timer: 101ms instead of 100ms [ 0.748083] APIC delta adjusted to PM-Timer: 623827 (636297) [ 0.753775] Performance Events: Fam17h core perfctr, AMD PMU driver. [ 0.760210] ... version: 0 [ 0.764220] ... bit width: 48 [ 0.768320] ... generic registers: 6 [ 0.772331] ... value mask: 0000ffffffffffff [ 0.777646] ... max period: 00007fffffffffff [ 0.782958] ... fixed-purpose events: 0 [ 0.786971] ... event mask: 000000000000003f [ 0.795293] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 0.803374] smpboot: Booting Node 1, Processors #1 OK [ 0.816584] smpboot: Booting Node 2, Processors #2 OK [ 0.829787] smpboot: Booting Node 3, Processors #3 OK [ 0.842978] smpboot: Booting Node 0, Processors #4 OK [ 0.856161] smpboot: Booting Node 1, Processors #5 OK [ 0.869338] smpboot: Booting Node 2, Processors #6 OK [ 0.882521] smpboot: Booting Node 3, Processors #7 OK [ 0.895702] smpboot: Booting Node 0, Processors #8 OK [ 0.909097] smpboot: Booting Node 1, Processors #9 OK [ 0.922283] smpboot: Booting Node 2, Processors #10 OK [ 0.935561] smpboot: Booting Node 3, Processors #11 OK [ 0.948832] smpboot: Booting Node 0, Processors #12 OK [ 0.962103] smpboot: Booting Node 1, Processors #13 OK [ 0.975386] smpboot: Booting Node 2, Processors #14 OK [ 0.988654] smpboot: Booting Node 3, Processors #15 OK [ 1.001925] smpboot: Booting Node 0, Processors #16 OK [ 1.015313] smpboot: Booting Node 1, Processors #17 OK [ 1.028591] smpboot: Booting Node 2, Processors #18 OK [ 1.041871] smpboot: Booting Node 3, Processors #19 OK [ 1.055140] smpboot: Booting Node 0, Processors #20 OK [ 1.068406] smpboot: Booting Node 1, Processors #21 OK [ 1.081673] smpboot: Booting Node 2, Processors #22 OK [ 1.094954] smpboot: Booting Node 3, Processors #23 OK [ 1.108223] smpboot: Booting Node 0, Processors #24 OK [ 1.121961] smpboot: Booting Node 1, Processors #25 OK [ 1.135211] smpboot: Booting Node 2, Processors #26 OK [ 1.148450] smpboot: Booting Node 3, Processors #27 OK [ 1.161694] smpboot: Booting Node 0, Processors #28 OK [ 1.174922] smpboot: Booting Node 1, Processors #29 OK [ 1.188157] smpboot: Booting Node 2, Processors #30 OK [ 1.201388] smpboot: Booting Node 3, Processors #31 OK [ 1.214612] smpboot: Booting Node 0, Processors #32 OK [ 1.227944] smpboot: Booting Node 1, Processors #33 OK [ 1.241290] smpboot: Booting Node 2, Processors #34 OK [ 1.254626] smpboot: Booting Node 3, Processors #35 OK [ 1.267855] smpboot: Booting Node 0, Processors #36 OK [ 1.281080] smpboot: Booting Node 1, Processors #37 OK [ 1.294424] smpboot: Booting Node 2, Processors #38 OK [ 1.307779] smpboot: Booting Node 3, Processors #39 OK [ 1.321112] smpboot: Booting Node 0, Processors #40 OK [ 1.334447] smpboot: Booting Node 1, Processors #41 OK [ 1.347786] smpboot: Booting Node 2, Processors #42 OK [ 1.361116] smpboot: Booting Node 3, Processors #43 OK [ 1.374350] smpboot: Booting Node 0, Processors #44 OK [ 1.387582] smpboot: Booting Node 1, Processors #45 OK [ 1.400925] smpboot: Booting Node 2, Processors #46 OK [ 1.414261] smpboot: Booting Node 3, Processors #47 [ 1.426969] Brought up 48 CPUs [ 1.430228] smpboot: Max logical packages: 3 [ 1.434504] smpboot: Total of 48 processors activated (191638.36 BogoMIPS) [ 1.722736] node 0 initialised, 15462980 pages in 274ms [ 1.731289] node 1 initialised, 15989367 pages in 278ms [ 1.731863] node 2 initialised, 15989367 pages in 279ms [ 1.740479] node 3 initialised, 15984547 pages in 287ms [ 1.747523] devtmpfs: initialized [ 1.773255] EVM: security.selinux [ 1.776574] EVM: security.ima [ 1.779546] EVM: security.capability [ 1.783222] PM: Registering ACPI NVS region [mem 0x0008f000-0x0008ffff] (4096 bytes) [ 1.790965] PM: Registering ACPI NVS region [mem 0x6efcf000-0x6fdfefff] (14876672 bytes) [ 1.800618] atomic64 test passed for x86-64 platform with CX8 and with SSE [ 1.807491] pinctrl core: initialized pinctrl subsystem [ 1.812824] RTC time: 13:57:21, date: 12/10/19 [ 1.817428] NET: Registered protocol family 16 [ 1.822227] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it [ 1.829800] ACPI: bus type PCI registered [ 1.833812] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 1.840396] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) [ 1.849698] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 [ 1.856489] PCI: Using configuration type 1 for base access [ 1.862073] PCI: Dell System detected, enabling pci=bfsort. [ 1.878192] ACPI: Added _OSI(Module Device) [ 1.882380] ACPI: Added _OSI(Processor Device) [ 1.886825] ACPI: Added _OSI(3.0 _SCP Extensions) [ 1.891531] ACPI: Added _OSI(Processor Aggregator Device) [ 1.896931] ACPI: Added _OSI(Linux-Dell-Video) [ 1.902200] ACPI: EC: Look up EC in DSDT [ 1.903182] ACPI: Executed 2 blocks of module-level executable AML code [ 1.915243] ACPI: Interpreter enabled [ 1.918917] ACPI: (supports S0 S5) [ 1.922325] ACPI: Using IOAPIC for interrupt routing [ 1.927502] HEST: Table parsing has been initialized. [ 1.932563] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 1.941719] ACPI: Enabled 1 GPEs in block 00 to 1F [ 1.953400] ACPI: PCI Interrupt Link [LNKA] (IRQs 4 5 7 10 11 14 15) *0 [ 1.960312] ACPI: PCI Interrupt Link [LNKB] (IRQs 4 5 7 10 11 14 15) *0 [ 1.967217] ACPI: PCI Interrupt Link [LNKC] (IRQs 4 5 7 10 11 14 15) *0 [ 1.974124] ACPI: PCI Interrupt Link [LNKD] (IRQs 4 5 7 10 11 14 15) *0 [ 1.981032] ACPI: PCI Interrupt Link [LNKE] (IRQs 4 5 7 10 11 14 15) *0 [ 1.987942] ACPI: PCI Interrupt Link [LNKF] (IRQs 4 5 7 10 11 14 15) *0 [ 1.994847] ACPI: PCI Interrupt Link [LNKG] (IRQs 4 5 7 10 11 14 15) *0 [ 2.001753] ACPI: PCI Interrupt Link [LNKH] (IRQs 4 5 7 10 11 14 15) *0 [ 2.008802] ACPI: PCI Root Bridge [PC00] (domain 0000 [bus 00-3f]) [ 2.014985] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.023203] acpi PNP0A08:00: PCIe AER handled by firmware [ 2.028647] acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug] [ 2.035592] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.043241] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.051702] PCI host bridge to bus 0000:00 [ 2.055803] pci_bus 0000:00: root bus resource [io 0x0000-0x03af window] [ 2.062586] pci_bus 0000:00: root bus resource [io 0x03e0-0x0cf7 window] [ 2.069371] pci_bus 0000:00: root bus resource [mem 0x000c0000-0x000c3fff window] [ 2.076850] pci_bus 0000:00: root bus resource [mem 0x000c4000-0x000c7fff window] [ 2.084332] pci_bus 0000:00: root bus resource [mem 0x000c8000-0x000cbfff window] [ 2.091811] pci_bus 0000:00: root bus resource [mem 0x000cc000-0x000cffff window] [ 2.099290] pci_bus 0000:00: root bus resource [mem 0x000d0000-0x000d3fff window] [ 2.106770] pci_bus 0000:00: root bus resource [mem 0x000d4000-0x000d7fff window] [ 2.114249] pci_bus 0000:00: root bus resource [mem 0x000d8000-0x000dbfff window] [ 2.121729] pci_bus 0000:00: root bus resource [mem 0x000dc000-0x000dffff window] [ 2.129207] pci_bus 0000:00: root bus resource [mem 0x000e0000-0x000e3fff window] [ 2.136689] pci_bus 0000:00: root bus resource [mem 0x000e4000-0x000e7fff window] [ 2.144166] pci_bus 0000:00: root bus resource [mem 0x000e8000-0x000ebfff window] [ 2.151645] pci_bus 0000:00: root bus resource [mem 0x000ec000-0x000effff window] [ 2.159125] pci_bus 0000:00: root bus resource [mem 0x000f0000-0x000fffff window] [ 2.166604] pci_bus 0000:00: root bus resource [io 0x0d00-0x3fff window] [ 2.173389] pci_bus 0000:00: root bus resource [mem 0xe1000000-0xfebfffff window] [ 2.180869] pci_bus 0000:00: root bus resource [mem 0x10000000000-0x2bf3fffffff window] [ 2.188871] pci_bus 0000:00: root bus resource [bus 00-3f] [ 2.194364] pci 0000:00:00.0: [1022:1450] type 00 class 0x060000 [ 2.194447] pci 0000:00:00.2: [1022:1451] type 00 class 0x080600 [ 2.194534] pci 0000:00:01.0: [1022:1452] type 00 class 0x060000 [ 2.194612] pci 0000:00:02.0: [1022:1452] type 00 class 0x060000 [ 2.194686] pci 0000:00:03.0: [1022:1452] type 00 class 0x060000 [ 2.194747] pci 0000:00:03.1: [1022:1453] type 01 class 0x060400 [ 2.195405] pci 0000:00:03.1: PME# supported from D0 D3hot D3cold [ 2.195505] pci 0000:00:04.0: [1022:1452] type 00 class 0x060000 [ 2.195586] pci 0000:00:07.0: [1022:1452] type 00 class 0x060000 [ 2.195647] pci 0000:00:07.1: [1022:1454] type 01 class 0x060400 [ 2.196395] pci 0000:00:07.1: PME# supported from D0 D3hot D3cold [ 2.196475] pci 0000:00:08.0: [1022:1452] type 00 class 0x060000 [ 2.196536] pci 0000:00:08.1: [1022:1454] type 01 class 0x060400 [ 2.197366] pci 0000:00:08.1: PME# supported from D0 D3hot D3cold [ 2.197482] pci 0000:00:14.0: [1022:790b] type 00 class 0x0c0500 [ 2.197682] pci 0000:00:14.3: [1022:790e] type 00 class 0x060100 [ 2.197886] pci 0000:00:18.0: [1022:1460] type 00 class 0x060000 [ 2.197939] pci 0000:00:18.1: [1022:1461] type 00 class 0x060000 [ 2.197991] pci 0000:00:18.2: [1022:1462] type 00 class 0x060000 [ 2.198043] pci 0000:00:18.3: [1022:1463] type 00 class 0x060000 [ 2.198093] pci 0000:00:18.4: [1022:1464] type 00 class 0x060000 [ 2.198143] pci 0000:00:18.5: [1022:1465] type 00 class 0x060000 [ 2.198193] pci 0000:00:18.6: [1022:1466] type 00 class 0x060000 [ 2.198243] pci 0000:00:18.7: [1022:1467] type 00 class 0x060000 [ 2.198293] pci 0000:00:19.0: [1022:1460] type 00 class 0x060000 [ 2.198347] pci 0000:00:19.1: [1022:1461] type 00 class 0x060000 [ 2.198402] pci 0000:00:19.2: [1022:1462] type 00 class 0x060000 [ 2.198458] pci 0000:00:19.3: [1022:1463] type 00 class 0x060000 [ 2.198512] pci 0000:00:19.4: [1022:1464] type 00 class 0x060000 [ 2.198566] pci 0000:00:19.5: [1022:1465] type 00 class 0x060000 [ 2.198621] pci 0000:00:19.6: [1022:1466] type 00 class 0x060000 [ 2.198674] pci 0000:00:19.7: [1022:1467] type 00 class 0x060000 [ 2.198727] pci 0000:00:1a.0: [1022:1460] type 00 class 0x060000 [ 2.198783] pci 0000:00:1a.1: [1022:1461] type 00 class 0x060000 [ 2.198834] pci 0000:00:1a.2: [1022:1462] type 00 class 0x060000 [ 2.198888] pci 0000:00:1a.3: [1022:1463] type 00 class 0x060000 [ 2.198943] pci 0000:00:1a.4: [1022:1464] type 00 class 0x060000 [ 2.198997] pci 0000:00:1a.5: [1022:1465] type 00 class 0x060000 [ 2.199051] pci 0000:00:1a.6: [1022:1466] type 00 class 0x060000 [ 2.199106] pci 0000:00:1a.7: [1022:1467] type 00 class 0x060000 [ 2.199160] pci 0000:00:1b.0: [1022:1460] type 00 class 0x060000 [ 2.199215] pci 0000:00:1b.1: [1022:1461] type 00 class 0x060000 [ 2.199268] pci 0000:00:1b.2: [1022:1462] type 00 class 0x060000 [ 2.199322] pci 0000:00:1b.3: [1022:1463] type 00 class 0x060000 [ 2.199374] pci 0000:00:1b.4: [1022:1464] type 00 class 0x060000 [ 2.199428] pci 0000:00:1b.5: [1022:1465] type 00 class 0x060000 [ 2.199481] pci 0000:00:1b.6: [1022:1466] type 00 class 0x060000 [ 2.199535] pci 0000:00:1b.7: [1022:1467] type 00 class 0x060000 [ 2.200407] pci 0000:01:00.0: [15b3:101b] type 00 class 0x020700 [ 2.200552] pci 0000:01:00.0: reg 0x10: [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.200787] pci 0000:01:00.0: reg 0x30: [mem 0xfff00000-0xffffffff pref] [ 2.201193] pci 0000:01:00.0: PME# supported from D3cold [ 2.201470] pci 0000:00:03.1: PCI bridge to [bus 01] [ 2.206443] pci 0000:00:03.1: bridge window [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.206529] pci 0000:02:00.0: [1022:145a] type 00 class 0x130000 [ 2.206628] pci 0000:02:00.2: [1022:1456] type 00 class 0x108000 [ 2.206646] pci 0000:02:00.2: reg 0x18: [mem 0xf7300000-0xf73fffff] [ 2.206658] pci 0000:02:00.2: reg 0x24: [mem 0xf7400000-0xf7401fff] [ 2.206735] pci 0000:02:00.3: [1022:145f] type 00 class 0x0c0330 [ 2.206748] pci 0000:02:00.3: reg 0x10: [mem 0xf7200000-0xf72fffff 64bit] [ 2.206798] pci 0000:02:00.3: PME# supported from D0 D3hot D3cold [ 2.206858] pci 0000:00:07.1: PCI bridge to [bus 02] [ 2.211830] pci 0000:00:07.1: bridge window [mem 0xf7200000-0xf74fffff] [ 2.212405] pci 0000:03:00.0: [1022:1455] type 00 class 0x130000 [ 2.212514] pci 0000:03:00.1: [1022:1468] type 00 class 0x108000 [ 2.212532] pci 0000:03:00.1: reg 0x18: [mem 0xf7000000-0xf70fffff] [ 2.212545] pci 0000:03:00.1: reg 0x24: [mem 0xf7100000-0xf7101fff] [ 2.212636] pci 0000:00:08.1: PCI bridge to [bus 03] [ 2.217603] pci 0000:00:08.1: bridge window [mem 0xf7000000-0xf71fffff] [ 2.217619] pci_bus 0000:00: on NUMA node 0 [ 2.218002] ACPI: PCI Root Bridge [PC01] (domain 0000 [bus 40-7f]) [ 2.224191] acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.232408] acpi PNP0A08:01: PCIe AER handled by firmware [ 2.237850] acpi PNP0A08:01: _OSC: platform does not support [SHPCHotplug] [ 2.244798] acpi PNP0A08:01: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.252449] acpi PNP0A08:01: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.260866] PCI host bridge to bus 0000:40 [ 2.264965] pci_bus 0000:40: root bus resource [io 0x4000-0x7fff window] [ 2.271749] pci_bus 0000:40: root bus resource [mem 0xc6000000-0xe0ffffff window] [ 2.279230] pci_bus 0000:40: root bus resource [mem 0x2bf40000000-0x47e7fffffff window] [ 2.287228] pci_bus 0000:40: root bus resource [bus 40-7f] [ 2.292719] pci 0000:40:00.0: [1022:1450] type 00 class 0x060000 [ 2.292791] pci 0000:40:00.2: [1022:1451] type 00 class 0x080600 [ 2.292883] pci 0000:40:01.0: [1022:1452] type 00 class 0x060000 [ 2.292958] pci 0000:40:02.0: [1022:1452] type 00 class 0x060000 [ 2.293033] pci 0000:40:03.0: [1022:1452] type 00 class 0x060000 [ 2.293106] pci 0000:40:04.0: [1022:1452] type 00 class 0x060000 [ 2.293187] pci 0000:40:07.0: [1022:1452] type 00 class 0x060000 [ 2.293247] pci 0000:40:07.1: [1022:1454] type 01 class 0x060400 [ 2.293794] pci 0000:40:07.1: PME# supported from D0 D3hot D3cold [ 2.293873] pci 0000:40:08.0: [1022:1452] type 00 class 0x060000 [ 2.293936] pci 0000:40:08.1: [1022:1454] type 01 class 0x060400 [ 2.294049] pci 0000:40:08.1: PME# supported from D0 D3hot D3cold [ 2.294748] pci 0000:41:00.0: [1022:145a] type 00 class 0x130000 [ 2.294858] pci 0000:41:00.2: [1022:1456] type 00 class 0x108000 [ 2.294878] pci 0000:41:00.2: reg 0x18: [mem 0xdb300000-0xdb3fffff] [ 2.294892] pci 0000:41:00.2: reg 0x24: [mem 0xdb400000-0xdb401fff] [ 2.294977] pci 0000:41:00.3: [1022:145f] type 00 class 0x0c0330 [ 2.294991] pci 0000:41:00.3: reg 0x10: [mem 0xdb200000-0xdb2fffff 64bit] [ 2.295045] pci 0000:41:00.3: PME# supported from D0 D3hot D3cold [ 2.295107] pci 0000:40:07.1: PCI bridge to [bus 41] [ 2.300075] pci 0000:40:07.1: bridge window [mem 0xdb200000-0xdb4fffff] [ 2.300172] pci 0000:42:00.0: [1022:1455] type 00 class 0x130000 [ 2.300290] pci 0000:42:00.1: [1022:1468] type 00 class 0x108000 [ 2.300310] pci 0000:42:00.1: reg 0x18: [mem 0xdb000000-0xdb0fffff] [ 2.300324] pci 0000:42:00.1: reg 0x24: [mem 0xdb100000-0xdb101fff] [ 2.300421] pci 0000:40:08.1: PCI bridge to [bus 42] [ 2.305388] pci 0000:40:08.1: bridge window [mem 0xdb000000-0xdb1fffff] [ 2.305402] pci_bus 0000:40: on NUMA node 1 [ 2.305577] ACPI: PCI Root Bridge [PC02] (domain 0000 [bus 80-bf]) [ 2.311757] acpi PNP0A08:02: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.319967] acpi PNP0A08:02: PCIe AER handled by firmware [ 2.325410] acpi PNP0A08:02: _OSC: platform does not support [SHPCHotplug] [ 2.332357] acpi PNP0A08:02: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.340009] acpi PNP0A08:02: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.348446] PCI host bridge to bus 0000:80 [ 2.352549] pci_bus 0000:80: root bus resource [io 0x03b0-0x03df window] [ 2.359336] pci_bus 0000:80: root bus resource [mem 0x000a0000-0x000bffff window] [ 2.366814] pci_bus 0000:80: root bus resource [io 0x8000-0xbfff window] [ 2.373602] pci_bus 0000:80: root bus resource [mem 0xab000000-0xc5ffffff window] [ 2.381079] pci_bus 0000:80: root bus resource [mem 0x47e80000000-0x63dbfffffff window] [ 2.389080] pci_bus 0000:80: root bus resource [bus 80-bf] [ 2.394571] pci 0000:80:00.0: [1022:1450] type 00 class 0x060000 [ 2.394642] pci 0000:80:00.2: [1022:1451] type 00 class 0x080600 [ 2.394731] pci 0000:80:01.0: [1022:1452] type 00 class 0x060000 [ 2.394796] pci 0000:80:01.1: [1022:1453] type 01 class 0x060400 [ 2.395417] pci 0000:80:01.1: PME# supported from D0 D3hot D3cold [ 2.395492] pci 0000:80:01.2: [1022:1453] type 01 class 0x060400 [ 2.395623] pci 0000:80:01.2: PME# supported from D0 D3hot D3cold [ 2.395704] pci 0000:80:02.0: [1022:1452] type 00 class 0x060000 [ 2.395780] pci 0000:80:03.0: [1022:1452] type 00 class 0x060000 [ 2.395842] pci 0000:80:03.1: [1022:1453] type 01 class 0x060400 [ 2.396423] pci 0000:80:03.1: PME# supported from D0 D3hot D3cold [ 2.396519] pci 0000:80:04.0: [1022:1452] type 00 class 0x060000 [ 2.396602] pci 0000:80:07.0: [1022:1452] type 00 class 0x060000 [ 2.396665] pci 0000:80:07.1: [1022:1454] type 01 class 0x060400 [ 2.396773] pci 0000:80:07.1: PME# supported from D0 D3hot D3cold [ 2.396852] pci 0000:80:08.0: [1022:1452] type 00 class 0x060000 [ 2.396915] pci 0000:80:08.1: [1022:1454] type 01 class 0x060400 [ 2.397432] pci 0000:80:08.1: PME# supported from D0 D3hot D3cold [ 2.397643] pci 0000:81:00.0: [14e4:165f] type 00 class 0x020000 [ 2.397669] pci 0000:81:00.0: reg 0x10: [mem 0xac230000-0xac23ffff 64bit pref] [ 2.397684] pci 0000:81:00.0: reg 0x18: [mem 0xac240000-0xac24ffff 64bit pref] [ 2.397699] pci 0000:81:00.0: reg 0x20: [mem 0xac250000-0xac25ffff 64bit pref] [ 2.397709] pci 0000:81:00.0: reg 0x30: [mem 0xfffc0000-0xffffffff pref] [ 2.397785] pci 0000:81:00.0: PME# supported from D0 D3hot D3cold [ 2.397880] pci 0000:81:00.1: [14e4:165f] type 00 class 0x020000 [ 2.397905] pci 0000:81:00.1: reg 0x10: [mem 0xac200000-0xac20ffff 64bit pref] [ 2.397920] pci 0000:81:00.1: reg 0x18: [mem 0xac210000-0xac21ffff 64bit pref] [ 2.397935] pci 0000:81:00.1: reg 0x20: [mem 0xac220000-0xac22ffff 64bit pref] [ 2.397945] pci 0000:81:00.1: reg 0x30: [mem 0xfffc0000-0xffffffff pref] [ 2.398023] pci 0000:81:00.1: PME# supported from D0 D3hot D3cold [ 2.398111] pci 0000:80:01.1: PCI bridge to [bus 81] [ 2.403083] pci 0000:80:01.1: bridge window [mem 0xac200000-0xac2fffff 64bit pref] [ 2.403405] pci 0000:82:00.0: [1556:be00] type 01 class 0x060400 [ 2.405801] pci 0000:80:01.2: PCI bridge to [bus 82-83] [ 2.411028] pci 0000:80:01.2: bridge window [mem 0xc0000000-0xc08fffff] [ 2.411033] pci 0000:80:01.2: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.411080] pci 0000:83:00.0: [102b:0536] type 00 class 0x030000 [ 2.411099] pci 0000:83:00.0: reg 0x10: [mem 0xab000000-0xabffffff pref] [ 2.411110] pci 0000:83:00.0: reg 0x14: [mem 0xc0808000-0xc080bfff] [ 2.411121] pci 0000:83:00.0: reg 0x18: [mem 0xc0000000-0xc07fffff] [ 2.411263] pci 0000:82:00.0: PCI bridge to [bus 83] [ 2.416241] pci 0000:82:00.0: bridge window [mem 0xc0000000-0xc08fffff] [ 2.416247] pci 0000:82:00.0: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.416437] pci 0000:84:00.0: [1000:00d1] type 00 class 0x010700 [ 2.416460] pci 0000:84:00.0: reg 0x10: [mem 0xac000000-0xac0fffff 64bit pref] [ 2.416470] pci 0000:84:00.0: reg 0x18: [mem 0xac100000-0xac1fffff 64bit pref] [ 2.416477] pci 0000:84:00.0: reg 0x20: [mem 0xc0d00000-0xc0dfffff] [ 2.416484] pci 0000:84:00.0: reg 0x24: [io 0x8000-0x80ff] [ 2.416493] pci 0000:84:00.0: reg 0x30: [mem 0xfffc0000-0xffffffff pref] [ 2.416545] pci 0000:84:00.0: supports D1 D2 [ 2.418799] pci 0000:80:03.1: PCI bridge to [bus 84] [ 2.423768] pci 0000:80:03.1: bridge window [io 0x8000-0x8fff] [ 2.423770] pci 0000:80:03.1: bridge window [mem 0xc0d00000-0xc0dfffff] [ 2.423774] pci 0000:80:03.1: bridge window [mem 0xac000000-0xac1fffff 64bit pref] [ 2.423855] pci 0000:85:00.0: [1022:145a] type 00 class 0x130000 [ 2.423962] pci 0000:85:00.2: [1022:1456] type 00 class 0x108000 [ 2.423980] pci 0000:85:00.2: reg 0x18: [mem 0xc0b00000-0xc0bfffff] [ 2.423994] pci 0000:85:00.2: reg 0x24: [mem 0xc0c00000-0xc0c01fff] [ 2.424086] pci 0000:80:07.1: PCI bridge to [bus 85] [ 2.429055] pci 0000:80:07.1: bridge window [mem 0xc0b00000-0xc0cfffff] [ 2.429480] pci 0000:86:00.0: [1022:1455] type 00 class 0x130000 [ 2.429597] pci 0000:86:00.1: [1022:1468] type 00 class 0x108000 [ 2.429617] pci 0000:86:00.1: reg 0x18: [mem 0xc0900000-0xc09fffff] [ 2.429631] pci 0000:86:00.1: reg 0x24: [mem 0xc0a00000-0xc0a01fff] [ 2.429719] pci 0000:86:00.2: [1022:7901] type 00 class 0x010601 [ 2.429751] pci 0000:86:00.2: reg 0x24: [mem 0xc0a02000-0xc0a02fff] [ 2.429790] pci 0000:86:00.2: PME# supported from D3hot D3cold [ 2.429858] pci 0000:80:08.1: PCI bridge to [bus 86] [ 2.434827] pci 0000:80:08.1: bridge window [mem 0xc0900000-0xc0afffff] [ 2.434853] pci_bus 0000:80: on NUMA node 2 [ 2.435025] ACPI: PCI Root Bridge [PC03] (domain 0000 [bus c0-ff]) [ 2.441202] acpi PNP0A08:03: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.449413] acpi PNP0A08:03: PCIe AER handled by firmware [ 2.454855] acpi PNP0A08:03: _OSC: platform does not support [SHPCHotplug] [ 2.461804] acpi PNP0A08:03: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.469456] acpi PNP0A08:03: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.477779] acpi PNP0A08:03: host bridge window [mem 0x63dc0000000-0xffffffffffff window] ([0x80000000000-0xffffffffffff] ignored, not CPU addressable) [ 2.491418] PCI host bridge to bus 0000:c0 [ 2.495517] pci_bus 0000:c0: root bus resource [io 0xc000-0xffff window] [ 2.502302] pci_bus 0000:c0: root bus resource [mem 0x90000000-0xaaffffff window] [ 2.509782] pci_bus 0000:c0: root bus resource [mem 0x63dc0000000-0x7ffffffffff window] [ 2.517782] pci_bus 0000:c0: root bus resource [bus c0-ff] [ 2.523271] pci 0000:c0:00.0: [1022:1450] type 00 class 0x060000 [ 2.523342] pci 0000:c0:00.2: [1022:1451] type 00 class 0x080600 [ 2.523431] pci 0000:c0:01.0: [1022:1452] type 00 class 0x060000 [ 2.523493] pci 0000:c0:01.1: [1022:1453] type 01 class 0x060400 [ 2.523727] pci 0000:c0:01.1: PME# supported from D0 D3hot D3cold [ 2.523826] pci 0000:c0:02.0: [1022:1452] type 00 class 0x060000 [ 2.523901] pci 0000:c0:03.0: [1022:1452] type 00 class 0x060000 [ 2.523975] pci 0000:c0:04.0: [1022:1452] type 00 class 0x060000 [ 2.524055] pci 0000:c0:07.0: [1022:1452] type 00 class 0x060000 [ 2.524117] pci 0000:c0:07.1: [1022:1454] type 01 class 0x060400 [ 2.524700] pci 0000:c0:07.1: PME# supported from D0 D3hot D3cold [ 2.524776] pci 0000:c0:08.0: [1022:1452] type 00 class 0x060000 [ 2.524839] pci 0000:c0:08.1: [1022:1454] type 01 class 0x060400 [ 2.524951] pci 0000:c0:08.1: PME# supported from D0 D3hot D3cold [ 2.525153] pci 0000:c1:00.0: [1000:005f] type 00 class 0x010400 [ 2.525166] pci 0000:c1:00.0: reg 0x10: [io 0xc000-0xc0ff] [ 2.525176] pci 0000:c1:00.0: reg 0x14: [mem 0xa5500000-0xa550ffff 64bit] [ 2.525186] pci 0000:c1:00.0: reg 0x1c: [mem 0xa5400000-0xa54fffff 64bit] [ 2.525198] pci 0000:c1:00.0: reg 0x30: [mem 0xfff00000-0xffffffff pref] [ 2.525248] pci 0000:c1:00.0: supports D1 D2 [ 2.525298] pci 0000:c0:01.1: PCI bridge to [bus c1] [ 2.530264] pci 0000:c0:01.1: bridge window [io 0xc000-0xcfff] [ 2.530266] pci 0000:c0:01.1: bridge window [mem 0xa5400000-0xa55fffff] [ 2.530709] pci 0000:c2:00.0: [1022:145a] type 00 class 0x130000 [ 2.530817] pci 0000:c2:00.2: [1022:1456] type 00 class 0x108000 [ 2.530836] pci 0000:c2:00.2: reg 0x18: [mem 0xa5200000-0xa52fffff] [ 2.530850] pci 0000:c2:00.2: reg 0x24: [mem 0xa5300000-0xa5301fff] [ 2.530942] pci 0000:c0:07.1: PCI bridge to [bus c2] [ 2.535916] pci 0000:c0:07.1: bridge window [mem 0xa5200000-0xa53fffff] [ 2.536012] pci 0000:c3:00.0: [1022:1455] type 00 class 0x130000 [ 2.536128] pci 0000:c3:00.1: [1022:1468] type 00 class 0x108000 [ 2.536147] pci 0000:c3:00.1: reg 0x18: [mem 0xa5000000-0xa50fffff] [ 2.536162] pci 0000:c3:00.1: reg 0x24: [mem 0xa5100000-0xa5101fff] [ 2.536260] pci 0000:c0:08.1: PCI bridge to [bus c3] [ 2.541228] pci 0000:c0:08.1: bridge window [mem 0xa5000000-0xa51fffff] [ 2.541245] pci_bus 0000:c0: on NUMA node 3 [ 2.543411] vgaarb: device added: PCI:0000:83:00.0,decodes=io+mem,owns=io+mem,locks=none [ 2.551499] vgaarb: loaded [ 2.554214] vgaarb: bridge control possible 0000:83:00.0 [ 2.559641] SCSI subsystem initialized [ 2.563419] ACPI: bus type USB registered [ 2.567447] usbcore: registered new interface driver usbfs [ 2.572942] usbcore: registered new interface driver hub [ 2.578478] usbcore: registered new device driver usb [ 2.583853] EDAC MC: Ver: 3.0.0 [ 2.587256] PCI: Using ACPI for IRQ routing [ 2.610408] PCI: pci_cache_line_size set to 64 bytes [ 2.610562] e820: reserve RAM buffer [mem 0x0008f000-0x0008ffff] [ 2.610564] e820: reserve RAM buffer [mem 0x37007020-0x37ffffff] [ 2.610565] e820: reserve RAM buffer [mem 0x37020020-0x37ffffff] [ 2.610567] e820: reserve RAM buffer [mem 0x37029020-0x37ffffff] [ 2.610568] e820: reserve RAM buffer [mem 0x3705b020-0x37ffffff] [ 2.610569] e820: reserve RAM buffer [mem 0x4f883000-0x4fffffff] [ 2.610571] e820: reserve RAM buffer [mem 0x6cacf000-0x6fffffff] [ 2.610572] e820: reserve RAM buffer [mem 0x107f380000-0x107fffffff] [ 2.610574] e820: reserve RAM buffer [mem 0x207ff80000-0x207fffffff] [ 2.610575] e820: reserve RAM buffer [mem 0x307ff80000-0x307fffffff] [ 2.610576] e820: reserve RAM buffer [mem 0x407ff80000-0x407fffffff] [ 2.610831] NetLabel: Initializing [ 2.614240] NetLabel: domain hash size = 128 [ 2.618600] NetLabel: protocols = UNLABELED CIPSOv4 [ 2.623582] NetLabel: unlabeled traffic allowed by default [ 2.629355] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 [ 2.634337] hpet0: 3 comparators, 32-bit 14.318180 MHz counter [ 2.642349] Switched to clocksource hpet [ 2.651087] pnp: PnP ACPI init [ 2.654161] ACPI: bus type PNP registered [ 2.658368] system 00:00: [mem 0x80000000-0x8fffffff] has been reserved [ 2.664995] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active) [ 2.665050] pnp 00:01: Plug and Play ACPI device, IDs PNP0b00 (active) [ 2.665250] pnp 00:02: Plug and Play ACPI device, IDs PNP0501 (active) [ 2.665438] pnp 00:03: Plug and Play ACPI device, IDs PNP0501 (active) [ 2.665577] pnp: PnP ACPI: found 4 devices [ 2.669682] ACPI: bus type PNP unregistered [ 2.681193] pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window [ 2.691110] pci 0000:81:00.0: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window [ 2.701022] pci 0000:81:00.1: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window [ 2.710938] pci 0000:84:00.0: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window [ 2.720852] pci 0000:c1:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window [ 2.730790] pci 0000:00:03.1: BAR 14: assigned [mem 0xe1000000-0xe10fffff] [ 2.737674] pci 0000:01:00.0: BAR 6: assigned [mem 0xe1000000-0xe10fffff pref] [ 2.744902] pci 0000:00:03.1: PCI bridge to [bus 01] [ 2.749879] pci 0000:00:03.1: bridge window [mem 0xe1000000-0xe10fffff] [ 2.756671] pci 0000:00:03.1: bridge window [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.764422] pci 0000:00:07.1: PCI bridge to [bus 02] [ 2.769395] pci 0000:00:07.1: bridge window [mem 0xf7200000-0xf74fffff] [ 2.776191] pci 0000:00:08.1: PCI bridge to [bus 03] [ 2.781165] pci 0000:00:08.1: bridge window [mem 0xf7000000-0xf71fffff] [ 2.787964] pci_bus 0000:00: resource 4 [io 0x0000-0x03af window] [ 2.787966] pci_bus 0000:00: resource 5 [io 0x03e0-0x0cf7 window] [ 2.787967] pci_bus 0000:00: resource 6 [mem 0x000c0000-0x000c3fff window] [ 2.787969] pci_bus 0000:00: resource 7 [mem 0x000c4000-0x000c7fff window] [ 2.787971] pci_bus 0000:00: resource 8 [mem 0x000c8000-0x000cbfff window] [ 2.787973] pci_bus 0000:00: resource 9 [mem 0x000cc000-0x000cffff window] [ 2.787975] pci_bus 0000:00: resource 10 [mem 0x000d0000-0x000d3fff window] [ 2.787976] pci_bus 0000:00: resource 11 [mem 0x000d4000-0x000d7fff window] [ 2.787978] pci_bus 0000:00: resource 12 [mem 0x000d8000-0x000dbfff window] [ 2.787980] pci_bus 0000:00: resource 13 [mem 0x000dc000-0x000dffff window] [ 2.787981] pci_bus 0000:00: resource 14 [mem 0x000e0000-0x000e3fff window] [ 2.787983] pci_bus 0000:00: resource 15 [mem 0x000e4000-0x000e7fff window] [ 2.787985] pci_bus 0000:00: resource 16 [mem 0x000e8000-0x000ebfff window] [ 2.787986] pci_bus 0000:00: resource 17 [mem 0x000ec000-0x000effff window] [ 2.787988] pci_bus 0000:00: resource 18 [mem 0x000f0000-0x000fffff window] [ 2.787990] pci_bus 0000:00: resource 19 [io 0x0d00-0x3fff window] [ 2.787991] pci_bus 0000:00: resource 20 [mem 0xe1000000-0xfebfffff window] [ 2.787993] pci_bus 0000:00: resource 21 [mem 0x10000000000-0x2bf3fffffff window] [ 2.787995] pci_bus 0000:01: resource 1 [mem 0xe1000000-0xe10fffff] [ 2.787997] pci_bus 0000:01: resource 2 [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.787999] pci_bus 0000:02: resource 1 [mem 0xf7200000-0xf74fffff] [ 2.788001] pci_bus 0000:03: resource 1 [mem 0xf7000000-0xf71fffff] [ 2.788013] pci 0000:40:07.1: PCI bridge to [bus 41] [ 2.792986] pci 0000:40:07.1: bridge window [mem 0xdb200000-0xdb4fffff] [ 2.799782] pci 0000:40:08.1: PCI bridge to [bus 42] [ 2.804755] pci 0000:40:08.1: bridge window [mem 0xdb000000-0xdb1fffff] [ 2.811553] pci_bus 0000:40: resource 4 [io 0x4000-0x7fff window] [ 2.811554] pci_bus 0000:40: resource 5 [mem 0xc6000000-0xe0ffffff window] [ 2.811556] pci_bus 0000:40: resource 6 [mem 0x2bf40000000-0x47e7fffffff window] [ 2.811558] pci_bus 0000:41: resource 1 [mem 0xdb200000-0xdb4fffff] [ 2.811560] pci_bus 0000:42: resource 1 [mem 0xdb000000-0xdb1fffff] [ 2.811591] pci 0000:80:01.1: BAR 14: assigned [mem 0xac300000-0xac3fffff] [ 2.818474] pci 0000:81:00.0: BAR 6: assigned [mem 0xac300000-0xac33ffff pref] [ 2.825702] pci 0000:81:00.1: BAR 6: assigned [mem 0xac340000-0xac37ffff pref] [ 2.832930] pci 0000:80:01.1: PCI bridge to [bus 81] [ 2.837906] pci 0000:80:01.1: bridge window [mem 0xac300000-0xac3fffff] [ 2.844699] pci 0000:80:01.1: bridge window [mem 0xac200000-0xac2fffff 64bit pref] [ 2.852450] pci 0000:82:00.0: PCI bridge to [bus 83] [ 2.857424] pci 0000:82:00.0: bridge window [mem 0xc0000000-0xc08fffff] [ 2.864219] pci 0000:82:00.0: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.871969] pci 0000:80:01.2: PCI bridge to [bus 82-83] [ 2.877209] pci 0000:80:01.2: bridge window [mem 0xc0000000-0xc08fffff] [ 2.884004] pci 0000:80:01.2: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.891753] pci 0000:84:00.0: BAR 6: no space for [mem size 0x00040000 pref] [ 2.898805] pci 0000:84:00.0: BAR 6: failed to assign [mem size 0x00040000 pref] [ 2.906208] pci 0000:80:03.1: PCI bridge to [bus 84] [ 2.911182] pci 0000:80:03.1: bridge window [io 0x8000-0x8fff] [ 2.917285] pci 0000:80:03.1: bridge window [mem 0xc0d00000-0xc0dfffff] [ 2.924078] pci 0000:80:03.1: bridge window [mem 0xac000000-0xac1fffff 64bit pref] [ 2.931827] pci 0000:80:07.1: PCI bridge to [bus 85] [ 2.936801] pci 0000:80:07.1: bridge window [mem 0xc0b00000-0xc0cfffff] [ 2.943600] pci 0000:80:08.1: PCI bridge to [bus 86] [ 2.948579] pci 0000:80:08.1: bridge window [mem 0xc0900000-0xc0afffff] [ 2.955377] pci_bus 0000:80: resource 4 [io 0x03b0-0x03df window] [ 2.955379] pci_bus 0000:80: resource 5 [mem 0x000a0000-0x000bffff window] [ 2.955381] pci_bus 0000:80: resource 6 [io 0x8000-0xbfff window] [ 2.955382] pci_bus 0000:80: resource 7 [mem 0xab000000-0xc5ffffff window] [ 2.955384] pci_bus 0000:80: resource 8 [mem 0x47e80000000-0x63dbfffffff window] [ 2.955386] pci_bus 0000:81: resource 1 [mem 0xac300000-0xac3fffff] [ 2.955387] pci_bus 0000:81: resource 2 [mem 0xac200000-0xac2fffff 64bit pref] [ 2.955389] pci_bus 0000:82: resource 1 [mem 0xc0000000-0xc08fffff] [ 2.955391] pci_bus 0000:82: resource 2 [mem 0xab000000-0xabffffff 64bit pref] [ 2.955392] pci_bus 0000:83: resource 1 [mem 0xc0000000-0xc08fffff] [ 2.955394] pci_bus 0000:83: resource 2 [mem 0xab000000-0xabffffff 64bit pref] [ 2.955396] pci_bus 0000:84: resource 0 [io 0x8000-0x8fff] [ 2.955397] pci_bus 0000:84: resource 1 [mem 0xc0d00000-0xc0dfffff] [ 2.955399] pci_bus 0000:84: resource 2 [mem 0xac000000-0xac1fffff 64bit pref] [ 2.955401] pci_bus 0000:85: resource 1 [mem 0xc0b00000-0xc0cfffff] [ 2.955402] pci_bus 0000:86: resource 1 [mem 0xc0900000-0xc0afffff] [ 2.955418] pci 0000:c1:00.0: BAR 6: no space for [mem size 0x00100000 pref] [ 2.962472] pci 0000:c1:00.0: BAR 6: failed to assign [mem size 0x00100000 pref] [ 2.969872] pci 0000:c0:01.1: PCI bridge to [bus c1] [ 2.974848] pci 0000:c0:01.1: bridge window [io 0xc000-0xcfff] [ 2.980950] pci 0000:c0:01.1: bridge window [mem 0xa5400000-0xa55fffff] [ 2.987748] pci 0000:c0:07.1: PCI bridge to [bus c2] [ 2.992727] pci 0000:c0:07.1: bridge window [mem 0xa5200000-0xa53fffff] [ 2.999524] pci 0000:c0:08.1: PCI bridge to [bus c3] [ 3.004496] pci 0000:c0:08.1: bridge window [mem 0xa5000000-0xa51fffff] [ 3.011294] pci_bus 0000:c0: resource 4 [io 0xc000-0xffff window] [ 3.011296] pci_bus 0000:c0: resource 5 [mem 0x90000000-0xaaffffff window] [ 3.011298] pci_bus 0000:c0: resource 6 [mem 0x63dc0000000-0x7ffffffffff window] [ 3.011300] pci_bus 0000:c1: resource 0 [io 0xc000-0xcfff] [ 3.011301] pci_bus 0000:c1: resource 1 [mem 0xa5400000-0xa55fffff] [ 3.011303] pci_bus 0000:c2: resource 1 [mem 0xa5200000-0xa53fffff] [ 3.011305] pci_bus 0000:c3: resource 1 [mem 0xa5000000-0xa51fffff] [ 3.011396] NET: Registered protocol family 2 [ 3.016446] TCP established hash table entries: 524288 (order: 10, 4194304 bytes) [ 3.024608] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) [ 3.031441] TCP: Hash tables configured (established 524288 bind 65536) [ 3.038096] TCP: reno registered [ 3.041452] UDP hash table entries: 65536 (order: 9, 2097152 bytes) [ 3.048060] UDP-Lite hash table entries: 65536 (order: 9, 2097152 bytes) [ 3.055271] NET: Registered protocol family 1 [ 3.060101] pci 0000:83:00.0: Boot video device [ 3.060139] PCI: CLS 64 bytes, default 64 [ 3.060199] Unpacking initramfs... [ 3.330460] Freeing initrd memory: 19732k freed [ 3.337218] AMD-Vi: IOMMU performance counters supported [ 3.342604] AMD-Vi: IOMMU performance counters supported [ 3.347957] AMD-Vi: IOMMU performance counters supported [ 3.353318] AMD-Vi: IOMMU performance counters supported [ 3.359956] iommu: Adding device 0000:00:01.0 to group 0 [ 3.365978] iommu: Adding device 0000:00:02.0 to group 1 [ 3.372004] iommu: Adding device 0000:00:03.0 to group 2 [ 3.378113] iommu: Adding device 0000:00:03.1 to group 3 [ 3.384150] iommu: Adding device 0000:00:04.0 to group 4 [ 3.390161] iommu: Adding device 0000:00:07.0 to group 5 [ 3.396208] iommu: Adding device 0000:00:07.1 to group 6 [ 3.402190] iommu: Adding device 0000:00:08.0 to group 7 [ 3.408234] iommu: Adding device 0000:00:08.1 to group 8 [ 3.414245] iommu: Adding device 0000:00:14.0 to group 9 [ 3.419579] iommu: Adding device 0000:00:14.3 to group 9 [ 3.425718] iommu: Adding device 0000:00:18.0 to group 10 [ 3.431143] iommu: Adding device 0000:00:18.1 to group 10 [ 3.436566] iommu: Adding device 0000:00:18.2 to group 10 [ 3.441991] iommu: Adding device 0000:00:18.3 to group 10 [ 3.447418] iommu: Adding device 0000:00:18.4 to group 10 [ 3.452842] iommu: Adding device 0000:00:18.5 to group 10 [ 3.458270] iommu: Adding device 0000:00:18.6 to group 10 [ 3.463698] iommu: Adding device 0000:00:18.7 to group 10 [ 3.469890] iommu: Adding device 0000:00:19.0 to group 11 [ 3.475320] iommu: Adding device 0000:00:19.1 to group 11 [ 3.480743] iommu: Adding device 0000:00:19.2 to group 11 [ 3.486167] iommu: Adding device 0000:00:19.3 to group 11 [ 3.491590] iommu: Adding device 0000:00:19.4 to group 11 [ 3.497019] iommu: Adding device 0000:00:19.5 to group 11 [ 3.502444] iommu: Adding device 0000:00:19.6 to group 11 [ 3.507870] iommu: Adding device 0000:00:19.7 to group 11 [ 3.514046] iommu: Adding device 0000:00:1a.0 to group 12 [ 3.519477] iommu: Adding device 0000:00:1a.1 to group 12 [ 3.524898] iommu: Adding device 0000:00:1a.2 to group 12 [ 3.530325] iommu: Adding device 0000:00:1a.3 to group 12 [ 3.535751] iommu: Adding device 0000:00:1a.4 to group 12 [ 3.541175] iommu: Adding device 0000:00:1a.5 to group 12 [ 3.546601] iommu: Adding device 0000:00:1a.6 to group 12 [ 3.552029] iommu: Adding device 0000:00:1a.7 to group 12 [ 3.558211] iommu: Adding device 0000:00:1b.0 to group 13 [ 3.563640] iommu: Adding device 0000:00:1b.1 to group 13 [ 3.569068] iommu: Adding device 0000:00:1b.2 to group 13 [ 3.574491] iommu: Adding device 0000:00:1b.3 to group 13 [ 3.579917] iommu: Adding device 0000:00:1b.4 to group 13 [ 3.585353] iommu: Adding device 0000:00:1b.5 to group 13 [ 3.590774] iommu: Adding device 0000:00:1b.6 to group 13 [ 3.596201] iommu: Adding device 0000:00:1b.7 to group 13 [ 3.602332] iommu: Adding device 0000:01:00.0 to group 14 [ 3.608430] iommu: Adding device 0000:02:00.0 to group 15 [ 3.614568] iommu: Adding device 0000:02:00.2 to group 16 [ 3.620670] iommu: Adding device 0000:02:00.3 to group 17 [ 3.626780] iommu: Adding device 0000:03:00.0 to group 18 [ 3.632877] iommu: Adding device 0000:03:00.1 to group 19 [ 3.638973] iommu: Adding device 0000:40:01.0 to group 20 [ 3.645061] iommu: Adding device 0000:40:02.0 to group 21 [ 3.651179] iommu: Adding device 0000:40:03.0 to group 22 [ 3.657260] iommu: Adding device 0000:40:04.0 to group 23 [ 3.663324] iommu: Adding device 0000:40:07.0 to group 24 [ 3.669360] iommu: Adding device 0000:40:07.1 to group 25 [ 3.675414] iommu: Adding device 0000:40:08.0 to group 26 [ 3.681451] iommu: Adding device 0000:40:08.1 to group 27 [ 3.687490] iommu: Adding device 0000:41:00.0 to group 28 [ 3.693537] iommu: Adding device 0000:41:00.2 to group 29 [ 3.699572] iommu: Adding device 0000:41:00.3 to group 30 [ 3.705646] iommu: Adding device 0000:42:00.0 to group 31 [ 3.711656] iommu: Adding device 0000:42:00.1 to group 32 [ 3.717747] iommu: Adding device 0000:80:01.0 to group 33 [ 3.723778] iommu: Adding device 0000:80:01.1 to group 34 [ 3.729894] iommu: Adding device 0000:80:01.2 to group 35 [ 3.735957] iommu: Adding device 0000:80:02.0 to group 36 [ 3.742002] iommu: Adding device 0000:80:03.0 to group 37 [ 3.748009] iommu: Adding device 0000:80:03.1 to group 38 [ 3.754053] iommu: Adding device 0000:80:04.0 to group 39 [ 3.760127] iommu: Adding device 0000:80:07.0 to group 40 [ 3.766122] iommu: Adding device 0000:80:07.1 to group 41 [ 3.772195] iommu: Adding device 0000:80:08.0 to group 42 [ 3.778262] iommu: Adding device 0000:80:08.1 to group 43 [ 3.784300] iommu: Adding device 0000:81:00.0 to group 44 [ 3.789751] iommu: Adding device 0000:81:00.1 to group 44 [ 3.795808] iommu: Adding device 0000:82:00.0 to group 45 [ 3.801220] iommu: Adding device 0000:83:00.0 to group 45 [ 3.807241] iommu: Adding device 0000:84:00.0 to group 46 [ 3.813275] iommu: Adding device 0000:85:00.0 to group 47 [ 3.819316] iommu: Adding device 0000:85:00.2 to group 48 [ 3.825366] iommu: Adding device 0000:86:00.0 to group 49 [ 3.831396] iommu: Adding device 0000:86:00.1 to group 50 [ 3.837434] iommu: Adding device 0000:86:00.2 to group 51 [ 3.843517] iommu: Adding device 0000:c0:01.0 to group 52 [ 3.849572] iommu: Adding device 0000:c0:01.1 to group 53 [ 3.855610] iommu: Adding device 0000:c0:02.0 to group 54 [ 3.861692] iommu: Adding device 0000:c0:03.0 to group 55 [ 3.867764] iommu: Adding device 0000:c0:04.0 to group 56 [ 3.873837] iommu: Adding device 0000:c0:07.0 to group 57 [ 3.879913] iommu: Adding device 0000:c0:07.1 to group 58 [ 3.886000] iommu: Adding device 0000:c0:08.0 to group 59 [ 3.892021] iommu: Adding device 0000:c0:08.1 to group 60 [ 3.900506] iommu: Adding device 0000:c1:00.0 to group 61 [ 3.906549] iommu: Adding device 0000:c2:00.0 to group 62 [ 3.912569] iommu: Adding device 0000:c2:00.2 to group 63 [ 3.918654] iommu: Adding device 0000:c3:00.0 to group 64 [ 3.924672] iommu: Adding device 0000:c3:00.1 to group 65 [ 3.930264] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40 [ 3.935583] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.940903] PPR NX GT IA GA PC GA_vAPIC [ 3.945035] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40 [ 3.950357] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.955679] PPR NX GT IA GA PC GA_vAPIC [ 3.959813] AMD-Vi: Found IOMMU at 0000:80:00.2 cap 0x40 [ 3.965133] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.970456] PPR NX GT IA GA PC GA_vAPIC [ 3.974598] AMD-Vi: Found IOMMU at 0000:c0:00.2 cap 0x40 [ 3.979923] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.985241] PPR NX GT IA GA PC GA_vAPIC [ 3.989384] AMD-Vi: Interrupt remapping enabled [ 3.993924] AMD-Vi: virtual APIC enabled [ 3.997923] pci 0000:00:00.2: irq 26 for MSI/MSI-X [ 3.998019] pci 0000:40:00.2: irq 27 for MSI/MSI-X [ 3.998107] pci 0000:80:00.2: irq 28 for MSI/MSI-X [ 3.998186] pci 0000:c0:00.2: irq 29 for MSI/MSI-X [ 3.998243] AMD-Vi: Lazy IO/TLB flushing enabled [ 4.004577] perf: AMD NB counters detected [ 4.008725] perf: AMD LLC counters detected [ 4.018979] sha1_ssse3: Using SHA-NI optimized SHA-1 implementation [ 4.025336] sha256_ssse3: Using SHA-256-NI optimized SHA-256 implementation [ 4.033948] futex hash table entries: 32768 (order: 9, 2097152 bytes) [ 4.040572] Initialise system trusted keyring [ 4.044978] audit: initializing netlink socket (disabled) [ 4.050406] type=2000 audit(1575986239.206:1): initialized [ 4.081291] HugeTLB registered 1 GB page size, pre-allocated 0 pages [ 4.087657] HugeTLB registered 2 MB page size, pre-allocated 0 pages [ 4.095279] zpool: loaded [ 4.097917] zbud: loaded [ 4.100820] VFS: Disk quotas dquot_6.6.0 [ 4.104856] Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 4.111675] msgmni has been set to 32768 [ 4.115696] Key type big_key registered [ 4.119541] SELinux: Registering netfilter hooks [ 4.121989] NET: Registered protocol family 38 [ 4.126452] Key type asymmetric registered [ 4.130556] Asymmetric key parser 'x509' registered [ 4.135494] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248) [ 4.143062] io scheduler noop registered [ 4.146997] io scheduler deadline registered (default) [ 4.152181] io scheduler cfq registered [ 4.156028] io scheduler mq-deadline registered [ 4.160568] io scheduler kyber registered [ 4.165134] pcieport 0000:00:03.1: irq 30 for MSI/MSI-X [ 4.166082] pcieport 0000:00:07.1: irq 31 for MSI/MSI-X [ 4.167091] pcieport 0000:00:08.1: irq 33 for MSI/MSI-X [ 4.167382] pcieport 0000:40:07.1: irq 34 for MSI/MSI-X [ 4.167694] pcieport 0000:40:08.1: irq 36 for MSI/MSI-X [ 4.168009] pcieport 0000:80:01.1: irq 37 for MSI/MSI-X [ 4.168731] pcieport 0000:80:01.2: irq 38 for MSI/MSI-X [ 4.168939] pcieport 0000:80:03.1: irq 39 for MSI/MSI-X [ 4.169731] pcieport 0000:80:07.1: irq 41 for MSI/MSI-X [ 4.169975] pcieport 0000:80:08.1: irq 43 for MSI/MSI-X [ 4.170211] pcieport 0000:c0:01.1: irq 44 for MSI/MSI-X [ 4.170974] pcieport 0000:c0:07.1: irq 46 for MSI/MSI-X [ 4.171177] pcieport 0000:c0:08.1: irq 48 for MSI/MSI-X [ 4.171274] pcieport 0000:00:03.1: Signaling PME through PCIe PME interrupt [ 4.178242] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt [ 4.184777] pcie_pme 0000:00:03.1:pcie001: service driver pcie_pme loaded [ 4.184791] pcieport 0000:00:07.1: Signaling PME through PCIe PME interrupt [ 4.191759] pci 0000:02:00.0: Signaling PME through PCIe PME interrupt [ 4.198293] pci 0000:02:00.2: Signaling PME through PCIe PME interrupt [ 4.204828] pci 0000:02:00.3: Signaling PME through PCIe PME interrupt [ 4.211365] pcie_pme 0000:00:07.1:pcie001: service driver pcie_pme loaded [ 4.211378] pcieport 0000:00:08.1: Signaling PME through PCIe PME interrupt [ 4.218340] pci 0000:03:00.0: Signaling PME through PCIe PME interrupt [ 4.224876] pci 0000:03:00.1: Signaling PME through PCIe PME interrupt [ 4.231412] pcie_pme 0000:00:08.1:pcie001: service driver pcie_pme loaded [ 4.231431] pcieport 0000:40:07.1: Signaling PME through PCIe PME interrupt [ 4.238394] pci 0000:41:00.0: Signaling PME through PCIe PME interrupt [ 4.244928] pci 0000:41:00.2: Signaling PME through PCIe PME interrupt [ 4.251465] pci 0000:41:00.3: Signaling PME through PCIe PME interrupt [ 4.257999] pcie_pme 0000:40:07.1:pcie001: service driver pcie_pme loaded [ 4.258015] pcieport 0000:40:08.1: Signaling PME through PCIe PME interrupt [ 4.264985] pci 0000:42:00.0: Signaling PME through PCIe PME interrupt [ 4.271518] pci 0000:42:00.1: Signaling PME through PCIe PME interrupt [ 4.278056] pcie_pme 0000:40:08.1:pcie001: service driver pcie_pme loaded [ 4.278075] pcieport 0000:80:01.1: Signaling PME through PCIe PME interrupt [ 4.285038] pci 0000:81:00.0: Signaling PME through PCIe PME interrupt [ 4.291574] pci 0000:81:00.1: Signaling PME through PCIe PME interrupt [ 4.298111] pcie_pme 0000:80:01.1:pcie001: service driver pcie_pme loaded [ 4.298126] pcieport 0000:80:01.2: Signaling PME through PCIe PME interrupt [ 4.305093] pci 0000:82:00.0: Signaling PME through PCIe PME interrupt [ 4.311629] pci 0000:83:00.0: Signaling PME through PCIe PME interrupt [ 4.318165] pcie_pme 0000:80:01.2:pcie001: service driver pcie_pme loaded [ 4.318179] pcieport 0000:80:03.1: Signaling PME through PCIe PME interrupt [ 4.325148] pci 0000:84:00.0: Signaling PME through PCIe PME interrupt [ 4.331686] pcie_pme 0000:80:03.1:pcie001: service driver pcie_pme loaded [ 4.331701] pcieport 0000:80:07.1: Signaling PME through PCIe PME interrupt [ 4.338668] pci 0000:85:00.0: Signaling PME through PCIe PME interrupt [ 4.345204] pci 0000:85:00.2: Signaling PME through PCIe PME interrupt [ 4.351739] pcie_pme 0000:80:07.1:pcie001: service driver pcie_pme loaded [ 4.351753] pcieport 0000:80:08.1: Signaling PME through PCIe PME interrupt [ 4.358716] pci 0000:86:00.0: Signaling PME through PCIe PME interrupt [ 4.365249] pci 0000:86:00.1: Signaling PME through PCIe PME interrupt [ 4.371784] pci 0000:86:00.2: Signaling PME through PCIe PME interrupt [ 4.378321] pcie_pme 0000:80:08.1:pcie001: service driver pcie_pme loaded [ 4.378334] pcieport 0000:c0:01.1: Signaling PME through PCIe PME interrupt [ 4.385295] pci 0000:c1:00.0: Signaling PME through PCIe PME interrupt [ 4.391831] pcie_pme 0000:c0:01.1:pcie001: service driver pcie_pme loaded [ 4.391845] pcieport 0000:c0:07.1: Signaling PME through PCIe PME interrupt [ 4.398808] pci 0000:c2:00.0: Signaling PME through PCIe PME interrupt [ 4.405342] pci 0000:c2:00.2: Signaling PME through PCIe PME interrupt [ 4.411877] pcie_pme 0000:c0:07.1:pcie001: service driver pcie_pme loaded [ 4.411889] pcieport 0000:c0:08.1: Signaling PME through PCIe PME interrupt [ 4.418855] pci 0000:c3:00.0: Signaling PME through PCIe PME interrupt [ 4.425390] pci 0000:c3:00.1: Signaling PME through PCIe PME interrupt [ 4.431924] pcie_pme 0000:c0:08.1:pcie001: service driver pcie_pme loaded [ 4.431943] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [ 4.437527] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 [ 4.444194] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 4.451011] efifb: probing for efifb [ 4.454607] efifb: framebuffer at 0xab000000, mapped to 0xffff9e8759800000, using 3072k, total 3072k [ 4.463737] efifb: mode is 1024x768x32, linelength=4096, pages=1 [ 4.469753] efifb: scrolling: redraw [ 4.473341] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0 [ 4.494669] Console: switching to colour frame buffer device 128x48 [ 4.516373] fb0: EFI VGA frame buffer device [ 4.520756] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input0 [ 4.528943] ACPI: Power Button [PWRB] [ 4.532665] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1 [ 4.540070] ACPI: Power Button [PWRF] [ 4.544952] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC. [ 4.552431] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled [ 4.579629] 00:02: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 4.606176] 00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 4.612247] Non-volatile memory driver v1.3 [ 4.616475] Linux agpgart interface v0.103 [ 4.623064] crash memory driver: version 1.1 [ 4.627575] rdac: device handler registered [ 4.631823] hp_sw: device handler registered [ 4.636107] emc: device handler registered [ 4.640398] alua: device handler registered [ 4.644630] libphy: Fixed MDIO Bus: probed [ 4.648792] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 4.655331] ehci-pci: EHCI PCI platform driver [ 4.659796] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 4.665990] ohci-pci: OHCI PCI platform driver [ 4.670455] uhci_hcd: USB Universal Host Controller Interface driver [ 4.676954] xhci_hcd 0000:02:00.3: xHCI Host Controller [ 4.682261] xhci_hcd 0000:02:00.3: new USB bus registered, assigned bus number 1 [ 4.689770] xhci_hcd 0000:02:00.3: hcc params 0x0270f665 hci version 0x100 quirks 0x00000410 [ 4.698251] xhci_hcd 0000:02:00.3: irq 50 for MSI/MSI-X [ 4.698275] xhci_hcd 0000:02:00.3: irq 51 for MSI/MSI-X [ 4.698294] xhci_hcd 0000:02:00.3: irq 52 for MSI/MSI-X [ 4.698317] xhci_hcd 0000:02:00.3: irq 53 for MSI/MSI-X [ 4.698336] xhci_hcd 0000:02:00.3: irq 54 for MSI/MSI-X [ 4.698365] xhci_hcd 0000:02:00.3: irq 55 for MSI/MSI-X [ 4.698384] xhci_hcd 0000:02:00.3: irq 56 for MSI/MSI-X [ 4.698402] xhci_hcd 0000:02:00.3: irq 57 for MSI/MSI-X [ 4.698542] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [ 4.705336] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.712563] usb usb1: Product: xHCI Host Controller [ 4.717450] usb usb1: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.725546] usb usb1: SerialNumber: 0000:02:00.3 [ 4.730288] hub 1-0:1.0: USB hub found [ 4.734053] hub 1-0:1.0: 2 ports detected [ 4.738311] xhci_hcd 0000:02:00.3: xHCI Host Controller [ 4.743600] xhci_hcd 0000:02:00.3: new USB bus registered, assigned bus number 2 [ 4.751027] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM. [ 4.759135] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003 [ 4.765933] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.773162] usb usb2: Product: xHCI Host Controller [ 4.778048] usb usb2: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.786143] usb usb2: SerialNumber: 0000:02:00.3 [ 4.790859] hub 2-0:1.0: USB hub found [ 4.794625] hub 2-0:1.0: 2 ports detected [ 4.798969] xhci_hcd 0000:41:00.3: xHCI Host Controller [ 4.804279] xhci_hcd 0000:41:00.3: new USB bus registered, assigned bus number 3 [ 4.811789] xhci_hcd 0000:41:00.3: hcc params 0x0270f665 hci version 0x100 quirks 0x00000410 [ 4.820274] xhci_hcd 0000:41:00.3: irq 59 for MSI/MSI-X [ 4.820295] xhci_hcd 0000:41:00.3: irq 60 for MSI/MSI-X [ 4.820316] xhci_hcd 0000:41:00.3: irq 61 for MSI/MSI-X [ 4.820334] xhci_hcd 0000:41:00.3: irq 62 for MSI/MSI-X [ 4.820363] xhci_hcd 0000:41:00.3: irq 63 for MSI/MSI-X [ 4.820389] xhci_hcd 0000:41:00.3: irq 64 for MSI/MSI-X [ 4.820408] xhci_hcd 0000:41:00.3: irq 65 for MSI/MSI-X [ 4.820426] xhci_hcd 0000:41:00.3: irq 66 for MSI/MSI-X [ 4.820578] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002 [ 4.827373] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.834599] usb usb3: Product: xHCI Host Controller [ 4.839487] usb usb3: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.847583] usb usb3: SerialNumber: 0000:41:00.3 [ 4.852319] hub 3-0:1.0: USB hub found [ 4.856090] hub 3-0:1.0: 2 ports detected [ 4.860361] xhci_hcd 0000:41:00.3: xHCI Host Controller [ 4.865638] xhci_hcd 0000:41:00.3: new USB bus registered, assigned bus number 4 [ 4.873074] usb usb4: We don't know the algorithms for LPM for this host, disabling LPM. [ 4.881182] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003 [ 4.887978] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.895205] usb usb4: Product: xHCI Host Controller [ 4.900095] usb usb4: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.908189] usb usb4: SerialNumber: 0000:41:00.3 [ 4.912907] hub 4-0:1.0: USB hub found [ 4.916668] hub 4-0:1.0: 2 ports detected [ 4.920928] usbcore: registered new interface driver usbserial_generic [ 4.927469] usbserial: USB Serial support registered for generic [ 4.933523] i8042: PNP: No PS/2 controller found. Probing ports directly. [ 5.049371] usb 1-1: new high-speed USB device number 2 using xhci_hcd [ 5.171369] usb 3-1: new high-speed USB device number 2 using xhci_hcd [ 5.181154] usb 1-1: New USB device found, idVendor=0424, idProduct=2744 [ 5.187869] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 5.195014] usb 1-1: Product: USB2734 [ 5.198686] usb 1-1: Manufacturer: Microchip Tech [ 5.231176] hub 1-1:1.0: USB hub found [ 5.235153] hub 1-1:1.0: 4 ports detected [ 5.291404] usb 2-1: new SuperSpeed USB device number 2 using xhci_hcd [ 5.301329] usb 3-1: New USB device found, idVendor=1604, idProduct=10c0 [ 5.308036] usb 3-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 5.310527] usb 2-1: New USB device found, idVendor=0424, idProduct=5744 [ 5.310528] usb 2-1: New USB device strings: Mfr=2, Product=3, SerialNumber=0 [ 5.310530] usb 2-1: Product: USB5734 [ 5.310531] usb 2-1: Manufacturer: Microchip Tech [ 5.311171] hub 2-1:1.0: USB hub found [ 5.311524] hub 2-1:1.0: 4 ports detected [ 5.312577] usb: port power management may be unreliable [ 5.353219] hub 3-1:1.0: USB hub found [ 5.357202] hub 3-1:1.0: 4 ports detected [ 5.973760] i8042: No controller found [ 5.977538] tsc: Refined TSC clocksource calibration: 1996.249 MHz [ 5.977657] mousedev: PS/2 mouse device common for all mice [ 5.977882] rtc_cmos 00:01: RTC can wake from S4 [ 5.978233] rtc_cmos 00:01: rtc core: registered rtc_cmos as rtc0 [ 5.978336] rtc_cmos 00:01: alarms up to one month, y3k, 114 bytes nvram, hpet irqs [ 5.978418] cpuidle: using governor menu [ 5.978675] EFI Variables Facility v0.08 2004-May-17 [ 6.004660] hidraw: raw HID events driver (C) Jiri Kosina [ 6.004778] usbcore: registered new interface driver usbhid [ 6.004779] usbhid: USB HID core driver [ 6.004863] drop_monitor: Initializing network drop monitor service [ 6.005021] TCP: cubic registered [ 6.005025] Initializing XFRM netlink socket [ 6.005249] NET: Registered protocol family 10 [ 6.005808] NET: Registered protocol family 17 [ 6.005811] mpls_gso: MPLS GSO support [ 6.007054] mce: Using 23 MCE banks [ 6.007102] microcode: CPU0: patch_level=0x08001250 [ 6.007113] microcode: CPU1: patch_level=0x08001250 [ 6.007126] microcode: CPU2: patch_level=0x08001250 [ 6.011017] microcode: CPU3: patch_level=0x08001250 [ 6.011036] microcode: CPU4: patch_level=0x08001250 [ 6.011053] microcode: CPU5: patch_level=0x08001250 [ 6.011069] microcode: CPU6: patch_level=0x08001250 [ 6.011081] microcode: CPU7: patch_level=0x08001250 [ 6.011090] microcode: CPU8: patch_level=0x08001250 [ 6.011101] microcode: CPU9: patch_level=0x08001250 [ 6.011112] microcode: CPU10: patch_level=0x08001250 [ 6.011122] microcode: CPU11: patch_level=0x08001250 [ 6.011132] microcode: CPU12: patch_level=0x08001250 [ 6.011143] microcode: CPU13: patch_level=0x08001250 [ 6.011153] microcode: CPU14: patch_level=0x08001250 [ 6.011164] microcode: CPU15: patch_level=0x08001250 [ 6.011175] microcode: CPU16: patch_level=0x08001250 [ 6.011187] microcode: CPU17: patch_level=0x08001250 [ 6.011197] microcode: CPU18: patch_level=0x08001250 [ 6.011207] microcode: CPU19: patch_level=0x08001250 [ 6.011216] microcode: CPU20: patch_level=0x08001250 [ 6.011227] microcode: CPU21: patch_level=0x08001250 [ 6.011238] microcode: CPU22: patch_level=0x08001250 [ 6.011248] microcode: CPU23: patch_level=0x08001250 [ 6.011259] microcode: CPU24: patch_level=0x08001250 [ 6.011267] microcode: CPU25: patch_level=0x08001250 [ 6.011278] microcode: CPU26: patch_level=0x08001250 [ 6.011284] microcode: CPU27: patch_level=0x08001250 [ 6.011292] microcode: CPU28: patch_level=0x08001250 [ 6.011303] microcode: CPU29: patch_level=0x08001250 [ 6.011313] microcode: CPU30: patch_level=0x08001250 [ 6.011324] microcode: CPU31: patch_level=0x08001250 [ 6.011335] microcode: CPU32: patch_level=0x08001250 [ 6.011343] microcode: CPU33: patch_level=0x08001250 [ 6.011354] microcode: CPU34: patch_level=0x08001250 [ 6.011376] microcode: CPU35: patch_level=0x08001250 [ 6.011384] microcode: CPU36: patch_level=0x08001250 [ 6.011393] microcode: CPU37: patch_level=0x08001250 [ 6.011404] microcode: CPU38: patch_level=0x08001250 [ 6.011414] microcode: CPU39: patch_level=0x08001250 [ 6.011422] microcode: CPU40: patch_level=0x08001250 [ 6.011433] microcode: CPU41: patch_level=0x08001250 [ 6.011444] microcode: CPU42: patch_level=0x08001250 [ 6.011455] microcode: CPU43: patch_level=0x08001250 [ 6.011463] microcode: CPU44: patch_level=0x08001250 [ 6.011471] microcode: CPU45: patch_level=0x08001250 [ 6.011482] microcode: CPU46: patch_level=0x08001250 [ 6.011493] microcode: CPU47: patch_level=0x08001250 [ 6.011544] microcode: Microcode Update Driver: v2.01 , Peter Oruba [ 6.011699] PM: Hibernation image not present or could not be loaded. [ 6.011703] Loading compiled-in X.509 certificates [ 6.011724] Loaded X.509 cert 'CentOS Linux kpatch signing key: ea0413152cde1d98ebdca3fe6f0230904c9ef717' [ 6.011738] Loaded X.509 cert 'CentOS Linux Driver update signing key: 7f421ee0ab69461574bb358861dbe77762a4201b' [ 6.012132] Loaded X.509 cert 'CentOS Linux kernel signing key: 468656045a39b52ff2152c315f6198c3e658f24d' [ 6.012145] registered taskstats version 1 [ 6.014239] Key type trusted registered [ 6.015794] Key type encrypted registered [ 6.015852] IMA: No TPM chip found, activating TPM-bypass! (rc=-19) [ 6.017833] Magic number: 15:608:988 [ 6.017837] machinecheck machinecheck1: hash matches [ 6.017877] clockevents clockevent61: hash matches [ 6.018009] memory memory1632: hash matches [ 6.018037] memory memory1186: hash matches [ 6.018060] memory memory845: hash matches [ 6.018097] memory memory399: hash matches [ 6.019105] rtc_cmos 00:01: setting system clock to 2019-12-10 13:57:25 UTC (1575986245) [ 6.410259] usb 3-1.1: new high-speed USB device number 3 using xhci_hcd [ 6.410269] Switched to clocksource tsc [ 6.421810] Freeing unused kernel memory: 1876k freed [ 6.427034] Write protecting the kernel read-only data: 12288k [ 6.434264] Freeing unused kernel memory: 504k freed [ 6.440609] Freeing unused kernel memory: 596k freed [ 6.494336] usb 3-1.1: New USB device found, idVendor=1604, idProduct=10c0 [ 6.496250] systemd[1]: systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) [ 6.497365] systemd[1]: Detected architecture x86-64. [ 6.497367] systemd[1]: Running in initial RAM disk. [ 6.529213] usb 3-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 6.537229] hub 3-1.1:1.0: USB hub found [ 6.542208] hub 3-1.1:1.0: 4 ports detected [ 6.553516] systemd[1]: Set hostname to . [ 6.588862] systemd[1]: Reached target Timers. [ 6.597658] systemd[1]: Created slice Root Slice. [ 6.608488] systemd[1]: Created slice System Slice. [ 6.611373] usb 3-1.4: new high-speed USB device number 4 using xhci_hcd [ 6.624421] systemd[1]: Reached target Slices. [ 6.633482] systemd[1]: Listening on Journal Socket. [ 6.644425] systemd[1]: Reached target Local File Systems. [ 6.655901] systemd[1]: Starting Journal Service... [ 6.665831] systemd[1]: Starting Setup Virtual Console... [ 6.685337] usb 3-1.4: New USB device found, idVendor=1604, idProduct=10c0 [ 6.685339] usb 3-1.4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 6.688376] systemd[1]: Starting Create list of required static device nodes for the current kernel... [ 6.697234] hub 3-1.4:1.0: USB hub found [ 6.697584] hub 3-1.4:1.0: 4 ports detected [ 6.729459] systemd[1]: Listening on udev Control Socket. [ 6.740448] systemd[1]: Listening on udev Kernel Socket. [ 6.751425] systemd[1]: Reached target Sockets. [ 6.760986] systemd[1]: Starting dracut cmdline hook... [ 6.770808] systemd[1]: Starting Apply Kernel Variables... [ 6.780421] systemd[1]: Reached target Swap. [ 6.789680] systemd[1]: Started Journal Service. [ 6.946237] pps_core: LinuxPPS API ver. 1 registered [ 6.951211] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti [ 6.964480] PTP clock support registered [ 6.970891] megasas: 07.705.02.00-rh1 [ 6.970992] mlx_compat: loading out-of-tree module taints kernel. [ 6.982686] mlx_compat: module verification failed: signature and/or required key missing - tainting kernel [ 6.993344] megaraid_sas 0000:c1:00.0: FW now in Ready state [ 6.999044] megaraid_sas 0000:c1:00.0: 64 bit DMA mask and 32 bit consistent mask [ 6.999400] megaraid_sas 0000:c1:00.0: irq 68 for MSI/MSI-X [ 6.999422] megaraid_sas 0000:c1:00.0: irq 69 for MSI/MSI-X [ 6.999447] megaraid_sas 0000:c1:00.0: irq 70 for MSI/MSI-X [ 6.999468] megaraid_sas 0000:c1:00.0: irq 71 for MSI/MSI-X [ 6.999489] megaraid_sas 0000:c1:00.0: irq 72 for MSI/MSI-X [ 6.999509] megaraid_sas 0000:c1:00.0: irq 73 for MSI/MSI-X [ 6.999531] megaraid_sas 0000:c1:00.0: irq 74 for MSI/MSI-X [ 6.999552] megaraid_sas 0000:c1:00.0: irq 75 for MSI/MSI-X [ 6.999575] megaraid_sas 0000:c1:00.0: irq 76 for MSI/MSI-X [ 6.999596] megaraid_sas 0000:c1:00.0: irq 77 for MSI/MSI-X [ 6.999616] megaraid_sas 0000:c1:00.0: irq 78 for MSI/MSI-X [ 6.999642] megaraid_sas 0000:c1:00.0: irq 79 for MSI/MSI-X [ 6.999672] megaraid_sas 0000:c1:00.0: irq 80 for MSI/MSI-X [ 6.999697] megaraid_sas 0000:c1:00.0: irq 81 for MSI/MSI-X [ 6.999721] megaraid_sas 0000:c1:00.0: irq 82 for MSI/MSI-X [ 6.999744] megaraid_sas 0000:c1:00.0: irq 83 for MSI/MSI-X [ 6.999769] megaraid_sas 0000:c1:00.0: irq 84 for MSI/MSI-X [ 6.999793] megaraid_sas 0000:c1:00.0: irq 85 for MSI/MSI-X [ 6.999817] megaraid_sas 0000:c1:00.0: irq 86 for MSI/MSI-X [ 6.999843] megaraid_sas 0000:c1:00.0: irq 87 for MSI/MSI-X [ 6.999865] megaraid_sas 0000:c1:00.0: irq 88 for MSI/MSI-X [ 6.999888] megaraid_sas 0000:c1:00.0: irq 89 for MSI/MSI-X [ 6.999913] megaraid_sas 0000:c1:00.0: irq 90 for MSI/MSI-X [ 6.999937] megaraid_sas 0000:c1:00.0: irq 91 for MSI/MSI-X [ 6.999966] megaraid_sas 0000:c1:00.0: irq 92 for MSI/MSI-X [ 6.999993] megaraid_sas 0000:c1:00.0: irq 93 for MSI/MSI-X [ 7.000018] megaraid_sas 0000:c1:00.0: irq 94 for MSI/MSI-X [ 7.000044] megaraid_sas 0000:c1:00.0: irq 95 for MSI/MSI-X [ 7.000068] megaraid_sas 0000:c1:00.0: irq 96 for MSI/MSI-X [ 7.000092] megaraid_sas 0000:c1:00.0: irq 97 for MSI/MSI-X [ 7.000116] megaraid_sas 0000:c1:00.0: irq 98 for MSI/MSI-X [ 7.000140] megaraid_sas 0000:c1:00.0: irq 99 for MSI/MSI-X [ 7.000161] megaraid_sas 0000:c1:00.0: irq 100 for MSI/MSI-X [ 7.000181] megaraid_sas 0000:c1:00.0: irq 101 for MSI/MSI-X [ 7.000201] megaraid_sas 0000:c1:00.0: irq 102 for MSI/MSI-X [ 7.000221] megaraid_sas 0000:c1:00.0: irq 103 for MSI/MSI-X [ 7.000240] megaraid_sas 0000:c1:00.0: irq 104 for MSI/MSI-X [ 7.000262] megaraid_sas 0000:c1:00.0: irq 105 for MSI/MSI-X [ 7.000281] megaraid_sas 0000:c1:00.0: irq 106 for MSI/MSI-X [ 7.000299] megaraid_sas 0000:c1:00.0: irq 107 for MSI/MSI-X [ 7.000319] megaraid_sas 0000:c1:00.0: irq 108 for MSI/MSI-X [ 7.000339] megaraid_sas 0000:c1:00.0: irq 109 for MSI/MSI-X [ 7.000358] megaraid_sas 0000:c1:00.0: irq 110 for MSI/MSI-X [ 7.000386] megaraid_sas 0000:c1:00.0: irq 111 for MSI/MSI-X [ 7.000406] megaraid_sas 0000:c1:00.0: irq 112 for MSI/MSI-X [ 7.000427] megaraid_sas 0000:c1:00.0: irq 113 for MSI/MSI-X [ 7.000451] megaraid_sas 0000:c1:00.0: irq 114 for MSI/MSI-X [ 7.000475] megaraid_sas 0000:c1:00.0: irq 115 for MSI/MSI-X [ 7.000632] megaraid_sas 0000:c1:00.0: firmware supports msix : (96) [ 7.000634] megaraid_sas 0000:c1:00.0: current msix/online cpus : (48/48) [ 7.000635] megaraid_sas 0000:c1:00.0: RDPQ mode : (disabled) [ 7.000638] megaraid_sas 0000:c1:00.0: Current firmware supports maximum commands: 928 LDIO threshold: 237 [ 7.000969] megaraid_sas 0000:c1:00.0: Configured max firmware commands: 927 [ 7.003315] megaraid_sas 0000:c1:00.0: FW supports sync cache : No [ 7.049263] Compat-mlnx-ofed backport release: 1c4bf42 [ 7.055144] Backport based on mlnx_ofed/mlnx-ofa_kernel-4.0.git 1c4bf42 [ 7.063141] compat.git: mlnx_ofed/mlnx-ofa_kernel-4.0.git [ 7.076381] tg3.c:v3.137 (May 11, 2014) [ 7.076534] libata version 3.00 loaded. [ 7.082410] mpt3sas version 31.00.00.00 loaded [ 7.087709] mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (263565236 kB) [ 7.093865] tg3 0000:81:00.0 eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address d0:94:66:34:4a:7d [ 7.093867] tg3 0000:81:00.0 eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1]) [ 7.093869] tg3 0000:81:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1] [ 7.093870] tg3 0000:81:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit] [ 7.121162] tg3 0000:81:00.1 eth1: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address d0:94:66:34:4a:7e [ 7.121165] tg3 0000:81:00.1 eth1: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1]) [ 7.121167] tg3 0000:81:00.1 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1] [ 7.121169] tg3 0000:81:00.1 eth1: dma_rwctrl[00000001] dma_mask[64-bit] [ 7.178468] ahci 0000:86:00.2: version 3.0 [ 7.178886] ahci 0000:86:00.2: irq 120 for MSI/MSI-X [ 7.178892] ahci 0000:86:00.2: irq 121 for MSI/MSI-X [ 7.178897] ahci 0000:86:00.2: irq 122 for MSI/MSI-X [ 7.178901] ahci 0000:86:00.2: irq 123 for MSI/MSI-X [ 7.178905] ahci 0000:86:00.2: irq 124 for MSI/MSI-X [ 7.178908] ahci 0000:86:00.2: irq 125 for MSI/MSI-X [ 7.178912] ahci 0000:86:00.2: irq 126 for MSI/MSI-X [ 7.178916] ahci 0000:86:00.2: irq 127 for MSI/MSI-X [ 7.178919] ahci 0000:86:00.2: irq 128 for MSI/MSI-X [ 7.178923] ahci 0000:86:00.2: irq 129 for MSI/MSI-X [ 7.178927] ahci 0000:86:00.2: irq 130 for MSI/MSI-X [ 7.178930] ahci 0000:86:00.2: irq 131 for MSI/MSI-X [ 7.178935] ahci 0000:86:00.2: irq 132 for MSI/MSI-X [ 7.178938] ahci 0000:86:00.2: irq 133 for MSI/MSI-X [ 7.178942] ahci 0000:86:00.2: irq 134 for MSI/MSI-X [ 7.178945] ahci 0000:86:00.2: irq 135 for MSI/MSI-X [ 7.179034] ahci 0000:86:00.2: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x1 impl SATA mode [ 7.188024] ahci 0000:86:00.2: flags: 64bit ncq sntf ilck pm led clo only pmp fbs pio slum part [ 7.199692] scsi host2: ahci [ 7.200384] mpt3sas_cm0: IOC Number : 0 [ 7.200387] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 7.200506] mpt3sas 0000:84:00.0: irq 136 for MSI/MSI-X [ 7.200531] mpt3sas 0000:84:00.0: irq 137 for MSI/MSI-X [ 7.200560] mpt3sas 0000:84:00.0: irq 138 for MSI/MSI-X [ 7.200585] mpt3sas 0000:84:00.0: irq 139 for MSI/MSI-X [ 7.200608] mpt3sas 0000:84:00.0: irq 140 for MSI/MSI-X [ 7.200630] mpt3sas 0000:84:00.0: irq 141 for MSI/MSI-X [ 7.200654] mpt3sas 0000:84:00.0: irq 142 for MSI/MSI-X [ 7.200678] mpt3sas 0000:84:00.0: irq 143 for MSI/MSI-X [ 7.200717] mpt3sas 0000:84:00.0: irq 144 for MSI/MSI-X [ 7.200751] mpt3sas 0000:84:00.0: irq 145 for MSI/MSI-X [ 7.200777] mpt3sas 0000:84:00.0: irq 146 for MSI/MSI-X [ 7.200799] mpt3sas 0000:84:00.0: irq 147 for MSI/MSI-X [ 7.200821] mpt3sas 0000:84:00.0: irq 148 for MSI/MSI-X [ 7.200844] mpt3sas 0000:84:00.0: irq 149 for MSI/MSI-X [ 7.200867] mpt3sas 0000:84:00.0: irq 150 for MSI/MSI-X [ 7.200890] mpt3sas 0000:84:00.0: irq 151 for MSI/MSI-X [ 7.200917] mpt3sas 0000:84:00.0: irq 152 for MSI/MSI-X [ 7.200938] mpt3sas 0000:84:00.0: irq 153 for MSI/MSI-X [ 7.200960] mpt3sas 0000:84:00.0: irq 154 for MSI/MSI-X [ 7.200983] mpt3sas 0000:84:00.0: irq 155 for MSI/MSI-X [ 7.201005] mpt3sas 0000:84:00.0: irq 156 for MSI/MSI-X [ 7.201028] mpt3sas 0000:84:00.0: irq 157 for MSI/MSI-X [ 7.201050] mpt3sas 0000:84:00.0: irq 158 for MSI/MSI-X [ 7.201073] mpt3sas 0000:84:00.0: irq 159 for MSI/MSI-X [ 7.201104] mpt3sas 0000:84:00.0: irq 160 for MSI/MSI-X [ 7.201125] mpt3sas 0000:84:00.0: irq 161 for MSI/MSI-X [ 7.201150] mpt3sas 0000:84:00.0: irq 162 for MSI/MSI-X [ 7.201173] mpt3sas 0000:84:00.0: irq 163 for MSI/MSI-X [ 7.201196] mpt3sas 0000:84:00.0: irq 164 for MSI/MSI-X [ 7.201219] mpt3sas 0000:84:00.0: irq 165 for MSI/MSI-X [ 7.201244] mpt3sas 0000:84:00.0: irq 166 for MSI/MSI-X [ 7.201266] mpt3sas 0000:84:00.0: irq 167 for MSI/MSI-X [ 7.201293] mpt3sas 0000:84:00.0: irq 168 for MSI/MSI-X [ 7.201316] mpt3sas 0000:84:00.0: irq 169 for MSI/MSI-X [ 7.201338] mpt3sas 0000:84:00.0: irq 170 for MSI/MSI-X [ 7.201362] mpt3sas 0000:84:00.0: irq 171 for MSI/MSI-X [ 7.201392] mpt3sas 0000:84:00.0: irq 172 for MSI/MSI-X [ 7.201415] mpt3sas 0000:84:00.0: irq 173 for MSI/MSI-X [ 7.201437] mpt3sas 0000:84:00.0: irq 174 for MSI/MSI-X [ 7.201459] mpt3sas 0000:84:00.0: irq 175 for MSI/MSI-X [ 7.201486] mpt3sas 0000:84:00.0: irq 176 for MSI/MSI-X [ 7.201510] mpt3sas 0000:84:00.0: irq 177 for MSI/MSI-X [ 7.201534] mpt3sas 0000:84:00.0: irq 178 for MSI/MSI-X [ 7.201557] mpt3sas 0000:84:00.0: irq 179 for MSI/MSI-X [ 7.201579] mpt3sas 0000:84:00.0: irq 180 for MSI/MSI-X [ 7.201601] mpt3sas 0000:84:00.0: irq 181 for MSI/MSI-X [ 7.201624] mpt3sas 0000:84:00.0: irq 182 for MSI/MSI-X [ 7.201646] mpt3sas 0000:84:00.0: irq 183 for MSI/MSI-X [ 7.202352] mpt3sas0-msix0: PCI-MSI-X enabled: IRQ 136 [ 7.202353] mpt3sas0-msix1: PCI-MSI-X enabled: IRQ 137 [ 7.202354] mpt3sas0-msix2: PCI-MSI-X enabled: IRQ 138 [ 7.202354] mpt3sas0-msix3: PCI-MSI-X enabled: IRQ 139 [ 7.202355] mpt3sas0-msix4: PCI-MSI-X enabled: IRQ 140 [ 7.202356] mpt3sas0-msix5: PCI-MSI-X enabled: IRQ 141 [ 7.202356] mpt3sas0-msix6: PCI-MSI-X enabled: IRQ 142 [ 7.202357] mpt3sas0-msix7: PCI-MSI-X enabled: IRQ 143 [ 7.202358] mpt3sas0-msix8: PCI-MSI-X enabled: IRQ 144 [ 7.202358] mpt3sas0-msix9: PCI-MSI-X enabled: IRQ 145 [ 7.202359] mpt3sas0-msix10: PCI-MSI-X enabled: IRQ 146 [ 7.202359] mpt3sas0-msix11: PCI-MSI-X enabled: IRQ 147 [ 7.202360] mpt3sas0-msix12: PCI-MSI-X enabled: IRQ 148 [ 7.202361] mpt3sas0-msix13: PCI-MSI-X enabled: IRQ 149 [ 7.202361] mpt3sas0-msix14: PCI-MSI-X enabled: IRQ 150 [ 7.202362] mpt3sas0-msix15: PCI-MSI-X enabled: IRQ 151 [ 7.202363] mpt3sas0-msix16: PCI-MSI-X enabled: IRQ 152 [ 7.202363] mpt3sas0-msix17: PCI-MSI-X enabled: IRQ 153 [ 7.202364] mpt3sas0-msix18: PCI-MSI-X enabled: IRQ 154 [ 7.202364] mpt3sas0-msix19: PCI-MSI-X enabled: IRQ 155 [ 7.202365] mpt3sas0-msix20: PCI-MSI-X enabled: IRQ 156 [ 7.202371] mpt3sas0-msix21: PCI-MSI-X enabled: IRQ 157 [ 7.202371] mpt3sas0-msix22: PCI-MSI-X enabled: IRQ 158 [ 7.202372] mpt3sas0-msix23: PCI-MSI-X enabled: IRQ 159 [ 7.202373] mpt3sas0-msix24: PCI-MSI-X enabled: IRQ 160 [ 7.202373] mpt3sas0-msix25: PCI-MSI-X enabled: IRQ 161 [ 7.202374] mpt3sas0-msix26: PCI-MSI-X enabled: IRQ 162 [ 7.202374] mpt3sas0-msix27: PCI-MSI-X enabled: IRQ 163 [ 7.202375] mpt3sas0-msix28: PCI-MSI-X enabled: IRQ 164 [ 7.202376] mpt3sas0-msix29: PCI-MSI-X enabled: IRQ 165 [ 7.202376] mpt3sas0-msix30: PCI-MSI-X enabled: IRQ 166 [ 7.202377] mpt3sas0-msix31: PCI-MSI-X enabled: IRQ 167 [ 7.202377] mpt3sas0-msix32: PCI-MSI-X enabled: IRQ 168 [ 7.202378] mpt3sas0-msix33: PCI-MSI-X enabled: IRQ 169 [ 7.202379] mpt3sas0-msix34: PCI-MSI-X enabled: IRQ 170 [ 7.202379] mpt3sas0-msix35: PCI-MSI-X enabled: IRQ 171 [ 7.202380] mpt3sas0-msix36: PCI-MSI-X enabled: IRQ 172 [ 7.202380] mpt3sas0-msix37: PCI-MSI-X enabled: IRQ 173 [ 7.202381] mpt3sas0-msix38: PCI-MSI-X enabled: IRQ 174 [ 7.202382] mpt3sas0-msix39: PCI-MSI-X enabled: IRQ 175 [ 7.202382] mpt3sas0-msix40: PCI-MSI-X enabled: IRQ 176 [ 7.202383] mpt3sas0-msix41: PCI-MSI-X enabled: IRQ 177 [ 7.202383] mpt3sas0-msix42: PCI-MSI-X enabled: IRQ 178 [ 7.202384] mpt3sas0-msix43: PCI-MSI-X enabled: IRQ 179 [ 7.202385] mpt3sas0-msix44: PCI-MSI-X enabled: IRQ 180 [ 7.202385] mpt3sas0-msix45: PCI-MSI-X enabled: IRQ 181 [ 7.202386] mpt3sas0-msix46: PCI-MSI-X enabled: IRQ 182 [ 7.202386] mpt3sas0-msix47: PCI-MSI-X enabled: IRQ 183 [ 7.202388] mpt3sas_cm0: iomem(0x00000000ac000000), mapped(0xffff9e875a200000), size(1048576) [ 7.202389] mpt3sas_cm0: ioport(0x0000000000008000), size(256) [ 7.270108] mlx5_core 0000:01:00.0: firmware version: 20.26.1040 [ 7.270136] mlx5_core 0000:01:00.0: 126.016 Gb/s available PCIe bandwidth, limited by 8 GT/s x16 link at 0000:00:03.1 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 7.278375] mpt3sas_cm0: IOC Number : 0 [ 7.278377] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 7.289002] ata1: SATA max UDMA/133 abar m4096@0xc0a02000 port 0xc0a02100 irq 120 [ 7.357376] megaraid_sas 0000:c1:00.0: Init cmd return status SUCCESS for SCSI host 0 [ 7.378375] megaraid_sas 0000:c1:00.0: firmware type : Legacy(64 VD) firmware [ 7.378377] megaraid_sas 0000:c1:00.0: controller type : iMR(0MB) [ 7.378378] megaraid_sas 0000:c1:00.0: Online Controller Reset(OCR) : Enabled [ 7.378379] megaraid_sas 0000:c1:00.0: Secure JBOD support : No [ 7.378380] megaraid_sas 0000:c1:00.0: NVMe passthru support : No [ 7.399883] megaraid_sas 0000:c1:00.0: INIT adapter done [ 7.399885] megaraid_sas 0000:c1:00.0: Jbod map is not supported megasas_setup_jbod_map 5146 [ 7.426272] megaraid_sas 0000:c1:00.0: pci id : (0x1000)/(0x005f)/(0x1028)/(0x1f4b) [ 7.426273] megaraid_sas 0000:c1:00.0: unevenspan support : yes [ 7.426274] megaraid_sas 0000:c1:00.0: firmware crash dump : no [ 7.426275] megaraid_sas 0000:c1:00.0: jbod sync map : no [ 7.426279] scsi host0: Avago SAS based MegaRAID driver [ 7.446461] scsi 0:2:0:0: Direct-Access DELL PERC H330 Mini 4.30 PQ: 0 ANSI: 5 [ 7.490164] mpt3sas_cm0: Allocated physical memory: size(38831 kB) [ 7.490165] mpt3sas_cm0: Current Controller Queue Depth(7564), Max Controller Queue Depth(7680) [ 7.490166] mpt3sas_cm0: Scatter Gather Elements per IO(128) [ 7.525760] mlx5_core 0000:01:00.0: irq 185 for MSI/MSI-X [ 7.525781] mlx5_core 0000:01:00.0: irq 186 for MSI/MSI-X [ 7.525799] mlx5_core 0000:01:00.0: irq 187 for MSI/MSI-X [ 7.525817] mlx5_core 0000:01:00.0: irq 188 for MSI/MSI-X [ 7.525836] mlx5_core 0000:01:00.0: irq 189 for MSI/MSI-X [ 7.525856] mlx5_core 0000:01:00.0: irq 190 for MSI/MSI-X [ 7.525875] mlx5_core 0000:01:00.0: irq 191 for MSI/MSI-X [ 7.525894] mlx5_core 0000:01:00.0: irq 192 for MSI/MSI-X [ 7.525915] mlx5_core 0000:01:00.0: irq 193 for MSI/MSI-X [ 7.525933] mlx5_core 0000:01:00.0: irq 194 for MSI/MSI-X [ 7.525952] mlx5_core 0000:01:00.0: irq 195 for MSI/MSI-X [ 7.525969] mlx5_core 0000:01:00.0: irq 196 for MSI/MSI-X [ 7.525988] mlx5_core 0000:01:00.0: irq 197 for MSI/MSI-X [ 7.526008] mlx5_core 0000:01:00.0: irq 198 for MSI/MSI-X [ 7.526027] mlx5_core 0000:01:00.0: irq 199 for MSI/MSI-X [ 7.526046] mlx5_core 0000:01:00.0: irq 200 for MSI/MSI-X [ 7.526063] mlx5_core 0000:01:00.0: irq 201 for MSI/MSI-X [ 7.526081] mlx5_core 0000:01:00.0: irq 202 for MSI/MSI-X [ 7.526100] mlx5_core 0000:01:00.0: irq 203 for MSI/MSI-X [ 7.526118] mlx5_core 0000:01:00.0: irq 204 for MSI/MSI-X [ 7.526137] mlx5_core 0000:01:00.0: irq 205 for MSI/MSI-X [ 7.526156] mlx5_core 0000:01:00.0: irq 206 for MSI/MSI-X [ 7.526175] mlx5_core 0000:01:00.0: irq 207 for MSI/MSI-X [ 7.526197] mlx5_core 0000:01:00.0: irq 208 for MSI/MSI-X [ 7.526215] mlx5_core 0000:01:00.0: irq 209 for MSI/MSI-X [ 7.526233] mlx5_core 0000:01:00.0: irq 210 for MSI/MSI-X [ 7.526251] mlx5_core 0000:01:00.0: irq 211 for MSI/MSI-X [ 7.526270] mlx5_core 0000:01:00.0: irq 212 for MSI/MSI-X [ 7.526294] mlx5_core 0000:01:00.0: irq 213 for MSI/MSI-X [ 7.526318] mlx5_core 0000:01:00.0: irq 214 for MSI/MSI-X [ 7.526336] mlx5_core 0000:01:00.0: irq 215 for MSI/MSI-X [ 7.526354] mlx5_core 0000:01:00.0: irq 216 for MSI/MSI-X [ 7.526376] mlx5_core 0000:01:00.0: irq 217 for MSI/MSI-X [ 7.526394] mlx5_core 0000:01:00.0: irq 218 for MSI/MSI-X [ 7.526419] mlx5_core 0000:01:00.0: irq 219 for MSI/MSI-X [ 7.526437] mlx5_core 0000:01:00.0: irq 220 for MSI/MSI-X [ 7.526455] mlx5_core 0000:01:00.0: irq 221 for MSI/MSI-X [ 7.526474] mlx5_core 0000:01:00.0: irq 222 for MSI/MSI-X [ 7.526491] mlx5_core 0000:01:00.0: irq 223 for MSI/MSI-X [ 7.526509] mlx5_core 0000:01:00.0: irq 224 for MSI/MSI-X [ 7.526528] mlx5_core 0000:01:00.0: irq 225 for MSI/MSI-X [ 7.526546] mlx5_core 0000:01:00.0: irq 226 for MSI/MSI-X [ 7.526564] mlx5_core 0000:01:00.0: irq 227 for MSI/MSI-X [ 7.526581] mlx5_core 0000:01:00.0: irq 228 for MSI/MSI-X [ 7.526599] mlx5_core 0000:01:00.0: irq 229 for MSI/MSI-X [ 7.526618] mlx5_core 0000:01:00.0: irq 230 for MSI/MSI-X [ 7.526636] mlx5_core 0000:01:00.0: irq 231 for MSI/MSI-X [ 7.526654] mlx5_core 0000:01:00.0: irq 232 for MSI/MSI-X [ 7.526672] mlx5_core 0000:01:00.0: irq 233 for MSI/MSI-X [ 7.527930] mlx5_core 0000:01:00.0: Port module event: module 0, Cable plugged [ 7.528177] mlx5_core 0000:01:00.0: mlx5_pcie_event:303:(pid 318): PCIe slot advertised sufficient power (27W). [ 7.535912] mlx5_core 0000:01:00.0: mlx5_fw_tracer_start:776:(pid 299): FWTracer: Ownership granted and active [ 7.619390] ata1: SATA link down (SStatus 0 SControl 300) [ 7.717794] mlx5_ib: Mellanox Connect-IB Infiniband driver v4.7-1.0.0 [ 7.753579] sd 0:2:0:0: [sda] 233308160 512-byte logical blocks: (119 GB/111 GiB) [ 7.761247] sd 0:2:0:0: [sda] Write Protect is off [ 7.763731] mpt3sas_cm0: FW Package Version(12.00.00.00) [ 7.764000] mpt3sas_cm0: SAS3616: FWVersion(12.00.00.00), ChipRevision(0x02), BiosVersion(00.00.00.00) [ 7.764004] mpt3sas_cm0: Protocol=(Initiator,Target,NVMe), Capabilities=(TLR,EEDP,Diag Trace Buffer,Task Set Full,NCQ) [ 7.764073] mpt3sas 0000:84:00.0: Enabled Extended Tags as Controller Supports [ 7.764091] mpt3sas_cm0: : host protection capabilities enabled DIF1 DIF2 DIF3 [ 7.764102] scsi host1: Fusion MPT SAS Host [ 7.764316] mpt3sas_cm0: sending port enable !! [ 7.814591] sd 0:2:0:0: [sda] Mode Sense: 1f 00 10 08 [ 7.814642] sd 0:2:0:0: [sda] Write cache: disabled, read cache: disabled, supports DPO and FUA [ 7.825031] sda: sda1 sda2 sda3 [ 7.828696] sd 0:2:0:0: [sda] Attached SCSI disk [ 7.935645] random: crng init done [ 9.709734] mpt3sas_cm0: hba_port entry: ffff887bf0b86880, port: 255 is added to hba_port list [ 9.870406] mpt3sas_cm0: host_add: handle(0x0001), sas_addr(0x500605b00db90c00), phys(21) [ 9.878981] mpt3sas_cm0: detecting: handle(0x0011), sas_address(0x300705b00db90c00), phy(16) [ 9.887421] mpt3sas_cm0: REPORT_LUNS: handle(0x0011), retries(0) [ 9.893456] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0011), lun(0) [ 9.899901] scsi 1:0:0:0: Enclosure LSI VirtualSES 03 PQ: 0 ANSI: 7 [ 9.908032] scsi 1:0:0:0: set ignore_delay_remove for handle(0x0011) [ 9.914385] scsi 1:0:0:0: SES: handle(0x0011), sas_addr(0x300705b00db90c00), phy(16), device_name(0x300705b00db90c00) [ 9.924984] scsi 1:0:0:0: enclosure logical id(0x300605b00d110c00), slot(16) [ 9.932116] scsi 1:0:0:0: enclosure level(0x0000), connector name( C3 ) [ 9.938834] scsi 1:0:0:0: serial_number(300605B00D110C00) [ 9.944236] scsi 1:0:0:0: qdepth(1), tagged(0), simple(0), ordered(0), scsi_level(8), cmd_que(0) [ 9.954948] mpt3sas_cm0: log_info(0x31200206): originator(PL), code(0x20), sub_code(0x0206) [ 9.974899] mpt3sas_cm0: detecting: handle(0x0017), sas_address(0x500a0984db2fa920), phy(8) [ 9.983260] mpt3sas_cm0: REPORT_LUNS: handle(0x0017), retries(0) [ 9.989406] mpt3sas_cm0: REPORT_LUNS: handle(0x0017), retries(1) [ 9.998499] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0017), lun(0) [ 10.004844] mpt3sas_cm0: detecting: handle(0x0017), sas_address(0x500a0984db2fa920), phy(8) [ 10.013211] mpt3sas_cm0: REPORT_LUNS: handle(0x0017), retries(0) [ 10.020190] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0017), lun(0) [ 10.026627] mpt3sas_cm0: detecting: handle(0x0017), sas_address(0x500a0984db2fa920), phy(8) [ 10.034994] mpt3sas_cm0: REPORT_LUNS: handle(0x0017), retries(0) [ 10.041943] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0017), lun(0) [ 10.048490] scsi 1:0:1:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.056681] scsi 1:0:1:0: SSP: handle(0x0017), sas_addr(0x500a0984db2fa920), phy(8), device_name(0x500a0984db2fa920) [ 10.067197] scsi 1:0:1:0: enclosure logical id(0x300605b00d110c00), slot(5) [ 10.074241] scsi 1:0:1:0: enclosure level(0x0000), connector name( C1 ) [ 10.080961] scsi 1:0:1:0: serial_number(021815000354 ) [ 10.086362] scsi 1:0:1:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.109353] scsi 1:0:1:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.117518] scsi 1:0:1:1: SSP: handle(0x0017), sas_addr(0x500a0984db2fa920), phy(8), device_name(0x500a0984db2fa920) [ 10.128028] scsi 1:0:1:1: enclosure logical id(0x300605b00d110c00), slot(5) [ 10.135075] scsi 1:0:1:1: enclosure level(0x0000), connector name( C1 ) [ 10.141793] scsi 1:0:1:1: serial_number(021815000354 ) [ 10.147192] scsi 1:0:1:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.180011] scsi 1:0:1:1: Mode parameters changed [ 10.196606] scsi 1:0:1:2: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.204799] scsi 1:0:1:2: SSP: handle(0x0017), sas_addr(0x500a0984db2fa920), phy(8), device_name(0x500a0984db2fa920) [ 10.215310] scsi 1:0:1:2: enclosure logical id(0x300605b00d110c00), slot(5) [ 10.222357] scsi 1:0:1:2: enclosure level(0x0000), connector name( C1 ) [ 10.229077] scsi 1:0:1:2: serial_number(021815000354 ) [ 10.234475] scsi 1:0:1:2: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.243652] scsi 1:0:1:2: Mode parameters changed [ 10.264605] scsi 1:0:1:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 10.272875] scsi 1:0:1:31: SSP: handle(0x0017), sas_addr(0x500a0984db2fa920), phy(8), device_name(0x500a0984db2fa920) [ 10.283474] scsi 1:0:1:31: enclosure logical id(0x300605b00d110c00), slot(5) [ 10.290607] scsi 1:0:1:31: enclosure level(0x0000), connector name( C1 ) [ 10.297413] scsi 1:0:1:31: serial_number(021815000354 ) [ 10.302900] scsi 1:0:1:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.373868] mpt3sas_cm0: detecting: handle(0x0018), sas_address(0x500a0984dfa1fa20), phy(0) [ 10.382222] mpt3sas_cm0: REPORT_LUNS: handle(0x0018), retries(0) [ 10.388369] mpt3sas_cm0: REPORT_LUNS: handle(0x0018), retries(1) [ 10.397821] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0018), lun(0) [ 10.404138] mpt3sas_cm0: detecting: handle(0x0018), sas_address(0x500a0984dfa1fa20), phy(0) [ 10.412504] mpt3sas_cm0: REPORT_LUNS: handle(0x0018), retries(0) [ 10.420410] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0018), lun(0) [ 10.426696] mpt3sas_cm0: detecting: handle(0x0018), sas_address(0x500a0984dfa1fa20), phy(0) [ 10.435056] mpt3sas_cm0: REPORT_LUNS: handle(0x0018), retries(0) [ 10.441814] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0018), lun(0) [ 10.448339] scsi 1:0:2:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.456524] scsi 1:0:2:0: SSP: handle(0x0018), sas_addr(0x500a0984dfa1fa20), phy(0), device_name(0x500a0984dfa1fa20) [ 10.467035] scsi 1:0:2:0: enclosure logical id(0x300605b00d110c00), slot(13) [ 10.474170] scsi 1:0:2:0: enclosure level(0x0000), connector name( C3 ) [ 10.480892] scsi 1:0:2:0: serial_number(021825001369 ) [ 10.486297] scsi 1:0:2:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.509366] scsi 1:0:2:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.517527] scsi 1:0:2:1: SSP: handle(0x0018), sas_addr(0x500a0984dfa1fa20), phy(0), device_name(0x500a0984dfa1fa20) [ 10.528042] scsi 1:0:2:1: enclosure logical id(0x300605b00d110c00), slot(13) [ 10.535175] scsi 1:0:2:1: enclosure level(0x0000), connector name( C3 ) [ 10.541891] scsi 1:0:2:1: serial_number(021825001369 ) [ 10.547294] scsi 1:0:2:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.556486] scsi 1:0:2:1: Mode parameters changed [ 10.572610] scsi 1:0:2:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 10.580907] scsi 1:0:2:31: SSP: handle(0x0018), sas_addr(0x500a0984dfa1fa20), phy(0), device_name(0x500a0984dfa1fa20) [ 10.591507] scsi 1:0:2:31: enclosure logical id(0x300605b00d110c00), slot(13) [ 10.598726] scsi 1:0:2:31: enclosure level(0x0000), connector name( C3 ) [ 10.605531] scsi 1:0:2:31: serial_number(021825001369 ) [ 10.611019] scsi 1:0:2:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.634002] mpt3sas_cm0: detecting: handle(0x0019), sas_address(0x500a0984da0f9b14), phy(12) [ 10.642438] mpt3sas_cm0: REPORT_LUNS: handle(0x0019), retries(0) [ 10.648583] mpt3sas_cm0: REPORT_LUNS: handle(0x0019), retries(1) [ 10.655510] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0019), lun(0) [ 10.661837] mpt3sas_cm0: detecting: handle(0x0019), sas_address(0x500a0984da0f9b14), phy(12) [ 10.670288] mpt3sas_cm0: REPORT_LUNS: handle(0x0019), retries(0) [ 10.676942] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0019), lun(0) [ 10.683617] scsi 1:0:3:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.692122] scsi 1:0:3:0: SSP: handle(0x0019), sas_addr(0x500a0984da0f9b14), phy(12), device_name(0x500a0984da0f9b14) [ 10.702720] scsi 1:0:3:0: enclosure logical id(0x300605b00d110c00), slot(1) [ 10.709766] scsi 1:0:3:0: enclosure level(0x0000), connector name( C0 ) [ 10.716489] scsi 1:0:3:0: serial_number(021812047179 ) [ 10.721893] scsi 1:0:3:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.742240] scsi 1:0:3:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.750408] scsi 1:0:3:1: SSP: handle(0x0019), sas_addr(0x500a0984da0f9b14), phy(12), device_name(0x500a0984da0f9b14) [ 10.761013] scsi 1:0:3:1: enclosure logical id(0x300605b00d110c00), slot(1) [ 10.768057] scsi 1:0:3:1: enclosure level(0x0000), connector name( C0 ) [ 10.774763] scsi 1:0:3:1: serial_number(021812047179 ) [ 10.780168] scsi 1:0:3:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.802596] scsi 1:0:3:2: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.810762] scsi 1:0:3:2: SSP: handle(0x0019), sas_addr(0x500a0984da0f9b14), phy(12), device_name(0x500a0984da0f9b14) [ 10.821358] scsi 1:0:3:2: enclosure logical id(0x300605b00d110c00), slot(1) [ 10.828405] scsi 1:0:3:2: enclosure level(0x0000), connector name( C0 ) [ 10.835122] scsi 1:0:3:2: serial_number(021812047179 ) [ 10.840523] scsi 1:0:3:2: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.860594] scsi 1:0:3:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 10.868845] scsi 1:0:3:31: SSP: handle(0x0019), sas_addr(0x500a0984da0f9b14), phy(12), device_name(0x500a0984da0f9b14) [ 10.879530] scsi 1:0:3:31: enclosure logical id(0x300605b00d110c00), slot(1) [ 10.886662] scsi 1:0:3:31: enclosure level(0x0000), connector name( C0 ) [ 10.893466] scsi 1:0:3:31: serial_number(021812047179 ) [ 10.898954] scsi 1:0:3:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 10.922197] mpt3sas_cm0: detecting: handle(0x001a), sas_address(0x500a0984dfa20c14), phy(4) [ 10.930549] mpt3sas_cm0: REPORT_LUNS: handle(0x001a), retries(0) [ 10.936694] mpt3sas_cm0: REPORT_LUNS: handle(0x001a), retries(1) [ 10.944363] mpt3sas_cm0: TEST_UNIT_READY: handle(0x001a), lun(0) [ 10.950678] mpt3sas_cm0: detecting: handle(0x001a), sas_address(0x500a0984dfa20c14), phy(4) [ 10.959038] mpt3sas_cm0: REPORT_LUNS: handle(0x001a), retries(0) [ 10.965752] mpt3sas_cm0: TEST_UNIT_READY: handle(0x001a), lun(0) [ 10.972678] scsi 1:0:4:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 10.980879] scsi 1:0:4:0: SSP: handle(0x001a), sas_addr(0x500a0984dfa20c14), phy(4), device_name(0x500a0984dfa20c14) [ 10.991391] scsi 1:0:4:0: enclosure logical id(0x300605b00d110c00), slot(9) [ 10.998438] scsi 1:0:4:0: enclosure level(0x0000), connector name( C2 ) [ 11.005156] scsi 1:0:4:0: serial_number(021825001558 ) [ 11.010556] scsi 1:0:4:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 11.031246] scsi 1:0:4:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 11.039416] scsi 1:0:4:1: SSP: handle(0x001a), sas_addr(0x500a0984dfa20c14), phy(4), device_name(0x500a0984dfa20c14) [ 11.049926] scsi 1:0:4:1: enclosure logical id(0x300605b00d110c00), slot(9) [ 11.056972] scsi 1:0:4:1: enclosure level(0x0000), connector name( C2 ) [ 11.063677] scsi 1:0:4:1: serial_number(021825001558 ) [ 11.069081] scsi 1:0:4:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 11.091615] scsi 1:0:4:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 11.099863] scsi 1:0:4:31: SSP: handle(0x001a), sas_addr(0x500a0984dfa20c14), phy(4), device_name(0x500a0984dfa20c14) [ 11.110465] scsi 1:0:4:31: enclosure logical id(0x300605b00d110c00), slot(9) [ 11.117595] scsi 1:0:4:31: enclosure level(0x0000), connector name( C2 ) [ 11.124404] scsi 1:0:4:31: serial_number(021825001558 ) [ 11.129889] scsi 1:0:4:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 15.966439] mpt3sas_cm0: port enable: SUCCESS [ 15.971332] scsi 1:0:1:0: rdac: LUN 0 (IOSHIP) (owned) [ 15.976746] sd 1:0:1:0: [sdb] 926167040 512-byte logical blocks: (474 GB/441 GiB) [ 15.984248] sd 1:0:1:0: [sdb] 4096-byte physical blocks [ 15.989584] scsi 1:0:1:1: rdac: LUN 1 (IOSHIP) (unowned) [ 15.995193] sd 1:0:1:0: [sdb] Write Protect is off [ 16.000015] sd 1:0:1:0: [sdb] Mode Sense: 83 00 10 08 [ 16.000018] sd 1:0:1:1: [sdc] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.000200] sd 1:0:1:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.000238] scsi 1:0:1:2: rdac: LUN 2 (IOSHIP) (owned) [ 16.000492] sd 1:0:1:2: [sdd] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.000844] scsi 1:0:2:0: rdac: LUN 0 (IOSHIP) (owned) [ 16.000965] sd 1:0:1:2: [sdd] Write Protect is off [ 16.000967] sd 1:0:1:2: [sdd] Mode Sense: 83 00 10 08 [ 16.001085] sd 1:0:2:0: [sde] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.001179] sd 1:0:1:2: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.001366] scsi 1:0:2:1: rdac: LUN 1 (IOSHIP) (unowned) [ 16.001616] sd 1:0:2:0: [sde] Write Protect is off [ 16.001618] sd 1:0:2:0: [sde] Mode Sense: 83 00 10 08 [ 16.001674] sd 1:0:2:1: [sdf] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.001812] sd 1:0:2:0: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.002050] scsi 1:0:3:0: rdac: LUN 0 (IOSHIP) (unowned) [ 16.002265] sd 1:0:2:1: [sdf] Write Protect is off [ 16.002266] sd 1:0:2:1: [sdf] Mode Sense: 83 00 10 08 [ 16.002346] sd 1:0:3:0: [sdg] 926167040 512-byte logical blocks: (474 GB/441 GiB) [ 16.002348] sd 1:0:3:0: [sdg] 4096-byte physical blocks [ 16.002464] sd 1:0:2:1: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.002642] scsi 1:0:3:1: rdac: LUN 1 (IOSHIP) (owned) [ 16.002876] sd 1:0:3:0: [sdg] Write Protect is off [ 16.002881] sd 1:0:3:0: [sdg] Mode Sense: 83 00 10 08 [ 16.002932] sd 1:0:3:1: [sdh] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.003118] sd 1:0:1:0: [sdb] Attached SCSI disk [ 16.003172] sd 1:0:3:0: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.003450] scsi 1:0:3:2: rdac: LUN 2 (IOSHIP) (unowned) [ 16.003670] sd 1:0:3:1: [sdh] Write Protect is off [ 16.003672] sd 1:0:3:1: [sdh] Mode Sense: 83 00 10 08 [ 16.003733] sd 1:0:3:2: [sdi] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.003927] sd 1:0:3:1: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.004115] scsi 1:0:4:0: rdac: LUN 0 (IOSHIP) (unowned) [ 16.004460] sd 1:0:4:0: [sdj] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.004474] sd 1:0:3:2: [sdi] Write Protect is off [ 16.004476] sd 1:0:3:2: [sdi] Mode Sense: 83 00 10 08 [ 16.004720] sd 1:0:3:2: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.004777] sd 1:0:2:0: [sde] Attached SCSI disk [ 16.005100] scsi 1:0:4:1: rdac: LUN 1 (IOSHIP) (owned) [ 16.005387] sd 1:0:4:0: [sdj] Write Protect is off [ 16.005389] sd 1:0:4:0: [sdj] Mode Sense: 83 00 10 08 [ 16.005410] sd 1:0:4:1: [sdk] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 16.005673] sd 1:0:4:0: [sdj] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.006122] sd 1:0:4:1: [sdk] Write Protect is off [ 16.006124] sd 1:0:4:1: [sdk] Mode Sense: 83 00 10 08 [ 16.006309] sd 1:0:4:1: [sdk] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.006346] sd 1:0:1:2: [sdd] Attached SCSI disk [ 16.006465] sd 1:0:2:1: [sdf] Attached SCSI disk [ 16.007661] sd 1:0:3:1: [sdh] Attached SCSI disk [ 16.007861] sd 1:0:3:0: [sdg] Attached SCSI disk [ 16.008478] sd 1:0:3:2: [sdi] Attached SCSI disk [ 16.008983] sd 1:0:4:1: [sdk] Attached SCSI disk [ 16.009111] sd 1:0:4:0: [sdj] Attached SCSI disk [ 16.274287] sd 1:0:1:1: [sdc] Write Protect is off [ 16.279087] sd 1:0:1:1: [sdc] Mode Sense: 83 00 10 08 [ 16.279223] sd 1:0:1:1: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 16.290329] sd 1:0:1:1: [sdc] Attached SCSI disk [ 16.389265] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null) [ 16.611424] systemd-journald[351]: Received SIGTERM from PID 1 (systemd). [ 16.641077] SELinux: Disabled at runtime. [ 16.645799] SELinux: Unregistering netfilter hooks [ 16.688452] type=1404 audit(1575986256.170:2): selinux=0 auid=4294967295 ses=4294967295 [ 16.712833] ip_tables: (C) 2000-2006 Netfilter Core Team [ 16.719187] systemd[1]: Inserted module 'ip_tables' [ 16.803389] EXT4-fs (sda2): re-mounted. Opts: (null) [ 16.814268] systemd-journald[4902]: Received request to flush runtime journal from PID 1 [ 16.864530] piix4_smbus 0000:00:14.0: SMBus Host Controller at 0xb00, revision 0 [ 16.872674] piix4_smbus 0000:00:14.0: Using register 0x2e for SMBus port selection [ 16.902658] device-mapper: uevent: version 1.0.3 [ 16.908694] device-mapper: ioctl: 4.37.1-ioctl (2018-04-03) initialised: dm-devel@redhat.com [ 16.919459] ACPI Error: No handler for Region [SYSI] (ffff884d29af6a68) [IPMI] (20130517/evregion-162) [ 16.931722] ACPI Error: Region IPMI (ID=7) has no handler (20130517/exfldio-305) [ 16.941958] ACPI Error: Method parse/execution failed [\_SB_.PMI0._GHL] (Node ffff884d29af35a0), AE_NOT_EXIST (20130517/psparse-536) [ 16.959492] ACPI Error: Method parse/execution failed [\_SB_.PMI0._PMC] (Node ffff884d29af3500), AE_NOT_EXIST (20130517/psparse-536) [ 16.972325] ACPI Exception: AE_NOT_EXIST, Evaluating _PMC (20130517/power_meter-753) [ 16.972415] ccp 0000:02:00.2: 3 command queues available [ 16.972471] ccp 0000:02:00.2: irq 235 for MSI/MSI-X [ 16.972494] ccp 0000:02:00.2: irq 236 for MSI/MSI-X [ 16.972553] ccp 0000:02:00.2: Queue 2 can access 4 LSB regions [ 16.972555] ccp 0000:02:00.2: Queue 3 can access 4 LSB regions [ 16.972556] ccp 0000:02:00.2: Queue 4 can access 4 LSB regions [ 16.972558] ccp 0000:02:00.2: Queue 0 gets LSB 4 [ 16.972560] ccp 0000:02:00.2: Queue 1 gets LSB 5 [ 16.972561] ccp 0000:02:00.2: Queue 2 gets LSB 6 [ 16.973375] ccp 0000:02:00.2: enabled [ 16.973578] ccp 0000:03:00.1: 5 command queues available [ 16.973628] ccp 0000:03:00.1: irq 238 for MSI/MSI-X [ 16.973676] ccp 0000:03:00.1: Queue 0 can access 7 LSB regions [ 16.973678] ccp 0000:03:00.1: Queue 1 can access 7 LSB regions [ 16.973680] ccp 0000:03:00.1: Queue 2 can access 7 LSB regions [ 16.973683] ccp 0000:03:00.1: Queue 3 can access 7 LSB regions [ 16.973685] ccp 0000:03:00.1: Queue 4 can access 7 LSB regions [ 16.973686] ccp 0000:03:00.1: Queue 0 gets LSB 1 [ 16.973688] ccp 0000:03:00.1: Queue 1 gets LSB 2 [ 16.973689] ccp 0000:03:00.1: Queue 2 gets LSB 3 [ 16.973690] ccp 0000:03:00.1: Queue 3 gets LSB 4 [ 16.973691] ccp 0000:03:00.1: Queue 4 gets LSB 5 [ 16.974629] ccp 0000:03:00.1: enabled [ 16.974833] ccp 0000:41:00.2: 3 command queues available [ 16.974876] ccp 0000:41:00.2: irq 240 for MSI/MSI-X [ 16.974897] ccp 0000:41:00.2: irq 241 for MSI/MSI-X [ 16.974945] ccp 0000:41:00.2: Queue 2 can access 4 LSB regions [ 16.974948] ccp 0000:41:00.2: Queue 3 can access 4 LSB regions [ 16.974950] ccp 0000:41:00.2: Queue 4 can access 4 LSB regions [ 16.974951] ccp 0000:41:00.2: Queue 0 gets LSB 4 [ 16.974953] ccp 0000:41:00.2: Queue 1 gets LSB 5 [ 16.974954] ccp 0000:41:00.2: Queue 2 gets LSB 6 [ 16.975610] ccp 0000:41:00.2: enabled [ 16.975756] ccp 0000:42:00.1: 5 command queues available [ 16.975800] ccp 0000:42:00.1: irq 243 for MSI/MSI-X [ 16.975827] ccp 0000:42:00.1: Queue 0 can access 7 LSB regions [ 16.975829] ccp 0000:42:00.1: Queue 1 can access 7 LSB regions [ 16.975831] ccp 0000:42:00.1: Queue 2 can access 7 LSB regions [ 16.975833] ccp 0000:42:00.1: Queue 3 can access 7 LSB regions [ 16.975835] ccp 0000:42:00.1: Queue 4 can access 7 LSB regions [ 16.975836] ccp 0000:42:00.1: Queue 0 gets LSB 1 [ 16.975837] ccp 0000:42:00.1: Queue 1 gets LSB 2 [ 16.975838] ccp 0000:42:00.1: Queue 2 gets LSB 3 [ 16.975839] ccp 0000:42:00.1: Queue 3 gets LSB 4 [ 16.975840] ccp 0000:42:00.1: Queue 4 gets LSB 5 [ 16.977575] ccp 0000:42:00.1: enabled [ 16.977821] ccp 0000:85:00.2: 3 command queues available [ 16.977868] ccp 0000:85:00.2: irq 245 for MSI/MSI-X [ 16.977892] ccp 0000:85:00.2: irq 246 for MSI/MSI-X [ 16.977988] ccp 0000:85:00.2: Queue 2 can access 4 LSB regions [ 16.977991] ccp 0000:85:00.2: Queue 3 can access 4 LSB regions [ 16.977993] ccp 0000:85:00.2: Queue 4 can access 4 LSB regions [ 16.977995] ccp 0000:85:00.2: Queue 0 gets LSB 4 [ 16.977996] ccp 0000:85:00.2: Queue 1 gets LSB 5 [ 16.977997] ccp 0000:85:00.2: Queue 2 gets LSB 6 [ 16.978994] ccp 0000:85:00.2: enabled [ 16.979134] ccp 0000:86:00.1: 5 command queues available [ 16.979175] ccp 0000:86:00.1: irq 248 for MSI/MSI-X [ 16.979274] ccp 0000:86:00.1: Queue 0 can access 7 LSB regions [ 16.979276] ccp 0000:86:00.1: Queue 1 can access 7 LSB regions [ 16.979278] ccp 0000:86:00.1: Queue 2 can access 7 LSB regions [ 16.979280] ccp 0000:86:00.1: Queue 3 can access 7 LSB regions [ 16.979282] ccp 0000:86:00.1: Queue 4 can access 7 LSB regions [ 16.979283] ccp 0000:86:00.1: Queue 0 gets LSB 1 [ 16.979285] ccp 0000:86:00.1: Queue 1 gets LSB 2 [ 16.979286] ccp 0000:86:00.1: Queue 2 gets LSB 3 [ 16.979287] ccp 0000:86:00.1: Queue 3 gets LSB 4 [ 16.979288] ccp 0000:86:00.1: Queue 4 gets LSB 5 [ 16.981240] ccp 0000:86:00.1: enabled [ 16.982416] ccp 0000:c2:00.2: 3 command queues available [ 16.982484] ccp 0000:c2:00.2: irq 250 for MSI/MSI-X [ 16.982509] ccp 0000:c2:00.2: irq 251 for MSI/MSI-X [ 16.982616] ccp 0000:c2:00.2: Queue 2 can access 4 LSB regions [ 16.982619] ccp 0000:c2:00.2: Queue 3 can access 4 LSB regions [ 16.982621] ccp 0000:c2:00.2: Queue 4 can access 4 LSB regions [ 16.982623] ccp 0000:c2:00.2: Queue 0 gets LSB 4 [ 16.982624] ccp 0000:c2:00.2: Queue 1 gets LSB 5 [ 16.982625] ccp 0000:c2:00.2: Queue 2 gets LSB 6 [ 16.983623] ccp 0000:c2:00.2: enabled [ 16.983797] ccp 0000:c3:00.1: 5 command queues available [ 16.983844] ccp 0000:c3:00.1: irq 253 for MSI/MSI-X [ 16.983931] ccp 0000:c3:00.1: Queue 0 can access 7 LSB regions [ 16.983934] ccp 0000:c3:00.1: Queue 1 can access 7 LSB regions [ 16.983936] ccp 0000:c3:00.1: Queue 2 can access 7 LSB regions [ 16.983938] ccp 0000:c3:00.1: Queue 3 can access 7 LSB regions [ 16.983940] ccp 0000:c3:00.1: Queue 4 can access 7 LSB regions [ 16.983942] ccp 0000:c3:00.1: Queue 0 gets LSB 1 [ 16.983943] ccp 0000:c3:00.1: Queue 1 gets LSB 2 [ 16.983944] ccp 0000:c3:00.1: Queue 2 gets LSB 3 [ 16.983945] ccp 0000:c3:00.1: Queue 3 gets LSB 4 [ 16.983947] ccp 0000:c3:00.1: Queue 4 gets LSB 5 [ 16.989108] ccp 0000:c3:00.1: enabled [ 17.073519] cryptd: max_cpu_qlen set to 1000 [ 17.130530] sd 0:2:0:0: Attached scsi generic sg0 type 0 [ 17.130675] scsi 1:0:0:0: Attached scsi generic sg1 type 13 [ 17.130779] sd 1:0:1:0: Attached scsi generic sg2 type 0 [ 17.130825] sd 1:0:1:1: Attached scsi generic sg3 type 0 [ 17.130864] sd 1:0:1:2: Attached scsi generic sg4 type 0 [ 17.130956] scsi 1:0:1:31: Attached scsi generic sg5 type 0 [ 17.131017] sd 1:0:2:0: Attached scsi generic sg6 type 0 [ 17.131057] sd 1:0:2:1: Attached scsi generic sg7 type 0 [ 17.131122] scsi 1:0:2:31: Attached scsi generic sg8 type 0 [ 17.131159] sd 1:0:3:0: Attached scsi generic sg9 type 0 [ 17.131196] sd 1:0:3:1: Attached scsi generic sg10 type 0 [ 17.131239] sd 1:0:3:2: Attached scsi generic sg11 type 0 [ 17.131275] scsi 1:0:3:31: Attached scsi generic sg12 type 0 [ 17.131309] sd 1:0:4:0: Attached scsi generic sg13 type 0 [ 17.131342] sd 1:0:4:1: Attached scsi generic sg14 type 0 [ 17.131383] scsi 1:0:4:31: Attached scsi generic sg15 type 0 [ 17.594981] ipmi message handler version 39.2 [ 17.601922] AVX2 version of gcm_enc/dec engaged. [ 17.607805] AES CTR mode by8 optimization enabled [ 17.614644] input: PC Speaker as /devices/platform/pcspkr/input/input2 [ 17.615567] ipmi device interface [ 17.632800] sd 1:0:1:0: Embedded Enclosure Device [ 17.634408] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [ 17.634538] alg: No test for __generic-gcm-aes-aesni (__driver-generic-gcm-aes-aesni) [ 17.656010] IPMI System Interface driver [ 17.658503] sd 1:0:1:1: Embedded Enclosure Device [ 17.661407] sd 1:0:1:2: Embedded Enclosure Device [ 17.664548] scsi 1:0:1:31: Embedded Enclosure Device [ 17.667469] sd 1:0:2:0: Embedded Enclosure Device [ 17.670044] sd 1:0:2:1: Embedded Enclosure Device [ 17.672599] scsi 1:0:2:31: Embedded Enclosure Device [ 17.674879] sd 1:0:3:0: Embedded Enclosure Device [ 17.678321] sd 1:0:3:1: Embedded Enclosure Device [ 17.679122] ipmi_si dmi-ipmi-si.0: ipmi_platform: probing via SMBIOS [ 17.679125] ipmi_si: SMBIOS: io 0xca8 regsize 1 spacing 4 irq 10 [ 17.679126] ipmi_si: Adding SMBIOS-specified kcs state machine [ 17.679185] ipmi_si IPI0001:00: ipmi_platform: probing via ACPI [ 17.679211] ipmi_si IPI0001:00: [io 0x0ca8] regsize 1 spacing 4 irq 10 [ 17.679212] ipmi_si dmi-ipmi-si.0: Removing SMBIOS-specified kcs state machine in favor of ACPI [ 17.679213] ipmi_si: Adding ACPI-specified kcs state machine [ 17.679409] ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca8, slave address 0x20, irq 10 [ 17.681026] sd 1:0:3:2: Embedded Enclosure Device [ 17.684034] scsi 1:0:3:31: Embedded Enclosure Device [ 17.686994] sd 1:0:4:0: Embedded Enclosure Device [ 17.689152] sd 1:0:4:1: Embedded Enclosure Device [ 17.691225] scsi 1:0:4:31: Embedded Enclosure Device [ 17.693318] ses 1:0:0:0: Attached Enclosure device [ 17.697778] dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.3) [ 17.705501] ipmi_si IPI0001:00: The BMC does not support setting the recv irq bit, compensating, but the BMC needs to be fixed. [ 17.710579] ipmi_si IPI0001:00: Using irq 10 [ 17.735040] ipmi_si IPI0001:00: Found new BMC (man_id: 0x0002a2, prod_id: 0x0100, dev_id: 0x20) [ 17.848121] ipmi_si IPI0001:00: IPMI kcs interface initialized [ 17.909013] kvm: Nested Paging enabled [ 17.916024] MCE: In-kernel MCE decoding enabled. [ 17.924279] AMD64 EDAC driver v3.4.0 [ 17.927883] EDAC amd64: DRAM ECC enabled. [ 17.931905] EDAC amd64: F17h detected (node 0). [ 17.936493] EDAC MC: UMC0 chip selects: [ 17.936495] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.941207] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.941208] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.941211] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.941214] EDAC MC: UMC1 chip selects: [ 17.941215] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.941216] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.941216] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.941217] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.941217] EDAC amd64: using x8 syndromes. [ 17.941218] EDAC amd64: MCT channel count: 2 [ 17.941390] EDAC MC0: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:18.3 [ 17.941396] EDAC amd64: DRAM ECC enabled. [ 17.941397] EDAC amd64: F17h detected (node 1). [ 17.941439] EDAC MC: UMC0 chip selects: [ 17.941440] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.941441] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.941442] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.941443] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.941445] EDAC MC: UMC1 chip selects: [ 17.941446] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.941446] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.941447] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.941449] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.941451] EDAC amd64: using x8 syndromes. [ 17.941458] EDAC amd64: MCT channel count: 2 [ 17.941623] EDAC MC1: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:19.3 [ 17.941629] EDAC amd64: DRAM ECC enabled. [ 17.941630] EDAC amd64: F17h detected (node 2). [ 17.941674] EDAC MC: UMC0 chip selects: [ 17.941674] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.941675] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.941677] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.941679] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.941682] EDAC MC: UMC1 chip selects: [ 17.941682] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.941683] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.941684] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.941684] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.941685] EDAC amd64: using x8 syndromes. [ 17.941685] EDAC amd64: MCT channel count: 2 [ 17.942533] EDAC MC2: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:1a.3 [ 17.942540] EDAC amd64: DRAM ECC enabled. [ 17.942541] EDAC amd64: F17h detected (node 3). [ 17.942588] EDAC MC: UMC0 chip selects: [ 17.942589] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.942590] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.942591] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.942592] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.942595] EDAC MC: UMC1 chip selects: [ 17.942596] EDAC amd64: MC: 0: 0MB 1: 0MB [ 17.942596] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 17.942597] EDAC amd64: MC: 4: 0MB 5: 0MB [ 17.942598] EDAC amd64: MC: 6: 0MB 7: 0MB [ 17.942598] EDAC amd64: using x8 syndromes. [ 17.942598] EDAC amd64: MCT channel count: 2 [ 17.943731] EDAC MC3: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:1b.3 [ 17.943863] EDAC PCI0: Giving out device to module 'amd64_edac' controller 'EDAC PCI controller': DEV '0000:00:18.0' (POLLED) [ 44.398072] device-mapper: multipath round-robin: version 1.2.0 loaded [ 62.904673] Adding 4194300k swap on /dev/sda3. Priority:-2 extents:1 across:4194300k FS [ 62.946694] type=1305 audit(1575986302.427:3): audit_pid=11921 old=0 auid=4294967295 ses=4294967295 res=1 [ 62.967496] RPC: Registered named UNIX socket transport module. [ 62.973668] RPC: Registered udp transport module. [ 62.979757] RPC: Registered tcp transport module. [ 62.985849] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 63.635709] mlx5_core 0000:01:00.0: slow_pci_heuristic:5575:(pid 12222): Max link speed = 100000, PCI BW = 126016 [ 63.646032] mlx5_core 0000:01:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 63.654307] mlx5_core 0000:01:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 64.094186] tg3 0000:81:00.0: irq 254 for MSI/MSI-X [ 64.094200] tg3 0000:81:00.0: irq 255 for MSI/MSI-X [ 64.094212] tg3 0000:81:00.0: irq 256 for MSI/MSI-X [ 64.094223] tg3 0000:81:00.0: irq 257 for MSI/MSI-X [ 64.094237] tg3 0000:81:00.0: irq 258 for MSI/MSI-X [ 64.220377] IPv6: ADDRCONF(NETDEV_UP): em1: link is not ready [ 67.717106] tg3 0000:81:00.0 em1: Link is up at 1000 Mbps, full duplex [ 67.723660] tg3 0000:81:00.0 em1: Flow control is on for TX and on for RX [ 67.730479] tg3 0000:81:00.0 em1: EEE is enabled [ 67.735123] IPv6: ADDRCONF(NETDEV_CHANGE): em1: link becomes ready [ 68.574384] IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready [ 68.864842] IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready [ 72.711096] FS-Cache: Loaded [ 72.741270] FS-Cache: Netfs 'nfs' registered for caching [ 72.750774] Key type dns_resolver registered [ 72.779275] NFS: Registering the id_resolver key type [ 72.785509] Key type id_resolver registered [ 72.791060] Key type id_legacy registered [ 1328.760841] LNet: HW NUMA nodes: 4, HW CPU cores: 48, npartitions: 4 [ 1328.768503] alg: No test for adler32 (adler32-zlib) [ 1329.568321] Lustre: Lustre: Build Version: 2.12.3_4_g142b4d4 [ 1329.674607] LNet: 38618:0:(config.c:1627:lnet_inet_enumerate()) lnet: Ignoring interface em2: it's down [ 1329.684382] LNet: Using FastReg for registration [ 1329.700801] LNet: Added LNI 10.0.10.51@o2ib7 [8/256/0/180] [ 1426.899279] LDISKFS-fs (dm-3): file extents enabled, maximum tree depth=5 [ 1426.985480] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc [ 1428.467837] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.110.62@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1444.620305] LustreError: 38844:0:(mgc_request.c:249:do_config_log_add()) MGC10.0.10.51@o2ib7: failed processing log, type 1: rc = -5 [ 1458.043384] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1463.675411] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1464.708427] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [ 1467.675434] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1469.675458] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1469.685693] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1475.675493] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1484.675562] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1492.947931] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1497.675630] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1497.685716] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 2 previous similar messages [ 1504.731690] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1504.744775] Lustre: fir-MDT0001: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 1504.913217] Lustre: fir-MDD0001: changelog on [ 1504.921482] Lustre: fir-MDT0001: in recovery but waiting for the first client to connect [ 1504.945588] Lustre: fir-MDT0001: Will be in recovery for at least 2:30, or until 1290 clients reconnect [ 1505.506292] Lustre: fir-MDT0001: Connection restored to fir-MDT0002-mdtlov_UUID (at 10.0.10.53@o2ib7) [ 1506.040259] Lustre: fir-MDT0001: Connection restored to ac744819-a0e9-dce1-af3e-f5ed5c20fc63 (at 10.9.104.14@o2ib4) [ 1506.050696] Lustre: Skipped 81 previous similar messages [ 1507.097988] Lustre: fir-MDT0001: Connection restored to 0f0b3b20-bdc9-8cb4-ded0-8300910ff5a5 (at 10.9.107.21@o2ib4) [ 1507.108418] Lustre: Skipped 92 previous similar messages [ 1509.154821] Lustre: fir-MDT0001: Connection restored to 882378af-0b41-73ee-5c10-5cc51464645c (at 10.9.108.22@o2ib4) [ 1509.165256] Lustre: Skipped 110 previous similar messages [ 1513.155634] Lustre: fir-MDT0001: Connection restored to 8660dc7a-172c-047f-9f20-55fab5f17314 (at 10.9.104.57@o2ib4) [ 1513.166065] Lustre: Skipped 712 previous similar messages [ 1514.731770] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1515.077117] Lustre: 39206:0:(ldlm_lib.c:1765:extend_recovery_timer()) fir-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 [ 1518.050907] Lustre: 39206:0:(ldlm_lib.c:1765:extend_recovery_timer()) fir-MDT0001: extended recovery timer reaching hard limit: 900, extend: 1 [ 1518.095567] Lustre: fir-MDT0001: Recovery over after 0:13, of 1290 clients 1290 recovered and 0 were evicted. [ 1521.548090] Lustre: fir-MDT0001: Connection restored to fir-MDT0001-lwp-OST0034_UUID (at 10.0.10.109@o2ib7) [ 1521.557839] Lustre: Skipped 328 previous similar messages [ 1529.723875] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1544.731949] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1559.740051] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1574.676134] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1574.686214] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [ 1589.724241] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1589.735995] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 1 previous similar message [ 1619.676430] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1619.686513] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [ 1633.276659] LustreError: 11-0: fir-OST0007-osc-MDT0001: operation ost_statfs to node 10.0.10.101@o2ib7 failed: rc = -107 [ 1633.276665] Lustre: fir-OST0003-osc-MDT0001: Connection to fir-OST0003 (at 10.0.10.101@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 1633.303588] LustreError: Skipped 5 previous similar messages [ 1634.732537] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1634.744295] LNetError: 318:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 2 previous similar messages [ 1688.256254] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.102@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1709.676999] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1709.687080] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [ 1709.696309] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1709.708215] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 4 previous similar messages [ 1720.451549] Lustre: fir-MDT0001: Connection restored to (at 10.0.10.102@o2ib7) [ 1720.458875] Lustre: Skipped 49 previous similar messages [ 1721.141203] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.102@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1781.516578] Lustre: fir-MDT0001: Connection restored to 10.0.10.102@o2ib7 (at 10.0.10.102@o2ib7) [ 1781.525365] Lustre: Skipped 1 previous similar message [ 1787.077712] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.102@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1839.677835] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 1839.687915] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 3 previous similar messages [ 1839.697225] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 1839.709149] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 8 previous similar messages [ 1847.193526] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.102@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1857.748141] Lustre: fir-OST000b-osc-MDT0001: Connection restored to 10.0.10.102@o2ib7 (at 10.0.10.102@o2ib7) [ 1857.757988] Lustre: Skipped 2 previous similar messages [ 1907.213265] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.102@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1971.488955] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.102@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2030.670265] Lustre: fir-MDT0001: Connection restored to 10.0.10.102@o2ib7 (at 10.0.10.102@o2ib7) [ 2030.679056] Lustre: Skipped 4 previous similar messages [ 2088.927629] LustreError: 11-0: fir-OST0013-osc-MDT0001: operation ost_statfs to node 10.0.10.103@o2ib7 failed: rc = -107 [ 2088.927638] Lustre: fir-OST0011-osc-MDT0001: Connection to fir-OST0011 (at 10.0.10.103@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 2088.927640] Lustre: Skipped 5 previous similar messages [ 2088.959785] LustreError: Skipped 4 previous similar messages [ 2109.679590] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 2109.689679] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 8 previous similar messages [ 2109.698991] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 2109.710918] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 18 previous similar messages [ 2131.188180] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.104@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2288.545961] Lustre: fir-MDT0001: Connection restored to 10.0.10.104@o2ib7 (at 10.0.10.104@o2ib7) [ 2288.554764] Lustre: Skipped 5 previous similar messages [ 2420.324143] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.104@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2420.341509] LustreError: Skipped 4 previous similar messages [ 2559.682679] LustreError: 11-0: fir-OST0021-osc-MDT0001: operation ost_statfs to node 10.0.10.105@o2ib7 failed: rc = -107 [ 2559.682687] Lustre: fir-OST001d-osc-MDT0001: Connection to fir-OST001d (at 10.0.10.105@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 2559.682690] Lustre: Skipped 5 previous similar messages [ 2559.714851] LustreError: Skipped 6 previous similar messages [ 2569.698735] LustreError: 11-0: fir-OST002f-osc-MDT0001: operation ost_statfs to node 10.0.10.107@o2ib7 failed: rc = -107 [ 2569.698742] Lustre: fir-OST0027-osc-MDT0001: Connection to fir-OST0027 (at 10.0.10.107@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 2569.698745] Lustre: Skipped 5 previous similar messages [ 2569.730900] LustreError: Skipped 5 previous similar messages [ 2584.722812] LustreError: 11-0: fir-OST0035-osc-MDT0001: operation ost_statfs to node 10.0.10.109@o2ib7 failed: rc = -107 [ 2584.722832] Lustre: fir-OST0033-osc-MDT0001: Connection to fir-OST0033 (at 10.0.10.109@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 2584.722834] Lustre: Skipped 5 previous similar messages [ 2584.755051] LustreError: Skipped 5 previous similar messages [ 2594.738863] LustreError: 11-0: fir-OST003d-osc-MDT0001: operation ost_statfs to node 10.0.10.111@o2ib7 failed: rc = -107 [ 2594.738878] Lustre: fir-OST0047-osc-MDT0001: Connection to fir-OST0047 (at 10.0.10.111@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 2594.738881] Lustre: Skipped 5 previous similar messages [ 2594.771020] LustreError: Skipped 5 previous similar messages [ 2614.771012] LustreError: 11-0: fir-OST0051-osc-MDT0001: operation ost_statfs to node 10.0.10.113@o2ib7 failed: rc = -107 [ 2614.771024] Lustre: fir-OST004b-osc-MDT0001: Connection to fir-OST004b (at 10.0.10.113@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 2614.771027] Lustre: Skipped 4 previous similar messages [ 2614.803172] LustreError: Skipped 5 previous similar messages [ 2624.738950] LNetError: 39686:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 2624.750884] LNetError: 39686:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 34 previous similar messages [ 2729.683623] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 2729.693705] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 5 previous similar messages [ 2803.516262] Lustre: fir-MDT0001: Connection restored to (at 10.0.10.114@o2ib7) [ 2803.523584] Lustre: Skipped 26 previous similar messages [ 2944.937773] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.0.10.114@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2944.955144] LustreError: Skipped 29 previous similar messages [ 3240.686792] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 3240.698710] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 41 previous similar messages [ 3330.687315] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 3330.697401] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 32 previous similar messages [ 3823.624570] LustreError: 11-0: fir-MDT0000-osp-MDT0001: operation ldlm_enqueue to node 10.0.10.52@o2ib7 failed: rc = -107 [ 3823.635524] LustreError: Skipped 6 previous similar messages [ 3823.641194] Lustre: fir-MDT0000-osp-MDT0001: Connection to fir-MDT0000 (at 10.0.10.52@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 3823.657191] Lustre: Skipped 11 previous similar messages [ 3845.690517] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 3845.702420] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 41 previous similar messages [ 3845.938932] LustreError: 137-5: fir-MDT0000_UUID: not available for connect from 10.9.101.20@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3845.956306] LustreError: Skipped 6 previous similar messages [ 3918.737966] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.54@o2ib7: -125 [ 3931.871144] LDISKFS-fs (dm-0): file extents enabled, maximum tree depth=5 [ 3931.955991] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc [ 3939.691105] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 1 seconds [ 3939.701192] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 45 previous similar messages [ 3961.749235] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.54@o2ib7: -125 [ 3975.760323] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.54@o2ib7: -125 [ 3975.773843] Lustre: fir-MDT0001: Connection restored to 10.0.10.51@o2ib7 (at 0@lo) [ 3975.781433] Lustre: Skipped 52 previous similar messages [ 3993.771429] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.54@o2ib7: -125 [ 3993.785185] Lustre: fir-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 3993.875512] Lustre: fir-MDD0000: changelog on [ 3993.880825] Lustre: fir-MDT0000: in recovery but waiting for the first client to connect [ 3994.138910] Lustre: fir-MDT0000: Will be in recovery for at least 2:30, or until 1290 clients reconnect [ 3995.900506] LustreError: 39388:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff886bf346f080 x1652453607966240/t0(0) o601->fir-MDT0000-lwp-OST002c_UUID@10.0.10.107@o2ib7:221/0 lens 336/0 e 0 to 0 dl 1575990241 ref 1 fl Interpret:/0/ffffffff rc 0/-1 [ 3995.926482] LustreError: 39388:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 1 previous similar message [ 3996.655844] LustreError: 38901:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff886bf525f080 x1652453608045088/t0(0) o601->fir-MDT0000-lwp-OST0026_UUID@10.0.10.107@o2ib7:222/0 lens 336/0 e 0 to 0 dl 1575990242 ref 1 fl Interpret:/0/ffffffff rc 0/-1 [ 3996.681811] LustreError: 38901:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 15 previous similar messages [ 3998.433428] LustreError: 38905:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff888becad0900 x1652453160710704/t0(0) o601->fir-MDT0000-lwp-OST0012_UUID@10.0.10.103@o2ib7:449/0 lens 336/0 e 0 to 0 dl 1575990469 ref 1 fl Interpret:/0/ffffffff rc 0/-1 [ 3998.459376] LustreError: 38905:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 16 previous similar messages [ 4000.446504] LustreError: 39445:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff885bd1ec9680 x1652542437489184/t0(0) o601->fir-MDT0000-lwp-OST003b_UUID@10.0.10.110@o2ib7:419/0 lens 336/0 e 0 to 0 dl 1575990439 ref 1 fl Interpret:/0/ffffffff rc 0/-1 [ 4000.472487] LustreError: 39445:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 49 previous similar messages [ 4001.036520] LustreError: 167-0: fir-MDT0000-lwp-MDT0001: This client was evicted by fir-MDT0000; in progress operations using this service will fail. [ 4004.449704] LustreError: 38899:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff885be7b91b00 x1652453261776256/t0(0) o601->fir-MDT0000-lwp-OST001e_UUID@10.0.10.105@o2ib7:423/0 lens 336/0 e 0 to 0 dl 1575990443 ref 1 fl Interpret:/0/ffffffff rc 0/-1 [ 4004.475654] LustreError: 38899:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 81 previous similar messages [ 4012.460841] LustreError: 40968:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff888bd9aa8000 x1652452417930336/t0(0) o601->fir-MDT0000-lwp-OST0046_UUID@10.0.10.111@o2ib7:493/0 lens 336/0 e 0 to 0 dl 1575990513 ref 1 fl Interpret:/0/ffffffff rc 0/-1 [ 4012.486789] LustreError: 40968:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 333 previous similar messages [ 4014.593218] Lustre: 40919:0:(ldlm_lib.c:1765:extend_recovery_timer()) fir-MDT0000: extended recovery timer reaching hard limit: 900, extend: 1 [ 4014.606002] Lustre: 40919:0:(ldlm_lib.c:1765:extend_recovery_timer()) Skipped 1 previous similar message [ 4014.664030] Lustre: fir-MDT0000: Recovery over after 0:21, of 1290 clients 1290 recovered and 0 were evicted. [ 4058.211823] Lustre: 38718:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1575990290/real 0] req@ffff887be3563f00 x1652542748116640/t0(0) o400->MGC10.0.10.51@o2ib7@10.0.10.52@o2ib7:26/25 lens 224/224 e 0 to 1 dl 1575990297 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [ 4058.239100] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 10.0.10.52@o2ib7) was lost; in progress operations using this service will fail [ 4450.694256] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 4450.706166] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 77 previous similar messages [ 4540.694759] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 0 seconds [ 4540.704842] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 81 previous similar messages [ 5053.697768] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 5053.709677] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 72 previous similar messages [ 5142.699240] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.52@o2ib7: 0 seconds [ 5142.709325] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 70 previous similar messages [ 5666.701387] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [ 5666.713297] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 65 previous similar messages [ 5727.493774] LustreError: 11-0: fir-MDT0003-osp-MDT0000: operation out_update to node 10.0.10.53@o2ib7 failed: rc = -107 [ 5727.504563] Lustre: fir-MDT0003-osp-MDT0000: Connection to fir-MDT0003 (at 10.0.10.53@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 5727.520578] Lustre: Skipped 1 previous similar message [ 5729.765853] LustreError: 11-0: fir-MDT0003-osp-MDT0001: operation mds_statfs to node 10.0.10.53@o2ib7 failed: rc = -107 [ 5729.776640] Lustre: fir-MDT0003-osp-MDT0001: Connection to fir-MDT0003 (at 10.0.10.53@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 5799.881055] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.8.18.21@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 5799.898359] LustreError: Skipped 1384 previous similar messages [ 5825.959689] Lustre: fir-MDT0000: Connection restored to 10.0.10.54@o2ib7 (at 10.0.10.54@o2ib7) [ 5825.968308] Lustre: Skipped 1389 previous similar messages [ 6023.414730] Lustre: Failing over fir-MDT0001 [ 6023.451836] LustreError: 39319:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.9.101.40@o2ib4 arrived at 1575992262 with bad export cookie 14105850204137837618 [ 6023.452580] Lustre: fir-MDT0001: Not available for connect from 10.8.21.2@o2ib6 (stopping) [ 6023.475767] LustreError: 39319:0:(ldlm_lock.c:2710:ldlm_lock_dump_handle()) ### ### ns: mdt-fir-MDT0001_UUID lock: ffff887bbe6a33c0/0xc3c20c0650922e5c lrc: 3/0,0 mode: CR/CR res: [0x240039b5a:0x34fc:0x0].0x0 bits 0x8/0x0 rrc: 25 type: IBT flags: 0x40000000000000 nid: 10.9.101.40@o2ib4 remote: 0x6b5a3d080a587381 expref: 33 pid: 39419 timeout: 0 lvb_type: 3 [ 6023.864861] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.24.1@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 6023.882059] LustreError: Skipped 1386 previous similar messages [ 6024.071544] LustreError: 11-0: fir-MDT0001-osp-MDT0000: operation mds_statfs to node 0@lo failed: rc = -107 [ 6024.081294] Lustre: fir-MDT0001-osp-MDT0000: Connection to fir-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 6024.334543] Lustre: server umount fir-MDT0001 complete [ 6024.888141] LustreError: 41013:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.9.101.36@o2ib4 arrived at 1575992264 with bad export cookie 14105850204137842210 [ 6102.707328] Lustre: fir-MDT0000: Connection restored to 10.0.10.52@o2ib7 (at 10.0.10.52@o2ib7) [ 6102.715960] Lustre: Skipped 4 previous similar messages [ 6189.020375] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 6189.344878] Lustre: Evicted from MGS (at 10.0.10.51@o2ib7) after server handle changed from 0xbba64b52f5111279 to 0xc3c20c065249ff79 [ 6257.424810] Lustre: MGS: Connection restored to 620d1500-6c55-6b8e-5d6d-f72ab08ba0d5 (at 10.9.102.46@o2ib4) [ 6257.434550] Lustre: Skipped 500 previous similar messages [ 6973.666668] Lustre: MGS: haven't heard from client 192d92f7-72ee-e6af-660a-7b659d74bb6a (at 10.9.0.62@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde66400, cur 1575993213 expire 1575993063 last 1575992986 [ 6981.634236] Lustre: fir-MDT0000: haven't heard from client 82b9ac9e-bd42-fb9c-cb3e-f327857b510c (at 10.9.0.62@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884c23446000, cur 1575993221 expire 1575993071 last 1575992994 [ 7744.145627] Lustre: MGS: Connection restored to 4437bcee-37db-e5e0-e8d5-9055e0a77d74 (at 10.9.106.45@o2ib4) [ 7744.155363] Lustre: Skipped 806 previous similar messages [ 9429.858138] Lustre: MGS: Connection restored to (at 10.9.0.62@o2ib4) [ 9431.648137] Lustre: fir-MDT0000: haven't heard from client cec884d3-ca4b-8127-2f6b-7762665aa5f8 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9191800, cur 1575995671 expire 1575995521 last 1575995444 [10538.751961] Lustre: MGS: Connection restored to fbefd9c2-b03e-16ab-7b85-ec9f835d33da (at 10.9.105.22@o2ib4) [10538.761715] Lustre: Skipped 1 previous similar message [11619.365002] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [12052.674909] Lustre: fir-MDT0000: haven't heard from client fb9a2d5e-e9b3-4fb9-b988-9954fcfb0920 (at 10.8.0.66@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9461800, cur 1575998292 expire 1575998142 last 1575998065 [12052.696525] Lustre: Skipped 1 previous similar message [14105.058904] Lustre: MGS: Connection restored to fb9a2d5e-e9b3-4fb9-b988-9954fcfb0920 (at 10.8.0.66@o2ib6) [14105.068476] Lustre: Skipped 1 previous similar message [15085.815112] Lustre: MGS: Connection restored to (at 10.9.110.39@o2ib4) [15085.821737] Lustre: Skipped 1 previous similar message [15594.692622] Lustre: MGS: haven't heard from client c4027f9f-ee2a-72d3-b779-f352c130a817 (at 10.8.21.28@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887d4b8a7400, cur 1576001834 expire 1576001684 last 1576001607 [15594.713656] Lustre: Skipped 1 previous similar message [17331.170897] Lustre: MGS: Connection restored to 5d310239-acc9-4 (at 10.9.108.39@o2ib4) [17670.273460] Lustre: MGS: Connection restored to (at 10.8.21.14@o2ib6) [17670.280001] Lustre: Skipped 1 previous similar message [17684.870317] Lustre: MGS: Connection restored to 98c710cf-a183-35fe-d60d-8494e153f1c3 (at 10.8.21.13@o2ib6) [17684.879972] Lustre: Skipped 1 previous similar message [17687.350375] Lustre: MGS: Connection restored to 172ec88c-3454-1411-8e15-a9b5202e9e30 (at 10.8.21.8@o2ib6) [17687.359943] Lustre: Skipped 3 previous similar messages [17697.299183] Lustre: MGS: Connection restored to 40a204f8-61bd-7bf5-8e8b-66a640362528 (at 10.8.21.28@o2ib6) [17697.308836] Lustre: Skipped 3 previous similar messages [17707.506559] Lustre: MGS: Connection restored to 13210e44-89fd-e522-ded9-67ae564de904 (at 10.8.21.20@o2ib6) [17707.516212] Lustre: Skipped 1 previous similar message [17723.912006] Lustre: MGS: Connection restored to (at 10.8.20.18@o2ib6) [17723.918550] Lustre: Skipped 11 previous similar messages [17757.867638] Lustre: MGS: Connection restored to (at 10.8.20.15@o2ib6) [17757.874171] Lustre: Skipped 27 previous similar messages [17847.625729] Lustre: MGS: Connection restored to 5f11dd29-1211-44a2-2612-f8309cf085b3 (at 10.8.21.18@o2ib6) [17847.635383] Lustre: Skipped 31 previous similar messages [18699.322808] Lustre: MGS: Connection restored to (at 10.8.22.12@o2ib6) [18699.329341] Lustre: Skipped 1 previous similar message [18722.187538] Lustre: MGS: Connection restored to 92c08489-d99f-9692-0d8e-5d862ef77698 (at 10.8.22.5@o2ib6) [18722.197106] Lustre: Skipped 1 previous similar message [18796.400265] Lustre: MGS: Connection restored to (at 10.8.20.8@o2ib6) [18796.406721] Lustre: Skipped 5 previous similar messages [20764.092335] Lustre: MGS: Connection restored to (at 10.8.22.4@o2ib6) [20764.098787] Lustre: Skipped 1 previous similar message [25839.679520] Lustre: MGS: Connection restored to (at 10.8.20.27@o2ib6) [25839.686057] Lustre: Skipped 1 previous similar message [28048.761950] Lustre: fir-MDT0000: haven't heard from client 8fbd1a16-d09d-1ef7-e10d-4e68dc0a9f97 (at 10.8.23.32@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9775c00, cur 1576014288 expire 1576014138 last 1576014061 [28048.783676] Lustre: Skipped 25 previous similar messages [30224.092136] Lustre: MGS: Connection restored to 8fbd1a16-d09d-1ef7-e10d-4e68dc0a9f97 (at 10.8.23.32@o2ib6) [30224.101810] Lustre: Skipped 1 previous similar message [31837.235359] Lustre: MGS: Connection restored to 0bbd53e2-6989-83e6-f126-86a473496205 (at 10.8.21.36@o2ib6) [31837.245014] Lustre: Skipped 1 previous similar message [34293.808430] Lustre: fir-MDT0000: haven't heard from client ee4590b6-1057-e690-5db0-89b0af3963cd (at 10.8.22.30@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba94db000, cur 1576020533 expire 1576020383 last 1576020306 [34293.830134] Lustre: Skipped 1 previous similar message [34669.767415] Lustre: MGS: Connection restored to c20915b7-72a8-8f0f-a961-7c81095a2283 (at 10.8.23.29@o2ib6) [34669.777079] Lustre: Skipped 1 previous similar message [36378.430328] Lustre: MGS: Connection restored to ee4590b6-1057-e690-5db0-89b0af3963cd (at 10.8.22.30@o2ib6) [36378.439981] Lustre: Skipped 1 previous similar message [39842.421792] Lustre: MGS: Connection restored to (at 10.8.23.20@o2ib6) [39842.428330] Lustre: Skipped 1 previous similar message [44495.623183] Lustre: MGS: Connection restored to 43d748a2-b8c5-e7f9-8b00-d16d4390ff4d (at 10.8.22.6@o2ib6) [44495.632751] Lustre: Skipped 1 previous similar message [45199.867986] Lustre: fir-MDT0000: haven't heard from client b6bab463-5f5c-8f5c-f09a-8f0ce0f6e1cd (at 10.8.21.31@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9393800, cur 1576031439 expire 1576031289 last 1576031212 [45199.889718] Lustre: Skipped 1 previous similar message [45275.864511] Lustre: fir-MDT0000: haven't heard from client 7515dbe4-f1c8-844a-9186-76f9c6288c34 (at 10.9.104.2@o2ib4) in 222 seconds. I think it's dead, and I am evicting it. exp ffff887ba96ff000, cur 1576031515 expire 1576031365 last 1576031293 [45275.886223] Lustre: Skipped 9 previous similar messages [46427.989255] Lustre: MGS: Connection restored to (at 10.9.114.14@o2ib4) [46427.995875] Lustre: Skipped 1 previous similar message [46475.261379] Lustre: MGS: Connection restored to (at 10.8.19.6@o2ib6) [46475.267837] Lustre: Skipped 1 previous similar message [46664.182712] Lustre: MGS: Connection restored to (at 10.9.110.71@o2ib4) [46664.189337] Lustre: Skipped 1 previous similar message [46703.521112] Lustre: MGS: Connection restored to (at 10.9.107.9@o2ib4) [46703.527653] Lustre: Skipped 1 previous similar message [46787.977596] Lustre: MGS: Connection restored to e8872901-9e69-2d9a-e57a-55077a64186b (at 10.9.109.25@o2ib4) [46787.987341] Lustre: Skipped 1 previous similar message [46918.263755] Lustre: MGS: Connection restored to 2ad8ff13-d978-9373-7245-882c6479cc4c (at 10.9.110.63@o2ib4) [46918.273499] Lustre: Skipped 1 previous similar message [46935.186516] Lustre: MGS: Connection restored to (at 10.9.110.62@o2ib4) [46935.193146] Lustre: Skipped 5 previous similar messages [47159.520317] Lustre: MGS: Connection restored to b5acf087-1850-f5e1-236a-4cc1bab1a9f0 (at 10.9.104.34@o2ib4) [47159.530061] Lustre: Skipped 5 previous similar messages [47267.141330] Lustre: MGS: Connection restored to b6bab463-5f5c-8f5c-f09a-8f0ce0f6e1cd (at 10.8.21.31@o2ib6) [47267.150980] Lustre: Skipped 7 previous similar messages [47484.359051] Lustre: MGS: Connection restored to 7eb73248-6ba3-525f-5dd7-7492b5394353 (at 10.8.28.9@o2ib6) [47484.368640] Lustre: Skipped 9 previous similar messages [49517.888394] Lustre: fir-MDT0000: haven't heard from client aadbd140-afe6-3cc5-5efa-1bf64465f6e7 (at 10.8.20.34@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba935d800, cur 1576035757 expire 1576035607 last 1576035530 [49517.910101] Lustre: Skipped 27 previous similar messages [51628.758894] Lustre: MGS: Connection restored to aadbd140-afe6-3cc5-5efa-1bf64465f6e7 (at 10.8.20.34@o2ib6) [51628.768548] Lustre: Skipped 17 previous similar messages [56371.222172] Lustre: MGS: Connection restored to 55ff50e7-08a4-be07-5499-ccc18f03f2c9 (at 10.8.23.17@o2ib6) [56371.231826] Lustre: Skipped 1 previous similar message [63338.070275] Lustre: MGS: Connection restored to 77f07ca8-e3bd-72f6-4ac1-3da8889522b3 (at 10.8.22.19@o2ib6) [63338.079944] Lustre: Skipped 1 previous similar message [63840.180855] Lustre: MGS: Connection restored to (at 10.8.20.5@o2ib6) [63840.187303] Lustre: Skipped 1 previous similar message [69476.008413] Lustre: fir-MDT0000: haven't heard from client 09a03217-f2a1-2632-097f-38339f6cbc7c (at 10.8.22.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97e9000, cur 1576055715 expire 1576055565 last 1576055488 [69476.030047] Lustre: Skipped 1 previous similar message [69526.346880] Lustre: MGS: Connection restored to 37c7e464-6686-fdc0-1c81-eae75026a910 (at 10.8.22.2@o2ib6) [69526.356455] Lustre: Skipped 1 previous similar message [71421.391276] Lustre: MGS: Connection restored to 1b1ace85-4b01-f903-bb83-ddb9142a20b0 (at 10.8.23.25@o2ib6) [71421.400935] Lustre: Skipped 1 previous similar message [71595.960113] Lustre: MGS: Connection restored to (at 10.8.22.1@o2ib6) [71595.966566] Lustre: Skipped 1 previous similar message [72786.030811] Lustre: fir-MDT0000: haven't heard from client d48dfcab-ce8f-b93c-3409-a3e76df7c945 (at 10.8.23.22@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9243400, cur 1576059025 expire 1576058875 last 1576058798 [72786.052510] Lustre: Skipped 1 previous similar message [74967.332878] Lustre: MGS: Connection restored to (at 10.8.23.22@o2ib6) [74967.339413] Lustre: Skipped 1 previous similar message [90733.802783] Lustre: MGS: Connection restored to (at 10.8.22.7@o2ib6) [90733.809231] Lustre: Skipped 1 previous similar message [95530.182682] Lustre: fir-MDT0000: haven't heard from client 5a6b489d-8a0c-1dc7-c222-8c5330c92213 (at 10.8.8.20@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be35ea000, cur 1576081769 expire 1576081619 last 1576081542 [95530.204317] Lustre: Skipped 1 previous similar message [95711.169211] Lustre: MGS: haven't heard from client 672e75f9-4fe3-4 (at 10.9.109.25@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bf5ad0400, cur 1576081950 expire 1576081800 last 1576081723 [95711.188504] Lustre: Skipped 17 previous similar messages [95718.410032] Lustre: MGS: Connection restored to 714da8dd-1047-4 (at 10.9.107.20@o2ib4) [95718.417961] Lustre: Skipped 1 previous similar message [95965.630803] Lustre: MGS: Connection restored to (at 10.9.110.71@o2ib4) [95965.637426] Lustre: Skipped 1 previous similar message [95980.615239] Lustre: MGS: Connection restored to e8872901-9e69-2d9a-e57a-55077a64186b (at 10.9.109.25@o2ib4) [95980.624981] Lustre: Skipped 1 previous similar message [96963.347706] Lustre: MGS: Connection restored to (at 10.9.117.46@o2ib4) [96963.354331] Lustre: Skipped 1 previous similar message [96996.203330] Lustre: MGS: Connection restored to 4c497e0b-ea41-4 (at 10.8.9.1@o2ib6) [96996.210989] Lustre: Skipped 1 previous similar message [97210.428873] Lustre: MGS: Connection restored to 8a77a7b3-28b8-5200-390a-7fe51bf1be0a (at 10.8.7.5@o2ib6) [97210.438361] Lustre: Skipped 1 previous similar message [97303.768927] Lustre: MGS: Connection restored to (at 10.9.101.60@o2ib4) [97303.775551] Lustre: Skipped 1 previous similar message [97320.432227] Lustre: MGS: Connection restored to 54fd6f2e-cb6c-4 (at 10.9.101.57@o2ib4) [97320.440151] Lustre: Skipped 1 previous similar message [97330.612917] Lustre: MGS: Connection restored to c658bf97-675e-4 (at 10.9.101.59@o2ib4) [97330.620844] Lustre: Skipped 1 previous similar message [97419.901319] Lustre: MGS: Connection restored to 5a6b489d-8a0c-1dc7-c222-8c5330c92213 (at 10.8.8.20@o2ib6) [97419.910884] Lustre: Skipped 1 previous similar message [97624.213395] Lustre: MGS: Connection restored to fc841094-f1fd-2756-1968-f74105b220e6 (at 10.8.8.30@o2ib6) [97624.222964] Lustre: Skipped 1 previous similar message [97875.445012] Lustre: MGS: Connection restored to (at 10.9.102.48@o2ib4) [97875.451642] Lustre: Skipped 5 previous similar messages [98333.584180] Lustre: MGS: Connection restored to 6676e5f3-c59e-c628-05b4-c9153b23c3f7 (at 10.8.21.16@o2ib6) [98333.593858] Lustre: Skipped 3 previous similar messages [106300.766587] Lustre: MGS: Connection restored to 5ce2e68e-76b2-bbc3-75c5-66a5c2b02651 (at 10.8.23.15@o2ib6) [106300.776331] Lustre: Skipped 3 previous similar messages [107711.237837] Lustre: fir-MDT0000: haven't heard from client 45ffa07c-203c-dad9-8f0d-e714fc6465b8 (at 10.8.22.11@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bf9936800, cur 1576093950 expire 1576093800 last 1576093723 [107711.259633] Lustre: Skipped 3 previous similar messages [109388.246940] Lustre: fir-MDT0000: haven't heard from client 704e8622-7442-8eb3-b4e3-c86a69ef45af (at 10.8.20.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba958c800, cur 1576095627 expire 1576095477 last 1576095400 [109388.268742] Lustre: Skipped 1 previous similar message [109777.164640] Lustre: MGS: Connection restored to (at 10.8.22.11@o2ib6) [109777.171281] Lustre: Skipped 1 previous similar message [109788.569226] Lustre: MGS: Connection restored to 4f86dcb5-8d8c-1599-bd44-005eb718eb65 (at 10.8.22.10@o2ib6) [109788.578969] Lustre: Skipped 1 previous similar message [109844.309564] Lustre: MGS: Connection restored to (at 10.8.23.23@o2ib6) [109844.316184] Lustre: Skipped 1 previous similar message [110243.384853] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096475/real 1576096475] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096482 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [110250.411899] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096482/real 1576096482] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096489 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [110257.438933] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096489/real 1576096489] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096496 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [110264.465975] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096496/real 1576096496] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096503 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [110271.493014] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096503/real 1576096503] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096510 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [110285.520095] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096517/real 1576096517] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096524 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [110285.547535] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 1 previous similar message [110306.557212] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096538/real 1576096538] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096545 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [110306.584636] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [110341.594418] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576096573/real 1576096573] req@ffff884c235b2400 x1652542886044416/t0(0) o104->fir-MDT0000@10.9.112.17@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576096580 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [110341.621853] Lustre: 39434:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [110390.631711] LustreError: 39434:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.9.112.17@o2ib4) failed to reply to blocking AST (req@ffff884c235b2400 x1652542886044416 status 0 rc -110), evict it ns: mdt-fir-MDT0000_UUID lock: ffff888bf167de80/0xc3c20c06994047fb lrc: 4/0,0 mode: PR/PR res: [0x20003963a:0x2ae:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x60200400000020 nid: 10.9.112.17@o2ib4 remote: 0x66c20da50bf8090b expref: 420 pid: 39396 timeout: 110532 lvb_type: 0 [110390.674684] LustreError: 138-a: fir-MDT0000: A client on nid 10.9.112.17@o2ib4 was evicted due to a lock blocking callback time out: rc -110 [110390.687397] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 154s: evicting client at 10.9.112.17@o2ib4 ns: mdt-fir-MDT0000_UUID lock: ffff888bf167de80/0xc3c20c06994047fb lrc: 3/0,0 mode: PR/PR res: [0x20003963a:0x2ae:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x60200400000020 nid: 10.9.112.17@o2ib4 remote: 0x66c20da50bf8090b expref: 421 pid: 39396 timeout: 0 lvb_type: 0 [110427.272490] Lustre: MGS: haven't heard from client b8b1dc75-1715-2d9e-e1ec-b7625b32320e (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc6a69400, cur 1576096666 expire 1576096516 last 1576096439 [110427.293700] Lustre: Skipped 1 previous similar message [111505.819344] Lustre: MGS: Connection restored to 704e8622-7442-8eb3-b4e3-c86a69ef45af (at 10.8.20.21@o2ib6) [111505.829095] Lustre: Skipped 1 previous similar message [111537.359939] Lustre: MGS: Connection restored to c3415e6e-dda3-8602-28df-a932f656881d (at 10.9.112.17@o2ib4) [111537.369771] Lustre: Skipped 1 previous similar message [111897.734219] Lustre: MGS: Connection restored to 4c497e0b-ea41-4 (at 10.8.9.1@o2ib6) [111897.741975] Lustre: Skipped 1 previous similar message [111912.296698] Lustre: MGS: Connection restored to (at 10.8.23.13@o2ib6) [111912.303324] Lustre: Skipped 1 previous similar message [112006.625030] Lustre: MGS: Connection restored to 37c7e464-6686-fdc0-1c81-eae75026a910 (at 10.8.22.2@o2ib6) [112006.634694] Lustre: Skipped 1 previous similar message [112104.542981] Lustre: MGS: Connection restored to b34be8aa-32d9-4 (at 10.9.113.13@o2ib4) [112104.550995] Lustre: Skipped 1 previous similar message [112185.692037] Lustre: MGS: Connection restored to (at 10.9.101.60@o2ib4) [112185.698754] Lustre: Skipped 1 previous similar message [112517.853689] Lustre: MGS: Connection restored to (at 10.8.24.7@o2ib6) [112517.860231] Lustre: Skipped 3 previous similar messages [114021.282924] Lustre: MGS: haven't heard from client f5dfb63b-1da5-2f76-47e2-80171bbf932c (at 10.8.22.16@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf9705400, cur 1576100260 expire 1576100110 last 1576100033 [114035.274945] Lustre: fir-MDT0000: haven't heard from client 000d6715-906a-fe00-99d9-1ba39760e7f7 (at 10.8.22.16@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba935c400, cur 1576100274 expire 1576100124 last 1576100047 [114035.296759] Lustre: Skipped 1 previous similar message [114514.278734] Lustre: fir-MDT0000: haven't heard from client 85fbdf3d-35db-072c-03b7-e9977baaa2bf (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba935e800, cur 1576100753 expire 1576100603 last 1576100526 [114514.300550] Lustre: Skipped 1 previous similar message [114724.218705] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [114724.225329] Lustre: Skipped 1 previous similar message [116099.725736] Lustre: MGS: Connection restored to (at 10.8.23.8@o2ib6) [116099.732269] Lustre: Skipped 1 previous similar message [116107.624175] Lustre: MGS: Connection restored to (at 10.8.23.18@o2ib6) [116107.630797] Lustre: Skipped 1 previous similar message [116118.281975] Lustre: MGS: Connection restored to (at 10.8.22.18@o2ib6) [116118.288601] Lustre: Skipped 1 previous similar message [116135.466570] Lustre: MGS: Connection restored to 94396c8b-eccd-7da2-de85-f79420b2e641 (at 10.8.23.33@o2ib6) [116135.476314] Lustre: Skipped 3 previous similar messages [117205.303986] Lustre: MGS: haven't heard from client f9e2b822-92c5-4 (at 10.9.117.46@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8859cf6ce400, cur 1576103444 expire 1576103294 last 1576103217 [117205.323350] Lustre: Skipped 1 previous similar message [117359.697927] Lustre: MGS: Connection restored to (at 10.9.117.46@o2ib4) [117359.704634] Lustre: Skipped 5 previous similar messages [117599.025292] Lustre: MGS: Connection restored to (at 10.8.21.2@o2ib6) [117599.031829] Lustre: Skipped 1 previous similar message [117610.390683] Lustre: MGS: Connection restored to (at 10.8.22.32@o2ib6) [117610.397307] Lustre: Skipped 1 previous similar message [119275.831968] Lustre: MGS: Connection restored to (at 10.8.23.27@o2ib6) [119275.838593] Lustre: Skipped 1 previous similar message [119525.398440] Lustre: MGS: Connection restored to (at 10.8.22.27@o2ib6) [119525.405058] Lustre: Skipped 1 previous similar message [129326.787361] perf: interrupt took too long (2501 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [133257.254526] Lustre: MGS: Connection restored to 0aa269ad-def9-3be3-d596-fd7c0af955fb (at 10.8.20.26@o2ib6) [133257.264275] Lustre: Skipped 1 previous similar message [135435.247206] Lustre: MGS: Connection restored to 207217ac-1163-df36-3120-8bf6c3ecbb93 (at 10.8.23.21@o2ib6) [135435.256952] Lustre: Skipped 1 previous similar message [142975.533244] Lustre: MGS: Connection restored to e15078c5-8209-4 (at 10.8.25.17@o2ib6) [142975.541160] Lustre: Skipped 1 previous similar message [143019.466050] Lustre: fir-MDT0000: haven't heard from client e15078c5-8209-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9499800, cur 1576129258 expire 1576129108 last 1576129031 [143019.486043] Lustre: Skipped 1 previous similar message [143029.457389] Lustre: MGS: haven't heard from client 99698eca-49dd-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf43fc400, cur 1576129268 expire 1576129118 last 1576129041 [143404.454000] Lustre: fir-MDT0000: haven't heard from client 208ccf09-d6ca-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887926a88400, cur 1576129643 expire 1576129493 last 1576129416 [144544.824976] Lustre: MGS: Connection restored to e15078c5-8209-4 (at 10.8.25.17@o2ib6) [144544.832892] Lustre: Skipped 1 previous similar message [145024.470553] Lustre: MGS: haven't heard from client d223256d-1b6e-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88797e232000, cur 1576131263 expire 1576131113 last 1576131036 [145024.489855] Lustre: Skipped 1 previous similar message [147441.110111] Lustre: MGS: Connection restored to (at 10.8.22.20@o2ib6) [147441.116735] Lustre: Skipped 1 previous similar message [147447.227204] Lustre: MGS: Connection restored to bd358c1a-07c6-3f9f-7c84-efdb04e29ef9 (at 10.8.21.1@o2ib6) [147447.236862] Lustre: Skipped 1 previous similar message [147511.519053] Lustre: MGS: Connection restored to (at 10.8.22.26@o2ib6) [147511.525684] Lustre: Skipped 1 previous similar message [150214.532159] Lustre: MGS: haven't heard from client 62793aa5-4457-2f74-3453-81c7d0efe754 (at 10.8.22.22@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc761f000, cur 1576136453 expire 1576136303 last 1576136226 [150214.553259] Lustre: Skipped 1 previous similar message [152295.270276] Lustre: MGS: Connection restored to 687b1eea-b865-b791-9de5-a67096eac725 (at 10.8.23.26@o2ib6) [152295.280034] Lustre: Skipped 1 previous similar message [152308.524139] Lustre: MGS: Connection restored to ca09bd61-a4b3-111c-b997-9c7823236764 (at 10.8.22.17@o2ib6) [152308.533878] Lustre: Skipped 1 previous similar message [152310.828657] Lustre: MGS: Connection restored to 00850750-7463-78da-94ee-623be2781c44 (at 10.8.22.22@o2ib6) [152310.838401] Lustre: Skipped 1 previous similar message [152327.635083] Lustre: MGS: Connection restored to a507eb44-8ff1-13e2-fab8-30d1823663f8 (at 10.8.22.24@o2ib6) [152327.644829] Lustre: Skipped 1 previous similar message [152557.605390] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576138195/real 1576138195] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576138796 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [152557.633614] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [152557.643455] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [152557.659828] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [152557.669751] Lustre: Skipped 1 previous similar message [152572.536713] Lustre: MGS: Connection restored to (at 10.8.22.14@o2ib6) [153159.129041] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576138796/real 1576138796] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576139397 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [153159.157253] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [153159.173601] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [153159.183529] Lustre: Skipped 1 previous similar message [153313.663975] Lustre: 40805:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576138796/real 1576138796] req@ffff887be1e81b00 x1652542919618352/t0(0) o5->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 432/432 e 0 to 1 dl 1576139552 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [153313.692294] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [153759.804652] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139397/real 1576139397] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576139998 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [153759.832887] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [153759.849235] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [154070.709518] Lustre: 40805:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139553/real 1576139553] req@ffff887bf70f2d00 x1652542920146672/t0(0) o5->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 432/432 e 0 to 1 dl 1576140309 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [154070.737889] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [154363.777284] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139998/real 1576139998] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576140599 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [154363.805516] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [154363.821886] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [154531.398703] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.25.17@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [154531.416077] LustreError: Skipped 2725 previous similar messages [154631.751433] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.25.17@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [154827.755086] Lustre: 40805:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576140310/real 1576140310] req@ffff887805158d80 x1652542920402592/t0(0) o5->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 432/432 e 0 to 1 dl 1576141066 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [154827.783401] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [154965.067958] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [154965.084298] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [154965.094213] Lustre: Skipped 2 previous similar messages [155052.900500] Lustre: fir-OST005e-osc-MDT0000: Connection to fir-OST005e (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [155142.540009] Lustre: fir-MDT0000: haven't heard from client 619199f2-141e-aa07-09cb-eb294e06c3f1 (at 10.9.116.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9397c00, cur 1576141381 expire 1576141231 last 1576141154 [155565.743557] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576141203/real 1576141203] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576141804 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [155565.771783] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [155565.781621] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [155565.797976] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [155565.807892] Lustre: Skipped 1 previous similar message [155584.800676] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [155809.913979] Lustre: fir-OST005e-osc-MDT0000: Connection to fir-OST005e (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [155970.593918] Lustre: fir-OST0058-osc-MDT0000: Connection to fir-OST0058 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [156167.395052] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576141804/real 1576141804] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576142405 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [156167.423259] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [156167.433110] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [156167.449444] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [156167.459370] Lustre: Skipped 2 previous similar messages [156186.817184] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [156566.493381] Lustre: fir-OST005e-osc-MDT0000: Connection to fir-OST005e (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [156767.702570] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576142405/real 1576142405] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576143006 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [156767.730778] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [156767.740806] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [156767.750746] Lustre: Skipped 2 previous similar messages [156943.834630] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [157321.698062] Lustre: fir-OST005e-osc-MDT0000: Connection to fir-OST005e (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [157321.714224] Lustre: Skipped 2 previous similar messages [157369.322355] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576143006/real 1576143006] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576143607 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [157369.350575] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [157369.360574] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [157369.370516] Lustre: Skipped 1 previous similar message [157700.852583] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [157969.782273] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576143607/real 1576143607] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576144208 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [157969.810496] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [157969.820326] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [157969.836512] Lustre: Skipped 2 previous similar messages [157969.842009] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [157969.851946] Lustre: Skipped 1 previous similar message [158457.870200] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [158571.041867] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576144208/real 1576144208] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576144809 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [158571.070093] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [158571.079939] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [158571.096124] Lustre: Skipped 3 previous similar messages [158571.101629] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [158571.111557] Lustre: Skipped 3 previous similar messages [159172.005404] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576144809/real 1576144809] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576145410 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [159172.033635] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [159172.043490] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [159172.059675] Lustre: Skipped 3 previous similar messages [159172.065187] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [159172.075122] Lustre: Skipped 3 previous similar messages [159214.887655] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [159773.105912] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576145410/real 1576145410] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576146011 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [159773.134136] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [159773.143967] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [159773.160159] Lustre: Skipped 3 previous similar messages [159773.165672] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [159773.175604] Lustre: Skipped 3 previous similar messages [159971.905089] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [160374.412423] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576146011/real 1576146011] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576146612 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [160374.440630] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [160374.450461] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [160374.466652] Lustre: Skipped 2 previous similar messages [160374.472156] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [160374.482072] Lustre: Skipped 2 previous similar messages [160728.922437] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [160974.824811] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576146612/real 1576146612] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576147213 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [160974.853018] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [160974.862846] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [160974.879019] Lustre: Skipped 3 previous similar messages [160974.884516] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [160974.894436] Lustre: Skipped 3 previous similar messages [161485.939764] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [161575.707262] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576147213/real 1576147213] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576147814 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [161575.735471] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [161575.745303] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [161575.761478] Lustre: Skipped 4 previous similar messages [161575.767036] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [161575.776962] Lustre: Skipped 4 previous similar messages [162176.622694] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576147814/real 1576147814] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576148415 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [162176.650919] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [162176.660767] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [162176.676941] Lustre: Skipped 4 previous similar messages [162176.682487] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [162176.692404] Lustre: Skipped 4 previous similar messages [162242.957085] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [162389.572161] Lustre: fir-MDT0000: haven't heard from client 33fb836e-8923-4 (at 10.9.113.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888b8c72bc00, cur 1576148628 expire 1576148478 last 1576148401 [162389.592234] Lustre: Skipped 1 previous similar message [162674.576529] Lustre: MGS: haven't heard from client 112f7644-c2be-8370-fe29-78c9940c58ee (at 10.9.103.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc47de000, cur 1576148913 expire 1576148763 last 1576148686 [162674.597626] Lustre: Skipped 1 previous similar message [162778.642103] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576148415/real 1576148415] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576149016 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [162778.670308] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [162778.680142] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [162778.696329] Lustre: Skipped 4 previous similar messages [162778.701808] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [162778.711753] Lustre: Skipped 6 previous similar messages [162999.974358] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [163379.677509] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576149017/real 1576149017] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576149618 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [163379.705714] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [163379.715547] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [163379.731733] Lustre: Skipped 3 previous similar messages [163379.737225] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [163379.747176] Lustre: Skipped 3 previous similar messages [163756.991697] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [163981.552975] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576149618/real 1576149618] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576150219 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [163981.581174] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [163981.591012] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [163981.607184] Lustre: Skipped 4 previous similar messages [163981.612679] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [163981.622615] Lustre: Skipped 4 previous similar messages [164514.009060] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [164583.388441] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576150220/real 1576150220] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576150821 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [164583.416644] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [164583.426481] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [164583.442658] Lustre: Skipped 5 previous similar messages [164583.448140] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [164583.458077] Lustre: Skipped 5 previous similar messages [165184.287862] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576150821/real 1576150821] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576151422 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [165184.316066] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [165184.325899] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [165184.342078] Lustre: Skipped 4 previous similar messages [165184.347549] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [165184.357484] Lustre: Skipped 4 previous similar messages [165271.026371] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [165785.171319] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576151422/real 1576151422] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576152023 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [165785.199529] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [165785.209365] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [165785.225538] Lustre: Skipped 3 previous similar messages [165785.231022] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [165785.240947] Lustre: Skipped 3 previous similar messages [165963.630136] Lustre: fir-MDT0000: haven't heard from client a83208a9-361d-4 (at 10.9.112.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bed3dfc00, cur 1576152202 expire 1576152052 last 1576151975 [165963.650113] Lustre: Skipped 1 previous similar message [165965.607660] Lustre: MGS: haven't heard from client 97bcf7cb-bf78-4 (at 10.9.112.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf3c82000, cur 1576152204 expire 1576152054 last 1576151977 [165965.626959] Lustre: Skipped 1 previous similar message [166028.043723] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [166300.617222] Lustre: fir-MDT0000: haven't heard from client 46023962-0c0f-4f56-ba25-877d19751e9f (at 10.8.18.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba973d400, cur 1576152539 expire 1576152389 last 1576152312 [166300.639036] Lustre: Skipped 1 previous similar message [166386.166785] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576152023/real 1576152023] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576152624 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [166386.195010] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [166386.204848] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [166386.221022] Lustre: Skipped 3 previous similar messages [166386.226492] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [166386.236432] Lustre: Skipped 7 previous similar messages [166785.061124] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [166987.338313] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576152624/real 1576152624] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576153225 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [166987.366515] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [166987.376349] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [166987.392527] Lustre: Skipped 5 previous similar messages [166987.398010] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [166987.407946] Lustre: Skipped 5 previous similar messages [167140.628491] Lustre: MGS: haven't heard from client da9227f5-bd81-94e6-98e1-3a8a3bec89b0 (at 10.9.103.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be173f000, cur 1576153379 expire 1576153229 last 1576153152 [167140.649690] Lustre: Skipped 1 previous similar message [167542.078638] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [167588.237897] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576153225/real 1576153225] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576153826 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [167588.266104] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [167588.275937] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [167588.292113] Lustre: Skipped 4 previous similar messages [167588.297608] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [167588.307539] Lustre: Skipped 6 previous similar messages [167906.783777] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [168020.800457] LustreError: 40817:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005c-osc-MDT0000: cannot cleanup orphans: rc = -11 [168067.840734] LustreError: 40809:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0058-osc-MDT0000: cannot cleanup orphans: rc = -11 [168189.153427] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576153826/real 1576153826] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576154427 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [168189.181626] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [168189.191456] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [168189.207628] Lustre: Skipped 4 previous similar messages [168189.213186] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [168189.223101] Lustre: Skipped 4 previous similar messages [168191.105453] LustreError: 40801:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0054-osc-MDT0000: cannot cleanup orphans: rc = -11 [168299.096080] LustreError: 40805:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0000: cannot cleanup orphans: rc = -107 [168350.698383] LustreError: 40813:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005a-osc-MDT0000: cannot cleanup orphans: rc = -107 [168663.801219] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [168777.817883] LustreError: 40817:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005c-osc-MDT0000: cannot cleanup orphans: rc = -107 [168790.228951] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576154427/real 1576154427] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576155028 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [168790.257158] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [168790.266993] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [168790.283197] Lustre: Skipped 4 previous similar messages [168790.288722] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [168790.298646] Lustre: Skipped 4 previous similar messages [168948.122846] LustreError: 40801:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0054-osc-MDT0000: cannot cleanup orphans: rc = -107 [168948.135970] LustreError: 40801:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 1 previous similar message [169391.248410] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576155028/real 1576155028] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576155629 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [169391.276634] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [169391.286466] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [169391.302639] Lustre: Skipped 3 previous similar messages [169391.308117] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [169391.318051] Lustre: Skipped 3 previous similar messages [169420.818587] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [169420.831707] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 2 previous similar messages [169482.616854] Lustre: fir-MDT0000: haven't heard from client 27dd63c4-0630-b8af-eb2d-2f38c1747230 (at 10.8.19.5@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b6400, cur 1576155721 expire 1576155571 last 1576155494 [169482.638563] Lustre: Skipped 1 previous similar message [169992.171867] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576155629/real 1576155629] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576156230 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [169992.200088] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [169992.210008] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [169992.226180] Lustre: Skipped 5 previous similar messages [169992.231650] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [169992.241604] Lustre: Skipped 5 previous similar messages [170156.618651] Lustre: fir-MDT0000: haven't heard from client d4e78436-48cb-55f2-4bab-88419072f51d (at 10.9.103.16@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97e8400, cur 1576156395 expire 1576156245 last 1576156168 [170156.640530] Lustre: Skipped 1 previous similar message [170177.846947] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [170177.860070] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [170592.911313] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576156230/real 1576156230] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576156831 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [170592.939541] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [170592.949371] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [170592.965542] Lustre: Skipped 4 previous similar messages [170592.971038] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [170592.980975] Lustre: Skipped 4 previous similar messages [170790.633859] Lustre: MGS: haven't heard from client babd4767-8aaa-fdee-2202-d9471210976a (at 10.9.104.20@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885be4138800, cur 1576157029 expire 1576156879 last 1576156802 [170790.655039] Lustre: Skipped 1 previous similar message [170934.875293] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [170934.888419] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [171085.635209] Lustre: fir-MDT0000: haven't heard from client ee45735a-3c72-071c-fe40-2e82d3a751bd (at 10.8.7.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9067000, cur 1576157324 expire 1576157174 last 1576157097 [171085.656920] Lustre: Skipped 1 previous similar message [171194.210805] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576156831/real 1576156831] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576157432 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [171194.239032] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13 previous similar messages [171194.249006] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [171194.265313] Lustre: Skipped 4 previous similar messages [171194.270817] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [171194.280752] Lustre: Skipped 4 previous similar messages [171603.016295] INFO: task mdt01_016:39241 blocked for more than 120 seconds. [171603.023179] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [171603.031099] mdt01_016 D ffff887bbfeea080 0 39241 2 0x00000080 [171603.038286] Call Trace: [171603.040848] [] ? lquota_disk_read+0xf2/0x390 [lquota] [171603.047658] [] schedule+0x29/0x70 [171603.052717] [] rwsem_down_write_failed+0x225/0x3a0 [171603.059261] [] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [171603.065899] [] call_rwsem_down_write_failed+0x17/0x30 [171603.072709] [] down_write+0x2d/0x3d [171603.077973] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [171603.084772] [] lod_qos_prep_create+0x16a/0x1890 [lod] [171603.091591] [] ? qsd_op_begin+0x262/0x4b0 [lquota] [171603.098134] [] ? osd_declare_qid+0x200/0x4a0 [osd_ldiskfs] [171603.105368] [] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [171603.113137] [] lod_prepare_create+0x215/0x2e0 [lod] [171603.119760] [] lod_declare_striped_create+0x1ee/0x980 [lod] [171603.127082] [] ? lod_sub_declare_create+0xdf/0x210 [lod] [171603.134158] [] lod_declare_create+0x204/0x590 [lod] [171603.140802] [] ? lu_context_refill+0x19/0x50 [obdclass] [171603.147785] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171603.155721] [] mdd_declare_create+0x4c/0xcb0 [mdd] [171603.162277] [] mdd_create+0x847/0x14e0 [mdd] [171603.168307] [] mdt_reint_open+0x224f/0x3240 [mdt] [171603.174786] [] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [171603.182369] [] mdt_reint_rec+0x83/0x210 [mdt] [171603.188479] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171603.195123] [] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [171603.202284] [] mdt_intent_open+0x82/0x3a0 [mdt] [171603.208568] [] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [171603.215819] [] mdt_intent_policy+0x435/0xd80 [mdt] [171603.222358] [] ? cfs_hash_bd_add_locked+0x24/0x80 [libcfs] [171603.229594] [] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [171603.236866] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171603.243666] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [171603.250894] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [171603.257381] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171603.264572] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [171603.272189] [] tgt_enqueue+0x62/0x210 [ptlrpc] [171603.278432] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171603.285433] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [171603.293103] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [171603.300288] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171603.308066] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [171603.314948] [] ? __wake_up+0x44/0x50 [171603.320312] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171603.326703] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [171603.334189] [] kthread+0xd1/0xe0 [171603.339194] [] ? insert_kthread_work+0x40/0x40 [171603.345381] [] ret_from_fork_nospec_begin+0xe/0x21 [171603.351917] [] ? insert_kthread_work+0x40/0x40 [171603.358125] INFO: task mdt00_016:39323 blocked for more than 120 seconds. [171603.365004] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [171603.372923] mdt00_016 D ffff887bbf735140 0 39323 2 0x00000080 [171603.380123] Call Trace: [171603.382672] [] ? lquota_disk_read+0xf2/0x390 [lquota] [171603.389461] [] schedule+0x29/0x70 [171603.394538] [] rwsem_down_write_failed+0x225/0x3a0 [171603.401089] [] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [171603.407708] [] ? __radix_tree_lookup+0x84/0xf0 [171603.413910] [] call_rwsem_down_write_failed+0x17/0x30 [171603.420704] [] down_write+0x2d/0x3d [171603.425947] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [171603.432766] [] lod_qos_prep_create+0x16a/0x1890 [lod] [171603.439577] [] ? qsd_op_begin+0x262/0x4b0 [lquota] [171603.446117] [] ? osd_declare_qid+0x200/0x4a0 [osd_ldiskfs] [171603.453369] [] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [171603.461139] [] lod_prepare_create+0x215/0x2e0 [lod] [171603.467765] [] lod_declare_striped_create+0x1ee/0x980 [lod] [171603.475084] [] ? lod_sub_declare_create+0xdf/0x210 [lod] [171603.482144] [] lod_declare_create+0x204/0x590 [lod] [171603.488812] [] ? lu_context_refill+0x19/0x50 [obdclass] [171603.495781] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171603.503719] [] mdd_declare_create+0x4c/0xcb0 [mdd] [171603.510269] [] mdd_create+0x847/0x14e0 [mdd] [171603.516308] [] mdt_reint_open+0x224f/0x3240 [mdt] [171603.522786] [] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [171603.530367] [] mdt_reint_rec+0x83/0x210 [mdt] [171603.536494] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171603.543125] [] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [171603.550277] [] mdt_intent_open+0x82/0x3a0 [mdt] [171603.556580] [] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [171603.563811] [] mdt_intent_policy+0x435/0xd80 [mdt] [171603.570370] [] ? ldlm_lock_create+0xa4/0x9f0 [ptlrpc] [171603.577196] [] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [171603.584445] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171603.591247] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [171603.598511] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [171603.605006] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171603.612176] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [171603.619800] [] tgt_enqueue+0x62/0x210 [ptlrpc] [171603.626019] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171603.633021] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [171603.640706] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [171603.647887] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171603.655666] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [171603.662560] [] ? __wake_up+0x44/0x50 [171603.667911] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171603.674295] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [171603.681782] [] kthread+0xd1/0xe0 [171603.686773] [] ? insert_kthread_work+0x40/0x40 [171603.692979] [] ret_from_fork_nospec_begin+0xe/0x21 [171603.699514] [] ? insert_kthread_work+0x40/0x40 [171603.705724] INFO: task mdt00_038:39399 blocked for more than 120 seconds. [171603.712599] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [171603.720511] mdt00_038 D ffff885bf330a080 0 39399 2 0x00000080 [171603.727712] Call Trace: [171603.730259] [] ? lquota_disk_read+0xf2/0x390 [lquota] [171603.737051] [] schedule+0x29/0x70 [171603.742108] [] rwsem_down_write_failed+0x225/0x3a0 [171603.748642] [] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [171603.755274] [] call_rwsem_down_write_failed+0x17/0x30 [171603.762072] [] down_write+0x2d/0x3d [171603.767313] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [171603.774132] [] lod_qos_prep_create+0x16a/0x1890 [lod] [171603.780941] [] ? qsd_op_begin+0x262/0x4b0 [lquota] [171603.787481] [] ? osd_declare_qid+0x200/0x4a0 [osd_ldiskfs] [171603.794736] [] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [171603.802506] [] lod_prepare_create+0x215/0x2e0 [lod] [171603.809129] [] lod_declare_striped_create+0x1ee/0x980 [lod] [171603.816469] [] ? lod_sub_declare_create+0xdf/0x210 [lod] [171603.823525] [] lod_declare_create+0x204/0x590 [lod] [171603.830167] [] ? lu_context_refill+0x19/0x50 [obdclass] [171603.837157] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171603.845093] [] mdd_declare_create+0x4c/0xcb0 [mdd] [171603.851633] [] mdd_create+0x847/0x14e0 [mdd] [171603.857673] [] mdt_reint_open+0x224f/0x3240 [mdt] [171603.864139] [] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [171603.871725] [] mdt_reint_rec+0x83/0x210 [mdt] [171603.877833] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171603.884464] [] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [171603.891630] [] mdt_intent_open+0x82/0x3a0 [mdt] [171603.897920] [] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [171603.905160] [] mdt_intent_policy+0x435/0xd80 [mdt] [171603.911719] [] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [171603.918969] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171603.925766] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [171603.933020] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [171603.939513] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171603.946682] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [171603.954302] [] tgt_enqueue+0x62/0x210 [ptlrpc] [171603.960519] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171603.967523] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [171603.975217] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [171603.982382] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171603.990159] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [171603.997049] [] ? __wake_up+0x44/0x50 [171604.002391] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171604.008781] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [171604.016266] [] kthread+0xd1/0xe0 [171604.021243] [] ? insert_kthread_work+0x40/0x40 [171604.027445] [] ret_from_fork_nospec_begin+0xe/0x21 [171604.033978] [] ? insert_kthread_work+0x40/0x40 [171683.197657] LNet: Service thread pid 39241 was inactive for 200.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [171683.214678] Pid: 39241, comm: mdt01_016 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171683.224938] Call Trace: [171683.227499] [] call_rwsem_down_write_failed+0x17/0x30 [171683.234322] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [171683.241166] [] lod_qos_prep_create+0x16a/0x1890 [lod] [171683.247998] [] lod_prepare_create+0x215/0x2e0 [lod] [171683.254647] [] lod_declare_striped_create+0x1ee/0x980 [lod] [171683.261994] [] lod_declare_create+0x204/0x590 [lod] [171683.268673] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171683.276636] [] mdd_declare_create+0x4c/0xcb0 [mdd] [171683.283230] [] mdd_create+0x847/0x14e0 [mdd] [171683.289277] [] mdt_reint_open+0x224f/0x3240 [mdt] [171683.295785] [] mdt_reint_rec+0x83/0x210 [mdt] [171683.301924] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171683.308602] [] mdt_intent_open+0x82/0x3a0 [mdt] [171683.314929] [] mdt_intent_policy+0x435/0xd80 [mdt] [171683.321515] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171683.328365] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171683.335572] [] tgt_enqueue+0x62/0x210 [ptlrpc] [171683.341826] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171683.348889] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171683.356687] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171683.363114] [] kthread+0xd1/0xe0 [171683.368119] [] ret_from_fork_nospec_begin+0xe/0x21 [171683.374695] [] 0xffffffffffffffff [171683.379819] LustreError: dumping log to /tmp/lustre-log.1576157921.39241 [171691.903713] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [171691.916838] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [171705.725786] LNet: Service thread pid 39399 was inactive for 224.36s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [171705.742804] Pid: 39399, comm: mdt00_038 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171705.753067] Call Trace: [171705.755626] [] osp_precreate_reserve+0x2e8/0x800 [osp] [171705.762546] [] osp_declare_create+0x199/0x5b0 [osp] [171705.769224] [] lod_sub_declare_create+0xdf/0x210 [lod] [171705.776145] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [171705.783324] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [171705.790851] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [171705.797776] [] lod_prepare_create+0x215/0x2e0 [lod] [171705.804422] [] lod_declare_striped_create+0x1ee/0x980 [lod] [171705.811781] [] lod_declare_create+0x204/0x590 [lod] [171705.818428] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171705.826399] [] mdd_declare_create+0x4c/0xcb0 [mdd] [171705.832964] [] mdd_create+0x847/0x14e0 [mdd] [171705.839040] [] mdt_reint_open+0x224f/0x3240 [mdt] [171705.845555] [] mdt_reint_rec+0x83/0x210 [mdt] [171705.851705] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171705.858364] [] mdt_intent_open+0x82/0x3a0 [mdt] [171705.864689] [] mdt_intent_policy+0x435/0xd80 [mdt] [171705.871262] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171705.878131] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171705.885326] [] tgt_enqueue+0x62/0x210 [ptlrpc] [171705.891597] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171705.898631] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171705.906459] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171705.912878] [] kthread+0xd1/0xe0 [171705.917893] [] ret_from_fork_nospec_begin+0xe/0x21 [171705.924456] [] 0xffffffffffffffff [171705.929580] LustreError: dumping log to /tmp/lustre-log.1576157944.39399 [171708.264232] LNet: Service thread pid 39399 completed after 226.90s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [171708.280477] LNet: Skipped 1 previous similar message [171721.630128] Lustre: fir-MDT0000: haven't heard from client 2d6a9cf7-46ee-4 (at 10.8.7.5@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a73a17800, cur 1576157960 expire 1576157810 last 1576157733 [171721.649961] Lustre: Skipped 1 previous similar message [171795.158306] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576157432/real 1576157432] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576158033 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [171795.186580] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [171795.196440] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [171795.212645] Lustre: Skipped 3 previous similar messages [171795.218173] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [171795.228133] Lustre: Skipped 5 previous similar messages [171797.628418] Lustre: fir-MDT0000: haven't heard from client 19c70918-a172-38a5-2512-02b987cb686f (at 10.9.116.8@o2ib4) in 152 seconds. I think it's dead, and I am evicting it. exp ffff888bed3da400, cur 1576158036 expire 1576157886 last 1576157884 [171797.650229] Lustre: Skipped 1 previous similar message [171872.638787] Lustre: MGS: haven't heard from client 02299f50-fd55-88da-2f6a-7032b0997b32 (at 10.9.116.8@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf5254000, cur 1576158111 expire 1576157961 last 1576157884 [172397.209793] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576158033/real 1576158033] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576158634 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [172397.237994] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [172397.247833] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [172397.264002] Lustre: Skipped 4 previous similar messages [172397.269491] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [172397.279429] Lustre: Skipped 6 previous similar messages [172448.932101] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [172448.945253] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [172488.637228] Lustre: fir-MDT0000: haven't heard from client 75c6d6d0-df4c-7543-716f-77a06d0b577a (at 10.9.103.68@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bed3d9800, cur 1576158727 expire 1576158577 last 1576158500 [172500.641755] Lustre: MGS: haven't heard from client f12efaec-23ad-f3f4-5f09-9dc112b40215 (at 10.9.103.68@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0a2c800, cur 1576158739 expire 1576158589 last 1576158512 [172684.046561] INFO: task mdt01_016:39241 blocked for more than 120 seconds. [172684.053445] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [172684.061376] mdt01_016 D ffff887bbfeea080 0 39241 2 0x00000080 [172684.068581] Call Trace: [172684.071157] [] ? update_curr+0x14c/0x1e0 [172684.076868] [] schedule+0x29/0x70 [172684.081937] [] rwsem_down_write_failed+0x225/0x3a0 [172684.088487] [] call_rwsem_down_write_failed+0x17/0x30 [172684.095304] [] down_write+0x2d/0x3d [172684.100547] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172684.107978] [] ? try_to_del_timer_sync+0x5e/0x90 [172684.114362] [] ? del_timer_sync+0x52/0x60 [172684.120136] [] ? schedule_timeout+0x170/0x2d0 [172684.126261] [] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [172684.133254] [] ? lod_prepare_avoidance+0x375/0x780 [lod] [172684.140342] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172684.147282] [] ? ldlm_inodebits_alloc_lock+0x66/0x180 [ptlrpc] [172684.154879] [] ? wake_up_state+0x20/0x20 [172684.160575] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172684.168544] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172684.175887] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172684.183057] [] mdd_layout_change+0x882/0x1000 [mdd] [172684.189745] [] ? mdt_object_lock_internal+0x70/0x360 [mdt] [172684.196999] [] mdt_layout_change+0x337/0x430 [mdt] [172684.203594] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172684.210193] [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] [172684.216747] [] mdt_intent_policy+0x435/0xd80 [mdt] [172684.223337] [] ? mdt_intent_open+0x3a0/0x3a0 [mdt] [172684.229916] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172684.236736] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [172684.244008] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [172684.250484] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172684.257673] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [172684.265304] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172684.271533] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172684.278544] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [172684.286253] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [172684.293423] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172684.301196] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [172684.308109] [] ? __wake_up+0x44/0x50 [172684.313476] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172684.319866] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [172684.327366] [] kthread+0xd1/0xe0 [172684.332338] [] ? insert_kthread_work+0x40/0x40 [172684.338524] [] ret_from_fork_nospec_begin+0xe/0x21 [172684.345069] [] ? insert_kthread_work+0x40/0x40 [172684.351261] INFO: task mdt03_023:39352 blocked for more than 120 seconds. [172684.358139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [172684.366068] mdt03_023 D ffff887bbf659040 0 39352 2 0x00000080 [172684.373253] Call Trace: [172684.375795] [] schedule+0x29/0x70 [172684.380856] [] rwsem_down_write_failed+0x225/0x3a0 [172684.387396] [] call_rwsem_down_write_failed+0x17/0x30 [172684.394186] [] down_write+0x2d/0x3d [172684.399448] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [172684.406249] [] lod_qos_prep_create+0x16a/0x1890 [lod] [172684.413072] [] ? ldlm_inodebits_alloc_lock+0x66/0x180 [ptlrpc] [172684.420661] [] ? wake_up_state+0x20/0x20 [172684.426348] [] ? ldlm_cli_enqueue_local+0x272/0x830 [ptlrpc] [172684.433757] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172684.441704] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172684.449018] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172684.456160] [] mdd_layout_change+0x882/0x1000 [mdd] [172684.462798] [] ? mdt_object_lock_internal+0x70/0x360 [mdt] [172684.470026] [] mdt_layout_change+0x337/0x430 [mdt] [172684.476562] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172684.483141] [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] [172684.489683] [] mdt_intent_policy+0x435/0xd80 [mdt] [172684.496230] [] ? mdt_intent_open+0x3a0/0x3a0 [mdt] [172684.502802] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172684.509613] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [172684.516842] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [172684.523326] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172684.530505] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [172684.538113] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172684.544366] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172684.551365] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [172684.559047] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [172684.566231] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172684.574007] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [172684.580913] [] ? __wake_up+0x44/0x50 [172684.586260] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172684.592648] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [172684.600149] [] kthread+0xd1/0xe0 [172684.605117] [] ? insert_kthread_work+0x40/0x40 [172684.611307] [] ret_from_fork_nospec_begin+0xe/0x21 [172684.617857] [] ? insert_kthread_work+0x40/0x40 [172684.624051] INFO: task mdt01_031:39382 blocked for more than 120 seconds. [172684.630924] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [172684.638858] mdt01_031 D ffff887bbf325140 0 39382 2 0x00000080 [172684.646066] Call Trace: [172684.648606] [] ? update_curr+0x14c/0x1e0 [172684.654277] [] schedule+0x29/0x70 [172684.659352] [] rwsem_down_write_failed+0x225/0x3a0 [172684.665880] [] call_rwsem_down_write_failed+0x17/0x30 [172684.672674] [] down_write+0x2d/0x3d [172684.677932] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172684.685328] [] ? try_to_del_timer_sync+0x5e/0x90 [172684.691689] [] ? del_timer_sync+0x52/0x60 [172684.697476] [] ? schedule_timeout+0x170/0x2d0 [172684.703594] [] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [172684.710564] [] ? lod_prepare_avoidance+0x375/0x780 [lod] [172684.717640] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172684.724547] [] ? ldlm_inodebits_alloc_lock+0x66/0x180 [ptlrpc] [172684.732120] [] ? wake_up_state+0x20/0x20 [172684.737809] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172684.745732] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172684.753048] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172684.760203] [] mdd_layout_change+0x882/0x1000 [mdd] [172684.766830] [] ? mdt_object_lock_internal+0x70/0x360 [mdt] [172684.774070] [] mdt_layout_change+0x337/0x430 [mdt] [172684.780608] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172684.787165] [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] [172684.793724] [] mdt_intent_policy+0x435/0xd80 [mdt] [172684.800265] [] ? mdt_intent_open+0x3a0/0x3a0 [mdt] [172684.806826] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172684.813640] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [172684.820871] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [172684.827342] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172684.834529] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [172684.842150] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172684.848365] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172684.855382] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [172684.863046] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [172684.870227] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172684.878016] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [172684.884896] [] ? __wake_up+0x44/0x50 [172684.890261] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172684.896650] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [172684.904135] [] kthread+0xd1/0xe0 [172684.909140] [] ? insert_kthread_work+0x40/0x40 [172684.915325] [] ret_from_fork_nospec_begin+0xe/0x21 [172684.921851] [] ? insert_kthread_work+0x40/0x40 [172684.928054] INFO: task mdt01_032:39384 blocked for more than 120 seconds. [172684.934928] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [172684.942842] mdt01_032 D ffff887bbf368000 0 39384 2 0x00000080 [172684.950050] Call Trace: [172684.952592] [] ? update_curr+0x14c/0x1e0 [172684.958266] [] schedule+0x29/0x70 [172684.963337] [] rwsem_down_write_failed+0x225/0x3a0 [172684.969866] [] call_rwsem_down_write_failed+0x17/0x30 [172684.976660] [] down_write+0x2d/0x3d [172684.981917] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172684.989332] [] ? try_to_del_timer_sync+0x5e/0x90 [172684.995708] [] ? del_timer_sync+0x52/0x60 [172685.001471] [] ? schedule_timeout+0x170/0x2d0 [172685.007578] [] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [172685.014578] [] ? lod_prepare_avoidance+0x375/0x780 [lod] [172685.021638] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172685.028539] [] ? ldlm_inodebits_alloc_lock+0x66/0x180 [ptlrpc] [172685.036145] [] ? wake_up_state+0x20/0x20 [172685.041815] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172685.049753] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172685.057082] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172685.064226] [] mdd_layout_change+0x882/0x1000 [mdd] [172685.070866] [] ? mdt_object_lock_internal+0x70/0x360 [mdt] [172685.078093] [] mdt_layout_change+0x337/0x430 [mdt] [172685.084631] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172685.091210] [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] [172685.097753] [] mdt_intent_policy+0x435/0xd80 [mdt] [172685.104294] [] ? mdt_intent_open+0x3a0/0x3a0 [mdt] [172685.110851] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172685.117649] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [172685.124890] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [172685.131362] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172685.138535] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [172685.146151] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172685.152367] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172685.159360] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [172685.167035] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [172685.174219] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172685.181997] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [172685.188889] [] ? __wake_up+0x44/0x50 [172685.194235] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172685.200634] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [172685.208129] [] kthread+0xd1/0xe0 [172685.213096] [] ? insert_kthread_work+0x40/0x40 [172685.219294] [] ret_from_fork_nospec_begin+0xe/0x21 [172685.225819] [] ? insert_kthread_work+0x40/0x40 [172685.232005] INFO: task mdt00_038:39399 blocked for more than 120 seconds. [172685.238911] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [172685.246827] mdt00_038 D ffff885bf330a080 0 39399 2 0x00000080 [172685.254026] Call Trace: [172685.256570] [] ? update_curr+0x14c/0x1e0 [172685.262238] [] schedule+0x29/0x70 [172685.267314] [] rwsem_down_write_failed+0x225/0x3a0 [172685.273843] [] call_rwsem_down_write_failed+0x17/0x30 [172685.280639] [] down_write+0x2d/0x3d [172685.285894] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172685.293291] [] ? try_to_del_timer_sync+0x5e/0x90 [172685.299668] [] ? del_timer_sync+0x52/0x60 [172685.305414] [] ? schedule_timeout+0x170/0x2d0 [172685.311515] [] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [172685.318498] [] ? lod_prepare_avoidance+0x375/0x780 [lod] [172685.325556] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172685.332465] [] ? ldlm_inodebits_alloc_lock+0x66/0x180 [ptlrpc] [172685.340068] [] ? wake_up_state+0x20/0x20 [172685.345740] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172685.353687] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172685.361015] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172685.368161] [] mdd_layout_change+0x882/0x1000 [mdd] [172685.374809] [] ? mdt_object_lock_internal+0x70/0x360 [mdt] [172685.382036] [] mdt_layout_change+0x337/0x430 [mdt] [172685.388575] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172685.395154] [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] [172685.401693] [] mdt_intent_policy+0x435/0xd80 [mdt] [172685.408229] [] ? mdt_intent_open+0x3a0/0x3a0 [mdt] [172685.414803] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172685.421600] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [172685.428844] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [172685.435343] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172685.442512] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [172685.450145] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172685.456361] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172685.463357] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [172685.471021] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [172685.478189] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172685.485980] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [172685.492859] [] ? __wake_up+0x44/0x50 [172685.498201] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172685.504622] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [172685.512108] [] kthread+0xd1/0xe0 [172685.517076] [] ? insert_kthread_work+0x40/0x40 [172685.523279] [] ret_from_fork_nospec_begin+0xe/0x21 [172685.529808] [] ? insert_kthread_work+0x40/0x40 [172685.536001] INFO: task mdt02_049:39433 blocked for more than 120 seconds. [172685.542889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [172685.550805] mdt02_049 D ffff887bbee330c0 0 39433 2 0x00000080 [172685.558003] Call Trace: [172685.560545] [] ? update_curr+0x14c/0x1e0 [172685.566214] [] schedule+0x29/0x70 [172685.571275] [] rwsem_down_write_failed+0x225/0x3a0 [172685.577804] [] call_rwsem_down_write_failed+0x17/0x30 [172685.584596] [] down_write+0x2d/0x3d [172685.589845] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172685.597241] [] ? try_to_del_timer_sync+0x5e/0x90 [172685.603603] [] ? del_timer_sync+0x52/0x60 [172685.609372] [] ? schedule_timeout+0x170/0x2d0 [172685.615476] [] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [172685.622454] [] ? lod_prepare_avoidance+0x375/0x780 [lod] [172685.629543] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172685.636449] [] ? ldlm_inodebits_alloc_lock+0x66/0x180 [ptlrpc] [172685.644024] [] ? wake_up_state+0x20/0x20 [172685.649704] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172685.657630] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172685.664966] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172685.672112] [] mdd_layout_change+0x882/0x1000 [mdd] [172685.678744] [] ? mdt_object_lock_internal+0x70/0x360 [mdt] [172685.685996] [] mdt_layout_change+0x337/0x430 [mdt] [172685.692533] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172685.699097] [] ? lustre_msg_buf+0x17/0x60 [ptlrpc] [172685.705638] [] mdt_intent_policy+0x435/0xd80 [mdt] [172685.712182] [] ? mdt_intent_open+0x3a0/0x3a0 [mdt] [172685.718754] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172685.725553] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [172685.732803] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [172685.739288] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172685.746460] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [172685.754080] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172685.760297] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172685.767297] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [172685.774959] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [172685.782125] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172685.789916] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [172685.796791] [] ? __wake_up+0x44/0x50 [172685.802135] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172685.808540] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [172685.816043] [] kthread+0xd1/0xe0 [172685.821017] [] ? insert_kthread_work+0x40/0x40 [172685.827223] [] ret_from_fork_nospec_begin+0xe/0x21 [172685.833771] [] ? insert_kthread_work+0x40/0x40 [172757.891881] LNet: Service thread pid 39399 was inactive for 200.74s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [172757.908902] Pid: 39399, comm: mdt00_038 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172757.919178] Call Trace: [172757.921737] [] osp_precreate_reserve+0x2e8/0x800 [osp] [172757.928648] [] osp_declare_create+0x199/0x5b0 [osp] [172757.935310] [] lod_sub_declare_create+0xdf/0x210 [lod] [172757.942229] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [172757.949409] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [172757.956927] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172757.963850] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172757.971816] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172757.979173] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172757.986351] [] mdd_layout_change+0x882/0x1000 [mdd] [172757.993013] [] mdt_layout_change+0x337/0x430 [mdt] [172757.999586] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172758.006160] [] mdt_intent_policy+0x435/0xd80 [mdt] [172758.012724] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172758.019585] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172758.026779] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172758.033029] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172758.040065] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172758.047882] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172758.054311] [] kthread+0xd1/0xe0 [172758.059315] [] ret_from_fork_nospec_begin+0xe/0x21 [172758.065889] [] 0xffffffffffffffff [172758.071002] LustreError: dumping log to /tmp/lustre-log.1576158996.39399 [172758.078469] Pid: 39384, comm: mdt01_032 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172758.088752] Call Trace: [172758.091299] [] call_rwsem_down_write_failed+0x17/0x30 [172758.098125] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172758.105580] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172758.112499] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172758.120462] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172758.127807] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172758.134989] [] mdd_layout_change+0x882/0x1000 [mdd] [172758.141660] [] mdt_layout_change+0x337/0x430 [mdt] [172758.148232] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172758.154806] [] mdt_intent_policy+0x435/0xd80 [mdt] [172758.161383] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172758.168236] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172758.175438] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172758.181695] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172758.188717] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172758.196535] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172758.202948] [] kthread+0xd1/0xe0 [172758.207954] [] ret_from_fork_nospec_begin+0xe/0x21 [172758.214508] [] 0xffffffffffffffff [172758.219612] Pid: 39241, comm: mdt01_016 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172758.229871] Call Trace: [172758.232422] [] call_rwsem_down_write_failed+0x17/0x30 [172758.239236] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172758.246688] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172758.253597] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172758.261558] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172758.268901] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172758.276084] [] mdd_layout_change+0x882/0x1000 [mdd] [172758.282734] [] mdt_layout_change+0x337/0x430 [mdt] [172758.289312] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172758.295889] [] mdt_intent_policy+0x435/0xd80 [mdt] [172758.302467] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172758.309334] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172758.316518] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172758.322772] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172758.329795] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172758.337617] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172758.344035] [] kthread+0xd1/0xe0 [172758.349042] [] ret_from_fork_nospec_begin+0xe/0x21 [172758.355596] [] 0xffffffffffffffff [172758.360700] Pid: 39382, comm: mdt01_031 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172758.370955] Call Trace: [172758.373500] [] call_rwsem_down_write_failed+0x17/0x30 [172758.380315] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [172758.387758] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [172758.394667] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172758.402629] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172758.409973] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172758.417152] [] mdd_layout_change+0x882/0x1000 [mdd] [172758.423805] [] mdt_layout_change+0x337/0x430 [mdt] [172758.430380] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172758.436943] [] mdt_intent_policy+0x435/0xd80 [mdt] [172758.443519] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172758.450360] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172758.457568] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172758.463810] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172758.470845] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172758.478648] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172758.485076] [] kthread+0xd1/0xe0 [172758.490070] [] ret_from_fork_nospec_begin+0xe/0x21 [172758.496636] [] 0xffffffffffffffff [172764.035919] LNet: Service thread pid 39352 was inactive for 200.04s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [172764.052940] LNet: Skipped 3 previous similar messages [172764.058088] Pid: 39352, comm: mdt03_023 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172764.068372] Call Trace: [172764.070925] [] call_rwsem_down_write_failed+0x17/0x30 [172764.077750] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [172764.084604] [] lod_qos_prep_create+0x16a/0x1890 [lod] [172764.091433] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [172764.099398] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [172764.106740] [] mdd_declare_layout_change+0x62/0x120 [mdd] [172764.113934] [] mdd_layout_change+0x882/0x1000 [mdd] [172764.120598] [] mdt_layout_change+0x337/0x430 [mdt] [172764.127207] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [172764.133771] [] mdt_intent_policy+0x435/0xd80 [mdt] [172764.140357] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172764.147212] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172764.154422] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172764.160684] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172764.167728] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172764.175529] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172764.181959] [] kthread+0xd1/0xe0 [172764.186975] [] ret_from_fork_nospec_begin+0xe/0x21 [172764.193552] [] 0xffffffffffffffff [172764.198661] LustreError: dumping log to /tmp/lustre-log.1576159002.39352 [172772.227965] LNet: Service thread pid 39239 was inactive for 200.70s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [172772.240909] LustreError: dumping log to /tmp/lustre-log.1576159010.39239 [172805.840267] INFO: task mdt02_010:39239 blocked for more than 120 seconds. [172805.847150] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [172805.855068] mdt02_010 D ffff887bbfee8000 0 39239 2 0x00000080 [172805.862279] Call Trace: [172805.864834] [] ? lquota_disk_read+0xf2/0x390 [lquota] [172805.871646] [] schedule+0x29/0x70 [172805.876704] [] rwsem_down_write_failed+0x225/0x3a0 [172805.883249] [] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [172805.889888] [] ? __radix_tree_lookup+0x84/0xf0 [172805.896089] [] call_rwsem_down_write_failed+0x17/0x30 [172805.902894] [] down_write+0x2d/0x3d [172805.908157] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [172805.914973] [] lod_qos_prep_create+0x16a/0x1890 [lod] [172805.921782] [] ? qsd_op_begin+0x262/0x4b0 [lquota] [172805.928329] [] ? osd_declare_qid+0x200/0x4a0 [osd_ldiskfs] [172805.935582] [] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [172805.943332] [] lod_prepare_create+0x215/0x2e0 [lod] [172805.949963] [] lod_declare_striped_create+0x1ee/0x980 [lod] [172805.957298] [] ? lod_sub_declare_create+0xdf/0x210 [lod] [172805.964356] [] lod_declare_create+0x204/0x590 [lod] [172805.971005] [] ? lu_context_refill+0x19/0x50 [obdclass] [172805.978002] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [172805.985926] [] mdd_declare_create+0x4c/0xcb0 [mdd] [172805.992461] [] mdd_create+0x847/0x14e0 [mdd] [172805.998488] [] mdt_reint_open+0x224f/0x3240 [mdt] [172806.004953] [] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [172806.012554] [] mdt_reint_rec+0x83/0x210 [mdt] [172806.018665] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [172806.025294] [] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [172806.032458] [] mdt_intent_open+0x82/0x3a0 [mdt] [172806.038761] [] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [172806.045998] [] mdt_intent_policy+0x435/0xd80 [mdt] [172806.052559] [] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [172806.059831] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172806.066633] [] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [172806.073885] [] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [172806.080357] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172806.087530] [] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [172806.095172] [] tgt_enqueue+0x62/0x210 [ptlrpc] [172806.101396] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172806.108413] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [172806.116076] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [172806.123257] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172806.131047] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [172806.137930] [] ? __wake_up+0x44/0x50 [172806.143293] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172806.149653] [] ? __schedule+0x42a/0x860 [172806.155264] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [172806.162764] [] kthread+0xd1/0xe0 [172806.167740] [] ? insert_kthread_work+0x40/0x40 [172806.173930] [] ret_from_fork_nospec_begin+0xe/0x21 [172806.180478] [] ? insert_kthread_work+0x40/0x40 [172825.988284] LNet: Service thread pid 39269 was inactive for 200.49s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [172826.001233] LustreError: dumping log to /tmp/lustre-log.1576159064.39269 [172851.521774] LNet: Service thread pid 39399 completed after 294.37s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [172856.708475] LNet: Service thread pid 39255 was inactive for 200.36s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [172856.721423] LustreError: dumping log to /tmp/lustre-log.1576159095.39255 [172871.556547] LNet: Service thread pid 39367 was inactive for 200.02s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [172871.569495] LustreError: dumping log to /tmp/lustre-log.1576159109.39367 [172921.644534] Lustre: fir-MDT0000: haven't heard from client 030cce72-3f78-2631-9a21-d2dac6dcbefa (at 10.8.19.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9029000, cur 1576159160 expire 1576159010 last 1576158933 [172997.325305] LustreError: 39375:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576158935, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff887af3215580/0xc3c20c06c19a7142 lrc: 3/1,0 mode: --/PR res: [0x200000406:0x1b2:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39375 timeout: 0 lvb_type: 0 [172997.364977] LustreError: dumping log to /tmp/lustre-log.1576159235.39375 [172997.952305] LustreError: 39341:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576158936, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff887acca37080/0xc3c20c06c19a718f lrc: 3/1,0 mode: --/PR res: [0x200000406:0x1b2:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39341 timeout: 0 lvb_type: 0 [172997.989300] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576158635/real 1576158635] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576159236 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [172997.989303] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 11 previous similar messages [172997.989310] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [172997.989311] Lustre: Skipped 5 previous similar messages [172997.989461] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [172997.989463] Lustre: Skipped 5 previous similar messages [172998.066661] LustreError: 39341:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 3 previous similar messages [172999.058323] LustreError: 39323:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576158937, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8852deede780/0xc3c20c06c19a74c9 lrc: 3/1,0 mode: --/PR res: [0x200000406:0x1b2:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39323 timeout: 0 lvb_type: 0 [172999.097897] LustreError: 39323:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 2 previous similar messages [173004.414340] LustreError: 39364:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576158942, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8874977bf080/0xc3c20c06c19a7a79 lrc: 3/1,0 mode: --/PR res: [0x200000406:0x1b2:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39364 timeout: 0 lvb_type: 0 [173004.453891] LustreError: 39364:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [173007.749358] LNet: Service thread pid 39236 was inactive for 310.36s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [173007.762312] LustreError: dumping log to /tmp/lustre-log.1576159246.39236 [173008.773365] LustreError: dumping log to /tmp/lustre-log.1576159247.39395 [173009.797373] LustreError: dumping log to /tmp/lustre-log.1576159248.39375 [173010.821376] LustreError: dumping log to /tmp/lustre-log.1576159249.39341 [173011.845386] LustreError: dumping log to /tmp/lustre-log.1576159250.39267 [173042.439582] LustreError: 39346:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576158980, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8877ae0d0b40/0xc3c20c06c19a96a3 lrc: 3/1,0 mode: --/PR res: [0x200029791:0x7f50:0x0].0x0 bits 0x13/0x0 rrc: 13 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39346 timeout: 0 lvb_type: 0 [173042.479225] LustreError: 39346:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [173051.522802] LNet: Service thread pid 39384 completed after 494.34s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [173067.141736] LNet: Service thread pid 39364 was inactive for 362.72s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [173067.158758] Pid: 39364, comm: mdt00_030 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173067.169020] Call Trace: [173067.171581] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173067.178619] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173067.185899] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [173067.192841] [] mdt_object_lock_internal+0x70/0x360 [mdt] [173067.199931] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173067.206949] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [173067.213608] [] mdt_intent_policy+0x435/0xd80 [mdt] [173067.220192] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173067.227041] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173067.234248] [] tgt_enqueue+0x62/0x210 [ptlrpc] [173067.240517] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173067.247560] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173067.255364] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173067.261789] [] kthread+0xd1/0xe0 [173067.266794] [] ret_from_fork_nospec_begin+0xe/0x21 [173067.273370] [] 0xffffffffffffffff [173067.278481] LustreError: dumping log to /tmp/lustre-log.1576159305.39364 [173114.643135] Lustre: MGS: haven't heard from client 4da0631e-0b9c-7c27-c6e2-66c5a8c0b673 (at 10.9.101.53@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be8230c00, cur 1576159353 expire 1576159203 last 1576159126 [173114.664315] Lustre: Skipped 3 previous similar messages [173151.750216] Lustre: 39384:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff886bf21a8900 x1651216130790064/t0(0) o101->a1acf167-afde-6f5a-879d-1a7c0814f282@10.9.117.21@o2ib4:255/0 lens 376/1600 e 19 to 0 dl 1576159395 ref 2 fl Interpret:/0/0 rc 0/0 [173153.158219] LNet: Service thread pid 39419 was inactive for 410.69s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [173153.175241] Pid: 39419, comm: mdt02_045 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173153.185505] Call Trace: [173153.188061] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173153.195100] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173153.202400] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [173153.209325] [] mdt_object_lock_internal+0x70/0x360 [mdt] [173153.216422] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173153.223434] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [173153.230105] [] mdt_intent_policy+0x435/0xd80 [mdt] [173153.236675] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173153.243538] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173153.250733] [] tgt_enqueue+0x62/0x210 [ptlrpc] [173153.257006] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173153.264036] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173153.271853] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173153.278286] [] kthread+0xd1/0xe0 [173153.283300] [] ret_from_fork_nospec_begin+0xe/0x21 [173153.289865] [] 0xffffffffffffffff [173153.294984] LustreError: dumping log to /tmp/lustre-log.1576159391.39419 [173153.302338] Pid: 39346, comm: mdt02_026 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173153.312615] Call Trace: [173153.315160] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173153.322180] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173153.329467] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [173153.336379] [] mdt_object_lock_internal+0x70/0x360 [mdt] [173153.343473] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173153.350469] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [173153.357131] [] mdt_intent_policy+0x435/0xd80 [mdt] [173153.363695] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173153.370548] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173153.377735] [] tgt_enqueue+0x62/0x210 [ptlrpc] [173153.384005] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173153.391039] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173153.398837] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173153.405252] [] kthread+0xd1/0xe0 [173153.410272] [] ret_from_fork_nospec_begin+0xe/0x21 [173153.416822] [] 0xffffffffffffffff [173155.206226] Pid: 39407, comm: mdt01_036 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173155.216484] Call Trace: [173155.219045] [] call_rwsem_down_write_failed+0x17/0x30 [173155.225892] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [173155.232747] [] lod_qos_prep_create+0x16a/0x1890 [lod] [173155.239584] [] lod_prepare_create+0x215/0x2e0 [lod] [173155.246259] [] lod_declare_striped_create+0x1ee/0x980 [lod] [173155.253605] [] lod_declare_create+0x204/0x590 [lod] [173155.260269] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [173155.268243] [] mdd_declare_create+0x4c/0xcb0 [mdd] [173155.274821] [] mdd_create+0x847/0x14e0 [mdd] [173155.280864] [] mdt_reint_open+0x224f/0x3240 [mdt] [173155.287372] [] mdt_reint_rec+0x83/0x210 [mdt] [173155.293509] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [173155.300182] [] mdt_intent_open+0x82/0x3a0 [mdt] [173155.306493] [] mdt_intent_policy+0x435/0xd80 [mdt] [173155.313078] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173155.319936] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173155.327156] [] tgt_enqueue+0x62/0x210 [ptlrpc] [173155.333403] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173155.340447] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173155.348249] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173155.354676] [] kthread+0xd1/0xe0 [173155.359680] [] ret_from_fork_nospec_begin+0xe/0x21 [173155.366258] [] 0xffffffffffffffff [173155.371365] LustreError: dumping log to /tmp/lustre-log.1576159393.39407 [173158.226523] Lustre: fir-MDT0000: Client a1acf167-afde-6f5a-879d-1a7c0814f282 (at 10.9.117.21@o2ib4) reconnecting [173159.430264] Lustre: 39266:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff888bf873cc80 x1649559130108000/t0(0) o101->a8d84424-9b8a-5525-fab4-b5243bf0dc64@10.9.104.22@o2ib4:262/0 lens 376/1600 e 19 to 0 dl 1576159402 ref 2 fl Interpret:/0/0 rc 0/0 [173159.459434] Lustre: 39266:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [173164.994919] Lustre: fir-MDT0000: Client a8d84424-9b8a-5525-fab4-b5243bf0dc64 (at 10.9.104.22@o2ib4) reconnecting [173165.005182] Lustre: Skipped 1 previous similar message [173165.830299] Lustre: 39277:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff887bf9fa3f00 x1649312727839232/t0(0) o101->e19e1947-897d-03aa-f267-2edb615db310@10.9.110.41@o2ib4:269/0 lens 1888/3288 e 19 to 0 dl 1576159409 ref 2 fl Interpret:/0/0 rc 0/0 [173172.526437] Lustre: fir-MDT0000: Client e19e1947-897d-03aa-f267-2edb615db310 (at 10.9.110.41@o2ib4) reconnecting [173205.960537] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [173205.973659] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [173219.962619] Lustre: 39411:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff887ba5484c80 x1651840017167536/t0(0) o101->20841216-9d8b-7794-9459-ced18b617ae2@10.9.114.3@o2ib4:323/0 lens 1792/3288 e 7 to 0 dl 1576159463 ref 2 fl Interpret:/0/0 rc 0/0 [173226.497582] Lustre: fir-MDT0000: Client 20841216-9d8b-7794-9459-ced18b617ae2 (at 10.9.114.3@o2ib4) reconnecting [173250.950833] Lustre: 39220:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff886bf5ab9f80 x1648842511409920/t0(0) o101->68425483-9450-d7a7-cad3-736e62941d5a@10.9.110.18@o2ib4:354/0 lens 376/1600 e 5 to 0 dl 1576159494 ref 2 fl Interpret:/0/0 rc 0/0 [173251.524213] LNet: Service thread pid 39352 completed after 687.52s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [173251.540463] LNet: Skipped 20 previous similar messages [173552.000595] LustreError: 39265:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576159490, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8852dd9d33c0/0xc3c20c06c19ce3d7 lrc: 3/1,0 mode: --/PR res: [0x2000376b8:0x1706e:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39265 timeout: 0 lvb_type: 0 [173598.139860] Lustre: 40805:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576159080/real 1576159080] req@ffff887bbd98b180 x1652542930629712/t0(0) o5->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 432/432 e 0 to 1 dl 1576159836 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [173598.168151] Lustre: 40805:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [173598.728869] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [173598.745069] Lustre: Skipped 4 previous similar messages [173598.750548] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [173598.760528] Lustre: Skipped 9 previous similar messages [173651.528180] LustreError: 39371:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576159589, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff885d87215a00/0xc3c20c06c19e4d99 lrc: 3/1,0 mode: --/PR res: [0x200037a5a:0xbae0:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39371 timeout: 0 lvb_type: 0 [173651.567747] LustreError: 39371:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [173685.652231] Lustre: fir-MDT0000: haven't heard from client 7ac0db55-de36-c1c6-f1a9-d7191d6b9947 (at 10.9.103.29@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba91cfc00, cur 1576159924 expire 1576159774 last 1576159697 [173685.674106] Lustre: Skipped 1 previous similar message [173751.528766] LustreError: 39389:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576159689, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8852deec3840/0xc3c20c06c19f6545 lrc: 3/1,0 mode: --/PR res: [0x2000389b9:0x11efe:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39389 timeout: 0 lvb_type: 0 [173751.568414] LustreError: 39389:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [173851.528354] LustreError: 39343:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576159789, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff888bf3b71f80/0xc3c20c06c1a07b54 lrc: 3/0,1 mode: --/CW res: [0x200029791:0x7f50:0x0].0x0 bits 0x2/0x0 rrc: 14 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39343 timeout: 0 lvb_type: 0 [173851.567932] LustreError: 39343:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 8 previous similar messages [173851.585085] Lustre: fir-MDT0000: Client dc8b0e50-2be4-ddc9-1be7-a287c814d044 (at 10.9.110.46@o2ib4) reconnecting [173851.595362] Lustre: Skipped 1 previous similar message [173962.989016] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [173963.002136] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [174199.948407] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576159837/real 1576159837] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576160438 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [174199.976662] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [174199.986524] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [174200.002765] Lustre: Skipped 4 previous similar messages [174200.008248] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [174200.018206] Lustre: Skipped 6 previous similar messages [174720.017497] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [174720.030643] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [174800.975969] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576160438/real 1576160438] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576161039 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [174801.004175] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [174801.014099] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [174801.030288] Lustre: Skipped 3 previous similar messages [174801.035774] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [174801.045708] Lustre: Skipped 3 previous similar messages [175402.027498] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576161039/real 1576161039] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576161640 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [175402.055700] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [175402.065535] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [175402.081712] Lustre: Skipped 4 previous similar messages [175402.087209] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [175402.097128] Lustre: Skipped 4 previous similar messages [175477.045943] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [175477.059068] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [175688.732216] LustreError: 39414:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576161627, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8855fafebf00/0xc3c20c06c1b29d18 lrc: 3/1,0 mode: --/PR res: [0x200029791:0x7f50:0x0].0x0 bits 0x13/0x0 rrc: 24 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39414 timeout: 0 lvb_type: 0 [175688.771885] LustreError: 39414:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [175723.960380] LustreError: 38893:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576161662, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff887bc6b87980/0xc3c20c06c1b2c0f1 lrc: 3/1,0 mode: --/PR res: [0x200029791:0x7f50:0x0].0x0 bits 0x13/0x0 rrc: 24 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 38893 timeout: 0 lvb_type: 0 [175752.235163] Lustre: fir-MDT0000: Client de02546b-f416-f3b2-d476-06fb4a31366f (at 10.8.7.20@o2ib6) reconnecting [175803.276840] LustreError: 39432:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576161741, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8859c7d18480/0xc3c20c06c1b33264 lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 42 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39432 timeout: 0 lvb_type: 0 [175803.316487] LustreError: 39432:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 24 previous similar messages [175855.185602] Lustre: fir-MDT0000: Client 1d444526-0c94-9229-34be-9d214c0c6bbd (at 10.9.101.46@o2ib4) reconnecting [176003.694998] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576161640/real 1576161640] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576162241 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [176003.723223] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 11 previous similar messages [176003.733150] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [176003.749364] Lustre: Skipped 5 previous similar messages [176003.754840] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [176003.764770] Lustre: Skipped 7 previous similar messages [176146.903832] Lustre: 38889:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff885b94335100 x1649443527617904/t0(0) o101->75ca7fbe-4dbb-5345-e1bf-3a337b10784c@10.9.117.38@o2ib4:230/0 lens 1800/3288 e 4 to 0 dl 1576162390 ref 2 fl Interpret:/0/0 rc 0/0 [176151.447846] LNet: Service thread pid 39329 was inactive for 599.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [176151.464867] LNet: Skipped 2 previous similar messages [176151.470013] Pid: 39329, comm: mdt00_019 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176151.480292] Call Trace: [176151.482850] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176151.489904] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176151.497220] [] mdt_object_local_lock+0x438/0xb20 [mdt] [176151.504141] [] mdt_object_lock_internal+0x70/0x360 [mdt] [176151.511235] [] mdt_object_lock+0x20/0x30 [mdt] [176151.517461] [] mdt_reint_open+0x106a/0x3240 [mdt] [176151.523958] [] mdt_reint_rec+0x83/0x210 [mdt] [176151.530098] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [176151.536759] [] mdt_intent_open+0x82/0x3a0 [mdt] [176151.543068] [] mdt_intent_policy+0x435/0xd80 [mdt] [176151.549656] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176151.556514] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176151.563735] [] tgt_enqueue+0x62/0x210 [ptlrpc] [176151.569990] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176151.577045] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176151.584850] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176151.591287] [] kthread+0xd1/0xe0 [176151.596294] [] ret_from_fork_nospec_begin+0xe/0x21 [176151.602878] [] 0xffffffffffffffff [176151.607987] LustreError: dumping log to /tmp/lustre-log.1576162389.39329 [176152.075855] LustreError: 39358:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576162090, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff887ad67ebf00/0xc3c20c06c1b6f236 lrc: 3/0,1 mode: --/CW res: [0x200029791:0x7f50:0x0].0x0 bits 0x2/0x0 rrc: 29 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39358 timeout: 0 lvb_type: 0 [176152.115411] LustreError: 39358:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 18 previous similar messages [176152.141702] Lustre: fir-MDT0000: Client 02eb8135-4034-bcb2-8df8-77d00506e76a (at 10.8.7.15@o2ib6) reconnecting [176152.151795] Lustre: Skipped 5 previous similar messages [176153.495858] LNet: Service thread pid 39346 was inactive for 601.41s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [176153.512877] Pid: 39346, comm: mdt02_026 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176153.523144] Call Trace: [176153.525706] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176153.532747] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176153.540052] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [176153.546970] [] mdt_object_lock_internal+0x70/0x360 [mdt] [176153.554073] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176153.561094] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [176153.567769] [] mdt_intent_policy+0x435/0xd80 [mdt] [176153.574367] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176153.581235] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176153.588437] [] tgt_enqueue+0x62/0x210 [ptlrpc] [176153.594716] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176153.601765] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176153.609581] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176153.616031] [] kthread+0xd1/0xe0 [176153.621042] [] ret_from_fork_nospec_begin+0xe/0x21 [176153.627647] [] 0xffffffffffffffff [176153.632768] LustreError: dumping log to /tmp/lustre-log.1576162391.39346 [176153.640137] Pid: 39269, comm: mdt02_018 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176153.650440] Call Trace: [176153.652988] [] call_rwsem_down_write_failed+0x17/0x30 [176153.659806] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [176153.666658] [] lod_qos_prep_create+0x16a/0x1890 [lod] [176153.673495] [] lod_prepare_create+0x215/0x2e0 [lod] [176153.680160] [] lod_declare_striped_create+0x1ee/0x980 [lod] [176153.687526] [] lod_declare_create+0x204/0x590 [lod] [176153.694189] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [176153.702166] [] mdd_declare_create+0x4c/0xcb0 [mdd] [176153.708727] [] mdd_create+0x847/0x14e0 [mdd] [176153.714806] [] mdt_reint_open+0x224f/0x3240 [mdt] [176153.721299] [] mdt_reint_rec+0x83/0x210 [mdt] [176153.727455] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [176153.734101] [] mdt_intent_open+0x82/0x3a0 [mdt] [176153.740416] [] mdt_intent_policy+0x435/0xd80 [mdt] [176153.746978] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176153.753833] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176153.761042] [] tgt_enqueue+0x62/0x210 [ptlrpc] [176153.767293] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176153.774329] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176153.782130] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176153.788561] [] kthread+0xd1/0xe0 [176153.793555] [] ret_from_fork_nospec_begin+0xe/0x21 [176153.800120] [] 0xffffffffffffffff [176153.805257] Pid: 39371, comm: mdt02_033 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176153.815522] Call Trace: [176153.818069] [] call_rwsem_down_write_failed+0x17/0x30 [176153.824927] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [176153.831755] [] lod_qos_prep_create+0x16a/0x1890 [lod] [176153.838596] [] lod_prepare_create+0x215/0x2e0 [lod] [176153.845295] [] lod_declare_striped_create+0x1ee/0x980 [lod] [176153.852653] [] lod_declare_create+0x204/0x590 [lod] [176153.859349] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [176153.867314] [] mdd_declare_create+0x4c/0xcb0 [mdd] [176153.873921] [] mdd_create+0x847/0x14e0 [mdd] [176153.879976] [] mdt_reint_open+0x224f/0x3240 [mdt] [176153.886506] [] mdt_reint_rec+0x83/0x210 [mdt] [176153.892646] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [176153.899343] [] mdt_intent_open+0x82/0x3a0 [mdt] [176153.905654] [] mdt_intent_policy+0x435/0xd80 [mdt] [176153.912262] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176153.919115] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176153.926345] [] tgt_enqueue+0x62/0x210 [ptlrpc] [176153.932602] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176153.939656] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176153.947463] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176153.953913] [] kthread+0xd1/0xe0 [176153.958918] [] ret_from_fork_nospec_begin+0xe/0x21 [176153.965508] [] 0xffffffffffffffff [176234.074373] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [176234.087497] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [176315.288827] LNet: Service thread pid 38897 was inactive for 763.21s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [176315.305849] LNet: Skipped 2 previous similar messages [176315.310998] Pid: 38897, comm: mdt03_001 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176315.321278] Call Trace: [176315.323830] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176315.330867] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176315.338163] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [176315.345089] [] mdt_object_lock_internal+0x70/0x360 [mdt] [176315.352193] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176315.359199] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [176315.365868] [] mdt_intent_policy+0x435/0xd80 [mdt] [176315.372440] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176315.379302] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176315.386499] [] tgt_enqueue+0x62/0x210 [ptlrpc] [176315.392763] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176315.399808] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176315.407628] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176315.414040] [] kthread+0xd1/0xe0 [176315.419056] [] ret_from_fork_nospec_begin+0xe/0x21 [176315.425619] [] 0xffffffffffffffff [176315.430741] LustreError: dumping log to /tmp/lustre-log.1576162553.38897 [176352.077456] LNet: Service thread pid 39371 completed after 799.99s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [176352.093781] LNet: Skipped 4 previous similar messages [176605.154504] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576162242/real 1576162242] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576162843 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [176605.182726] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [176605.192561] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [176605.208738] Lustre: Skipped 4 previous similar messages [176605.214253] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [176605.224196] Lustre: Skipped 25 previous similar messages [176852.122958] LustreError: 39229:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576162790, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8855f753d100/0xc3c20c06c1bcc2c7 lrc: 3/0,1 mode: --/PW res: [0x200039577:0x11b6:0x0].0x8ec40924 bits 0x2/0x0 rrc: 3 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39229 timeout: 0 lvb_type: 0 [176852.163033] LustreError: 39229:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 13 previous similar messages [176852.163508] Lustre: fir-MDT0000: Client c104d961-ddd0-a5eb-3382-4ecbd88b591c (at 10.8.18.16@o2ib6) reconnecting [176852.163510] Lustre: Skipped 15 previous similar messages [176852.872295] Lustre: fir-MDT0000: Client 4fb4463b-4df1-b2ca-bcaf-03821e29c498 (at 10.8.8.31@o2ib6) reconnecting [176852.882382] Lustre: Skipped 13 previous similar messages [176991.102773] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [176991.115900] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [177206.126017] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576162843/real 1576162843] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576163444 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [177206.154235] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 11 previous similar messages [177206.164157] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [177206.180315] Lustre: Skipped 4 previous similar messages [177206.185792] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [177206.195706] Lustre: Skipped 19 previous similar messages [177283.666727] Lustre: fir-MDT0000: haven't heard from client 3bd651a1-07e6-0cec-1800-45156860eb64 (at 10.9.110.39@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdf49ec00, cur 1576163522 expire 1576163372 last 1576163295 [177283.688640] Lustre: Skipped 1 previous similar message [177283.693902] LustreError: 39384:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.9.110.39@o2ib4) failed to reply to blocking AST (req@ffff885db815e300 x1652542932057200 status 0 rc -5), evict it ns: mdt-fir-MDT0000_UUID lock: ffff885861b9e300/0xc3c20c06c1c07621 lrc: 4/0,0 mode: PR/PR res: [0x20003963a:0x2ae:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x60200400000020 nid: 10.9.110.39@o2ib4 remote: 0x3535cd2e4ab6440 expref: 419 pid: 39328 timeout: 177424 lvb_type: 0 [177283.736621] LustreError: 138-a: fir-MDT0000: A client on nid 10.9.110.39@o2ib4 was evicted due to a lock blocking callback time out: rc -5 [177291.682641] Lustre: MGS: haven't heard from client 29b6614b-d9b6-3c4b-cd6c-79cb079428c5 (at 10.9.110.39@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b73820c00, cur 1576163530 expire 1576163380 last 1576163303 [177748.131278] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [177748.144402] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [177807.065616] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576163444/real 1576163444] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576164045 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [177807.093822] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [177807.103743] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [177807.119892] Lustre: Skipped 3 previous similar messages [177807.125403] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [177807.135324] Lustre: Skipped 3 previous similar messages [177971.533627] LustreError: 39230:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576163909, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8863aabea880/0xc3c20c06c1c583ca lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39230 timeout: 0 lvb_type: 0 [177971.573266] LustreError: 39230:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 19 previous similar messages [178007.063842] LustreError: 39326:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576163945, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff88792d670000/0xc3c20c06c1c5bab9 lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39326 timeout: 0 lvb_type: 0 [178010.018854] LNet: Service thread pid 39341 was inactive for 360.81s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [178010.035877] Pid: 39341, comm: mdt02_025 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178010.046140] Call Trace: [178010.048698] [] osp_precreate_reserve+0x2e8/0x800 [osp] [178010.055618] [] osp_declare_create+0x199/0x5b0 [osp] [178010.062280] [] lod_sub_declare_create+0xdf/0x210 [lod] [178010.069198] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [178010.076387] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [178010.083906] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [178010.090828] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [178010.098779] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [178010.106133] [] mdd_declare_layout_change+0x62/0x120 [mdd] [178010.113313] [] mdd_layout_change+0x882/0x1000 [mdd] [178010.119975] [] mdt_layout_change+0x337/0x430 [mdt] [178010.126561] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [178010.133146] [] mdt_intent_policy+0x435/0xd80 [mdt] [178010.139718] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178010.146588] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178010.153793] [] tgt_enqueue+0x62/0x210 [ptlrpc] [178010.160057] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178010.167089] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178010.174915] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178010.181327] [] kthread+0xd1/0xe0 [178010.186342] [] ret_from_fork_nospec_begin+0xe/0x21 [178010.192907] [] 0xffffffffffffffff [178010.198029] LustreError: dumping log to /tmp/lustre-log.1576164248.39341 [178032.546994] LNet: Service thread pid 39258 was inactive for 361.02s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [178032.564015] Pid: 39258, comm: mdt02_014 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178032.574277] Call Trace: [178032.576837] [] call_rwsem_down_write_failed+0x17/0x30 [178032.583676] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [178032.590526] [] lod_qos_prep_create+0x16a/0x1890 [lod] [178032.597360] [] lod_prepare_create+0x215/0x2e0 [lod] [178032.604023] [] lod_declare_striped_create+0x1ee/0x980 [lod] [178032.611379] [] lod_declare_create+0x204/0x590 [lod] [178032.618039] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [178032.626010] [] mdd_declare_create+0x4c/0xcb0 [mdd] [178032.632572] [] mdd_create+0x847/0x14e0 [mdd] [178032.638628] [] mdt_reint_open+0x224f/0x3240 [mdt] [178032.645114] [] mdt_reint_rec+0x83/0x210 [mdt] [178032.651264] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [178032.657939] [] mdt_intent_open+0x82/0x3a0 [mdt] [178032.664269] [] mdt_intent_policy+0x435/0xd80 [mdt] [178032.670851] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178032.677726] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178032.684921] [] tgt_enqueue+0x62/0x210 [ptlrpc] [178032.691193] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178032.698226] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178032.706039] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178032.712455] [] kthread+0xd1/0xe0 [178032.717472] [] ret_from_fork_nospec_begin+0xe/0x21 [178032.724049] [] 0xffffffffffffffff [178032.729173] LustreError: dumping log to /tmp/lustre-log.1576164271.39258 [178052.420463] LNet: Service thread pid 39341 completed after 403.22s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [178067.363210] LNet: Service thread pid 39384 was inactive for 414.94s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [178067.380234] Pid: 39384, comm: mdt01_032 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178067.390493] Call Trace: [178067.393046] [] osp_precreate_reserve+0x2e8/0x800 [osp] [178067.399966] [] osp_declare_create+0x199/0x5b0 [osp] [178067.406644] [] lod_sub_declare_create+0xdf/0x210 [lod] [178067.413564] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [178067.420745] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [178067.428262] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [178067.435184] [] lod_prepare_create+0x215/0x2e0 [lod] [178067.441835] [] lod_declare_striped_create+0x1ee/0x980 [lod] [178067.449201] [] lod_declare_create+0x204/0x590 [lod] [178067.455848] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [178067.463819] [] mdd_declare_create+0x4c/0xcb0 [mdd] [178067.470382] [] mdd_create+0x847/0x14e0 [mdd] [178067.476449] [] mdt_reint_open+0x224f/0x3240 [mdt] [178067.482940] [] mdt_reint_rec+0x83/0x210 [mdt] [178067.489091] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [178067.495750] [] mdt_intent_open+0x82/0x3a0 [mdt] [178067.502073] [] mdt_intent_policy+0x435/0xd80 [mdt] [178067.508646] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178067.515508] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178067.522703] [] tgt_enqueue+0x62/0x210 [ptlrpc] [178067.528969] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178067.535998] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178067.543827] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178067.550246] [] kthread+0xd1/0xe0 [178067.555261] [] ret_from_fork_nospec_begin+0xe/0x21 [178067.561825] [] 0xffffffffffffffff [178067.566945] LustreError: dumping log to /tmp/lustre-log.1576164305.39384 [178078.627277] LNet: Service thread pid 39250 was inactive for 426.20s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [178078.644302] Pid: 39250, comm: mdt00_010 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178078.654560] Call Trace: [178078.657118] [] call_rwsem_down_write_failed+0x17/0x30 [178078.663942] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [178078.671411] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [178078.678329] [] lod_prepare_create+0x215/0x2e0 [lod] [178078.684998] [] lod_declare_striped_create+0x1ee/0x980 [lod] [178078.692372] [] lod_declare_create+0x204/0x590 [lod] [178078.699037] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [178078.707003] [] mdd_declare_create+0x4c/0xcb0 [mdd] [178078.713581] [] mdd_create+0x847/0x14e0 [mdd] [178078.719625] [] mdt_reint_open+0x224f/0x3240 [mdt] [178078.726132] [] mdt_reint_rec+0x83/0x210 [mdt] [178078.732270] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [178078.738956] [] mdt_intent_open+0x82/0x3a0 [mdt] [178078.745271] [] mdt_intent_policy+0x435/0xd80 [mdt] [178078.751854] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178078.758715] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178078.765931] [] tgt_enqueue+0x62/0x210 [ptlrpc] [178078.772189] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178078.779233] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178078.787044] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178078.793468] [] kthread+0xd1/0xe0 [178078.798487] [] ret_from_fork_nospec_begin+0xe/0x21 [178078.805069] [] 0xffffffffffffffff [178078.810202] LustreError: dumping log to /tmp/lustre-log.1576164317.39250 [178080.675286] Pid: 39364, comm: mdt00_030 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178080.685552] Call Trace: [178080.688102] [] call_rwsem_down_write_failed+0x17/0x30 [178080.694923] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [178080.702394] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [178080.709316] [] lod_prepare_create+0x215/0x2e0 [lod] [178080.715980] [] lod_declare_striped_create+0x1ee/0x980 [lod] [178080.723325] [] lod_declare_create+0x204/0x590 [lod] [178080.729986] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [178080.737944] [] mdd_declare_create+0x4c/0xcb0 [mdd] [178080.744522] [] mdd_create+0x847/0x14e0 [mdd] [178080.750563] [] mdt_reint_open+0x224f/0x3240 [mdt] [178080.757075] [] mdt_reint_rec+0x83/0x210 [mdt] [178080.763217] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [178080.769863] [] mdt_intent_open+0x82/0x3a0 [mdt] [178080.776188] [] mdt_intent_policy+0x435/0xd80 [mdt] [178080.782760] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178080.789624] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178080.796825] [] tgt_enqueue+0x62/0x210 [ptlrpc] [178080.803089] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178080.810121] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178080.817941] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178080.824361] [] kthread+0xd1/0xe0 [178080.829375] [] ret_from_fork_nospec_begin+0xe/0x21 [178080.835937] [] 0xffffffffffffffff [178080.841038] LustreError: dumping log to /tmp/lustre-log.1576164319.39364 [178086.819328] LNet: Service thread pid 39230 was inactive for 415.28s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [178086.832279] LNet: Skipped 9 previous similar messages [178086.837425] LustreError: dumping log to /tmp/lustre-log.1576164325.39230 [178113.443491] LNet: Service thread pid 39268 was inactive for 361.02s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [178113.456442] LustreError: dumping log to /tmp/lustre-log.1576164351.39268 [178127.779577] LNet: Service thread pid 39382 was inactive for 414.45s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [178127.792532] LustreError: dumping log to /tmp/lustre-log.1576164366.39382 [178143.192684] LustreError: 39328:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576164081, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff885d6f11b600/0xc3c20c06c1c6f3d6 lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 29 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39328 timeout: 0 lvb_type: 0 [178143.232338] LustreError: 39328:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [178152.421329] LNet: Service thread pid 39364 completed after 499.27s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [178152.437578] LNet: Skipped 6 previous similar messages [178219.695970] Lustre: fir-MDT0000: haven't heard from client c1504d4c-7504-c251-de3c-6f26c7b8e7d5 (at 10.9.102.26@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba92a7400, cur 1576164458 expire 1576164308 last 1576164231 [178407.709255] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576164045/real 1576164045] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576164646 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [178407.737455] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [178407.747379] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [178407.763553] Lustre: Skipped 4 previous similar messages [178407.769035] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [178407.778970] Lustre: Skipped 4 previous similar messages [178505.159835] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -107 [178505.172952] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [178931.686996] Lustre: fir-MDT0000: haven't heard from client 3c020cd0-089d-acb1-e879-86429192cebf (at 10.8.27.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b0000, cur 1576165170 expire 1576165020 last 1576164943 [178931.708707] Lustre: Skipped 1 previous similar message [179009.288760] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576164646/real 1576164646] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576165247 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [179009.316967] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [179009.326885] Lustre: fir-OST0056-osc-MDT0000: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [179009.343036] Lustre: Skipped 5 previous similar messages [179009.348501] Lustre: fir-OST0056-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [179009.358424] Lustre: Skipped 5 previous similar messages [179165.745679] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [179165.755762] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.115@o2ib7 (6): c: 0, oc: 0, rc: 8 [179165.768097] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [179172.745713] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 1 seconds [179172.755972] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 59 previous similar messages [179198.933922] Lustre: MGS: Received new LWP connection from 10.8.8.19@o2ib6, removing former export from same NID [179200.142011] Lustre: MGS: Received new LWP connection from 10.9.102.58@o2ib4, removing former export from same NID [179200.152363] Lustre: Skipped 1 previous similar message [179203.053524] Lustre: MGS: Received new LWP connection from 10.9.108.9@o2ib4, removing former export from same NID [179205.820776] Lustre: MGS: Received new LWP connection from 10.9.110.63@o2ib4, removing former export from same NID [179205.831137] Lustre: Skipped 2 previous similar messages [179209.903126] Lustre: MGS: Received new LWP connection from 10.9.103.13@o2ib4, removing former export from same NID [179209.913475] Lustre: Skipped 6 previous similar messages [179211.346409] Lustre: fir-MDT0000: Client 0713f3a9-f297-cd73-69ad-d70a0f44846f (at 10.9.104.62@o2ib4) reconnecting [179213.609743] Lustre: fir-MDT0000: Client 09403296-99cb-0352-a342-f41333f5025e (at 10.9.107.69@o2ib4) reconnecting [179218.037676] Lustre: MGS: Received new LWP connection from 10.9.105.9@o2ib4, removing former export from same NID [179218.047938] Lustre: Skipped 8 previous similar messages [179222.994841] Lustre: fir-MDT0000: Client ee8a44d1-a255-3904-d785-781d851ce5cc (at 10.9.107.65@o2ib4) reconnecting [179226.591040] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.211@o2ib7 added to recovery queue. Health = 900 [179231.388140] Lustre: fir-MDT0000: Client 91b198e6-ce9d-6e88-4a8a-d97e9eaae698 (at 10.8.26.36@o2ib6) reconnecting [179234.150801] Lustre: MGS: Received new LWP connection from 10.9.112.2@o2ib4, removing former export from same NID [179234.161066] Lustre: Skipped 35 previous similar messages [179238.678064] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.108.14@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [179238.739740] Lustre: fir-MDT0000: Client d021ee3d-37fa-4 (at 10.8.28.7@o2ib6) reconnecting [179238.748011] Lustre: Skipped 2 previous similar messages [179241.604129] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.210@o2ib7 added to recovery queue. Health = 900 [179244.746145] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [179244.758141] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 6 previous similar messages [179249.746166] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 1 seconds [179249.756418] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 19 previous similar messages [179253.227647] Lustre: fir-MDT0000: Client b9d3876a-bb59-06ed-126e-3827677a4444 (at 10.9.104.3@o2ib4) reconnecting [179253.237815] Lustre: Skipped 5 previous similar messages [179256.617218] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.212@o2ib7 added to recovery queue. Health = 900 [179256.630247] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 1 previous similar message [179261.640248] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.202@o2ib7 added to recovery queue. Health = 900 [179261.653286] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 1 previous similar message [179262.188255] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0000: cannot cleanup orphans: rc = -11 [179262.201287] LustreError: 40821:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [179266.631098] Lustre: MGS: Received new LWP connection from 10.9.105.65@o2ib4, removing former export from same NID [179266.641452] Lustre: Skipped 106 previous similar messages [179266.663277] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.209@o2ib7 added to recovery queue. Health = 900 [179266.676309] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 2 previous similar messages [179269.829836] Lustre: fir-MDT0000: Client 3cd8d17d-c015-47f4-5929-0823e94a86fa (at 10.9.110.34@o2ib4) reconnecting [179269.840096] Lustre: Skipped 8 previous similar messages [179275.924074] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.104.8@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [179289.785796] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.7.18@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [179289.803075] LustreError: Skipped 1 previous similar message [179300.687488] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.203@o2ib7 added to recovery queue. Health = 900 [179300.700521] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 1 previous similar message [179302.983920] Lustre: fir-MDT0000: Client 54a4e18f-2dbf-9330-244f-d38b0011d1d4 (at 10.9.103.65@o2ib4) reconnecting [179302.994193] Lustre: Skipped 25 previous similar messages [179311.270303] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.103.46@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [179311.287789] LustreError: Skipped 5 previous similar messages [179312.364547] LustreError: 96473:0:(ldlm_lib.c:3256:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff886bfd423850 x1648782579006384/t0(0) o256->1b4e0033-5092-73e2-ee39-e46ff9b43fe9@10.8.28.8@o2ib6:412/0 lens 304/240 e 0 to 0 dl 1576165592 ref 1 fl Interpret:/0/0 rc 0/0 [179313.248553] LustreError: 96472:0:(ldlm_lib.c:3256:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff888be926b850 x1649567732887344/t0(0) o256->351465b9-9a15-eaaf-e2ff-273afe28ffed@10.9.104.27@o2ib4:413/0 lens 304/240 e 0 to 0 dl 1576165593 ref 1 fl Interpret:/0/0 rc 0/0 [179313.272874] LustreError: 96472:0:(ldlm_lib.c:3256:target_bulk_io()) Skipped 1 previous similar message [179325.710638] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.202@o2ib7 added to recovery queue. Health = 900 [179325.723714] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 4 previous similar messages [179330.790491] Lustre: MGS: Received new LWP connection from 10.9.106.19@o2ib4, removing former export from same NID [179330.800841] Lustre: Skipped 206 previous similar messages [179344.674297] Lustre: fir-MDT0000: haven't heard from client fir-MDT0000-lwp-OST0054_UUID (at 10.0.10.115@o2ib7) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886be695e800, cur 1576165583 expire 1576165433 last 1576165356 [179344.695486] Lustre: Skipped 1 previous similar message [179348.914018] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.108.50@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [179348.931477] LustreError: Skipped 82 previous similar messages [179351.697279] Lustre: MGS: haven't heard from client b3aef711-fb13-218a-11cf-7e4e4d6f4a51 (at 10.0.10.115@o2ib7) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdddea400, cur 1576165590 expire 1576165440 last 1576165363 [179351.718463] Lustre: Skipped 5 previous similar messages [179378.311067] Lustre: fir-MDT0000: Client 4e8251e5-eb6b-473d-1b55-6cf68aeb84d4 (at 10.9.105.59@o2ib4) reconnecting [179378.321328] Lustre: Skipped 44 previous similar messages [179398.747060] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [179398.759054] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 8 previous similar messages [179400.747066] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 2 seconds [179400.757323] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 45 previous similar messages [179508.949494] Lustre: fir-MDT0000: Client cce2fc1a-d500-0fbb-5491-2d32b40f4df2 (at 10.8.20.10@o2ib6) reconnecting [179508.959675] Lustre: Skipped 64 previous similar messages [179576.748166] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 1 seconds [179576.758258] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.115@o2ib7 (7): c: 0, oc: 0, rc: 8 [179588.314526] Lustre: MGS: Received new LWP connection from 10.9.107.9@o2ib4, removing former export from same NID [179588.324786] Lustre: Skipped 92 previous similar messages [179609.860375] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576165247/real 1576165247] req@ffff886bcde4ad00 x1652542919103808/t0(0) o6->fir-OST0056-osc-MDT0000@10.0.10.115@o2ib7:28/4 lens 544/432 e 4 to 1 dl 1576165848 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [179609.888575] Lustre: 38698:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 30 previous similar messages [179659.330780] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.107.9@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [179659.348156] LustreError: Skipped 118 previous similar messages [179673.330975] Lustre: MGS: Connection restored to (at 10.9.107.9@o2ib4) [179673.337597] Lustre: Skipped 629 previous similar messages [179688.702844] Lustre: MGS: haven't heard from client e3242d1f-bdca-4a42-9a91-79e078549196 (at 10.9.103.24@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde1d400, cur 1576165927 expire 1576165777 last 1576165700 [179879.373920] Lustre: MGS: Received new LWP connection from 10.9.107.9@o2ib4, removing former export from same NID [179879.384193] Lustre: Skipped 2 previous similar messages [179885.375125] Lustre: fir-MDT0000: Client d833ee08-9e03-4 (at 10.9.107.9@o2ib4) reconnecting [179885.383480] Lustre: Skipped 4 previous similar messages [179891.374581] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.107.9@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [179891.391963] LustreError: Skipped 2 previous similar messages [180037.688909] Lustre: fir-MDT0000: haven't heard from client 4c5e6f33-2d0c-f229-3fed-c30688bbed72 (at 10.9.116.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba902d000, cur 1576166276 expire 1576166126 last 1576166049 [180037.710798] Lustre: Skipped 1 previous similar message [180050.684238] Lustre: MGS: haven't heard from client 5540cff7-da1b-d42f-e90b-ff6d64672f1b (at 10.9.102.16@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be08d5000, cur 1576166289 expire 1576166139 last 1576166062 [180050.705446] Lustre: Skipped 18 previous similar messages [180113.689067] Lustre: fir-MDT0000: haven't heard from client 29e66763-b95c-3d3e-5532-53facc0d6b7a (at 10.9.109.32@o2ib4) in 220 seconds. I think it's dead, and I am evicting it. exp ffff887ba9607400, cur 1576166352 expire 1576166202 last 1576166132 [180113.710968] Lustre: Skipped 20 previous similar messages [180124.779547] LustreError: 42578:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166063, 300s ago); not entering recovery in server code, just going back to sleep ns: MGS lock: ffff888b827572c0/0xc3c20c06c1d5896f lrc: 3/0,1 mode: --/EX res: [0x726966:0x2:0x0].0x0 rrc: 1257 type: PLN flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 42578 timeout: 0 lvb_type: 0 [180131.053585] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [180131.066451] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166069, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff888bf5e3a400/0xc3c20c06c1d5a06e lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1d5a075 expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [180131.103834] LustreError: 96734:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888bb2664180) refcount nonzero (1) after lock cleanup; forcing cleanup. [180264.443315] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.107.9@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [180264.460690] LustreError: Skipped 5 previous similar messages [180273.447598] Lustre: fir-MDT0000: Connection restored to (at 10.9.107.9@o2ib4) [180273.454916] Lustre: Skipped 1254 previous similar messages [180293.084175] Lustre: fir-MDT0000: haven't heard from client 646257db-4a10-1d7d-1435-2f2425d1bdb2 (at 10.8.18.26@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdd4b2800, cur 1576166531 expire 1576166381 last 1576166304 [180293.105984] Lustre: Skipped 3 previous similar messages [180435.077312] Lustre: MGS: Received new LWP connection from 10.8.23.20@o2ib6, removing former export from same NID [180435.087574] Lustre: Skipped 1242 previous similar messages [180436.315429] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [180436.328298] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166374, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff888bd0ec1200/0xc3c20c06c1d7745d lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1d77464 expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [180436.365674] LustreError: 96855:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888bb2664840) refcount nonzero (1) after lock cleanup; forcing cleanup. [180451.560515] Lustre: fir-MDT0000: Client d833ee08-9e03-4 (at 10.9.107.9@o2ib4) reconnecting [180451.568864] Lustre: Skipped 3 previous similar messages [180742.347216] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [180742.360098] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166680, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff887ad67e8480/0xc3c20c06c1d97ca9 lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1d97cb0 expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [180742.397473] LustreError: 96937:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888bb2665ec0) refcount nonzero (1) after lock cleanup; forcing cleanup. [180766.718483] Lustre: fir-MDT0000: haven't heard from client db44fcc6-df61-0a83-7c51-af3e9a77d479 (at 10.8.7.13@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9776c00, cur 1576167005 expire 1576166855 last 1576166778 [180766.740198] Lustre: Skipped 29 previous similar messages [180804.593586] LustreError: 39253:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166742, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff886a6ee64800/0xc3c20c06c1da8f4d lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 32 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39253 timeout: 0 lvb_type: 0 [180834.557757] LustreError: 39416:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166772, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff887ba4700d80/0xc3c20c06c1dab7f6 lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 32 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39416 timeout: 0 lvb_type: 0 [180844.355819] LustreError: 39250:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166782, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8852d79d4ec0/0xc3c20c06c1dabd8a lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 32 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39250 timeout: 0 lvb_type: 0 [180880.728866] Lustre: fir-OST0058-osc-MDT0000: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [180880.738785] Lustre: Skipped 2499 previous similar messages [180883.307042] LustreError: 39265:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166821, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8877ada68000/0xc3c20c06c1db5f71 lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 35 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39265 timeout: 0 lvb_type: 0 [180890.548079] LNet: Service thread pid 39268 was inactive for 411.17s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [180890.565101] LNet: Skipped 1 previous similar message [180890.570161] Pid: 39268, comm: mdt02_017 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [180890.580438] Call Trace: [180890.582991] [] osp_precreate_reserve+0x2e8/0x800 [osp] [180890.589911] [] osp_declare_create+0x199/0x5b0 [osp] [180890.596573] [] lod_sub_declare_create+0xdf/0x210 [lod] [180890.603490] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [180890.610695] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [180890.618221] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [180890.625146] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [180890.633096] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [180890.640466] [] mdd_declare_layout_change+0x62/0x120 [mdd] [180890.647641] [] mdd_layout_change+0x882/0x1000 [mdd] [180890.654299] [] mdt_layout_change+0x337/0x430 [mdt] [180890.660872] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [180890.667449] [] mdt_intent_policy+0x435/0xd80 [mdt] [180890.674012] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [180890.680885] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [180890.688093] [] tgt_enqueue+0x62/0x210 [ptlrpc] [180890.694359] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [180890.701382] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [180890.709196] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [180890.715613] [] kthread+0xd1/0xe0 [180890.720628] [] ret_from_fork_nospec_begin+0xe/0x21 [180890.727192] [] 0xffffffffffffffff [180890.732314] LustreError: dumping log to /tmp/lustre-log.1576167129.39268 [180890.748094] Pid: 39269, comm: mdt02_018 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [180890.758387] Call Trace: [180890.760937] [] call_rwsem_down_write_failed+0x17/0x30 [180890.767760] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [180890.775189] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [180890.782112] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [180890.790079] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [180890.797417] [] mdd_declare_layout_change+0x62/0x120 [mdd] [180890.804600] [] mdd_layout_change+0x882/0x1000 [mdd] [180890.811265] [] mdt_layout_change+0x337/0x430 [mdt] [180890.817842] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [180890.824408] [] mdt_intent_policy+0x435/0xd80 [mdt] [180890.830996] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [180890.837849] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [180890.845055] [] tgt_enqueue+0x62/0x210 [ptlrpc] [180890.851308] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [180890.858344] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [180890.866146] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [180890.872587] [] kthread+0xd1/0xe0 [180890.877588] [] ret_from_fork_nospec_begin+0xe/0x21 [180890.884158] [] 0xffffffffffffffff [180890.889262] Pid: 39375, comm: mdt02_034 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [180890.899551] Call Trace: [180890.902101] [] call_rwsem_down_write_failed+0x17/0x30 [180890.908914] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [180890.916358] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [180890.923267] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [180890.931233] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [180890.938589] [] mdd_declare_layout_change+0x62/0x120 [mdd] [180890.945772] [] mdd_layout_change+0x882/0x1000 [mdd] [180890.952422] [] mdt_layout_change+0x337/0x430 [mdt] [180890.959012] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [180890.965580] [] mdt_intent_policy+0x435/0xd80 [mdt] [180890.972152] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [180890.978995] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [180890.986194] [] tgt_enqueue+0x62/0x210 [ptlrpc] [180890.992439] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [180890.999473] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [180891.007276] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [180891.013702] [] kthread+0xd1/0xe0 [180891.018696] [] ret_from_fork_nospec_begin+0xe/0x21 [180891.025264] [] 0xffffffffffffffff [180892.596099] LNet: Service thread pid 39382 was inactive for 412.93s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [180892.613124] LNet: Skipped 2 previous similar messages [180892.618268] Pid: 39382, comm: mdt01_031 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [180892.628527] Call Trace: [180892.631076] [] call_rwsem_down_write_failed+0x17/0x30 [180892.637895] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [180892.645322] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [180892.652229] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [180892.660192] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [180892.667534] [] mdd_declare_layout_change+0x62/0x120 [mdd] [180892.674702] [] mdd_layout_change+0x882/0x1000 [mdd] [180892.681351] [] mdt_layout_change+0x337/0x430 [mdt] [180892.687928] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [180892.694489] [] mdt_intent_policy+0x435/0xd80 [mdt] [180892.701063] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [180892.707912] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [180892.715114] [] tgt_enqueue+0x62/0x210 [ptlrpc] [180892.721355] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [180892.728391] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [180892.736193] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [180892.742623] [] kthread+0xd1/0xe0 [180892.747615] [] ret_from_fork_nospec_begin+0xe/0x21 [180892.754191] [] 0xffffffffffffffff [180892.759292] LustreError: dumping log to /tmp/lustre-log.1576167131.39382 [180892.766591] Pid: 39367, comm: mdt03_028 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [180892.776878] Call Trace: [180892.779429] [] call_rwsem_down_write_failed+0x17/0x30 [180892.786241] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [180892.793687] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [180892.800596] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [180892.808571] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [180892.815910] [] mdd_declare_layout_change+0x62/0x120 [mdd] [180892.823094] [] mdd_layout_change+0x882/0x1000 [mdd] [180892.829741] [] mdt_layout_change+0x337/0x430 [mdt] [180892.836317] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [180892.842889] [] mdt_intent_policy+0x435/0xd80 [mdt] [180892.849463] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [180892.856306] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [180892.863513] [] tgt_enqueue+0x62/0x210 [ptlrpc] [180892.869766] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [180892.876820] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [180892.884618] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [180892.891049] [] kthread+0xd1/0xe0 [180892.896041] [] ret_from_fork_nospec_begin+0xe/0x21 [180892.902609] [] 0xffffffffffffffff [180897.770807] LNet: Service thread pid 39268 completed after 418.39s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [180897.787056] LNet: Skipped 1 previous similar message [180898.740140] LNet: Service thread pid 38895 was inactive for 411.25s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [180898.753092] LustreError: dumping log to /tmp/lustre-log.1576167137.38895 [180899.764133] LNet: Service thread pid 39258 was inactive for 410.89s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [180899.777082] LustreError: dumping log to /tmp/lustre-log.1576167138.39258 [180906.932180] LNet: Service thread pid 39339 was inactive for 413.23s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [180906.945133] LustreError: dumping log to /tmp/lustre-log.1576167145.39339 [180909.757077] LNet: Service thread pid 39375 completed after 430.17s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [180909.773319] LNet: Skipped 1 previous similar message [180917.172239] LNet: Service thread pid 39253 was inactive for 412.57s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [180917.185188] LustreError: dumping log to /tmp/lustre-log.1576167155.39253 [180933.365469] LNet: Service thread pid 39367 completed after 453.38s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [180944.683924] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.107.9@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [180944.701309] LustreError: Skipped 10 previous similar messages [180945.844407] LNet: Service thread pid 39416 was inactive for 411.28s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [180945.857358] LNet: Skipped 1 previous similar message [180945.862421] LustreError: dumping log to /tmp/lustre-log.1576167184.39416 [180957.108465] LNet: Service thread pid 39250 was inactive for 412.75s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [180957.121413] LustreError: dumping log to /tmp/lustre-log.1576167195.39250 [180986.804631] LNet: Service thread pid 39341 was inactive for 411.51s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [180986.817576] LustreError: dumping log to /tmp/lustre-log.1576167225.39341 [180990.161662] LustreError: 39378:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166928, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8875b2a03f00/0xc3c20c06c1dc286b lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 36 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39378 timeout: 0 lvb_type: 0 [180993.972672] LustreError: dumping log to /tmp/lustre-log.1576167232.39265 [181003.188724] LustreError: dumping log to /tmp/lustre-log.1576167241.39346 [181005.236734] LustreError: dumping log to /tmp/lustre-log.1576167243.39217 [181025.716854] LNet: Service thread pid 38891 was inactive for 413.14s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [181025.729804] LNet: Skipped 3 previous similar messages [181025.734951] LustreError: dumping log to /tmp/lustre-log.1576167264.38891 [181026.940858] Lustre: 39215:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576167258/real 1576167258] req@ffff88785e337080 x1652542933315888/t0(0) o104->fir-MDT0000@10.9.115.1@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576167265 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [181026.968193] Lustre: 39215:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16 previous similar messages [181033.365137] LNet: Service thread pid 39258 completed after 544.49s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [181043.728871] Lustre: MGS: Received new LWP connection from 10.9.107.9@o2ib4, removing former export from same NID [181043.739135] Lustre: Skipped 2496 previous similar messages [181051.888992] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [181051.901861] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166990, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff888b89be4ec0/0xc3c20c06c1dccd0e lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1dccd15 expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [181051.939231] LustreError: 97158:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888b8aab4240) refcount nonzero (1) after lock cleanup; forcing cleanup. [181072.726015] Lustre: fir-MDT0000: haven't heard from client d59b4a25-94cd-9118-509c-0144bd0df5bb (at 10.9.109.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886be3ac3000, cur 1576167311 expire 1576167161 last 1576167084 [181072.747892] Lustre: Skipped 1 previous similar message [181082.549183] Lustre: 39330:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff88799bae8480 x1649614765426528/t0(0) o101->5b41e348-8633-a21d-46d9-7918979d9d25@10.9.104.19@o2ib4:635/0 lens 376/1600 e 14 to 0 dl 1576167325 ref 2 fl Interpret:/0/0 rc 0/0 [181082.578362] Lustre: 39330:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages [181088.553120] Lustre: fir-MDT0000: Client 5b41e348-8633-a21d-46d9-7918979d9d25 (at 10.9.104.19@o2ib4) reconnecting [181088.563384] Lustre: Skipped 3 previous similar messages [181088.937222] Lustre: 39351:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff885b8bbfad00 x1649498306499712/t0(0) o101->cf0dcba8-ff55-c75d-2ce2-0d11bb83fb82@10.9.102.21@o2ib4:642/0 lens 376/1600 e 14 to 0 dl 1576167332 ref 2 fl Interpret:/0/0 rc 0/0 [181098.965292] Lustre: 39323:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff886becfb9b00 x1648439898262768/t0(0) o101->5e845a5e-9c00-cc58-79df-d4d75fd3c1a1@10.8.27.4@o2ib6:652/0 lens 1792/3288 e 11 to 0 dl 1576167342 ref 2 fl Interpret:/0/0 rc 0/0 [181101.493282] LNet: Service thread pid 39378 was inactive for 411.33s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [181101.506230] LustreError: dumping log to /tmp/lustre-log.1576167339.39378 [181102.281303] LustreError: 39325:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576167040, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff885bd5349b00/0xc3c20c06c1dd425c lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 38 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39325 timeout: 0 lvb_type: 0 [181102.320949] LustreError: 39325:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [181117.877381] LustreError: dumping log to /tmp/lustre-log.1576167356.39371 [181129.653460] Lustre: 39242:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff887b9b6eb600 x1648417756486160/t0(0) o101->4e97c29c-283b-4253-402d-db9d46beedd7@10.9.101.39@o2ib4:682/0 lens 600/3264 e 6 to 0 dl 1576167372 ref 2 fl Interpret:/0/0 rc 0/0 [181129.682555] Lustre: 39242:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [181133.365821] LNet: Service thread pid 39339 completed after 639.67s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [181133.382075] LNet: Skipped 11 previous similar messages [181359.683933] Lustre: MGS: haven't heard from client 742b2cbc-624e-86be-da90-400c9fd59825 (at 10.9.114.2@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc6bf5000, cur 1576167598 expire 1576167448 last 1576167371 [181359.705053] Lustre: Skipped 67 previous similar messages [181359.870774] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [181359.883647] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576167298, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff888b8278a640/0xc3c20c06c1e92ca3 lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1e92caa expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [181359.921014] LustreError: 97304:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888bca7fb500) refcount nonzero (1) after lock cleanup; forcing cleanup. [181402.375031] Lustre: 38715:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576167633/real 1576167633] req@ffff887534b33180 x1652542933462384/t0(0) o41->fir-MDT0003-osp-MDT0000@10.0.10.54@o2ib7:24/4 lens 224/368 e 0 to 1 dl 1576167640 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [181402.403236] Lustre: 38715:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [181402.413066] Lustre: fir-MDT0003-osp-MDT0000: Connection to fir-MDT0003 (at 10.0.10.54@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [181402.429160] Lustre: Skipped 6 previous similar messages [181480.777822] Lustre: MGS: Connection restored to 70ca4d0d-57d2-4178-fe01-a31f45306b60 (at 10.9.112.16@o2ib4) [181480.787655] Lustre: Skipped 2446 previous similar messages [181508.759645] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds [181508.769907] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (106): c: 4, oc: 0, rc: 8 [181508.782293] LNetError: 38671:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [181509.755954] LNetError: 96587:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [181509.767960] LNetError: 96587:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 12 previous similar messages [181545.740106] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.104.25@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [181545.757565] LustreError: Skipped 143 previous similar messages [181550.756203] LNetError: 96587:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [181550.768202] LNetError: 96587:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 4 previous similar messages [181560.756170] Lustre: 42120:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1576167797/real 1576167799] req@ffff8858ccf1d100 x1652542933542448/t0(0) o105->MGS@10.0.10.54@o2ib7:15/16 lens 304/224 e 0 to 1 dl 1576167804 ref 1 fl Rpc:eX/0/ffffffff rc 0/-1 [181570.760021] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 1 seconds [181570.770196] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 56 previous similar messages [181594.760150] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 1 seconds [181594.770248] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (9): c: 0, oc: 0, rc: 8 [181605.760220] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [181605.770309] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (10): c: 0, oc: 0, rc: 8 [181614.760274] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [181614.770359] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (5): c: 0, oc: 0, rc: 8 [181628.760353] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [181628.770434] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (5): c: 0, oc: 0, rc: 8 [181628.782691] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [181628.794732] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 12 previous similar messages [181643.760441] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 1 seconds [181643.770532] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (6): c: 0, oc: 0, rc: 8 [181671.760604] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 1 seconds [181671.770687] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Skipped 1 previous similar message [181671.780855] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (6): c: 0, oc: 0, rc: 8 [181671.792838] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Skipped 1 previous similar message [181672.760615] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.54@o2ib7: 2 seconds [181672.770784] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 14 previous similar messages [181708.760823] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [181708.770906] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Skipped 2 previous similar messages [181708.781160] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (7): c: 0, oc: 0, rc: 8 [181708.793144] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Skipped 2 previous similar messages [181777.761238] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [181777.771322] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Skipped 4 previous similar messages [181777.781603] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.54@o2ib7 (6): c: 0, oc: 0, rc: 8 [181777.793603] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Skipped 4 previous similar messages [181792.761577] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [181792.773579] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 12 previous similar messages [181861.806766] Lustre: 42120:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576168093/real 1576168093] req@ffff88560703b600 x1652542933636560/t0(0) o105->MGS@10.0.10.54@o2ib7:15/16 lens 304/224 e 0 to 1 dl 1576168100 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [181861.833435] Lustre: 42120:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 382 previous similar messages [181884.342927] Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.54@o2ib7, removing former export from same NID [181884.353881] Lustre: Skipped 2426 previous similar messages [181888.689167] Lustre: fir-MDT0000: haven't heard from client a1acf167-afde-6f5a-879d-1a7c0814f282 (at 10.9.117.21@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9601c00, cur 1576168127 expire 1576167977 last 1576167900 [181888.711064] Lustre: Skipped 11 previous similar messages [182038.632878] LustreError: 42578:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.9.107.9@o2ib4) failed to reply to blocking AST (req@ffff888b75b51f80 x1652542933790336 status 0 rc -110), evict it ns: MGS lock: ffff888bf2303840/0xc3c20c06c206bf5b lrc: 4/0,0 mode: CR/CR res: [0x726966:0x2:0x0].0x0 rrc: 1227 type: PLN flags: 0x40000400000020 nid: 10.9.107.9@o2ib4 remote: 0x1f531de89b55b22a expref: 17 pid: 42501 timeout: 0 lvb_type: 0 [182038.672324] LustreError: 138-a: MGS: A client on nid 10.9.107.9@o2ib4 was evicted due to a lock blocking callback time out: rc -110 [182038.684248] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 1576168276s: evicting client at 10.9.107.9@o2ib4 ns: MGS lock: ffff888bf2303840/0xc3c20c06c206bf5b lrc: 4/0,0 mode: CR/CR res: [0x726966:0x2:0x0].0x0 rrc: 1228 type: PLN flags: 0x40000400000020 nid: 10.9.107.9@o2ib4 remote: 0x1f531de89b55b22a expref: 18 pid: 42501 timeout: 0 lvb_type: 0 [182140.801084] Lustre: MGS: Connection restored to (at 10.9.104.42@o2ib4) [182140.807798] Lustre: Skipped 28 previous similar messages [182191.535716] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [182191.548596] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576168129, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff888be0300480/0xc3c20c06c2758fad lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c2758fb4 expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [182191.586008] LustreError: 97965:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888b813400c0) refcount nonzero (1) after lock cleanup; forcing cleanup. [182338.686566] LustreError: 42578:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576168276, 300s ago); not entering recovery in server code, just going back to sleep ns: MGS lock: ffff888bde6b2ac0/0xc3c20c06c2750c27 lrc: 3/0,1 mode: --/EX res: [0x726966:0x2:0x0].0x0 rrc: 2443 type: PLN flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 42578 timeout: 0 lvb_type: 0 [182461.851303] Lustre: 42120:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576168693/real 1576168693] req@ffff88571b6bbf00 x1652542933546944/t0(0) o105->MGS@10.9.107.9@o2ib4:15/16 lens 304/224 e 0 to 1 dl 1576168700 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [182461.877974] Lustre: 42120:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 551 previous similar messages [182492.694176] Lustre: fir-MDT0000: haven't heard from client dec5062c-f101-0dc5-128b-72e40bd60a5a (at 10.9.112.12@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bde256400, cur 1576168731 expire 1576168581 last 1576168504 [182492.716070] Lustre: Skipped 7 previous similar messages [182494.371740] Lustre: MGS: Received new LWP connection from 10.9.105.17@o2ib4, removing former export from same NID [182494.382093] Lustre: Skipped 1218 previous similar messages [182498.687488] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [182498.700357] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576168436, 301s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff8863c6899b00/0xc3c20c06c28a681c lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c28a6823 expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [182498.737752] LustreError: 98064:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888bb17a7ec0) refcount nonzero (1) after lock cleanup; forcing cleanup. [182747.739349] Lustre: MGS: Connection restored to (at 10.9.117.11@o2ib4) [182747.746052] Lustre: Skipped 2480 previous similar messages [182808.559277] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [182808.572149] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576168746, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff888bbaf9dc40/0xc3c20c06c2adfd45 lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c2adfd4c expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [182808.609514] LustreError: 98237:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888bf8ed7080) refcount nonzero (1) after lock cleanup; forcing cleanup. [183063.892705] Lustre: 42120:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576169295/real 1576169295] req@ffff88571b6bbf00 x1652542933546944/t0(0) o105->MGS@10.9.107.9@o2ib4:15/16 lens 304/224 e 0 to 1 dl 1576169302 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [183063.919369] Lustre: 42120:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 429 previous similar messages [183096.405236] Lustre: MGS: Received new LWP connection from 10.9.101.63@o2ib4, removing former export from same NID [183096.415586] Lustre: Skipped 2462 previous similar messages [183113.640958] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [183113.653830] LustreError: 38884:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576169051, 300s ago), entering recovery for MGS@10.0.10.51@o2ib7 ns: MGC10.0.10.51@o2ib7 lock: ffff888becfaf2c0/0xc3c20c06c2c33140 lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c2c33147 expref: -99 pid: 38884 timeout: 0 lvb_type: 0 [183113.691203] LustreError: 98325:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff888b86b8e0c0) refcount nonzero (1) after lock cleanup; forcing cleanup. [183196.929822] LustreError: 42120:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.9.107.9@o2ib4) returned error from completion AST (req@ffff88571b6bb600 x1652542933546976 status -107 rc -107), evict it ns: MGS lock: ffff888bf5dd1680/0xc3c20c06c20386c0 lrc: 3/0,0 mode: CR/CR res: [0x726966:0x2:0x0].0x0 rrc: 6164 type: PLN flags: 0x40000400000020 nid: 10.9.107.9@o2ib4 remote: 0x1f531de89b55b1f9 expref: 17 pid: 42580 timeout: 0 lvb_type: 0 [183196.969728] LustreError: 42120:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 4 previous similar messages [183196.979905] LustreError: 138-a: MGS: A client on nid 10.9.107.9@o2ib4 was evicted due to a lock completion callback time out: rc -107 [183196.992001] LustreError: Skipped 4 previous similar messages [183196.997774] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 1576169435s: evicting client at 10.9.107.9@o2ib4 ns: MGS lock: ffff888bf5dd1680/0xc3c20c06c20386c0 lrc: 3/0,0 mode: CR/CR res: [0x726966:0x2:0x0].0x0 rrc: 6165 type: PLN flags: 0x40000400000020 nid: 10.9.107.9@o2ib4 remote: 0x1f531de89b55b1f9 expref: 18 pid: 42580 timeout: 0 lvb_type: 0 [183197.032831] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 4 previous similar messages [183360.256055] Lustre: MGS: Connection restored to (at 10.9.116.11@o2ib4) [183360.262766] Lustre: Skipped 2531 previous similar messages [183642.706852] Lustre: fir-MDT0000: haven't heard from client 295209bb-0224-d868-bd7c-cd75c3b19a1c (at 10.8.18.20@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9359000, cur 1576169881 expire 1576169731 last 1576169654 [183642.728646] Lustre: Skipped 5 previous similar messages [184050.799745] Lustre: MGS: Connection restored to (at 10.9.115.12@o2ib4) [184050.806452] Lustre: Skipped 21 previous similar messages [184663.652062] Lustre: MGS: Connection restored to b9b67222-dc5d-c9e8-945b-377220afc943 (at 10.8.20.25@o2ib6) [184663.661810] Lustre: Skipped 3 previous similar messages [185122.706017] Lustre: fir-MDT0000: haven't heard from client 75167b5d-e2d7-d704-ea07-95d8feb377a6 (at 10.9.102.1@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba90a5800, cur 1576171361 expire 1576171211 last 1576171134 [185122.727814] Lustre: Skipped 1 previous similar message [185277.717688] Lustre: MGS: haven't heard from client 55529a98-2a28-a963-7acd-1b84cd50762d (at 10.8.7.19@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf5a04400, cur 1576171516 expire 1576171366 last 1576171289 [185277.738716] Lustre: Skipped 1 previous similar message [185552.714145] Lustre: fir-MDT0000: haven't heard from client 3fa61b7b-3364-0c3e-efb9-55ce1343c799 (at 10.8.23.34@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888c3fc72c00, cur 1576171791 expire 1576171641 last 1576171564 [185552.735955] Lustre: Skipped 1 previous similar message [185692.653062] Lustre: MGS: Connection restored to 295209bb-0224-d868-bd7c-cd75c3b19a1c (at 10.8.18.20@o2ib6) [185692.662805] Lustre: Skipped 3 previous similar messages [186172.713172] Lustre: MGS: haven't heard from client 9c024261-121c-46e9-5dee-41ee02e3e326 (at 10.8.18.18@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885be4453400, cur 1576172411 expire 1576172261 last 1576172184 [186172.734280] Lustre: Skipped 1 previous similar message [186830.716599] Lustre: fir-MDT0000: haven't heard from client ef78dfe0-80b9-391e-81c2-9236655a36fe (at 10.9.103.59@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bed3da800, cur 1576173069 expire 1576172919 last 1576172842 [186830.738493] Lustre: Skipped 1 previous similar message [187014.787837] Lustre: MGS: Connection restored to (at 10.9.104.7@o2ib4) [187014.794456] Lustre: Skipped 1 previous similar message [187447.333649] Lustre: MGS: Connection restored to 75167b5d-e2d7-d704-ea07-95d8feb377a6 (at 10.9.102.1@o2ib4) [187447.343393] Lustre: Skipped 5 previous similar messages [187657.041464] Lustre: MGS: Connection restored to 3fa61b7b-3364-0c3e-efb9-55ce1343c799 (at 10.8.23.34@o2ib6) [187657.051212] Lustre: Skipped 1 previous similar message [187754.734177] Lustre: MGS: haven't heard from client d84068f9-facb-1706-cbe7-745525e4a5c1 (at 10.8.27.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc761c400, cur 1576173993 expire 1576173843 last 1576173766 [187754.755278] Lustre: Skipped 1 previous similar message [188334.192210] Lustre: MGS: Connection restored to ee8a8d10-65c2-ae96-bc67-9f6bae32e110 (at 10.8.18.18@o2ib6) [188334.201957] Lustre: Skipped 5 previous similar messages [188958.733283] Lustre: MGS: haven't heard from client 5708211f-0df6-e95b-8bc0-a86ba2362e40 (at 10.9.101.42@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be1739400, cur 1576175197 expire 1576175047 last 1576174970 [188958.754471] Lustre: Skipped 5 previous similar messages [189848.427626] Lustre: MGS: Connection restored to 227d7a25-50be-a469-9b6d-83846499cd76 (at 10.8.27.14@o2ib6) [189848.437367] Lustre: Skipped 3 previous similar messages [190340.477618] Lustre: MGS: Connection restored to (at 10.9.116.14@o2ib4) [190340.484328] Lustre: Skipped 3 previous similar messages [190384.735976] Lustre: fir-MDT0000: haven't heard from client 5d110741-f52f-a556-c0fd-775bc1eebbda (at 10.9.105.33@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba92e7400, cur 1576176623 expire 1576176473 last 1576176396 [190384.757859] Lustre: Skipped 19 previous similar messages [190533.780560] LustreError: 39258:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576176472, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff887bd17e18c0/0xc3c20c06c69f66cf lrc: 3/1,0 mode: --/PR res: [0x2000376b8:0x1706e:0x0].0x0 bits 0x13/0x0 rrc: 27 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39258 timeout: 0 lvb_type: 0 [190572.254764] LustreError: 38892:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576176510, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff885b969398c0/0xc3c20c06c6a1e167 lrc: 3/1,0 mode: --/PR res: [0x200037a5a:0xbae0:0x0].0x0 bits 0x13/0x0 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 38892 timeout: 0 lvb_type: 0 [190573.613280] Lustre: MGS: Connection restored to ce5ee768-37d0-d480-6e14-e3a25f5ac36c (at 10.9.117.30@o2ib4) [190573.623118] Lustre: Skipped 5 previous similar messages [190590.950875] LustreError: 88947:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576176529, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff88687b7960c0/0xc3c20c06c6a35857 lrc: 3/0,1 mode: --/CW res: [0x2000376b8:0x1706e:0x0].0x0 bits 0x2/0x0 rrc: 27 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 88947 timeout: 0 lvb_type: 0 [190590.990520] LustreError: 88947:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 2 previous similar messages [190624.271078] LustreError: 97383:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576176562, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff886bf7e2bcc0/0xc3c20c06c6a59714 lrc: 3/0,1 mode: --/CW res: [0x2000376b8:0x1706e:0x0].0x0 bits 0x2/0x0 rrc: 28 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 97383 timeout: 0 lvb_type: 0 [190632.938115] LNet: Service thread pid 39258 was inactive for 399.15s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [190632.955139] LNet: Skipped 1 previous similar message [190632.960203] Pid: 39258, comm: mdt02_014 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [190632.970500] Call Trace: [190632.973062] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [190632.980122] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [190632.987425] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [190632.994362] [] mdt_object_lock_internal+0x70/0x360 [mdt] [190633.001456] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [190633.008483] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [190633.015168] [] mdt_intent_policy+0x435/0xd80 [mdt] [190633.021776] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [190633.028626] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [190633.035842] [] tgt_enqueue+0x62/0x210 [ptlrpc] [190633.042103] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [190633.049153] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [190633.056957] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [190633.063394] [] kthread+0xd1/0xe0 [190633.068399] [] ret_from_fork_nospec_begin+0xe/0x21 [190633.074983] [] 0xffffffffffffffff [190633.080118] LustreError: dumping log to /tmp/lustre-log.1576176871.39258 [190651.370221] LNet: Service thread pid 39336 was inactive for 398.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [190651.387265] Pid: 39336, comm: mdt02_024 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [190651.397528] Call Trace: [190651.400082] [] call_rwsem_down_write_failed+0x17/0x30 [190651.406907] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [190651.414359] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [190651.421269] [] lod_prepare_create+0x215/0x2e0 [lod] [190651.427930] [] lod_declare_striped_create+0x1ee/0x980 [lod] [190651.435275] [] lod_declare_create+0x204/0x590 [lod] [190651.441950] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [190651.449912] [] mdd_declare_create+0x4c/0xcb0 [mdd] [190651.456497] [] mdd_create+0x847/0x14e0 [mdd] [190651.462537] [] mdt_reint_open+0x224f/0x3240 [mdt] [190651.469035] [] mdt_reint_rec+0x83/0x210 [mdt] [190651.475174] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [190651.481845] [] mdt_intent_open+0x82/0x3a0 [mdt] [190651.488159] [] mdt_intent_policy+0x435/0xd80 [mdt] [190651.494744] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [190651.501602] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [190651.508822] [] tgt_enqueue+0x62/0x210 [ptlrpc] [190651.515077] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [190651.522120] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [190651.529921] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [190651.536350] [] kthread+0xd1/0xe0 [190651.541353] [] ret_from_fork_nospec_begin+0xe/0x21 [190651.547930] [] 0xffffffffffffffff [190651.553041] LustreError: dumping log to /tmp/lustre-log.1576176889.39336 [190653.255674] LNet: Service thread pid 39336 completed after 399.97s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [190653.271927] LNet: Skipped 1 previous similar message [190659.166277] LustreError: 98146:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576176597, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff885b977a1200/0xc3c20c06c6a7b6f2 lrc: 3/1,0 mode: --/PR res: [0x2000376b8:0x1706e:0x0].0x0 bits 0x13/0x0 rrc: 30 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 98146 timeout: 0 lvb_type: 0 [190724.074638] LNet: Service thread pid 39232 was inactive for 450.85s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [190724.091666] Pid: 39232, comm: mdt01_014 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [190724.101922] Call Trace: [190724.104481] [] osp_precreate_reserve+0x2e8/0x800 [osp] [190724.111399] [] osp_declare_create+0x199/0x5b0 [osp] [190724.118066] [] lod_sub_declare_create+0xdf/0x210 [lod] [190724.124983] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [190724.132173] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [190724.139704] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [190724.146632] [] lod_prepare_create+0x215/0x2e0 [lod] [190724.153290] [] lod_declare_striped_create+0x1ee/0x980 [lod] [190724.160639] [] lod_declare_create+0x204/0x590 [lod] [190724.167293] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [190724.175267] [] mdd_declare_create+0x4c/0xcb0 [mdd] [190724.181854] [] mdd_create+0x847/0x14e0 [mdd] [190724.187910] [] mdt_reint_open+0x224f/0x3240 [mdt] [190724.194402] [] mdt_reint_rec+0x83/0x210 [mdt] [190724.200553] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [190724.207231] [] mdt_intent_open+0x82/0x3a0 [mdt] [190724.213554] [] mdt_intent_policy+0x435/0xd80 [mdt] [190724.220125] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [190724.226986] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [190724.234183] [] tgt_enqueue+0x62/0x210 [ptlrpc] [190724.240448] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [190724.247477] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [190724.255292] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [190724.261709] [] kthread+0xd1/0xe0 [190724.266723] [] ret_from_fork_nospec_begin+0xe/0x21 [190724.273303] [] 0xffffffffffffffff [190724.278433] LustreError: dumping log to /tmp/lustre-log.1576176962.39232 [190753.257101] LNet: Service thread pid 39232 completed after 480.03s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [191123.258200] Lustre: MGS: Connection restored to (at 10.9.115.12@o2ib4) [191123.264906] Lustre: Skipped 13 previous similar messages [191788.748966] Lustre: fir-MDT0000: haven't heard from client e8e18d90-dcac-7195-a7b7-bbaf10be70ce (at 10.9.103.52@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886be3ac0c00, cur 1576178027 expire 1576177877 last 1576177800 [191788.770847] Lustre: Skipped 1 previous similar message [192216.177004] Lustre: MGS: Connection restored to 5d110741-f52f-a556-c0fd-775bc1eebbda (at 10.9.105.33@o2ib4) [192216.186835] Lustre: Skipped 1 previous similar message [193417.364701] Lustre: MGS: Connection restored to e8e18d90-dcac-7195-a7b7-bbaf10be70ce (at 10.9.103.52@o2ib4) [193417.374529] Lustre: Skipped 1 previous similar message [193579.756119] Lustre: MGS: haven't heard from client 9a79f7a1-9fac-50b7-c195-aa7bdae4f43f (at 10.8.7.7@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bf8834400, cur 1576179818 expire 1576179668 last 1576179591 [193579.777038] Lustre: Skipped 1 previous similar message [193739.584837] Lustre: 97348:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576179970/real 1576179970] req@ffff886bcde4ec00 x1652542944106224/t0(0) o104->fir-MDT0000@10.9.101.46@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576179977 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [193739.612277] Lustre: 97348:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 99 previous similar messages [193780.754672] Lustre: MGS: haven't heard from client ebc1ca27-139b-33a6-84f1-99529f6e5ea6 (at 10.9.101.46@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc6bdf800, cur 1576180019 expire 1576179869 last 1576179792 [193780.775882] Lustre: Skipped 1 previous similar message [193781.767459] LustreError: 97348:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.9.101.46@o2ib4) failed to reply to blocking AST (req@ffff886bcde4ec00 x1652542944106224 status 0 rc -5), evict it ns: mdt-fir-MDT0000_UUID lock: ffff885d51e0fbc0/0xc3c20c06c74b6efb lrc: 4/0,0 mode: PR/PR res: [0x200000406:0x1b2:0x0].0x0 bits 0x13/0x0 rrc: 19 type: IBT flags: 0x60200400000020 nid: 10.9.101.46@o2ib4 remote: 0x363c394bde5a12c6 expref: 1369 pid: 88948 timeout: 194034 lvb_type: 0 [193781.810399] LustreError: 97348:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 4 previous similar messages [193781.820576] LustreError: 138-a: fir-MDT0000: A client on nid 10.9.101.46@o2ib4 was evicted due to a lock blocking callback time out: rc -5 [193781.833087] LustreError: Skipped 4 previous similar messages [194028.756572] Lustre: MGS: haven't heard from client ab6cce31-df0e-ae34-d69d-c23500355ff1 (at 10.9.101.26@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885beabcec00, cur 1576180267 expire 1576180117 last 1576180040 [194028.777774] Lustre: Skipped 1 previous similar message [194880.761885] Lustre: MGS: haven't heard from client acf051dc-6a1e-bbe8-7a61-8e031fd79e86 (at 10.8.7.15@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf5a01000, cur 1576181119 expire 1576180969 last 1576180892 [194880.782893] Lustre: Skipped 1 previous similar message [195330.518138] Lustre: MGS: Connection restored to c804f06b-97c0-205b-aa77-e2392ade35bd (at 10.8.7.7@o2ib6) [195330.527710] Lustre: Skipped 1 previous similar message [196044.098798] Lustre: MGS: Connection restored to 1d444526-0c94-9229-34be-9d214c0c6bbd (at 10.9.101.46@o2ib4) [196044.108630] Lustre: Skipped 1 previous similar message [196392.063661] Lustre: MGS: Connection restored to 7126efc2-9676-1db9-94d0-ae09c1520697 (at 10.9.101.26@o2ib4) [196392.073495] Lustre: Skipped 1 previous similar message [196608.151551] Lustre: MGS: Connection restored to 02eb8135-4034-bcb2-8df8-77d00506e76a (at 10.8.7.15@o2ib6) [196608.161205] Lustre: Skipped 1 previous similar message [196659.389919] Lustre: MGS: Connection restored to (at 10.8.22.31@o2ib6) [196659.396538] Lustre: Skipped 1 previous similar message [196741.253783] Lustre: 97355:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576182972/real 1576182972] req@ffff886bd3fe8d80 x1652542946361344/t0(0) o1000->fir-MDT0001-osp-MDT0000@10.0.10.52@o2ib7:24/4 lens 304/4320 e 0 to 1 dl 1576182979 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 [196741.282268] Lustre: 97355:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [196741.292099] Lustre: fir-MDT0001-osp-MDT0000: Connection to fir-MDT0001 (at 10.0.10.52@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [196766.349046] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [196766.365371] LustreError: Skipped 1109 previous similar messages [196842.278429] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.110.51@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [196842.295887] LustreError: Skipped 120 previous similar messages [196846.845385] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds [196846.855643] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Skipped 3 previous similar messages [196846.865898] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.52@o2ib7 (105): c: 4, oc: 0, rc: 8 [196846.878061] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Skipped 3 previous similar messages [196846.888079] LNetError: 38668:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.52@o2ib7 added to recovery queue. Health = 900 [196847.146766] LNetError: 101736:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [196847.158844] LNetError: 101736:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 2 previous similar messages [196891.152961] LNetError: 101736:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [196891.165068] LNetError: 101736:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 5 previous similar messages [196911.845762] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.52@o2ib7: 0 seconds [196911.855930] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 2 previous similar messages [196927.845845] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.52@o2ib7: 1 seconds [196938.890326] Lustre: MGS: Connection restored to (at 10.9.103.28@o2ib4) [196938.897041] Lustre: Skipped 1 previous similar message [196944.852408] Lustre: MGS: haven't heard from client b6936c9e-bc4f-ad29-5bfd-ac26e88c91e0 (at 10.0.10.52@o2ib7) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0acd000, cur 1576183183 expire 1576183033 last 1576182956 [196944.873530] Lustre: Skipped 1 previous similar message [196947.845973] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.52@o2ib7: 0 seconds [196947.856143] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [196992.872339] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.27.18@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [196992.889714] LustreError: Skipped 182 previous similar messages [197046.846541] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [197046.856626] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.52@o2ib7 (20): c: 0, oc: 0, rc: 8 [197046.868964] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [197046.880991] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 4 previous similar messages [197145.615094] LNet: Service thread pid 97355 was inactive for 411.35s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [197145.632114] Pid: 97355, comm: mdt00_051 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [197145.642377] Call Trace: [197145.644936] [] ptlrpc_set_wait+0x480/0x790 [ptlrpc] [197145.651629] [] ptlrpc_queue_wait+0x83/0x230 [ptlrpc] [197145.658407] [] osp_remote_sync+0xd3/0x200 [osp] [197145.664717] [] osp_attr_get+0x463/0x730 [osp] [197145.670858] [] osp_object_init+0x16d/0x2d0 [osp] [197145.677267] [] lu_object_start.isra.35+0x8b/0x120 [obdclass] [197145.684736] [] lu_object_find_at+0x1e1/0xa60 [obdclass] [197145.691753] [] lu_object_find_slice+0x1f/0x90 [obdclass] [197145.698871] [] mdd_object_find+0x10/0x70 [mdd] [197145.705094] [] obf_lookup+0x2c9/0x350 [mdd] [197145.711082] [] mdt_getattr_name_lock+0xf7c/0x1c30 [mdt] [197145.718101] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [197145.724773] [] mdt_intent_policy+0x435/0xd80 [mdt] [197145.731339] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [197145.738215] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [197145.745414] [] tgt_enqueue+0x62/0x210 [ptlrpc] [197145.751693] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [197145.758725] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [197145.766554] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [197145.772975] [] kthread+0xd1/0xe0 [197145.777973] [] ret_from_fork_nospec_begin+0xe/0x21 [197145.784543] [] 0xffffffffffffffff [197145.789649] LustreError: dumping log to /tmp/lustre-log.1576183384.97355 [197146.847120] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [197146.857200] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.52@o2ib7 (15): c: 0, oc: 0, rc: 8 [197293.513298] Lustre: MGS: Connection restored to 10.0.10.52@o2ib7 (at 10.0.10.52@o2ib7) [197293.521307] Lustre: Skipped 1 previous similar message [197293.930074] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.27.18@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [197293.947448] LustreError: Skipped 550 previous similar messages [197294.031190] Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.52@o2ib7, removing former export from same NID [197294.042150] Lustre: Skipped 1222 previous similar messages [197328.912146] Lustre: 39399:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8852966f9200 x1652162130109376/t0(0) o101->ae1d0080-04fa-5436-e145-ffdf0db9990d@10.0.10.3@o2ib7:272/0 lens 600/3264 e 14 to 0 dl 1576183572 ref 2 fl Interpret:/0/0 rc 0/0 [197335.303077] Lustre: fir-MDT0000: Client ae1d0080-04fa-5436-e145-ffdf0db9990d (at 10.0.10.3@o2ib7) reconnecting [197335.313206] Lustre: Skipped 2 previous similar messages [197398.028251] LNet: Service thread pid 97355 completed after 663.77s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [197496.849083] LNet: Service thread pid 39417 was inactive for 200.09s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [197496.866100] Pid: 39417, comm: mdt03_042 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [197496.876361] Call Trace: [197496.878911] [] call_rwsem_down_write_failed+0x17/0x30 [197496.885737] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [197496.892580] [] lod_qos_prep_create+0x16a/0x1890 [lod] [197496.899412] [] lod_prepare_create+0x215/0x2e0 [lod] [197496.906077] [] lod_declare_striped_create+0x1ee/0x980 [lod] [197496.913417] [] lod_declare_create+0x204/0x590 [lod] [197496.920080] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [197496.928030] [] mdd_declare_create+0x4c/0xcb0 [mdd] [197496.934606] [] mdd_create+0x847/0x14e0 [mdd] [197496.940647] [] mdt_reint_open+0x224f/0x3240 [mdt] [197496.947154] [] mdt_reint_rec+0x83/0x210 [mdt] [197496.953292] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [197496.959963] [] mdt_intent_open+0x82/0x3a0 [mdt] [197496.966276] [] mdt_intent_policy+0x435/0xd80 [mdt] [197496.972860] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [197496.979718] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [197496.986940] [] tgt_enqueue+0x62/0x210 [ptlrpc] [197496.993201] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [197497.000233] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [197497.008046] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [197497.014463] [] kthread+0xd1/0xe0 [197497.019478] [] ret_from_fork_nospec_begin+0xe/0x21 [197497.026042] [] 0xffffffffffffffff [197497.031163] LustreError: dumping log to /tmp/lustre-log.1576183735.39417 [197497.873110] LNet: Service thread pid 39358 was inactive for 200.58s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [197497.890129] Pid: 39358, comm: mdt03_025 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [197497.900389] Call Trace: [197497.902947] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [197497.909988] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [197497.917292] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [197497.924209] [] mdt_object_lock_internal+0x70/0x360 [mdt] [197497.931316] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [197497.938319] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [197497.944991] [] mdt_intent_policy+0x435/0xd80 [mdt] [197497.951553] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [197497.958415] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [197497.965625] [] tgt_enqueue+0x62/0x210 [ptlrpc] [197497.971901] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [197497.978931] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [197497.986746] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [197497.993163] [] kthread+0xd1/0xe0 [197497.998177] [] ret_from_fork_nospec_begin+0xe/0x21 [197498.004743] [] 0xffffffffffffffff [197498.009861] LustreError: dumping log to /tmp/lustre-log.1576183736.39358 [197498.017148] Pid: 39275, comm: mdt02_020 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [197498.027451] Call Trace: [197498.030004] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [197498.037027] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [197498.044320] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [197498.051230] [] mdt_object_lock_internal+0x70/0x360 [mdt] [197498.058325] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [197498.065331] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [197498.072000] [] mdt_intent_policy+0x435/0xd80 [mdt] [197498.078573] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [197498.085425] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [197498.092630] [] tgt_enqueue+0x62/0x210 [ptlrpc] [197498.098909] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [197498.105935] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [197498.113749] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [197498.120166] [] kthread+0xd1/0xe0 [197498.125170] [] ret_from_fork_nospec_begin+0xe/0x21 [197498.131728] [] 0xffffffffffffffff [197498.136832] Pid: 39411, comm: mdt02_042 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [197498.147084] Call Trace: [197498.149630] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [197498.156643] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [197498.163946] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [197498.170867] [] mdt_object_lock_internal+0x70/0x360 [mdt] [197498.177963] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [197498.184966] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [197498.191628] [] mdt_intent_policy+0x435/0xd80 [mdt] [197498.198191] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [197498.205044] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [197498.212233] [] tgt_enqueue+0x62/0x210 [ptlrpc] [197498.218485] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [197498.225509] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [197498.233338] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [197498.239757] [] kthread+0xd1/0xe0 [197498.244763] [] ret_from_fork_nospec_begin+0xe/0x21 [197498.251342] [] 0xffffffffffffffff [197498.256440] Pid: 39244, comm: mdt03_008 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [197498.266694] Call Trace: [197498.269239] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [197498.276267] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [197498.283541] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [197498.290472] [] mdt_object_lock_internal+0x70/0x360 [mdt] [197498.297572] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [197498.304591] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [197498.311250] [] mdt_intent_policy+0x435/0xd80 [mdt] [197498.317825] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [197498.324666] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [197498.331875] [] tgt_enqueue+0x62/0x210 [ptlrpc] [197498.338117] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [197498.345151] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [197498.352953] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [197498.359384] [] kthread+0xd1/0xe0 [197498.364391] [] ret_from_fork_nospec_begin+0xe/0x21 [197498.370960] [] 0xffffffffffffffff [197498.376055] LNet: Service thread pid 97455 was inactive for 200.76s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [197498.389015] LNet: Skipped 1 previous similar message [197498.897105] LustreError: dumping log to /tmp/lustre-log.1576183737.97383 [197499.921104] LustreError: dumping log to /tmp/lustre-log.1576183738.38898 [197501.969116] LustreError: dumping log to /tmp/lustre-log.1576183740.39426 [197513.233184] LustreError: dumping log to /tmp/lustre-log.1576183751.39254 [197515.281193] LNet: Service thread pid 38895 was inactive for 200.12s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [197515.294143] LNet: Skipped 35 previous similar messages [197515.299381] LustreError: dumping log to /tmp/lustre-log.1576183753.38895 [197574.047839] LNet: Service thread pid 39417 completed after 277.29s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [197574.064086] LNet: Skipped 39 previous similar messages [197574.673527] LNet: Service thread pid 39232 was inactive for 200.61s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [197574.686478] LustreError: dumping log to /tmp/lustre-log.1576183812.39232 [197594.129645] LustreError: dumping log to /tmp/lustre-log.1576183832.97378 [197603.345696] LustreError: dumping log to /tmp/lustre-log.1576183841.39238 [197674.049579] LNet: Service thread pid 38894 completed after 299.99s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [197674.065827] LNet: Skipped 5 previous similar messages [197717.677124] Lustre: DEBUG MARKER: Thu Dec 12 12:52:35 2019 [198918.782221] Lustre: fir-MDT0000: haven't heard from client f97f048d-b027-4 (at 10.8.9.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a69bac000, cur 1576185157 expire 1576185007 last 1576184930 [200624.802591] Lustre: MGS: haven't heard from client 88e30dc0-2493-6815-27c6-7300a4eebf30 (at 10.8.28.3@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0ac2800, cur 1576186863 expire 1576186713 last 1576186636 [200624.823604] Lustre: Skipped 1 previous similar message [200883.748214] LNet: Service thread pid 97407 was inactive for 200.41s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [200883.765241] LNet: Skipped 3 previous similar messages [200883.770389] Pid: 97407, comm: mdt02_055 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [200883.780665] Call Trace: [200883.783221] [] call_rwsem_down_write_failed+0x17/0x30 [200883.790039] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [200883.797489] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [200883.804415] [] lod_prepare_create+0x215/0x2e0 [lod] [200883.811079] [] lod_declare_striped_create+0x1ee/0x980 [lod] [200883.818422] [] lod_declare_create+0x204/0x590 [lod] [200883.825084] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [200883.833035] [] mdd_declare_create+0x4c/0xcb0 [mdd] [200883.839610] [] mdd_create+0x847/0x14e0 [mdd] [200883.845652] [] mdt_reint_open+0x224f/0x3240 [mdt] [200883.852150] [] mdt_reint_rec+0x83/0x210 [mdt] [200883.858279] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [200883.864944] [] mdt_intent_open+0x82/0x3a0 [mdt] [200883.871261] [] mdt_intent_policy+0x435/0xd80 [mdt] [200883.877825] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [200883.884680] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [200883.891888] [] tgt_enqueue+0x62/0x210 [ptlrpc] [200883.898139] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [200883.905176] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [200883.912990] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [200883.919407] [] kthread+0xd1/0xe0 [200883.924408] [] ret_from_fork_nospec_begin+0xe/0x21 [200883.930975] [] 0xffffffffffffffff [200883.936099] LustreError: dumping log to /tmp/lustre-log.1576187122.97407 [200901.156307] LNet: Service thread pid 39408 was inactive for 200.43s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [200901.173350] Pid: 39408, comm: mdt02_041 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [200901.183612] Call Trace: [200901.186170] [] call_rwsem_down_write_failed+0x17/0x30 [200901.192994] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [200901.199842] [] lod_qos_prep_create+0x16a/0x1890 [lod] [200901.206662] [] lod_prepare_create+0x215/0x2e0 [lod] [200901.213324] [] lod_declare_striped_create+0x1ee/0x980 [lod] [200901.220667] [] lod_declare_create+0x204/0x590 [lod] [200901.227330] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [200901.235290] [] mdd_declare_create+0x4c/0xcb0 [mdd] [200901.241862] [] mdd_create+0x847/0x14e0 [mdd] [200901.247912] [] mdt_reint_open+0x224f/0x3240 [mdt] [200901.254419] [] mdt_reint_rec+0x83/0x210 [mdt] [200901.260582] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [200901.267248] [] mdt_intent_open+0x82/0x3a0 [mdt] [200901.273596] [] mdt_intent_policy+0x435/0xd80 [mdt] [200901.280170] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [200901.287061] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [200901.294262] [] tgt_enqueue+0x62/0x210 [ptlrpc] [200901.300544] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [200901.307581] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [200901.315431] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [200901.321864] [] kthread+0xd1/0xe0 [200901.326902] [] ret_from_fork_nospec_begin+0xe/0x21 [200901.333470] [] 0xffffffffffffffff [200901.338622] LustreError: dumping log to /tmp/lustre-log.1576187139.39408 [200901.345899] Pid: 39363, comm: mdt02_030 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [200901.356175] Call Trace: [200901.358732] [] call_rwsem_down_write_failed+0x17/0x30 [200901.365557] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [200901.372392] [] lod_qos_prep_create+0x16a/0x1890 [lod] [200901.379215] [] lod_prepare_create+0x215/0x2e0 [lod] [200901.385880] [] lod_declare_striped_create+0x1ee/0x980 [lod] [200901.393221] [] lod_declare_create+0x204/0x590 [lod] [200901.399883] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [200901.407834] [] mdd_declare_create+0x4c/0xcb0 [mdd] [200901.414411] [] mdd_create+0x847/0x14e0 [mdd] [200901.420453] [] mdt_reint_open+0x224f/0x3240 [mdt] [200901.426964] [] mdt_reint_rec+0x83/0x210 [mdt] [200901.433099] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [200901.439762] [] mdt_intent_open+0x82/0x3a0 [mdt] [200901.446075] [] mdt_intent_policy+0x435/0xd80 [mdt] [200901.452664] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [200901.459519] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [200901.466716] [] tgt_enqueue+0x62/0x210 [ptlrpc] [200901.472970] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [200901.479993] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [200901.487808] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [200901.494238] [] kthread+0xd1/0xe0 [200901.499232] [] ret_from_fork_nospec_begin+0xe/0x21 [200901.505804] [] 0xffffffffffffffff [200930.852466] LNet: Service thread pid 39360 was inactive for 200.45s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [200930.869494] LNet: Skipped 1 previous similar message [200930.874552] Pid: 39360, comm: mdt02_029 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [200930.884826] Call Trace: [200930.887378] [] call_rwsem_down_write_failed+0x17/0x30 [200930.894203] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [200930.901046] [] lod_qos_prep_create+0x16a/0x1890 [lod] [200930.907868] [] lod_prepare_create+0x215/0x2e0 [lod] [200930.914546] [] lod_declare_striped_create+0x1ee/0x980 [lod] [200930.921892] [] lod_declare_create+0x204/0x590 [lod] [200930.928556] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [200930.936514] [] mdd_declare_create+0x4c/0xcb0 [mdd] [200930.943086] [] mdd_create+0x847/0x14e0 [mdd] [200930.949132] [] mdt_reint_open+0x224f/0x3240 [mdt] [200930.955630] [] mdt_reint_rec+0x83/0x210 [mdt] [200930.961767] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [200930.968438] [] mdt_intent_open+0x82/0x3a0 [mdt] [200930.974761] [] mdt_intent_policy+0x435/0xd80 [mdt] [200930.981337] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [200930.988206] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [200930.995404] [] tgt_enqueue+0x62/0x210 [ptlrpc] [200931.001666] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [200931.008698] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [200931.016513] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [200931.022929] [] kthread+0xd1/0xe0 [200931.027945] [] ret_from_fork_nospec_begin+0xe/0x21 [200931.034507] [] 0xffffffffffffffff [200931.039628] LustreError: dumping log to /tmp/lustre-log.1576187169.39360 [200980.759198] LNet: Service thread pid 39408 completed after 280.03s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [200980.775444] LNet: Skipped 3 previous similar messages [200983.588740] LNet: Service thread pid 38895 was inactive for 200.71s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [200983.605766] Pid: 38895, comm: mdt02_002 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [200983.616028] Call Trace: [200983.618583] [] call_rwsem_down_write_failed+0x17/0x30 [200983.625407] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [200983.632254] [] lod_qos_prep_create+0x16a/0x1890 [lod] [200983.639073] [] lod_prepare_create+0x215/0x2e0 [lod] [200983.645741] [] lod_declare_striped_create+0x1ee/0x980 [lod] [200983.653079] [] lod_declare_create+0x204/0x590 [lod] [200983.659741] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [200983.667700] [] mdd_declare_create+0x4c/0xcb0 [mdd] [200983.674275] [] mdd_create+0x847/0x14e0 [mdd] [200983.680320] [] mdt_reint_open+0x224f/0x3240 [mdt] [200983.686816] [] mdt_reint_rec+0x83/0x210 [mdt] [200983.692954] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [200983.699617] [] mdt_intent_open+0x82/0x3a0 [mdt] [200983.705939] [] mdt_intent_policy+0x435/0xd80 [mdt] [200983.712514] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [200983.719372] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [200983.726578] [] tgt_enqueue+0x62/0x210 [ptlrpc] [200983.732839] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [200983.739874] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [200983.747677] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [200983.754106] [] kthread+0xd1/0xe0 [200983.759107] [] ret_from_fork_nospec_begin+0xe/0x21 [200983.765669] [] 0xffffffffffffffff [200983.770797] LustreError: dumping log to /tmp/lustre-log.1576187221.38895 [201017.380931] LNet: Service thread pid 39330 was inactive for 200.49s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [201017.393880] LNet: Skipped 5 previous similar messages [201017.399026] LustreError: dumping log to /tmp/lustre-log.1576187255.39330 [201080.760954] LNet: Service thread pid 38895 completed after 297.88s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [201080.777204] LNet: Skipped 1 previous similar message [201080.869285] LNet: Service thread pid 39329 was inactive for 200.11s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [201080.882234] LustreError: dumping log to /tmp/lustre-log.1576187319.39329 [201180.761118] LNet: Service thread pid 39329 completed after 300.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [201180.777367] LNet: Skipped 1 previous similar message [201942.173237] Lustre: MGS: Connection restored to (at 10.9.108.12@o2ib4) [201942.179947] Lustre: Skipped 4 previous similar messages [202216.811542] Lustre: MGS: Connection restored to 4359a6d6-39f4-3744-7f0f-dc517a2bb4c6 (at 10.8.28.3@o2ib6) [202216.821201] Lustre: Skipped 1 previous similar message [205948.822554] Lustre: MGS: haven't heard from client 8b08dc27-d2aa-93f7-25fb-507b587de732 (at 10.9.101.71@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887d4c8b2400, cur 1576192187 expire 1576192037 last 1576191960 [205948.843735] Lustre: Skipped 1 previous similar message [207113.762597] LustreError: 39232:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193051, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8869bcfeb180/0xc3c20c06d2f5b395 lrc: 3/1,0 mode: --/PR res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x13/0x0 rrc: 70 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39232 timeout: 0 lvb_type: 0 [207113.802243] LustreError: 39232:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 15 previous similar messages [207127.058687] LustreError: 39336:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193065, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8875a4fb6300/0xc3c20c06d2f8d5af lrc: 3/1,0 mode: --/PR res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x13/0x0 rrc: 70 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39336 timeout: 0 lvb_type: 0 [207145.623785] LustreError: 39374:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193083, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff887bbe51e0c0/0xc3c20c06d2fcef9a lrc: 3/1,0 mode: --/PR res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x13/0x0 rrc: 70 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39374 timeout: 0 lvb_type: 0 [207145.663436] LustreError: 39374:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 5 previous similar messages [207155.946853] LustreError: 39252:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193094, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff888bf5e6ad00/0xc3c20c06d2ff1fee lrc: 3/1,0 mode: --/PR res: [0x2000013a6:0x62d4:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39252 timeout: 0 lvb_type: 0 [207178.823984] LNet: Service thread pid 97355 was inactive for 365.05s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [207178.841007] Pid: 97355, comm: mdt00_051 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [207178.851267] Call Trace: [207178.853844] [] call_rwsem_down_write_failed+0x17/0x30 [207178.860664] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [207178.868117] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [207178.875036] [] lod_prepare_create+0x215/0x2e0 [lod] [207178.881697] [] lod_declare_striped_create+0x1ee/0x980 [lod] [207178.889040] [] lod_declare_create+0x204/0x590 [lod] [207178.895702] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [207178.903679] [] mdd_declare_create+0x4c/0xcb0 [mdd] [207178.910253] [] mdd_create+0x847/0x14e0 [mdd] [207178.916296] [] mdt_reint_open+0x224f/0x3240 [mdt] [207178.922803] [] mdt_reint_rec+0x83/0x210 [mdt] [207178.928940] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [207178.935613] [] mdt_intent_open+0x82/0x3a0 [mdt] [207178.941926] [] mdt_intent_policy+0x435/0xd80 [mdt] [207178.948509] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [207178.955359] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [207178.962564] [] tgt_enqueue+0x62/0x210 [ptlrpc] [207178.968833] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [207178.975876] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [207178.983682] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [207178.990109] [] kthread+0xd1/0xe0 [207178.995112] [] ret_from_fork_nospec_begin+0xe/0x21 [207179.001688] [] 0xffffffffffffffff [207179.006798] LustreError: dumping log to /tmp/lustre-log.1576193417.97355 [207179.014302] Pid: 39428, comm: mdt00_042 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [207179.024592] Call Trace: [207179.027142] [] call_rwsem_down_write_failed+0x17/0x30 [207179.033954] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [207179.041398] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [207179.048309] [] lod_prepare_create+0x215/0x2e0 [lod] [207179.054974] [] lod_declare_striped_create+0x1ee/0x980 [lod] [207179.062311] [] lod_declare_create+0x204/0x590 [lod] [207179.068977] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [207179.076926] [] mdd_declare_create+0x4c/0xcb0 [mdd] [207179.083501] [] mdd_create+0x847/0x14e0 [mdd] [207179.089560] [] mdt_reint_open+0x224f/0x3240 [mdt] [207179.096058] [] mdt_reint_rec+0x83/0x210 [mdt] [207179.102204] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [207179.108869] [] mdt_intent_open+0x82/0x3a0 [mdt] [207179.115172] [] mdt_intent_policy+0x435/0xd80 [mdt] [207179.121757] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [207179.128596] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [207179.135794] [] tgt_enqueue+0x62/0x210 [ptlrpc] [207179.142039] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [207179.149074] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [207179.156892] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [207179.163321] [] kthread+0xd1/0xe0 [207179.168329] [] ret_from_fork_nospec_begin+0xe/0x21 [207179.174893] [] 0xffffffffffffffff [207179.180001] Pid: 39432, comm: mdt00_044 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [207179.190261] Call Trace: [207179.192807] [] call_rwsem_down_write_failed+0x17/0x30 [207179.199621] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [207179.207047] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [207179.213977] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [207179.221929] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [207179.229285] [] mdd_declare_layout_change+0x62/0x120 [mdd] [207179.236470] [] mdd_layout_change+0x882/0x1000 [mdd] [207179.243132] [] mdt_layout_change+0x337/0x430 [mdt] [207179.249705] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [207179.256281] [] mdt_intent_policy+0x435/0xd80 [mdt] [207179.262843] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [207179.269698] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [207179.276885] [] tgt_enqueue+0x62/0x210 [ptlrpc] [207179.283141] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [207179.290161] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [207179.297982] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [207179.304416] [] kthread+0xd1/0xe0 [207179.309434] [] ret_from_fork_nospec_begin+0xe/0x21 [207179.316013] [] 0xffffffffffffffff [207179.321121] Pid: 97382, comm: mdt01_060 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [207179.331375] Call Trace: [207179.333925] [] call_rwsem_down_write_failed+0x17/0x30 [207179.340743] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [207179.348185] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [207179.355112] [] lod_prepare_create+0x215/0x2e0 [lod] [207179.361770] [] lod_declare_striped_create+0x1ee/0x980 [lod] [207179.369132] [] lod_declare_create+0x204/0x590 [lod] [207179.375781] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [207179.383726] [] mdd_declare_create+0x4c/0xcb0 [mdd] [207179.390303] [] mdd_create+0x847/0x14e0 [mdd] [207179.396345] [] mdt_reint_open+0x224f/0x3240 [mdt] [207179.402827] [] mdt_reint_rec+0x83/0x210 [mdt] [207179.408958] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [207179.415617] [] mdt_intent_open+0x82/0x3a0 [mdt] [207179.421922] [] mdt_intent_policy+0x435/0xd80 [mdt] [207179.428499] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [207179.435353] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [207179.442555] [] tgt_enqueue+0x62/0x210 [ptlrpc] [207179.448798] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [207179.455833] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [207179.463635] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [207179.470063] [] kthread+0xd1/0xe0 [207179.475057] [] ret_from_fork_nospec_begin+0xe/0x21 [207179.481630] [] 0xffffffffffffffff [207179.486718] LNet: Service thread pid 97348 was inactive for 365.72s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [207179.503754] LNet: Skipped 3 previous similar messages [207179.508900] Pid: 97348, comm: mdt01_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [207179.519157] Call Trace: [207179.521708] [] call_rwsem_down_write_failed+0x17/0x30 [207179.528532] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [207179.535991] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [207179.542901] [] lod_prepare_create+0x215/0x2e0 [lod] [207179.549565] [] lod_declare_striped_create+0x1ee/0x980 [lod] [207179.556908] [] lod_declare_create+0x204/0x590 [lod] [207179.563570] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [207179.571519] [] mdd_declare_create+0x4c/0xcb0 [mdd] [207179.578109] [] mdd_create+0x847/0x14e0 [mdd] [207179.584155] [] mdt_reint_open+0x224f/0x3240 [mdt] [207179.590654] [] mdt_reint_rec+0x83/0x210 [mdt] [207179.596793] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [207179.603463] [] mdt_intent_open+0x82/0x3a0 [mdt] [207179.609790] [] mdt_intent_policy+0x435/0xd80 [mdt] [207179.616378] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [207179.623249] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [207179.630451] [] tgt_enqueue+0x62/0x210 [ptlrpc] [207179.636703] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [207179.643749] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [207179.651547] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [207179.657983] [] kthread+0xd1/0xe0 [207179.662988] [] ret_from_fork_nospec_begin+0xe/0x21 [207179.669564] [] 0xffffffffffffffff [207179.674658] LNet: Service thread pid 39257 was inactive for 364.96s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [207179.687621] LNet: Skipped 1 previous similar message [207179.847989] LustreError: dumping log to /tmp/lustre-log.1576193418.97362 [207180.871985] LustreError: dumping log to /tmp/lustre-log.1576193419.39349 [207181.896987] LNet: Service thread pid 97405 was inactive for 366.92s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [207181.909929] LNet: Skipped 19 previous similar messages [207181.915165] LustreError: dumping log to /tmp/lustre-log.1576193420.97405 [207183.944014] LustreError: dumping log to /tmp/lustre-log.1576193422.39394 [207184.968012] LustreError: dumping log to /tmp/lustre-log.1576193423.39258 [207189.064039] LNet: Service thread pid 98146 was inactive for 364.72s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [207189.076986] LNet: Skipped 2 previous similar messages [207189.082134] LustreError: dumping log to /tmp/lustre-log.1576193427.98146 [207191.112056] LustreError: dumping log to /tmp/lustre-log.1576193429.39247 [207192.136060] LustreError: dumping log to /tmp/lustre-log.1576193430.39384 [207197.256105] LNet: Service thread pid 97389 was inactive for 364.98s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [207197.269058] LNet: Skipped 10 previous similar messages [207197.274292] LustreError: dumping log to /tmp/lustre-log.1576193435.97389 [207210.568167] LustreError: dumping log to /tmp/lustre-log.1576193448.39374 [207213.766383] LNet: Service thread pid 39264 completed after 399.18s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [207213.782624] LNet: Skipped 5 previous similar messages [207220.808232] LNet: Service thread pid 98176 was inactive for 364.86s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [207220.821185] LNet: Skipped 1 previous similar message [207220.826245] LustreError: dumping log to /tmp/lustre-log.1576193458.98176 [207222.856241] LustreError: dumping log to /tmp/lustre-log.1576193461.39252 [207230.024284] LustreError: dumping log to /tmp/lustre-log.1576193468.39444 [207232.072301] LustreError: dumping log to /tmp/lustre-log.1576193470.39270 [207238.213334] LustreError: 97386:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193176, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8855e0f30240/0xc3c20c06d311b8d5 lrc: 3/1,0 mode: --/PR res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x13/0x0 rrc: 70 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 97386 timeout: 0 lvb_type: 0 [207238.252981] LustreError: 97386:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 2 previous similar messages [207271.201528] LustreError: 39241:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193209, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff88639b4421c0/0xc3c20c06d31919d2 lrc: 3/0,1 mode: --/CW res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x2/0x0 rrc: 70 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39241 timeout: 0 lvb_type: 0 [207273.577835] Lustre: fir-MDT0000: Client 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) reconnecting [207273.586131] Lustre: fir-MDT0000: Connection restored to 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) [207273.594660] Lustre: Skipped 1 previous similar message [207278.152561] LNet: Service thread pid 39407 was inactive for 364.71s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [207278.165530] LNet: Skipped 3 previous similar messages [207278.170672] LustreError: dumping log to /tmp/lustre-log.1576193516.39407 [207313.767010] LNet: Service thread pid 38890 completed after 499.02s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [207313.783257] LNet: Skipped 13 previous similar messages [207328.328863] LustreError: dumping log to /tmp/lustre-log.1576193566.38891 [207330.376874] LustreError: dumping log to /tmp/lustre-log.1576193568.39233 [207331.400888] LustreError: dumping log to /tmp/lustre-log.1576193569.39214 [207333.448904] LustreError: dumping log to /tmp/lustre-log.1576193571.39266 [207334.472897] LustreError: dumping log to /tmp/lustre-log.1576193572.97386 [207408.129347] Lustre: 39436:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff885bc250cc80 x1649315785578896/t0(0) o101->b4206b2f-67a2-cb01-c899-d99205e22b23@10.9.108.61@o2ib4:536/0 lens 1832/3288 e 13 to 0 dl 1576193651 ref 2 fl Interpret:/0/0 rc 0/0 [207413.766370] LustreError: 38892:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193351, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8869cecc6c00/0xc3c20c06d338026b lrc: 3/1,0 mode: --/PR res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x13/0x0 rrc: 72 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 38892 timeout: 0 lvb_type: 0 [207413.766372] LustreError: 97387:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193351, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8852f1bac380/0xc3c20c06d338023a lrc: 3/0,1 mode: --/CW res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x2/0x0 rrc: 70 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 97387 timeout: 0 lvb_type: 0 [207413.766376] LustreError: 97387:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 3 previous similar messages [207413.768863] LNet: Service thread pid 98176 completed after 557.81s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [207413.846582] Lustre: 97348:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:1s); client may timeout. req@ffff886bea4aa400 x1649317747289552/t592806742019(0) o101->3532db27-3550-1319-6c1b-3d6651c2c9af@10.9.108.62@o2ib4:536/0 lens 1840/904 e 13 to 0 dl 1576193651 ref 1 fl Complete:/0/0 rc 0/0 [207413.902506] LustreError: 38892:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 2 previous similar messages [207713.850235] LustreError: 39241:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576193652, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff888bac6eb3c0/0xc3c20c06d37b7b70 lrc: 3/0,1 mode: --/CW res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x2/0x0 rrc: 82 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39241 timeout: 0 lvb_type: 0 [207713.889799] LustreError: 39241:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 3 previous similar messages [207713.914499] Lustre: fir-MDT0000: Client 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) reconnecting [207713.922790] Lustre: fir-MDT0000: Connection restored to 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) [208246.941076] Lustre: MGS: Connection restored to (at 10.9.101.71@o2ib4) [208313.877844] LustreError: 39258:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576194252, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff885853baca40/0xc3c20c06d40fb711 lrc: 3/1,0 mode: --/PR res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x13/0x0 rrc: 84 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39258 timeout: 0 lvb_type: 0 [208313.917575] LustreError: 39258:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 35 previous similar messages [208313.944205] Lustre: fir-MDT0000: Client 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) reconnecting [208313.952500] Lustre: fir-MDT0000: Connection restored to 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) [208313.961022] Lustre: Skipped 1 previous similar message [208913.914426] LustreError: 39341:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576194852, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8852cc2ccec0/0xc3c20c06d467104b lrc: 3/1,0 mode: --/PR res: [0x20003957b:0x1410:0x0].0x0 bits 0x13/0x0 rrc: 9 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39341 timeout: 0 lvb_type: 0 [208913.954005] LustreError: 39341:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 38 previous similar messages [209017.676556] Lustre: fir-MDT0000: Client 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) reconnecting [209017.684850] Lustre: fir-MDT0000: Connection restored to 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) [209309.696802] Lustre: 39215:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff887405e96300 x1649340861684560/t0(0) o101->d5336f36-1352-ddc7-e966-e696298bb1ae@10.9.106.53@o2ib4:172/0 lens 376/1600 e 5 to 0 dl 1576195552 ref 2 fl Interpret:/0/0 rc 0/0 [209309.725877] Lustre: 39215:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 13 previous similar messages [209310.676804] Lustre: 38896:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff888bd0e69f80 x1649046576943376/t0(0) o101->f9f503f0-6ff6-698f-9a8d-14bd128a6d42@10.9.101.27@o2ib4:173/0 lens 1792/3288 e 5 to 0 dl 1576195553 ref 2 fl Interpret:/0/0 rc 0/0 [209310.705986] Lustre: 38896:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 12 previous similar messages [209316.678468] Lustre: fir-MDT0000: Client 78c1e3f1-dd0b-4 (at 10.8.18.18@o2ib6) reconnecting [209316.686843] Lustre: fir-MDT0000: Connection restored to ee8a8d10-65c2-ae96-bc67-9f6bae32e110 (at 10.8.18.18@o2ib6) [209316.948829] LNet: Service thread pid 39389 was inactive for 601.16s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [209316.965853] Pid: 39389, comm: mdt03_034 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209316.976119] Call Trace: [209316.978678] [] call_rwsem_down_write_failed+0x17/0x30 [209316.985512] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [209316.992984] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [209316.999903] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [209317.007856] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [209317.015230] [] mdd_declare_layout_change+0x62/0x120 [mdd] [209317.022412] [] mdd_layout_change+0x882/0x1000 [mdd] [209317.029087] [] mdt_layout_change+0x337/0x430 [mdt] [209317.035676] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [209317.042255] [] mdt_intent_policy+0x435/0xd80 [mdt] [209317.048846] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209317.055704] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209317.062946] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209317.069209] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209317.076262] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209317.084080] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209317.090515] [] kthread+0xd1/0xe0 [209317.095528] [] ret_from_fork_nospec_begin+0xe/0x21 [209317.102112] [] 0xffffffffffffffff [209317.107238] LustreError: dumping log to /tmp/lustre-log.1576195555.39389 [209317.114992] Pid: 98146, comm: mdt00_068 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209317.125275] Call Trace: [209317.127831] [] osp_precreate_reserve+0x2e8/0x800 [osp] [209317.134754] [] osp_declare_create+0x199/0x5b0 [osp] [209317.141407] [] lod_sub_declare_create+0xdf/0x210 [lod] [209317.148325] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [209317.155526] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [209317.163051] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [209317.169975] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [209317.177940] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [209317.185298] [] mdd_declare_layout_change+0x62/0x120 [mdd] [209317.192497] [] mdd_layout_change+0x882/0x1000 [mdd] [209317.199147] [] mdt_layout_change+0x337/0x430 [mdt] [209317.205724] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [209317.212302] [] mdt_intent_policy+0x435/0xd80 [mdt] [209317.218882] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209317.225750] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209317.232947] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209317.239218] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209317.246253] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209317.254074] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209317.260491] [] kthread+0xd1/0xe0 [209317.265529] [] ret_from_fork_nospec_begin+0xe/0x21 [209317.272097] [] 0xffffffffffffffff [209317.277211] Pid: 39431, comm: mdt00_043 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209317.287472] Call Trace: [209317.290025] [] call_rwsem_down_write_failed+0x17/0x30 [209317.296872] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [209317.304317] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [209317.311242] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [209317.319209] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [209317.326584] [] mdd_declare_layout_change+0x62/0x120 [mdd] [209317.333780] [] mdd_layout_change+0x882/0x1000 [mdd] [209317.340441] [] mdt_layout_change+0x337/0x430 [mdt] [209317.347034] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [209317.353606] [] mdt_intent_policy+0x435/0xd80 [mdt] [209317.360196] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209317.367047] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209317.374250] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209317.380523] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209317.387561] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209317.395386] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209317.401810] [] kthread+0xd1/0xe0 [209317.406825] [] ret_from_fork_nospec_begin+0xe/0x21 [209317.413387] [] 0xffffffffffffffff [209318.716858] Lustre: 39245:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8852857e0900 x1649314302985776/t0(0) o101->75af6c9a-e740-8c0d-465f-820e82ef6338@10.9.108.60@o2ib4:181/0 lens 1784/3288 e 4 to 0 dl 1576195561 ref 2 fl Interpret:/0/0 rc 0/0 [209318.746014] Lustre: 39245:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 26 previous similar messages [209320.750861] Lustre: 106785:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8873fca03a80 x1649309816130960/t0(0) o101->1431f338-e19b-6337-4b33-ec6ebaff454a@10.8.18.22@o2ib6:183/0 lens 1840/3288 e 5 to 0 dl 1576195563 ref 2 fl Interpret:/0/0 rc 0/0 [209320.780019] Lustre: 106785:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [209325.141875] LNet: Service thread pid 39347 was inactive for 601.44s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [209325.158895] LNet: Skipped 2 previous similar messages [209325.164045] Pid: 39347, comm: mdt00_023 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209325.174320] Call Trace: [209325.176870] [] call_rwsem_down_write_failed+0x17/0x30 [209325.183695] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [209325.190543] [] lod_qos_prep_create+0x16a/0x1890 [lod] [209325.197364] [] lod_prepare_create+0x215/0x2e0 [lod] [209325.204031] [] lod_declare_striped_create+0x1ee/0x980 [lod] [209325.211379] [] lod_declare_create+0x204/0x590 [lod] [209325.218038] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [209325.225988] [] mdd_declare_create+0x4c/0xcb0 [mdd] [209325.232549] [] mdd_create+0x847/0x14e0 [mdd] [209325.238599] [] mdt_reint_open+0x224f/0x3240 [mdt] [209325.245112] [] mdt_reint_rec+0x83/0x210 [mdt] [209325.251244] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [209325.257908] [] mdt_intent_open+0x82/0x3a0 [mdt] [209325.264209] [] mdt_intent_policy+0x435/0xd80 [mdt] [209325.270784] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209325.277644] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209325.284836] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209325.291086] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209325.298119] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209325.305924] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209325.312366] [] kthread+0xd1/0xe0 [209325.317372] [] ret_from_fork_nospec_begin+0xe/0x21 [209325.323936] [] 0xffffffffffffffff [209325.329047] LustreError: dumping log to /tmp/lustre-log.1576195563.39347 [209326.728906] Lustre: 39245:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8852cc790d80 x1652733017378752/t0(0) o101->851b742b-36ee-4@10.9.107.13@o2ib4:189/0 lens 1896/3288 e 4 to 0 dl 1576195569 ref 2 fl Interpret:/0/0 rc 0/0 [209326.756248] Lustre: 39245:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages [209326.903460] Lustre: fir-MDT0000: Client 1431f338-e19b-6337-4b33-ec6ebaff454a (at 10.8.18.22@o2ib6) reconnecting [209326.913632] Lustre: Skipped 8 previous similar messages [209331.284917] LNet: Service thread pid 97354 was inactive for 600.38s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [209331.301944] Pid: 97354, comm: mdt00_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209331.312205] Call Trace: [209331.314763] [] call_rwsem_down_write_failed+0x17/0x30 [209331.321589] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [209331.328458] [] lod_qos_prep_create+0x16a/0x1890 [lod] [209331.335290] [] lod_prepare_create+0x215/0x2e0 [lod] [209331.341968] [] lod_declare_striped_create+0x1ee/0x980 [lod] [209331.349310] [] lod_declare_create+0x204/0x590 [lod] [209331.355970] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [209331.363933] [] mdd_declare_create+0x4c/0xcb0 [mdd] [209331.370500] [] mdd_create+0x847/0x14e0 [mdd] [209331.376554] [] mdt_reint_open+0x224f/0x3240 [mdt] [209331.383052] [] mdt_reint_rec+0x83/0x210 [mdt] [209331.389210] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [209331.395859] [] mdt_intent_open+0x82/0x3a0 [mdt] [209331.402184] [] mdt_intent_policy+0x435/0xd80 [mdt] [209331.408752] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209331.415609] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209331.422817] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209331.429091] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209331.436123] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209331.443957] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209331.450378] [] kthread+0xd1/0xe0 [209331.455382] [] ret_from_fork_nospec_begin+0xe/0x21 [209331.461958] [] 0xffffffffffffffff [209331.467067] LustreError: dumping log to /tmp/lustre-log.1576195569.97354 [209334.748954] Lustre: 97350:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff88528625c380 x1649559265975568/t0(0) o101->a8d84424-9b8a-5525-fab4-b5243bf0dc64@10.9.104.22@o2ib4:197/0 lens 1904/3288 e 4 to 0 dl 1576195577 ref 2 fl Interpret:/0/0 rc 0/0 [209334.778115] Lustre: 97350:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 8 previous similar messages [209335.380942] LNet: Service thread pid 39436 was inactive for 601.61s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [209335.393891] LNet: Skipped 9 previous similar messages [209335.399043] LustreError: dumping log to /tmp/lustre-log.1576195573.39436 [209337.197098] Lustre: fir-MDT0000: Connection restored to 5860417b-a563-2455-9c94-86226f905ab9 (at 10.8.27.9@o2ib6) [209337.207454] Lustre: Skipped 19 previous similar messages [209337.428958] LustreError: dumping log to /tmp/lustre-log.1576195575.39351 [209341.524980] LustreError: dumping log to /tmp/lustre-log.1576195579.97380 [209349.717029] LNet: Service thread pid 39417 was inactive for 600.03s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [209349.729971] LNet: Skipped 5 previous similar messages [209349.735114] LustreError: dumping log to /tmp/lustre-log.1576195587.39417 [209350.741076] Lustre: fir-MDT0000: Client d2bd0014-3bea-4 (at 10.9.114.7@o2ib4) reconnecting [209350.749433] Lustre: Skipped 14 previous similar messages [209362.837117] Lustre: 38892:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff886978788000 x1652736187566016/t0(0) o101->d9364eb2-511c-4@10.8.27.10@o2ib6:225/0 lens 600/3264 e 2 to 0 dl 1576195605 ref 2 fl Interpret:/0/0 rc 0/0 [209362.864281] Lustre: 38892:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [209408.869390] Lustre: 39339:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8852bf750900 x1649327288667664/t0(0) o101->a7c6c322-7850-feae-097c-a35b332d6e36@10.9.108.67@o2ib4:272/0 lens 376/1600 e 1 to 0 dl 1576195652 ref 2 fl Interpret:/0/0 rc 0/0 [209413.974525] LNet: Service thread pid 98146 completed after 698.30s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [209413.990776] LNet: Skipped 36 previous similar messages [209414.037877] Lustre: fir-MDT0000: Client 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) reconnecting [209414.046144] Lustre: Skipped 1 previous similar message [209414.051408] Lustre: fir-MDT0000: Connection restored to 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) [209414.059929] Lustre: Skipped 5 previous similar messages [209415.253417] LNet: Service thread pid 97405 was inactive for 601.31s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [209415.266371] LustreError: dumping log to /tmp/lustre-log.1576195653.97405 [209425.493474] LustreError: dumping log to /tmp/lustre-log.1576195663.98144 [209427.541481] LustreError: dumping log to /tmp/lustre-log.1576195665.97386 [209429.589501] LustreError: dumping log to /tmp/lustre-log.1576195667.39442 [209490.069870] Lustre: 107131:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff887879306c00 x1649484815686192/t0(0) o101->5627d86f-0964-ad4d-2769-f014ccc68300@10.8.17.16@o2ib6:353/0 lens 600/3264 e 2 to 0 dl 1576195733 ref 2 fl Interpret:/0/0 rc 0/0 [209490.098937] Lustre: 107131:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 18 previous similar messages [209496.671477] Lustre: fir-MDT0000: Client 5627d86f-0964-ad4d-2769-f014ccc68300 (at 10.8.17.16@o2ib6) reconnecting [209496.681655] Lustre: Skipped 12 previous similar messages [209496.687082] Lustre: fir-MDT0000: Connection restored to 5627d86f-0964-ad4d-2769-f014ccc68300 (at 10.8.17.16@o2ib6) [209496.697547] Lustre: Skipped 12 previous similar messages [209499.221910] LNet: Service thread pid 39261 was inactive for 600.75s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [209499.234860] LNet: Skipped 12 previous similar messages [209499.240103] LustreError: dumping log to /tmp/lustre-log.1576195737.39261 [209507.413960] LustreError: dumping log to /tmp/lustre-log.1576195745.39380 [209513.991530] LNet: Service thread pid 98142 completed after 780.32s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [209514.007780] LNet: Skipped 1 previous similar message [209517.654026] LustreError: dumping log to /tmp/lustre-log.1576195755.39263 [209526.241076] LustreError: 107018:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576195464, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff88643a14f740/0xc3c20c06d4cc5c26 lrc: 3/1,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 14 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 107018 timeout: 0 lvb_type: 0 [209526.280895] LustreError: 107018:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 51 previous similar messages [209534.038141] LustreError: dumping log to /tmp/lustre-log.1576195772.39402 [209538.134142] LustreError: dumping log to /tmp/lustre-log.1576195776.106852 [209554.518237] LustreError: dumping log to /tmp/lustre-log.1576195792.39333 [209568.854326] LNet: Service thread pid 106854 was inactive for 801.39s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [209568.867364] LNet: Skipped 9 previous similar messages [209568.872511] LustreError: dumping log to /tmp/lustre-log.1576195807.106854 [209613.992064] LNet: Service thread pid 39402 completed after 879.96s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [209614.008328] LNet: Skipped 15 previous similar messages [209615.958607] LustreError: dumping log to /tmp/lustre-log.1576195854.39425 [209622.102648] LNet: Service thread pid 39253 was inactive for 801.20s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [209622.119672] Pid: 39253, comm: mdt01_019 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209622.129933] Call Trace: [209622.132490] [] call_rwsem_down_write_failed+0x17/0x30 [209622.139313] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [209622.146764] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [209622.153673] [] lod_prepare_create+0x215/0x2e0 [lod] [209622.160336] [] lod_declare_striped_create+0x1ee/0x980 [lod] [209622.167679] [] lod_declare_create+0x204/0x590 [lod] [209622.174358] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [209622.182318] [] mdd_declare_create+0x4c/0xcb0 [mdd] [209622.188894] [] mdd_create+0x847/0x14e0 [mdd] [209622.194937] [] mdt_reint_open+0x224f/0x3240 [mdt] [209622.201442] [] mdt_reint_rec+0x83/0x210 [mdt] [209622.207580] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [209622.214243] [] mdt_intent_open+0x82/0x3a0 [mdt] [209622.220554] [] mdt_intent_policy+0x435/0xd80 [mdt] [209622.227130] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209622.233981] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209622.241201] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209622.247449] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209622.254483] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209622.262285] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209622.268714] [] kthread+0xd1/0xe0 [209622.273716] [] ret_from_fork_nospec_begin+0xe/0x21 [209622.280293] [] 0xffffffffffffffff [209622.285403] LustreError: dumping log to /tmp/lustre-log.1576195860.39253 [209624.150655] Pid: 97344, comm: mdt01_046 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209624.160911] Call Trace: [209624.163472] [] call_rwsem_down_write_failed+0x17/0x30 [209624.170296] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [209624.177756] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [209624.184667] [] lod_prepare_create+0x215/0x2e0 [lod] [209624.191334] [] lod_declare_striped_create+0x1ee/0x980 [lod] [209624.198679] [] lod_declare_create+0x204/0x590 [lod] [209624.205349] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [209624.213344] [] mdd_declare_create+0x4c/0xcb0 [mdd] [209624.219959] [] mdd_create+0x847/0x14e0 [mdd] [209624.226013] [] mdt_reint_open+0x224f/0x3240 [mdt] [209624.232558] [] mdt_reint_rec+0x83/0x210 [mdt] [209624.238702] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [209624.245392] [] mdt_intent_open+0x82/0x3a0 [mdt] [209624.251701] [] mdt_intent_policy+0x435/0xd80 [mdt] [209624.258276] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209624.265133] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209624.272344] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209624.278595] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209624.285656] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209624.293458] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209624.299886] [] kthread+0xd1/0xe0 [209624.304890] [] ret_from_fork_nospec_begin+0xe/0x21 [209624.311466] [] 0xffffffffffffffff [209624.316574] LustreError: dumping log to /tmp/lustre-log.1576195862.97344 [209624.323893] Pid: 98236, comm: mdt01_065 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209624.334166] Call Trace: [209624.336711] [] call_rwsem_down_write_failed+0x17/0x30 [209624.343534] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [209624.350370] [] lod_qos_prep_create+0x16a/0x1890 [lod] [209624.357193] [] lod_prepare_create+0x215/0x2e0 [lod] [209624.363854] [] lod_declare_striped_create+0x1ee/0x980 [lod] [209624.371198] [] lod_declare_create+0x204/0x590 [lod] [209624.377861] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [209624.385810] [] mdd_declare_create+0x4c/0xcb0 [mdd] [209624.392385] [] mdd_create+0x847/0x14e0 [mdd] [209624.398428] [] mdt_reint_open+0x224f/0x3240 [mdt] [209624.404918] [] mdt_reint_rec+0x83/0x210 [mdt] [209624.411058] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [209624.417741] [] mdt_intent_open+0x82/0x3a0 [mdt] [209624.424047] [] mdt_intent_policy+0x435/0xd80 [mdt] [209624.430621] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209624.437464] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209624.444661] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209624.450905] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209624.457943] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209624.465742] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209624.472171] [] kthread+0xd1/0xe0 [209624.477166] [] ret_from_fork_nospec_begin+0xe/0x21 [209624.483747] [] 0xffffffffffffffff [209624.488834] Pid: 38891, comm: mdt01_001 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209624.499109] Call Trace: [209624.501660] [] call_rwsem_down_write_failed+0x17/0x30 [209624.508472] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [209624.515306] [] lod_qos_prep_create+0x16a/0x1890 [lod] [209624.522129] [] lod_prepare_create+0x215/0x2e0 [lod] [209624.528790] [] lod_declare_striped_create+0x1ee/0x980 [lod] [209624.536135] [] lod_declare_create+0x204/0x590 [lod] [209624.542798] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [209624.550760] [] mdd_declare_create+0x4c/0xcb0 [mdd] [209624.557339] [] mdd_create+0x847/0x14e0 [mdd] [209624.563382] [] mdt_reint_open+0x224f/0x3240 [mdt] [209624.569872] [] mdt_reint_rec+0x83/0x210 [mdt] [209624.576003] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [209624.582681] [] mdt_intent_open+0x82/0x3a0 [mdt] [209624.589002] [] mdt_intent_policy+0x435/0xd80 [mdt] [209624.595577] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209624.602417] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209624.609617] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209624.615873] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209624.622912] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209624.630713] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209624.637142] [] kthread+0xd1/0xe0 [209624.642146] [] ret_from_fork_nospec_begin+0xe/0x21 [209624.648697] [] 0xffffffffffffffff [209626.198667] LNet: Service thread pid 39265 was inactive for 801.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [209626.215752] LNet: Skipped 3 previous similar messages [209626.220920] Pid: 39265, comm: mdt02_016 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209626.231210] Call Trace: [209626.233762] [] call_rwsem_down_write_failed+0x17/0x30 [209626.240588] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [209626.247431] [] lod_qos_prep_create+0x16a/0x1890 [lod] [209626.254256] [] lod_prepare_create+0x215/0x2e0 [lod] [209626.260917] [] lod_declare_striped_create+0x1ee/0x980 [lod] [209626.268261] [] lod_declare_create+0x204/0x590 [lod] [209626.274921] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [209626.282880] [] mdd_declare_create+0x4c/0xcb0 [mdd] [209626.289456] [] mdd_create+0x847/0x14e0 [mdd] [209626.295499] [] mdt_reint_open+0x224f/0x3240 [mdt] [209626.302010] [] mdt_reint_rec+0x83/0x210 [mdt] [209626.308144] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [209626.314805] [] mdt_intent_open+0x82/0x3a0 [mdt] [209626.321118] [] mdt_intent_policy+0x435/0xd80 [mdt] [209626.327706] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209626.334561] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209626.341769] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209626.348020] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209626.355062] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209626.362865] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209626.369309] [] kthread+0xd1/0xe0 [209626.374332] [] ret_from_fork_nospec_begin+0xe/0x21 [209626.380923] [] 0xffffffffffffffff [209626.386034] LustreError: dumping log to /tmp/lustre-log.1576195864.39265 [209628.246691] LustreError: dumping log to /tmp/lustre-log.1576195866.97376 [209645.142802] Lustre: 38887:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-31), not sending early reply req@ffff8852d22f6c00 x1649494866337200/t0(0) o101->a39c942a-14d0-8a42-662a-6515c9201963@10.9.102.5@o2ib4:508/0 lens 576/3264 e 0 to 0 dl 1576195888 ref 2 fl Interpret:/0/0 rc 0/0 [209645.171963] Lustre: 38887:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 17 previous similar messages [209651.018086] Lustre: fir-MDT0000: Client 97102c2b-e0e2-553a-c933-88dc912145da (at 10.9.115.11@o2ib4) reconnecting [209651.028351] Lustre: Skipped 10 previous similar messages [209651.033778] Lustre: fir-MDT0000: Connection restored to 97102c2b-e0e2-553a-c933-88dc912145da (at 10.9.115.11@o2ib4) [209651.044296] Lustre: Skipped 11 previous similar messages [209699.927107] LNet: Service thread pid 106898 was inactive for 801.94s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [209699.940148] LNet: Skipped 11 previous similar messages [209699.945383] LustreError: dumping log to /tmp/lustre-log.1576195938.106898 [209710.167167] LustreError: dumping log to /tmp/lustre-log.1576195948.106917 [209713.992316] LNet: Service thread pid 97355 completed after 900.06s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [209714.008599] LNet: Skipped 7 previous similar messages [209722.455239] LustreError: dumping log to /tmp/lustre-log.1576195960.39438 [209726.551264] LustreError: dumping log to /tmp/lustre-log.1576195964.98340 [209734.743315] LustreError: dumping log to /tmp/lustre-log.1576195972.97377 [209738.839335] LustreError: dumping log to /tmp/lustre-log.1576195976.39246 [209744.983374] LustreError: dumping log to /tmp/lustre-log.1576195983.39360 [209747.031381] LustreError: dumping log to /tmp/lustre-log.1576195985.39346 [209783.895603] LustreError: dumping log to /tmp/lustre-log.1576196022.39241 [209785.943613] LustreError: dumping log to /tmp/lustre-log.1576196024.39257 [209787.853274] Lustre: fir-MDT0000: haven't heard from client af8d5000-0c68-4 (at 10.8.0.67@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9684400, cur 1576196026 expire 1576195876 last 1576195799 [209787.873157] Lustre: Skipped 1 previous similar message [209787.991624] LustreError: dumping log to /tmp/lustre-log.1576196026.97383 [209794.135665] LustreError: dumping log to /tmp/lustre-log.1576196032.39232 [209796.183687] LustreError: dumping log to /tmp/lustre-log.1576196034.39407 [209800.279705] LustreError: dumping log to /tmp/lustre-log.1576196038.106857 [209813.992990] LNet: Service thread pid 39426 completed after 998.83s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [209814.009252] LNet: Skipped 4 previous similar messages [209820.760824] LustreError: dumping log to /tmp/lustre-log.1576196058.39324 [209878.104161] LustreError: dumping log to /tmp/lustre-log.1576196116.39404 [209913.993659] LNet: Service thread pid 39265 completed after 1089.26s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [209914.009990] LNet: Skipped 7 previous similar messages [209921.240437] Lustre: 107132:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-398), not sending early reply req@ffff886928d1ba80 x1649340861729072/t0(0) o101->d5336f36-1352-ddc7-e966-e696298bb1ae@10.9.106.53@o2ib4:29/0 lens 1784/3288 e 0 to 0 dl 1576196164 ref 2 fl Interpret:/0/0 rc 0/0 [209921.269860] Lustre: 107132:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 45 previous similar messages [209953.758011] Lustre: fir-MDT0000: Client 469c9fed-7a4e-a33d-2f08-51ca338b69fb (at 10.9.108.68@o2ib4) reconnecting [209953.768272] Lustre: Skipped 35 previous similar messages [209953.773698] Lustre: fir-MDT0000: Connection restored to 469c9fed-7a4e-a33d-2f08-51ca338b69fb (at 10.9.108.68@o2ib4) [209953.784234] Lustre: Skipped 36 previous similar messages [209964.120677] LNet: Service thread pid 106909 was inactive for 960.55s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [209964.137826] Pid: 106909, comm: mdt02_068 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209964.148174] Call Trace: [209964.150733] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [209964.157789] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [209964.165072] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [209964.172001] [] mdt_object_lock_internal+0x70/0x360 [mdt] [209964.179093] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [209964.186101] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [209964.192763] [] mdt_intent_policy+0x435/0xd80 [mdt] [209964.199351] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209964.206194] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209964.213421] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209964.219671] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209964.226715] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209964.234516] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209964.240944] [] kthread+0xd1/0xe0 [209964.245947] [] ret_from_fork_nospec_begin+0xe/0x21 [209964.252525] [] 0xffffffffffffffff [209964.257633] LustreError: dumping log to /tmp/lustre-log.1576196202.106909 [209968.216704] Pid: 38898, comm: mdt03_002 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [209968.226963] Call Trace: [209968.229513] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [209968.236546] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [209968.243860] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [209968.250777] [] mdt_object_lock_internal+0x70/0x360 [mdt] [209968.257874] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [209968.264871] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [209968.271541] [] mdt_intent_policy+0x435/0xd80 [mdt] [209968.278103] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [209968.284956] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [209968.292154] [] tgt_enqueue+0x62/0x210 [ptlrpc] [209968.298430] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [209968.305447] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [209968.313262] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [209968.319679] [] kthread+0xd1/0xe0 [209968.324679] [] ret_from_fork_nospec_begin+0xe/0x21 [209968.331247] [] 0xffffffffffffffff [209968.336365] LustreError: dumping log to /tmp/lustre-log.1576196206.38898 [210013.994268] LNet: Service thread pid 39325 completed after 1187.57s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [210014.010617] LNet: Skipped 14 previous similar messages [210025.561067] LNet: Service thread pid 39414 was inactive for 1011.58s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [210025.578201] LNet: Skipped 1 previous similar message [210025.583264] Pid: 39414, comm: mdt00_040 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210025.593519] Call Trace: [210025.596073] [] call_rwsem_down_write_failed+0x17/0x30 [210025.602895] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [210025.610348] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [210025.617256] [] lod_prepare_create+0x215/0x2e0 [lod] [210025.623917] [] lod_declare_striped_create+0x1ee/0x980 [lod] [210025.631273] [] lod_declare_create+0x204/0x590 [lod] [210025.637934] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [210025.645894] [] mdd_declare_create+0x4c/0xcb0 [mdd] [210025.652468] [] mdd_create+0x847/0x14e0 [mdd] [210025.658529] [] mdt_reint_open+0x224f/0x3240 [mdt] [210025.665037] [] mdt_reint_rec+0x83/0x210 [mdt] [210025.671173] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [210025.677841] [] mdt_intent_open+0x82/0x3a0 [mdt] [210025.684157] [] mdt_intent_policy+0x435/0xd80 [mdt] [210025.690742] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210025.697591] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210025.704794] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210025.711049] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210025.718083] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210025.725902] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210025.732331] [] kthread+0xd1/0xe0 [210025.737333] [] ret_from_fork_nospec_begin+0xe/0x21 [210025.743911] [] 0xffffffffffffffff [210025.749029] LustreError: dumping log to /tmp/lustre-log.1576196263.39414 [210025.756495] Pid: 88947, comm: mdt01_042 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210025.766766] Call Trace: [210025.769313] [] call_rwsem_down_write_failed+0x17/0x30 [210025.776136] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [210025.783587] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [210025.790511] [] lod_prepare_create+0x215/0x2e0 [lod] [210025.797173] [] lod_declare_striped_create+0x1ee/0x980 [lod] [210025.804517] [] lod_declare_create+0x204/0x590 [lod] [210025.811179] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [210025.819132] [] mdd_declare_create+0x4c/0xcb0 [mdd] [210025.825706] [] mdd_create+0x847/0x14e0 [mdd] [210025.831750] [] mdt_reint_open+0x224f/0x3240 [mdt] [210025.838247] [] mdt_reint_rec+0x83/0x210 [mdt] [210025.844386] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [210025.851057] [] mdt_intent_open+0x82/0x3a0 [mdt] [210025.857382] [] mdt_intent_policy+0x435/0xd80 [mdt] [210025.863959] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210025.870801] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210025.878010] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210025.884261] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210025.891295] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210025.899099] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210025.905525] [] kthread+0xd1/0xe0 [210025.910523] [] ret_from_fork_nospec_begin+0xe/0x21 [210025.917097] [] 0xffffffffffffffff [210025.922203] Pid: 106931, comm: mdt01_093 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210025.932559] Call Trace: [210025.935108] [] call_rwsem_down_write_failed+0x17/0x30 [210025.941922] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [210025.949367] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [210025.956273] [] lod_prepare_create+0x215/0x2e0 [lod] [210025.962935] [] lod_declare_striped_create+0x1ee/0x980 [lod] [210025.970280] [] lod_declare_create+0x204/0x590 [lod] [210025.976941] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [210025.984891] [] mdd_declare_create+0x4c/0xcb0 [mdd] [210025.991481] [] mdd_create+0x847/0x14e0 [mdd] [210025.997526] [] mdt_reint_open+0x224f/0x3240 [mdt] [210026.004025] [] mdt_reint_rec+0x83/0x210 [mdt] [210026.010162] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [210026.016825] [] mdt_intent_open+0x82/0x3a0 [mdt] [210026.023136] [] mdt_intent_policy+0x435/0xd80 [mdt] [210026.029711] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210026.036552] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210026.043761] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210026.050005] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210026.057053] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210026.064850] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210026.071278] [] kthread+0xd1/0xe0 [210026.076275] [] ret_from_fork_nospec_begin+0xe/0x21 [210026.082862] [] 0xffffffffffffffff [210026.087950] LNet: Service thread pid 98332 was inactive for 1012.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [210026.101000] LNet: Skipped 42 previous similar messages [210029.657082] LustreError: dumping log to /tmp/lustre-log.1576196267.39385 [210033.753098] LustreError: dumping log to /tmp/lustre-log.1576196271.39273 [210035.801109] LustreError: dumping log to /tmp/lustre-log.1576196273.39348 [210037.849126] LustreError: dumping log to /tmp/lustre-log.1576196276.97459 [210041.945147] LustreError: dumping log to /tmp/lustre-log.1576196280.39255 [210046.041175] LustreError: dumping log to /tmp/lustre-log.1576196284.39369 [210050.137196] LustreError: dumping log to /tmp/lustre-log.1576196288.98176 [210066.521295] LustreError: dumping log to /tmp/lustre-log.1576196304.39416 [210070.617321] LustreError: dumping log to /tmp/lustre-log.1576196308.38941 [210082.905410] LustreError: dumping log to /tmp/lustre-log.1576196321.97359 [210099.289489] LustreError: dumping log to /tmp/lustre-log.1576196337.39236 [210107.481540] LustreError: dumping log to /tmp/lustre-log.1576196345.106855 [210111.577561] LustreError: dumping log to /tmp/lustre-log.1576196349.39349 [210114.015918] LNet: Service thread pid 39336 completed after 1178.37s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [210114.032248] LNet: Skipped 10 previous similar messages [210125.913653] LustreError: dumping log to /tmp/lustre-log.1576196364.97401 [210127.961655] LustreError: dumping log to /tmp/lustre-log.1576196366.39227 [210132.057691] LustreError: dumping log to /tmp/lustre-log.1576196370.39247 [210134.105697] LustreError: dumping log to /tmp/lustre-log.1576196372.97408 [210136.153719] LustreError: dumping log to /tmp/lustre-log.1576196374.39421 [210164.825887] LustreError: dumping log to /tmp/lustre-log.1576196402.107023 [210177.113960] LustreError: dumping log to /tmp/lustre-log.1576196415.97434 [210182.075993] LustreError: 39340:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576196120, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff888b8d7eaf40/0xc3c20c06d5315c19 lrc: 3/1,0 mode: --/PR res: [0x2000016f2:0x7:0x0].0x0 bits 0x13/0x0 rrc: 7 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39340 timeout: 0 lvb_type: 0 [210182.115288] LustreError: 39340:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 32 previous similar messages [210209.883157] LustreError: dumping log to /tmp/lustre-log.1576196448.39411 [210214.039724] LNet: Service thread pid 39377 completed after 1100.06s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [210214.056070] LNet: Skipped 37 previous similar messages [210226.266274] LustreError: dumping log to /tmp/lustre-log.1576196464.39403 [210287.706640] LNet: Service thread pid 97345 was inactive for 1063.42s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [210287.723753] LNet: Skipped 2 previous similar messages [210287.728901] Pid: 97345, comm: mdt01_047 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210287.739180] Call Trace: [210287.741735] [] call_rwsem_down_write_failed+0x17/0x30 [210287.748561] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [210287.755409] [] lod_qos_prep_create+0x16a/0x1890 [lod] [210287.762228] [] lod_prepare_create+0x215/0x2e0 [lod] [210287.768899] [] lod_declare_striped_create+0x1ee/0x980 [lod] [210287.776247] [] lod_declare_create+0x204/0x590 [lod] [210287.782923] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [210287.790884] [] mdd_declare_create+0x4c/0xcb0 [mdd] [210287.797480] [] mdd_create+0x847/0x14e0 [mdd] [210287.803524] [] mdt_reint_open+0x224f/0x3240 [mdt] [210287.810031] [] mdt_reint_rec+0x83/0x210 [mdt] [210287.816169] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [210287.822841] [] mdt_intent_open+0x82/0x3a0 [mdt] [210287.829153] [] mdt_intent_policy+0x435/0xd80 [mdt] [210287.835730] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210287.842584] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210287.849795] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210287.856046] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210287.863097] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210287.870918] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210287.877349] [] kthread+0xd1/0xe0 [210287.882353] [] ret_from_fork_nospec_begin+0xe/0x21 [210287.888918] [] 0xffffffffffffffff [210287.894040] LustreError: dumping log to /tmp/lustre-log.1576196526.97345 [210287.901508] Pid: 39323, comm: mdt00_016 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210287.911801] Call Trace: [210287.914361] [] call_rwsem_down_write_failed+0x17/0x30 [210287.921187] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [210287.928033] [] lod_qos_prep_create+0x16a/0x1890 [lod] [210287.934863] [] lod_prepare_create+0x215/0x2e0 [lod] [210287.941519] [] lod_declare_striped_create+0x1ee/0x980 [lod] [210287.948894] [] lod_declare_create+0x204/0x590 [lod] [210287.955549] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [210287.963519] [] mdd_declare_create+0x4c/0xcb0 [mdd] [210287.970086] [] mdd_create+0x847/0x14e0 [mdd] [210287.976147] [] mdt_reint_open+0x224f/0x3240 [mdt] [210287.982632] [] mdt_reint_rec+0x83/0x210 [mdt] [210287.988784] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [210287.995456] [] mdt_intent_open+0x82/0x3a0 [mdt] [210288.001781] [] mdt_intent_policy+0x435/0xd80 [mdt] [210288.008362] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210288.015227] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210288.022419] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210288.028670] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210288.035717] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210288.043515] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210288.049930] [] kthread+0xd1/0xe0 [210288.054937] [] ret_from_fork_nospec_begin+0xe/0x21 [210288.061512] [] 0xffffffffffffffff [210314.041582] Lustre: 39411:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (636:481s); client may timeout. req@ffff887775c93180 x1652555110615360/t0(0) o101->cfe93466-ba97-4@10.9.0.62@o2ib4:691/0 lens 584/536 e 0 to 0 dl 1576196071 ref 1 fl Complete:/0/0 rc 0/0 [210314.068326] Lustre: 39411:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 3 previous similar messages [210459.483701] Lustre: 106831:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-845), not sending early reply req@ffff8864fd63f980 x1649313082320112/t0(0) o101->7aa12007-79f9-a9cc-9090-a11975521a91@10.9.108.63@o2ib4:567/0 lens 1832/3288 e 0 to 0 dl 1576196702 ref 2 fl Interpret:/0/0 rc 0/0 [210459.513204] Lustre: 106831:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 84 previous similar messages [210508.891987] LNet: Service thread pid 39192 was inactive for 1204.38s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [210508.909089] LNet: Skipped 1 previous similar message [210508.914145] Pid: 39192, comm: mdt03_003 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210508.924437] Call Trace: [210508.926987] [] call_rwsem_down_write_failed+0x17/0x30 [210508.933807] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [210508.941256] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [210508.948175] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [210508.956137] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [210508.963481] [] mdd_declare_layout_change+0x62/0x120 [mdd] [210508.970662] [] mdd_layout_change+0x882/0x1000 [mdd] [210508.977312] [] mdt_layout_change+0x337/0x430 [mdt] [210508.983910] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [210508.990478] [] mdt_intent_policy+0x435/0xd80 [mdt] [210508.997060] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210509.003921] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210509.011127] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210509.017379] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210509.024431] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210509.032233] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210509.038662] [] kthread+0xd1/0xe0 [210509.043665] [] ret_from_fork_nospec_begin+0xe/0x21 [210509.050266] [] 0xffffffffffffffff [210509.055375] LustreError: dumping log to /tmp/lustre-log.1576196747.39192 [210525.276093] Pid: 97382, comm: mdt01_060 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210525.286357] Call Trace: [210525.288910] [] call_rwsem_down_write_failed+0x17/0x30 [210525.295733] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [210525.303187] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [210525.310092] [] lod_prepare_create+0x215/0x2e0 [lod] [210525.316757] [] lod_declare_striped_create+0x1ee/0x980 [lod] [210525.324100] [] lod_declare_create+0x204/0x590 [lod] [210525.330761] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [210525.338728] [] mdd_declare_create+0x4c/0xcb0 [mdd] [210525.345289] [] mdd_create+0x847/0x14e0 [mdd] [210525.351348] [] mdt_reint_open+0x224f/0x3240 [mdt] [210525.357838] [] mdt_reint_rec+0x83/0x210 [mdt] [210525.363988] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [210525.370639] [] mdt_intent_open+0x82/0x3a0 [mdt] [210525.376964] [] mdt_intent_policy+0x435/0xd80 [mdt] [210525.383536] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210525.390406] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210525.397601] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210525.403880] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210525.410906] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210525.418722] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210525.425136] [] kthread+0xd1/0xe0 [210525.430143] [] ret_from_fork_nospec_begin+0xe/0x21 [210525.436698] [] 0xffffffffffffffff [210525.441820] LustreError: dumping log to /tmp/lustre-log.1576196763.97382 [210525.449223] Pid: 39221, comm: mdt01_007 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210525.459504] Call Trace: [210525.462056] [] call_rwsem_down_write_failed+0x17/0x30 [210525.468877] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [210525.476344] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [210525.483257] [] lod_prepare_create+0x215/0x2e0 [lod] [210525.489917] [] lod_declare_striped_create+0x1ee/0x980 [lod] [210525.497260] [] lod_declare_create+0x204/0x590 [lod] [210525.503923] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [210525.511873] [] mdd_declare_create+0x4c/0xcb0 [mdd] [210525.518447] [] mdd_create+0x847/0x14e0 [mdd] [210525.524490] [] mdt_reint_open+0x224f/0x3240 [mdt] [210525.530990] [] mdt_reint_rec+0x83/0x210 [mdt] [210525.537143] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [210525.543808] [] mdt_intent_open+0x82/0x3a0 [mdt] [210525.550111] [] mdt_intent_policy+0x435/0xd80 [mdt] [210525.556684] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210525.563525] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210525.570734] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210525.576986] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210525.584019] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210525.591822] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210525.598252] [] kthread+0xd1/0xe0 [210525.603262] [] ret_from_fork_nospec_begin+0xe/0x21 [210525.609830] [] 0xffffffffffffffff [210573.636704] Lustre: fir-MDT0000: Client 55c89a19-c2de-4 (at 10.8.0.82@o2ib6) reconnecting [210573.644968] Lustre: Skipped 43 previous similar messages [210573.650404] Lustre: fir-MDT0000: Connection restored to 55c89a19-c2de-4 (at 10.8.0.82@o2ib6) [210573.658929] Lustre: Skipped 43 previous similar messages [210615.388625] Pid: 106834, comm: mdt01_078 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210615.398977] Call Trace: [210615.401534] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [210615.408577] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [210615.415856] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [210615.422771] [] mdt_object_lock_internal+0x70/0x360 [mdt] [210615.429880] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [210615.436881] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [210615.443553] [] mdt_intent_policy+0x435/0xd80 [mdt] [210615.450125] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210615.456986] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210615.464182] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210615.470449] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210615.477476] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210615.485291] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210615.491707] [] kthread+0xd1/0xe0 [210615.496722] [] ret_from_fork_nospec_begin+0xe/0x21 [210615.503288] [] 0xffffffffffffffff [210615.508416] LustreError: dumping log to /tmp/lustre-log.1576196853.106834 [210615.515955] Pid: 39384, comm: mdt01_032 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210615.526238] Call Trace: [210615.528789] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [210615.535810] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [210615.543120] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [210615.550041] [] mdt_object_lock_internal+0x70/0x360 [mdt] [210615.557130] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [210615.564133] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [210615.570796] [] mdt_intent_policy+0x435/0xd80 [mdt] [210615.577359] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210615.584219] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210615.591415] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210615.597681] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210615.604726] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210615.612545] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210615.618958] [] kthread+0xd1/0xe0 [210615.623966] [] ret_from_fork_nospec_begin+0xe/0x21 [210615.630520] [] 0xffffffffffffffff [210615.635633] Pid: 39334, comm: mdt00_020 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210615.645891] Call Trace: [210615.648443] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [210615.655463] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [210615.662752] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [210615.669660] [] mdt_object_lock_internal+0x70/0x360 [mdt] [210615.676754] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [210615.683752] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [210615.690412] [] mdt_intent_policy+0x435/0xd80 [mdt] [210615.696976] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210615.703831] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210615.711025] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210615.717289] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210615.724310] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210615.732113] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210615.738554] [] kthread+0xd1/0xe0 [210615.743544] [] ret_from_fork_nospec_begin+0xe/0x21 [210615.750110] [] 0xffffffffffffffff [210615.755204] Pid: 97442, comm: mdt02_059 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [210615.765487] Call Trace: [210615.768034] [] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [210615.775056] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [210615.782345] [] mdt_object_local_lock+0x50b/0xb20 [mdt] [210615.789251] [] mdt_object_lock_internal+0x70/0x360 [mdt] [210615.796347] [] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [210615.803358] [] mdt_intent_getattr+0x2b5/0x480 [mdt] [210615.810032] [] mdt_intent_policy+0x435/0xd80 [mdt] [210615.816595] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [210615.823447] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [210615.830644] [] tgt_enqueue+0x62/0x210 [ptlrpc] [210615.836899] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [210615.843921] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [210615.851737] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [210615.858153] [] kthread+0xd1/0xe0 [210615.863160] [] ret_from_fork_nospec_begin+0xe/0x21 [210615.869736] [] 0xffffffffffffffff [210814.083842] LustreError: 97347:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576196752, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8869c46f8900/0xc3c20c06d597b463 lrc: 3/1,0 mode: --/PR res: [0x200000406:0x1b2:0x0].0x0 bits 0x13/0x0 rrc: 44 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 97347 timeout: 0 lvb_type: 0 [210814.123407] LustreError: 97347:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 52 previous similar messages [210814.126330] LNet: Service thread pid 39428 completed after 1700.14s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [210814.126333] LNet: Skipped 21 previous similar messages [211214.206227] Lustre: fir-MDT0000: Client 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) reconnecting [211214.214499] Lustre: Skipped 16 previous similar messages [211214.219940] Lustre: fir-MDT0000: Connection restored to 03dd52b8-a4fc-4 (at 10.9.0.61@o2ib4) [211214.228471] Lustre: Skipped 16 previous similar messages [211473.781975] LustreError: 39268:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576197411, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff8878de2818c0/0xc3c20c06d6189ee6 lrc: 3/1,0 mode: --/PR res: [0x200039577:0x1b1a:0x0].0x0 bits 0x13/0x0 rrc: 5 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39268 timeout: 0 lvb_type: 0 [211473.821535] LustreError: 39268:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 13 previous similar messages [211810.148056] Lustre: 39429:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff888bafbcc800 x1649050425102112/t0(0) o101->4c1f7414-081e-38fa-7245-fdc2400de56e@10.9.101.49@o2ib4:408/0 lens 1792/3288 e 0 to 0 dl 1576198053 ref 2 fl Interpret:/0/0 rc 0/0 [211810.177478] Lustre: 39429:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 10 previous similar messages [211816.069991] Lustre: fir-MDT0000: Client 4c1f7414-081e-38fa-7245-fdc2400de56e (at 10.9.101.49@o2ib4) reconnecting [211816.080249] Lustre: Skipped 3 previous similar messages [211816.085588] Lustre: fir-MDT0000: Connection restored to (at 10.9.101.49@o2ib4) [211816.092993] Lustre: Skipped 3 previous similar messages [212100.837802] LustreError: 39229:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576198038, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff88643a3a6780/0xc3c20c06d676e9be lrc: 3/0,1 mode: --/CW res: [0x2000376b8:0x1706e:0x0].0x0 bits 0x2/0x0 rrc: 28 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39229 timeout: 0 lvb_type: 0 [212100.877452] LustreError: 39229:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 38 previous similar messages [212164.710193] Lustre: 39370:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff8874d5eb5e80 x1650958631802240/t0(0) o101->bdc6a669-f745-2944-1b74-3762ff7d0bf8@10.9.101.36@o2ib4:7/0 lens 584/3264 e 0 to 0 dl 1576198407 ref 2 fl Interpret:/0/0 rc 0/0 [212164.739351] Lustre: 39370:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages [212464.743881] Lustre: 39389:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff888b6fa8b600 x1649420627536176/t0(0) o101->970bc850-7648-f96d-fc2b-8b8c64ce0bd4@10.9.101.52@o2ib4:307/0 lens 584/3264 e 0 to 0 dl 1576198707 ref 2 fl Interpret:/0/0 rc 0/0 [212464.773213] Lustre: 39389:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages [212471.043623] Lustre: fir-MDT0000: Client 970bc850-7648-f96d-fc2b-8b8c64ce0bd4 (at 10.9.101.52@o2ib4) reconnecting [212471.053883] Lustre: Skipped 8 previous similar messages [212471.059227] Lustre: fir-MDT0000: Connection restored to (at 10.9.101.52@o2ib4) [212471.066624] Lustre: Skipped 8 previous similar messages [213014.196019] LustreError: 39357:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576198952, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff888be543d100/0xc3c20c06d6b8f03a lrc: 3/0,1 mode: --/CW res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x2/0x0 rrc: 39 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39357 timeout: 0 lvb_type: 0 [213014.235608] LustreError: 39357:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 45 previous similar messages [213614.210464] LustreError: 106832:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576199552, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0000_UUID lock: ffff886be6e7a1c0/0xc3c20c06d6ee5d4c lrc: 3/0,1 mode: --/CW res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x2/0x0 rrc: 37 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 106832 timeout: 0 lvb_type: 0 [213614.250192] LustreError: 106832:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 10 previous similar messages [213740.886807] Lustre: fir-MDT0000: haven't heard from client 62f117dd-237d-c074-d679-5244422357ce (at 10.9.103.27@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba9557c00, cur 1576199979 expire 1576199829 last 1576199752 [213740.908681] Lustre: Skipped 1 previous similar message [213908.912270] Lustre: 106857:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8865386b2d00 x1649340862596688/t0(0) o101->d5336f36-1352-ddc7-e966-e696298bb1ae@10.9.106.53@o2ib4:242/0 lens 584/3264 e 3 to 0 dl 1576200152 ref 2 fl Interpret:/0/0 rc 0/0 [213908.941428] Lustre: 106857:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [213914.736285] LNet: Service thread pid 39349 was inactive for 600.52s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [213914.753354] LNet: Skipped 6 previous similar messages [213914.758506] Pid: 39349, comm: mdt03_022 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [213914.768831] Call Trace: [213914.771395] [] osp_precreate_reserve+0x2e8/0x800 [osp] [213914.778324] [] osp_declare_create+0x199/0x5b0 [osp] [213914.785000] [] lod_sub_declare_create+0xdf/0x210 [lod] [213914.791937] [] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [213914.799140] [] lod_alloc_qos.constprop.18+0x10f4/0x1840 [lod] [213914.806696] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [213914.813640] [] lod_prepare_create+0x215/0x2e0 [lod] [213914.820287] [] lod_declare_striped_create+0x1ee/0x980 [lod] [213914.827651] [] lod_declare_create+0x204/0x590 [lod] [213914.834301] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [213914.842273] [] mdd_declare_create+0x4c/0xcb0 [mdd] [213914.848837] [] mdd_create+0x847/0x14e0 [mdd] [213914.854890] [] mdt_reint_open+0x224f/0x3240 [mdt] [213914.861387] [] mdt_reint_rec+0x83/0x210 [mdt] [213914.867537] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [213914.874201] [] mdt_intent_open+0x82/0x3a0 [mdt] [213914.880527] [] mdt_intent_policy+0x435/0xd80 [mdt] [213914.887101] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [213914.893969] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [213914.901165] [] tgt_enqueue+0x62/0x210 [ptlrpc] [213914.907436] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [213914.914469] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [213914.922284] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [213914.928700] [] kthread+0xd1/0xe0 [213914.933718] [] ret_from_fork_nospec_begin+0xe/0x21 [213914.940293] [] 0xffffffffffffffff [213914.945417] LustreError: dumping log to /tmp/lustre-log.1576200153.39349 [213914.953072] Pid: 39385, comm: mdt03_033 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [213914.963356] Call Trace: [213914.965903] [] call_rwsem_down_write_failed+0x17/0x30 [213914.972728] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [213914.980172] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [213914.987096] [] lod_prepare_create+0x215/0x2e0 [lod] [213914.993759] [] lod_declare_striped_create+0x1ee/0x980 [lod] [213915.001102] [] lod_declare_create+0x204/0x590 [lod] [213915.007777] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [213915.015724] [] mdd_declare_create+0x4c/0xcb0 [mdd] [213915.022299] [] mdd_create+0x847/0x14e0 [mdd] [213915.028342] [] mdt_reint_open+0x224f/0x3240 [mdt] [213915.034839] [] mdt_reint_rec+0x83/0x210 [mdt] [213915.040969] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [213915.047639] [] mdt_intent_open+0x82/0x3a0 [mdt] [213915.053943] [] mdt_intent_policy+0x435/0xd80 [mdt] [213915.060528] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [213915.067368] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [213915.074580] [] tgt_enqueue+0x62/0x210 [ptlrpc] [213915.080828] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [213915.087861] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [213915.095664] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [213915.102093] [] kthread+0xd1/0xe0 [213915.107097] [] ret_from_fork_nospec_begin+0xe/0x21 [213915.113664] [] 0xffffffffffffffff [213915.118757] Pid: 106832, comm: mdt01_076 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [213915.129118] Call Trace: [213915.131664] [] call_rwsem_down_write_failed+0x17/0x30 [213915.138479] [] lod_qos_statfs_update+0x97/0x2b0 [lod] [213915.145315] [] lod_qos_prep_create+0x16a/0x1890 [lod] [213915.152138] [] lod_prepare_create+0x215/0x2e0 [lod] [213915.158802] [] lod_declare_striped_create+0x1ee/0x980 [lod] [213915.166143] [] lod_declare_create+0x204/0x590 [lod] [213915.172805] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [213915.180755] [] mdd_declare_create+0x4c/0xcb0 [mdd] [213915.187330] [] mdd_create+0x847/0x14e0 [mdd] [213915.193373] [] mdt_reint_open+0x224f/0x3240 [mdt] [213915.199873] [] mdt_reint_rec+0x83/0x210 [mdt] [213915.206025] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [213915.212691] [] mdt_intent_open+0x82/0x3a0 [mdt] [213915.219001] [] mdt_intent_policy+0x435/0xd80 [mdt] [213915.225587] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [213915.229214] Lustre: fir-MDT0000: Client be4565a9-8448-ebff-ec7a-065a9a83593c (at 10.8.18.19@o2ib6) reconnecting [213915.229215] Lustre: Skipped 3 previous similar messages [213915.229236] Lustre: fir-MDT0000: Connection restored to (at 10.8.18.19@o2ib6) [213915.229237] Lustre: Skipped 3 previous similar messages [213915.260543] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [213915.267735] [] tgt_enqueue+0x62/0x210 [ptlrpc] [213915.273991] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [213915.281012] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [213915.288829] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [213915.295243] [] kthread+0xd1/0xe0 [213915.300235] [] ret_from_fork_nospec_begin+0xe/0x21 [213915.306811] [] 0xffffffffffffffff [213915.311902] Pid: 39324, comm: mdt03_015 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [213915.322171] Call Trace: [213915.324716] [] call_rwsem_down_write_failed+0x17/0x30 [213915.331540] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [213915.338998] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [213915.345909] [] lod_prepare_create+0x215/0x2e0 [lod] [213915.352571] [] lod_declare_striped_create+0x1ee/0x980 [lod] [213915.359917] [] lod_declare_create+0x204/0x590 [lod] [213915.366586] [] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [213915.374536] [] mdd_declare_create+0x4c/0xcb0 [mdd] [213915.381110] [] mdd_create+0x847/0x14e0 [mdd] [213915.387153] [] mdt_reint_open+0x224f/0x3240 [mdt] [213915.393651] [] mdt_reint_rec+0x83/0x210 [mdt] [213915.399790] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [213915.406452] [] mdt_intent_open+0x82/0x3a0 [mdt] [213915.412754] [] mdt_intent_policy+0x435/0xd80 [mdt] [213915.419333] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [213915.426172] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [213915.433380] [] tgt_enqueue+0x62/0x210 [ptlrpc] [213915.439624] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [213915.446657] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [213915.454459] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [213915.460889] [] kthread+0xd1/0xe0 [213915.465882] [] ret_from_fork_nospec_begin+0xe/0x21 [213915.472464] [] 0xffffffffffffffff [213916.784304] Pid: 39429, comm: mdt03_046 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [213916.794563] Call Trace: [213916.797105] [] call_rwsem_down_write_failed+0x17/0x30 [213916.803929] [] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [213916.811376] [] lod_qos_prep_create+0x12d7/0x1890 [lod] [213916.818284] [] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [213916.826255] [] lod_declare_layout_change+0xb65/0x10f0 [lod] [213916.833598] [] mdd_declare_layout_change+0x62/0x120 [mdd] [213916.840781] [] mdd_layout_change+0x882/0x1000 [mdd] [213916.847462] [] mdt_layout_change+0x337/0x430 [mdt] [213916.854047] [] mdt_intent_layout+0x7ee/0xcc0 [mdt] [213916.860613] [] mdt_intent_policy+0x435/0xd80 [mdt] [213916.867187] [] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [213916.874028] [] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [213916.881228] [] tgt_enqueue+0x62/0x210 [ptlrpc] [213916.887471] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [213916.894507] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [213916.902308] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [213916.908737] [] kthread+0xd1/0xe0 [213916.913755] [] ret_from_fork_nospec_begin+0xe/0x21 [213916.920310] [] 0xffffffffffffffff [213916.925397] LustreError: dumping log to /tmp/lustre-log.1576200155.39429 [213927.024360] LNet: Service thread pid 39352 was inactive for 600.96s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [213927.037305] LNet: Skipped 86 previous similar messages [213927.042537] LustreError: dumping log to /tmp/lustre-log.1576200165.39352 [213937.264428] LustreError: dumping log to /tmp/lustre-log.1576200175.39438 [213955.696551] LustreError: dumping log to /tmp/lustre-log.1576200193.97455 [213957.072566] Lustre: 39428:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-23), not sending early reply req@ffff88527f7c0480 x1649314426856784/t0(0) o101->1f72d546-482b-ba22-9634-964c4dc9701a@10.9.108.56@o2ib4:290/0 lens 1888/3288 e 0 to 0 dl 1576200200 ref 2 fl Interpret:/0/0 rc 0/0 [213957.101905] Lustre: 39428:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 14 previous similar messages [214014.215259] LNet: Service thread pid 39349 completed after 700.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [214014.231503] LNet: Skipped 22 previous similar messages [214015.088913] LNet: Service thread pid 39368 was inactive for 600.78s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [214015.101858] LNet: Skipped 2 previous similar messages [214015.107004] LustreError: dumping log to /tmp/lustre-log.1576200253.39368 [214015.217110] Lustre: fir-MDT0000: Client 0c302cf4-1147-d945-dfa2-e9bc796b3175 (at 10.9.101.32@o2ib4) reconnecting [214015.227378] Lustre: Skipped 11 previous similar messages [214015.232821] Lustre: fir-MDT0000: Connection restored to (at 10.9.101.32@o2ib4) [214015.240226] Lustre: Skipped 11 previous similar messages [214037.233062] Lustre: 107149:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-23), not sending early reply req@ffff88637305b600 x1649050425908336/t0(0) o101->4c1f7414-081e-38fa-7245-fdc2400de56e@10.9.101.49@o2ib4:370/0 lens 584/3264 e 0 to 0 dl 1576200280 ref 2 fl Interpret:/0/0 rc 0/0 [214037.262386] Lustre: 107149:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 15 previous similar messages [214047.857107] LustreError: dumping log to /tmp/lustre-log.1576200285.39229 [214072.433263] LustreError: dumping log to /tmp/lustre-log.1576200310.107141 [214090.865396] LustreError: dumping log to /tmp/lustre-log.1576200328.39440 [214097.009428] LustreError: dumping log to /tmp/lustre-log.1576200335.39273 [214114.217910] LNet: Service thread pid 106930 completed after 700.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [214114.234244] LNet: Skipped 4 previous similar messages [214115.441550] LustreError: dumping log to /tmp/lustre-log.1576200353.39244 [214119.537573] LustreError: dumping log to /tmp/lustre-log.1576200357.39358 [214121.585584] LustreError: dumping log to /tmp/lustre-log.1576200359.39406 [214123.633600] LustreError: dumping log to /tmp/lustre-log.1576200361.39217 [214135.921674] LustreError: dumping log to /tmp/lustre-log.1576200374.106870 [214137.969684] LustreError: dumping log to /tmp/lustre-log.1576200376.39247 [214146.161737] LNet: Service thread pid 39437 was inactive for 601.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [214146.174681] LNet: Skipped 28 previous similar messages [214146.179917] LustreError: dumping log to /tmp/lustre-log.1576200384.39437 [214152.091706] Lustre: Failing over fir-MDT0000 [214152.176235] Lustre: fir-MDT0000: Not available for connect from 10.8.23.22@o2ib6 (stopping) [214152.184676] Lustre: Skipped 45 previous similar messages [214152.416894] LustreError: 97383:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff886be3ac0800 ns: mdt-fir-MDT0000_UUID lock: ffff885dc50b0900/0xc3c20c06d71114b3 lrc: 3/0,0 mode: --/PR res: [0x20003ac50:0x7f36:0x0].0x0 bits 0x13/0x0 rrc: 32 type: IBT flags: 0x50306400000020 nid: 10.9.106.15@o2ib4 remote: 0x50fb205d05874a29 expref: 7 pid: 97383 timeout: 0 lvb_type: 0 [214152.681571] Lustre: fir-MDT0000: Not available for connect from 10.9.106.66@o2ib4 (stopping) [214152.690114] Lustre: Skipped 38 previous similar messages [214152.695547] LustreError: 43237:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.9.101.8@o2ib4 arrived at 1576200390 with bad export cookie 14105850204140236714 [214152.711189] LustreError: 43237:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) Skipped 1 previous similar message [214152.721216] LustreError: 43237:0:(ldlm_lock.c:2710:ldlm_lock_dump_handle()) ### ### ns: mdt-fir-MDT0000_UUID lock: ffff887bbdb13cc0/0xc3c20c06d566926d lrc: 3/0,0 mode: PR/PR res: [0x200000406:0xb3:0x0].0x0 bits 0x13/0x0 rrc: 378 type: IBT flags: 0x40200000000000 nid: 10.9.101.8@o2ib4 remote: 0x61759e98ea0e2f24 expref: 2 pid: 97398 timeout: 0 lvb_type: 0 [214153.010273] LustreError: 39405:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff887ba9550c00 ns: mdt-fir-MDT0000_UUID lock: ffff88688aa10fc0/0xc3c20c06d71700f0 lrc: 3/0,0 mode: --/PR res: [0x200039577:0x11b6:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x50306400000000 nid: 10.8.27.8@o2ib6 remote: 0x335e15af6b6351f4 expref: 16 pid: 39405 timeout: 0 lvb_type: 0 [214153.044904] LustreError: 39405:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) Skipped 9 previous similar messages [214153.681576] Lustre: fir-MDT0000: Not available for connect from 10.9.108.30@o2ib4 (stopping) [214153.690112] Lustre: Skipped 72 previous similar messages [214153.720043] LustreError: 45692:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.0.10.112@o2ib7 arrived at 1576200391 with bad export cookie 14105850204140815068 [214153.735767] LustreError: 45692:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) Skipped 8 previous similar messages [214153.799330] LustreError: 108863:0:(ldlm_resource.c:1147:ldlm_resource_complain()) mdt-fir-MDT0000_UUID: namespace resource [0x200038534:0x1ace:0x0].0x0 (ffff8856ccabbbc0) refcount nonzero (2) after lock cleanup; forcing cleanup. [214155.505086] LustreError: 43239:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.0.10.105@o2ib7 arrived at 1576200393 with bad export cookie 14105850204140656658 [214155.520819] LustreError: 43239:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) Skipped 22 previous similar messages [214155.711507] Lustre: fir-MDT0000: Not available for connect from 10.8.22.24@o2ib6 (stopping) [214155.719953] Lustre: Skipped 96 previous similar messages [214158.449811] LustreError: dumping log to /tmp/lustre-log.1576200396.97457 [214158.630539] LustreError: 43239:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.0.10.110@o2ib7 arrived at 1576200396 with bad export cookie 14105850204140426456 [214158.646271] LustreError: 43239:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) Skipped 42 previous similar messages [214159.764822] Lustre: fir-MDT0000: Not available for connect from 10.9.108.61@o2ib4 (stopping) [214159.773351] Lustre: Skipped 90 previous similar messages [214163.989274] LustreError: 38879:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.0.10.111@o2ib7 arrived at 1576200402 with bad export cookie 14105850204141199347 [214164.005042] LustreError: 38879:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) Skipped 21 previous similar messages [214167.773868] Lustre: fir-MDT0000: Not available for connect from 10.9.105.38@o2ib4 (stopping) [214167.782396] Lustre: Skipped 194 previous similar messages [214175.646196] LustreError: 38875:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.0.10.109@o2ib7 arrived at 1576200413 with bad export cookie 14105850204140880259 [214175.661947] LustreError: 38875:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) Skipped 25 previous similar messages [214178.867929] LustreError: 0-0: Forced cleanup waiting for mdt-fir-MDT0000_UUID namespace with 103 resources in use, (rc=-110) [214178.929930] LustreError: dumping log to /tmp/lustre-log.1576200417.39266 [214184.851608] Lustre: fir-MDT0000: Not available for connect from 10.8.0.65@o2ib6 (stopping) [214184.859996] Lustre: Skipped 860 previous similar messages [214199.410050] LustreError: dumping log to /tmp/lustre-log.1576200437.107183 [214201.202105] Lustre: 39344:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-23), not sending early reply req@ffff888bc7722400 x1652733444556736/t0(0) o101->d2bd0014-3bea-4@10.9.114.7@o2ib4:534/0 lens 1792/3288 e 0 to 0 dl 1576200444 ref 2 fl Interpret:/0/0 rc 0/0 [214201.229529] Lustre: 39344:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 15 previous similar messages [214203.879083] LustreError: 0-0: Forced cleanup waiting for mdt-fir-MDT0000_UUID namespace with 103 resources in use, (rc=-110) [214214.173625] LustreError: 108933:0:(qsd_reint.c:56:qsd_reint_completion()) fir-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x20000:0x0], rc:-108 [214214.188570] LustreError: 108933:0:(qsd_reint.c:56:qsd_reint_completion()) Skipped 2 previous similar messages [214214.249472] LNet: Service thread pid 39358 completed after 695.19s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [214214.249540] Lustre: 39266:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (628:8s); client may timeout. req@ffff888bc7722400 x1652733444556736/t0(0) o101->d2bd0014-3bea-4@10.9.114.7@o2ib4:534/0 lens 1792/560 e 0 to 0 dl 1576200444 ref 1 fl Complete:/0/0 rc -19/-19 [214214.292790] LNet: Skipped 27 previous similar messages [214249.098091] Lustre: fir-MDT0000: Not available for connect from 10.0.10.3@o2ib7 (stopping) [214249.106448] Lustre: Skipped 2 previous similar messages [214261.492179] LustreError: 108956:0:(qsd_reint.c:56:qsd_reint_completion()) fir-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x10000:0x0], rc:-108 [214261.507178] LustreError: 108956:0:(qsd_reint.c:56:qsd_reint_completion()) Skipped 2 previous similar messages [214274.173947] LustreError: 108960:0:(qsd_reint.c:56:qsd_reint_completion()) fir-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x20000:0x0], rc:-108 [214274.188911] LustreError: 108960:0:(qsd_reint.c:56:qsd_reint_completion()) Skipped 2 previous similar messages [214301.057672] Lustre: server umount fir-MDT0000 complete [214307.481016] sched: RT throttling activated [214341.502033] LDISKFS-fs (dm-0): file extents enabled, maximum tree depth=5 [214341.589863] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc [214343.917340] Lustre: fir-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [214344.071898] Lustre: fir-MDD0000: changelog on [214344.081243] Lustre: fir-MDT0000: in recovery but waiting for the first client to connect [214348.929775] Lustre: fir-MDT0000: Will be in recovery for at least 2:30, or until 1271 clients reconnect [214348.932768] Lustre: fir-MDT0000: Connection restored to 704e8622-7442-8eb3-b4e3-c86a69ef45af (at 10.8.20.21@o2ib6) [214348.932770] Lustre: Skipped 24 previous similar messages [214355.353206] LustreError: 109508:0:(tgt_handler.c:525:tgt_filter_recovery_request()) @@@ not permitted during recovery req@ffff885d5d2e5e80 x1652747909508464/t0(0) o601->fir-MDT0000-lwp-MDT0001_UUID@10.0.10.52@o2ib7:683/0 lens 336/0 e 0 to 0 dl 1576201348 ref 1 fl Interpret:/0/ffffffff rc 0/-1 [214355.379267] LustreError: 109508:0:(tgt_handler.c:525:tgt_filter_recovery_request()) Skipped 18 previous similar messages [214357.365980] Lustre: 109440:0:(ldlm_lib.c:1765:extend_recovery_timer()) fir-MDT0000: extended recovery timer reaching hard limit: 900, extend: 1 [214357.378930] Lustre: 109440:0:(ldlm_lib.c:1765:extend_recovery_timer()) Skipped 36 previous similar messages [214357.416716] Lustre: fir-MDT0000: Recovery over after 0:08, of 1271 clients 1271 recovered and 0 were evicted. [214775.885439] Lustre: MGS: haven't heard from client 46ad863f-f227-deee-59d2-4b6842f8fe21 (at 10.9.102.53@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bfa311c00, cur 1576201014 expire 1576200864 last 1576200787 [214775.906623] Lustre: Skipped 1 previous similar message [215659.534872] Lustre: MGS: Connection restored to 62f117dd-237d-c074-d679-5244422357ce (at 10.9.103.27@o2ib4) [215659.544698] Lustre: Skipped 1369 previous similar messages [217112.123261] Lustre: MGS: Connection restored to (at 10.9.102.53@o2ib4) [217112.129972] Lustre: Skipped 1 previous similar message [217457.896409] Lustre: MGS: haven't heard from client 6784b256-083b-783f-ab9f-d610fc101c63 (at 10.9.104.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0a2a400, cur 1576203696 expire 1576203546 last 1576203469 [217457.917594] Lustre: Skipped 1 previous similar message [218323.912309] Lustre: MGS: haven't heard from client e6faa273-d68e-7054-d60c-905379aaf1ac (at 10.9.101.51@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc4e8b000, cur 1576204562 expire 1576204412 last 1576204335 [218323.933491] Lustre: Skipped 1 previous similar message [219819.475502] Lustre: MGS: Connection restored to c463879e-71d6-cfb3-b583-923d4925c479 (at 10.9.104.28@o2ib4) [219819.485383] Lustre: Skipped 1 previous similar message [220517.030635] Lustre: MGS: Connection restored to 7de2709b-434b-c2b2-ee11-fe99c3a9d16f (at 10.9.101.51@o2ib4) [220517.040466] Lustre: Skipped 1 previous similar message [226006.954933] Lustre: fir-MDT0000: haven't heard from client 935b75df-613a-c7ad-95b7-8cbfb8326a67 (at 10.9.101.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbf04fc00, cur 1576212245 expire 1576212095 last 1576212018 [226006.976812] Lustre: Skipped 1 previous similar message [226018.958011] Lustre: MGS: haven't heard from client 62818541-2f9e-3fbf-37a6-6cd1b5c2b596 (at 10.9.101.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be0888000, cur 1576212257 expire 1576212107 last 1576212030 [228024.091456] Lustre: MGS: Connection restored to fe46e801-2d86-9439-0b24-b78514ed5486 (at 10.9.109.8@o2ib4) [228024.101200] Lustre: Skipped 1 previous similar message [228246.317728] Lustre: MGS: Connection restored to 935b75df-613a-c7ad-95b7-8cbfb8326a67 (at 10.9.101.28@o2ib4) [228246.327561] Lustre: Skipped 1 previous similar message [236658.010764] Lustre: fir-MDT0000: haven't heard from client 4684778f-c6ca-8992-003a-f2d67b1c6b5d (at 10.9.105.39@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d2776400, cur 1576222896 expire 1576222746 last 1576222669 [239013.213918] Lustre: MGS: Connection restored to (at 10.9.105.39@o2ib4) [239013.220623] Lustre: Skipped 1 previous similar message [240008.839013] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [240008.847284] Lustre: Skipped 24 previous similar messages [240008.852717] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [240008.859939] Lustre: Skipped 1 previous similar message [240031.871134] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [240031.888421] LustreError: Skipped 10 previous similar messages [243015.091736] Lustre: MGS: haven't heard from client d77c7f7a-9a96-f21e-841a-6d14d4ec395f (at 10.9.101.23@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bcde28c00, cur 1576229253 expire 1576229103 last 1576229026 [243015.112927] Lustre: Skipped 1 previous similar message [245300.852407] Lustre: MGS: Connection restored to eeedb4d1-a88f-91d4-b517-9794013a9735 (at 10.9.101.23@o2ib4) [245398.768527] Lustre: MGS: Connection restored to 74bb7759-5b69-188d-1d68-d42ea52dd73e (at 10.8.22.9@o2ib6) [245398.778208] Lustre: Skipped 1 previous similar message [245421.695844] Lustre: MGS: Connection restored to f2522b1b-c543-fb0a-5671-8b71578820aa (at 10.8.23.5@o2ib6) [245421.705499] Lustre: Skipped 1 previous similar message [245434.313654] Lustre: MGS: Connection restored to a330e9da-f898-eba1-da4d-e66f263c946d (at 10.8.23.6@o2ib6) [245434.323307] Lustre: Skipped 1 previous similar message [245442.310078] Lustre: MGS: Connection restored to b917c939-9df9-0eb7-4a6b-af8df30cce5c (at 10.8.23.2@o2ib6) [245442.319733] Lustre: Skipped 1 previous similar message [245737.003526] Lustre: MGS: Connection restored to 609336f4-1458-3d02-65ef-5e8312905d2d (at 10.9.103.35@o2ib4) [245737.013371] Lustre: Skipped 1 previous similar message [250558.152919] Lustre: fir-MDT0000: haven't heard from client c69da9eb-2a67-e47a-288a-b07cd55ef6e7 (at 10.9.102.22@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d2774000, cur 1576236796 expire 1576236646 last 1576236569 [250558.174797] Lustre: Skipped 1 previous similar message [251979.105403] Lustre: MGS: haven't heard from client 2d37a8cd-2c9e-a1f5-ecd0-228eabddb956 (at 10.9.101.40@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdaaba400, cur 1576238217 expire 1576238067 last 1576237990 [251979.126592] Lustre: Skipped 1 previous similar message [252861.290310] Lustre: MGS: Connection restored to (at 10.9.102.22@o2ib4) [252861.297021] Lustre: Skipped 1 previous similar message [253355.137455] Lustre: MGS: Connection restored to (at 10.9.101.40@o2ib4) [253355.144159] Lustre: Skipped 1 previous similar message [260132.473890] Lustre: MGS: Connection restored to (at 10.9.103.8@o2ib4) [260132.480519] Lustre: Skipped 1 previous similar message [264602.506433] Lustre: MGS: Connection restored to (at 10.9.102.58@o2ib4) [264602.513147] Lustre: Skipped 1 previous similar message [266608.192073] Lustre: MGS: haven't heard from client 268f9cff-5f59-279f-eccf-0350e45c9030 (at 10.8.27.22@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde1dc00, cur 1576252846 expire 1576252696 last 1576252619 [266608.213165] Lustre: Skipped 1 previous similar message [268734.115027] Lustre: MGS: Connection restored to (at 10.8.27.22@o2ib6) [268734.121650] Lustre: Skipped 1 previous similar message [269833.210920] Lustre: fir-MDT0000: haven't heard from client 612b8514-2fed-4 (at 10.9.115.12@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab2e98000, cur 1576256071 expire 1576255921 last 1576255844 [269833.230973] Lustre: Skipped 1 previous similar message [271006.237052] Lustre: MGS: Connection restored to 27dd63c4-0630-b8af-eb2d-2f38c1747230 (at 10.8.19.5@o2ib6) [271006.246712] Lustre: Skipped 1 previous similar message [271008.791676] Lustre: MGS: Connection restored to (at 10.8.19.1@o2ib6) [271008.798214] Lustre: Skipped 1 previous similar message [271281.871892] Lustre: MGS: Connection restored to 4c497e0b-ea41-4 (at 10.8.9.1@o2ib6) [271281.879647] Lustre: Skipped 1 previous similar message [271488.840695] Lustre: MGS: Connection restored to (at 10.8.7.12@o2ib6) [271488.847231] Lustre: Skipped 1 previous similar message [271708.815430] Lustre: MGS: Connection restored to a5fa5de0-cdab-4a43-0304-bf3577f942f3 (at 10.9.103.40@o2ib4) [271708.825265] Lustre: Skipped 1 previous similar message [271860.326330] Lustre: MGS: Connection restored to (at 10.8.27.2@o2ib6) [271860.332868] Lustre: Skipped 1 previous similar message [271894.841280] Lustre: MGS: Connection restored to c681c8c8-a3bd-4f09-2cf4-358a58ae71d2 (at 10.9.117.22@o2ib4) [271894.851111] Lustre: Skipped 1 previous similar message [271937.258302] Lustre: MGS: Connection restored to 75c6d6d0-df4c-7543-716f-77a06d0b577a (at 10.9.103.68@o2ib4) [271937.268166] Lustre: Skipped 7 previous similar messages [272067.166104] Lustre: MGS: Connection restored to 7e6b1bcc-06cc-6146-e31c-86eefaf425fd (at 10.9.101.53@o2ib4) [272067.175931] Lustre: Skipped 7 previous similar messages [272441.199200] Lustre: MGS: Connection restored to (at 10.9.115.12@o2ib4) [272441.205916] Lustre: Skipped 15 previous similar messages [273201.314335] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [273201.322260] Lustre: Skipped 3 previous similar messages [273605.255290] Lustre: fir-MDT0000: haven't heard from client c31f9bc3-7f63-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e4ea88c00, cur 1576259843 expire 1576259693 last 1576259616 [273605.275263] Lustre: Skipped 9 previous similar messages [273963.237000] Lustre: MGS: haven't heard from client f6ff0a39-a2e7-4cc1-5e90-2db4d23ec000 (at 10.9.109.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bcafef800, cur 1576260201 expire 1576260051 last 1576259974 [273963.258191] Lustre: Skipped 1 previous similar message [275965.302065] Lustre: MGS: Connection restored to 926d1d3f-c00b-0a14-c2c2-d31d7ccaab04 (at 10.9.109.54@o2ib4) [275965.311891] Lustre: Skipped 3 previous similar messages [278945.753404] Lustre: MGS: Connection restored to (at 10.9.115.12@o2ib4) [278945.760109] Lustre: Skipped 1 previous similar message [281708.472578] Lustre: MGS: Connection restored to (at 10.9.117.19@o2ib4) [281708.479291] Lustre: Skipped 1 previous similar message [281710.602059] Lustre: MGS: Connection restored to 75c6d6d0-df4c-7543-716f-77a06d0b577a (at 10.9.103.68@o2ib4) [281710.611884] Lustre: Skipped 4 previous similar messages [281713.405484] Lustre: MGS: Connection restored to (at 10.9.110.16@o2ib4) [281713.412194] Lustre: Skipped 3 previous similar messages [281740.266223] Lustre: fir-MDT0000: haven't heard from client 6c64d62f-48f9-4 (at 10.9.117.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d5b37f400, cur 1576267978 expire 1576267828 last 1576267751 [281740.286280] Lustre: Skipped 1 previous similar message [281746.806957] Lustre: MGS: Connection restored to b0b26ee9-c32f-6532-44f3-71f6594472dc (at 10.9.107.38@o2ib4) [281746.816788] Lustre: Skipped 1 previous similar message [281761.200194] Lustre: MGS: Connection restored to (at 10.9.103.29@o2ib4) [281761.206905] Lustre: Skipped 1 previous similar message [281967.755399] Lustre: MGS: Connection restored to (at 10.9.115.12@o2ib4) [281967.762111] Lustre: Skipped 1 previous similar message [282007.906193] Lustre: MGS: Connection restored to a5fa5de0-cdab-4a43-0304-bf3577f942f3 (at 10.9.103.40@o2ib4) [282007.916021] Lustre: Skipped 1 previous similar message [282431.983199] Lustre: MGS: Connection restored to (at 10.9.116.4@o2ib4) [282431.989820] Lustre: Skipped 3 previous similar messages [282625.940779] Lustre: MGS: Connection restored to ffd29e47-c156-7a5d-e13c-47520fdf8012 (at 10.9.107.8@o2ib4) [282625.950530] Lustre: Skipped 1 previous similar message [283096.272807] Lustre: fir-MDT0000: haven't heard from client d0d4132c-29bd-a096-d75c-9d230454f241 (at 10.9.109.67@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b7400, cur 1576269334 expire 1576269184 last 1576269107 [283096.294689] Lustre: Skipped 23 previous similar messages [285310.155945] Lustre: MGS: Connection restored to (at 10.9.109.67@o2ib4) [285310.162667] Lustre: Skipped 1 previous similar message [286913.323492] Lustre: fir-MDT0000: haven't heard from client 2c477d6e-fd13-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ac9f1f800, cur 1576273151 expire 1576273001 last 1576272924 [286913.343461] Lustre: Skipped 1 previous similar message [286937.781051] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [286937.788982] Lustre: Skipped 1 previous similar message [287150.680434] Lustre: MGS: Connection restored to 16834498-f082-b8d6-0fed-822dab1a074a (at 10.8.26.35@o2ib6) [287150.690174] Lustre: Skipped 1 previous similar message [287190.299398] Lustre: fir-MDT0000: haven't heard from client 1c607bab-82d8-4 (at 10.8.26.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f5800, cur 1576273428 expire 1576273278 last 1576273201 [287190.319401] Lustre: Skipped 1 previous similar message [287252.173352] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [287252.181297] Lustre: Skipped 1 previous similar message [287490.775470] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [287490.783393] Lustre: Skipped 1 previous similar message [287555.301763] Lustre: MGS: haven't heard from client e8b8eb1f-1d31-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8852a535b400, cur 1576273793 expire 1576273643 last 1576273566 [287555.321042] Lustre: Skipped 3 previous similar messages [288045.336170] Lustre: MGS: haven't heard from client 8391c5c8-fdfe-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886137b52c00, cur 1576274283 expire 1576274133 last 1576274056 [288045.355449] Lustre: Skipped 1 previous similar message [288058.910991] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [288058.918909] Lustre: Skipped 1 previous similar message [288348.533053] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [288348.540972] Lustre: Skipped 1 previous similar message [288387.327168] Lustre: MGS: haven't heard from client 22d32650-8ccc-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88576bb17000, cur 1576274625 expire 1576274475 last 1576274398 [288387.346446] Lustre: Skipped 1 previous similar message [288611.830973] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [288611.838889] Lustre: Skipped 1 previous similar message [288676.303973] Lustre: fir-MDT0000: haven't heard from client 24f7595d-69f0-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bea748c00, cur 1576274914 expire 1576274764 last 1576274687 [288676.323940] Lustre: Skipped 1 previous similar message [288890.309276] Lustre: fir-MDT0000: haven't heard from client d86e0d94-a8a4-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8868f93f7800, cur 1576275128 expire 1576274978 last 1576274901 [288890.329244] Lustre: Skipped 1 previous similar message [289460.308419] Lustre: fir-MDT0000: haven't heard from client fbb85c61-fcc5-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857da1c1400, cur 1576275698 expire 1576275548 last 1576275471 [289460.328392] Lustre: Skipped 1 previous similar message [289507.359965] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [289507.367891] Lustre: Skipped 3 previous similar messages [289735.971794] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [289735.979714] Lustre: Skipped 1 previous similar message [289786.314849] Lustre: MGS: haven't heard from client ee164e66-3d55-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886ae2296800, cur 1576276024 expire 1576275874 last 1576275797 [289786.334128] Lustre: Skipped 1 previous similar message [289967.727499] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [289967.735431] Lustre: Skipped 1 previous similar message [289973.312414] Lustre: fir-MDT0000: haven't heard from client a444a098-ee47-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8877152b5c00, cur 1576276211 expire 1576276061 last 1576275984 [289973.332389] Lustre: Skipped 1 previous similar message [290321.315187] Lustre: fir-MDT0000: haven't heard from client 82452b2d-c1d7-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857dd406000, cur 1576276559 expire 1576276409 last 1576276332 [290321.335178] Lustre: Skipped 1 previous similar message [290324.446066] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [290324.453987] Lustre: Skipped 1 previous similar message [294086.742111] perf: interrupt took too long (3129 > 3126), lowering kernel.perf_event_max_sample_rate to 63000 [294475.344503] Lustre: MGS: haven't heard from client 756b9e6b-bde2-5455-3bcc-6e042d48fa50 (at 10.9.110.43@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bea725c00, cur 1576280713 expire 1576280563 last 1576280486 [294475.365685] Lustre: Skipped 1 previous similar message [294778.341921] Lustre: MGS: haven't heard from client c97fa3e1-0063-4d17-d31a-967ab2f0f2f2 (at 10.9.108.48@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf3c86400, cur 1576281016 expire 1576280866 last 1576280789 [294778.363104] Lustre: Skipped 1 previous similar message [296605.781051] Lustre: MGS: Connection restored to 8a803f91-236e-6a0c-6c70-125991cec704 (at 10.9.110.43@o2ib4) [296605.790883] Lustre: Skipped 1 previous similar message [297132.428963] Lustre: MGS: Connection restored to (at 10.8.20.4@o2ib6) [297132.435495] Lustre: Skipped 1 previous similar message [297201.605358] Lustre: MGS: Connection restored to (at 10.8.23.11@o2ib6) [297201.611981] Lustre: Skipped 3 previous similar messages [298137.421048] Lustre: fir-MDT0000: haven't heard from client 1c655101-6947-4 (at 10.9.109.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88784209f400, cur 1576284375 expire 1576284225 last 1576284148 [298137.441121] Lustre: Skipped 1 previous similar message [298164.233327] Lustre: MGS: Connection restored to (at 10.9.109.37@o2ib4) [298164.240037] Lustre: Skipped 9 previous similar messages [298954.361913] Lustre: fir-MDT0000: haven't heard from client bd57f75b-13e5-e3a1-9b0f-6e32b1e20e30 (at 10.9.110.42@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887402e67800, cur 1576285192 expire 1576285042 last 1576284965 [298954.383790] Lustre: Skipped 1 previous similar message [300153.659132] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [300153.667422] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [300153.674648] Lustre: Skipped 1 previous similar message [300241.899215] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [300241.909422] Lustre: Skipped 1 previous similar message [300241.914678] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [300241.914806] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [300271.957719] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [300271.957798] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [300271.985205] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [300297.045505] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [300297.045630] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [300297.071074] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [301050.979017] Lustre: MGS: Connection restored to (at 10.9.110.42@o2ib4) [301243.397904] Lustre: fir-MDT0000: haven't heard from client c1df1d70-b0ee-e5cb-6b7f-82f38506f9a4 (at 10.8.23.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbe9c6000, cur 1576287481 expire 1576287331 last 1576287254 [301243.419643] Lustre: Skipped 1 previous similar message [303274.929670] Lustre: MGS: Connection restored to 434330d3-8def-ac3b-ef45-a9ad34a86cc5 (at 10.8.23.9@o2ib6) [303274.939330] Lustre: Skipped 1 previous similar message [303327.287760] Lustre: MGS: Connection restored to 9cc4136f-bf17-6652-57d0-d2da120c520f (at 10.8.22.36@o2ib6) [303327.297505] Lustre: Skipped 1 previous similar message [303328.610258] Lustre: MGS: Connection restored to (at 10.8.23.7@o2ib6) [303328.616791] Lustre: Skipped 1 previous similar message [303335.965634] Lustre: MGS: Connection restored to (at 10.8.22.35@o2ib6) [303335.972258] Lustre: Skipped 1 previous similar message [303342.067143] Lustre: MGS: Connection restored to 9c1ce0f7-d74e-89cc-501f-2f65de889131 (at 10.8.22.34@o2ib6) [303342.076883] Lustre: Skipped 1 previous similar message [303450.325883] Lustre: MGS: Connection restored to (at 10.8.23.4@o2ib6) [303450.332417] Lustre: Skipped 3 previous similar messages [303522.662475] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [303522.670395] Lustre: Skipped 1 previous similar message [303523.075544] LustreError: 109593:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.18.35@o2ib6) returned error from blocking AST (req@ffff885bbde33f00 x1652543205970656 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff8857cd58c140/0xc3c20c07049fed20 lrc: 4/0,0 mode: EX/EX res: [0x20003c212:0xa:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.18.35@o2ib6 remote: 0xb5fa43053486a2e2 expref: 1068 pid: 109568 timeout: 303670 lvb_type: 3 [303523.075878] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.18.35@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [303523.075902] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 10.8.18.35@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff8855de08f980/0xc3c20c07049fd6fa lrc: 3/0,0 mode: EX/EX res: [0x20003c212:0xb:0x0].0x0 bits 0x8/0x0 rrc: 5 type: IBT flags: 0x60000400000020 nid: 10.8.18.35@o2ib6 remote: 0xb5fa430534869e66 expref: 1069 pid: 109695 timeout: 0 lvb_type: 3 [303523.168397] LustreError: 109593:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 1 previous similar message [303548.417493] Lustre: MGS: haven't heard from client 780a55b3-86f7-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886375958c00, cur 1576289786 expire 1576289636 last 1576289559 [303548.436771] Lustre: Skipped 1 previous similar message [304025.752535] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576290256/real 1576290256] req@ffff88680b70f980 x1652543206991392/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576290263 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [304025.779955] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 1 previous similar message [304032.789588] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576290263/real 1576290263] req@ffff88680b70f980 x1652543206991392/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576290270 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [304039.816626] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576290270/real 1576290270] req@ffff88680b70f980 x1652543206991392/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576290277 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [304046.843666] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576290277/real 1576290277] req@ffff88680b70f980 x1652543206991392/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576290284 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [304060.871750] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576290291/real 1576290291] req@ffff88680b70f980 x1652543206991392/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576290298 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [304060.899169] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 1 previous similar message [304081.908879] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576290312/real 1576290312] req@ffff88680b70f980 x1652543206991392/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576290319 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [304081.936320] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [304123.946139] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576290354/real 1576290354] req@ffff88680b70f980 x1652543206991392/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576290361 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [304123.973561] Lustre: 109695:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13 previous similar messages [304172.983449] LustreError: 109695:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.18.35@o2ib6) failed to reply to blocking AST (req@ffff88680b70f980 x1652543206991392 status 0 rc -110), evict it ns: mdt-fir-MDT0000_UUID lock: ffff8859d0fa3180/0xc3c20c0704f6beeb lrc: 4/0,0 mode: PR/PR res: [0x20003ac3f:0x18d9:0x0].0x0 bits 0x13/0x0 rrc: 6 type: IBT flags: 0x60200400000020 nid: 10.8.18.35@o2ib6 remote: 0x40e9ff1e41556e4e expref: 316 pid: 109560 timeout: 304313 lvb_type: 0 [304173.026472] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.18.35@o2ib6 was evicted due to a lock blocking callback time out: rc -110 [304173.039090] LustreError: Skipped 1 previous similar message [304173.044775] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 154s: evicting client at 10.8.18.35@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff8859d0fa3180/0xc3c20c0704f6beeb lrc: 3/0,0 mode: PR/PR res: [0x20003ac3f:0x18d9:0x0].0x0 bits 0x13/0x0 rrc: 6 type: IBT flags: 0x60200400000020 nid: 10.8.18.35@o2ib6 remote: 0x40e9ff1e41556e4e expref: 317 pid: 109560 timeout: 0 lvb_type: 0 [304184.717412] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [304184.725340] Lustre: Skipped 3 previous similar messages [304227.447329] Lustre: MGS: haven't heard from client 9c579da5-4d50-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888ae5361800, cur 1576290465 expire 1576290315 last 1576290238 [304563.442782] Lustre: fir-MDT0000: haven't heard from client d5c27389-d809-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857e1f65800, cur 1576290801 expire 1576290651 last 1576290574 [304620.525112] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [304620.533034] Lustre: Skipped 1 previous similar message [304785.957073] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [304785.964999] Lustre: Skipped 1 previous similar message [304847.400256] Lustre: MGS: haven't heard from client fc60ed45-96c5-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8855ea148400, cur 1576291085 expire 1576290935 last 1576290858 [304847.419549] Lustre: Skipped 1 previous similar message [316019.482751] Lustre: MGS: haven't heard from client 39fca784-af65-fa11-ce1e-d955b280495f (at 10.9.101.41@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bbc77d800, cur 1576302257 expire 1576302107 last 1576302030 [316019.503937] Lustre: Skipped 1 previous similar message [317168.912966] Lustre: MGS: Connection restored to 16834498-f082-b8d6-0fed-822dab1a074a (at 10.8.26.35@o2ib6) [317168.922712] Lustre: Skipped 1 previous similar message [317208.475056] Lustre: MGS: haven't heard from client 8c5cf0e6-7ea3-4 (at 10.8.26.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88594aa6e000, cur 1576303446 expire 1576303296 last 1576303219 [317208.494338] Lustre: Skipped 7 previous similar messages [317814.301095] Lustre: MGS: Connection restored to (at 10.9.101.41@o2ib4) [317814.307803] Lustre: Skipped 1 previous similar message [317936.779725] Lustre: MGS: Connection restored to 99026c83-56e0-4dd3-fc33-0054b763a3cf (at 10.9.108.44@o2ib4) [317936.789565] Lustre: Skipped 1 previous similar message [318379.353554] Lustre: MGS: Connection restored to (at 10.9.108.38@o2ib4) [318379.360267] Lustre: Skipped 1 previous similar message [318409.751560] Lustre: MGS: Connection restored to ac489d2b-cfb7-24fc-423e-e2c0425db7fc (at 10.9.108.53@o2ib4) [318409.761385] Lustre: Skipped 1 previous similar message [323760.815940] Lustre: MGS: Connection restored to (at 10.9.102.57@o2ib4) [323760.822648] Lustre: Skipped 1 previous similar message [325632.745789] Lustre: MGS: Connection restored to (at 10.8.26.24@o2ib6) [325632.752416] Lustre: Skipped 1 previous similar message [325887.794203] Lustre: 109712:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576312118/real 1576312118] req@ffff887bbf1e3600 x1652543233472176/t0(0) o104->fir-MDT0000@10.9.0.63@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1576312125 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [325887.821540] Lustre: 109712:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21 previous similar messages [325890.413491] LNetError: 38666:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) [325897.153589] Lustre: fir-MDT0000: Client ab702dbb-acc1-454f-76f0-dd27e10f6c0e (at 10.9.110.3@o2ib4) reconnecting [325897.163830] Lustre: fir-MDT0000: Connection restored to (at 10.9.110.3@o2ib4) [325897.171162] Lustre: Skipped 1 previous similar message [328179.739661] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [328223.542396] Lustre: fir-MDT0000: haven't heard from client b28ac1d3-bc02-a84b-c4e3-8b0da2026a7b (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d1e80c00, cur 1576314461 expire 1576314311 last 1576314234 [328223.564188] Lustre: Skipped 1 previous similar message [329191.888353] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [329191.896645] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [329191.903874] Lustre: Skipped 1 previous similar message [329281.851134] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [329281.861340] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [329375.155769] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [329415.433188] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [329415.433255] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [329415.458762] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [329440.522532] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [329440.532722] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [329456.147382] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [329456.157570] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [339295.562220] Lustre: MGS: Connection restored to 16834498-f082-b8d6-0fed-822dab1a074a (at 10.8.26.35@o2ib6) [339324.624151] Lustre: fir-MDT0000: haven't heard from client 10addcc5-009e-4 (at 10.8.26.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8859e54ba000, cur 1576325562 expire 1576325412 last 1576325335 [339324.644120] Lustre: Skipped 1 previous similar message [346668.661121] Lustre: MGS: haven't heard from client a0f9c285-07ec-e1c6-04bb-b2f1495b91ef (at 10.9.106.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0ace000, cur 1576332906 expire 1576332756 last 1576332679 [346668.682327] Lustre: Skipped 1 previous similar message [346782.738902] Lustre: MGS: Connection restored to (at 10.9.106.54@o2ib4) [346782.745618] Lustre: Skipped 1 previous similar message [352508.036630] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [352508.044563] Lustre: Skipped 1 previous similar message [352530.697309] Lustre: MGS: haven't heard from client f9a8e4da-2689-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885af5e63000, cur 1576338768 expire 1576338618 last 1576338541 [352530.716589] Lustre: Skipped 1 previous similar message [352548.694749] Lustre: fir-MDT0000: haven't heard from client 13008421-123d-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885e8838dc00, cur 1576338786 expire 1576338636 last 1576338559 [352986.702784] Lustre: fir-MDT0000: haven't heard from client 827cbbc8-78ef-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885ac864c800, cur 1576339224 expire 1576339074 last 1576338997 [353031.010793] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [353031.018722] Lustre: Skipped 1 previous similar message [358869.105925] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [358869.113849] Lustre: Skipped 1 previous similar message [358903.733713] Lustre: fir-MDT0000: haven't heard from client 9d0458af-3fba-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88664eac7400, cur 1576345141 expire 1576344991 last 1576344914 [358903.753691] Lustre: Skipped 1 previous similar message [359088.235392] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [359088.243312] Lustre: Skipped 1 previous similar message [359147.734998] Lustre: fir-MDT0000: haven't heard from client 4a0f83ff-1fd6-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88687dff5800, cur 1576345385 expire 1576345235 last 1576345158 [359147.754969] Lustre: Skipped 1 previous similar message [360842.253165] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [360842.261463] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [360842.268690] Lustre: Skipped 1 previous similar message [360844.381429] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [360940.490966] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [360940.499265] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [364512.564672] Lustre: MGS: Connection restored to (at 10.9.106.58@o2ib4) [365308.768894] Lustre: fir-MDT0000: haven't heard from client 1a15ac0b-33c5-1284-d3c3-4f9b39c6e145 (at 10.8.23.31@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887978e5c800, cur 1576351546 expire 1576351396 last 1576351319 [365308.790686] Lustre: Skipped 1 previous similar message [367383.882514] Lustre: MGS: Connection restored to 1a15ac0b-33c5-1284-d3c3-4f9b39c6e145 (at 10.8.23.31@o2ib6) [367383.892259] Lustre: Skipped 1 previous similar message [369036.301285] Lustre: MGS: Connection restored to 8dd84d2f-366f-ac1d-06b8-51ead500d18d (at 10.8.20.28@o2ib6) [369036.311029] Lustre: Skipped 1 previous similar message [369049.287686] Lustre: MGS: Connection restored to (at 10.8.23.35@o2ib6) [369049.294314] Lustre: Skipped 1 previous similar message [369134.018883] Lustre: MGS: Connection restored to (at 10.8.20.6@o2ib6) [369134.025439] Lustre: Skipped 1 previous similar message [369415.485618] Lustre: MGS: Connection restored to (at 10.8.23.36@o2ib6) [369415.492244] Lustre: Skipped 1 previous similar message [370235.083928] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [370235.092225] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [370235.099455] Lustre: Skipped 1 previous similar message [370244.675228] Lustre: MGS: Connection restored to 6e842e42-1583-b71d-7e0f-7411de939ee3 (at 10.8.22.23@o2ib6) [370263.046825] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [370263.055120] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [370263.062349] Lustre: Skipped 1 previous similar message [375644.477225] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [375644.485519] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [375749.732639] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [375749.740931] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [376633.834499] Lustre: fir-MDT0000: haven't heard from client 96174370-7b3f-46c9-7ad4-d5b5f4fb562b (at 10.9.104.61@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887729189400, cur 1576362871 expire 1576362721 last 1576362644 [376633.856414] Lustre: Skipped 1 previous similar message [378908.564035] Lustre: MGS: Connection restored to 96174370-7b3f-46c9-7ad4-d5b5f4fb562b (at 10.9.104.61@o2ib4) [380286.877335] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [380286.885638] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [380286.892867] Lustre: Skipped 1 previous similar message [380306.413178] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [380564.491198] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [380564.501387] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [380566.399698] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [380566.408003] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [381401.751038] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [381404.786334] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [381404.794632] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [381630.869065] Lustre: fir-MDT0000: haven't heard from client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf9da0000, cur 1576367868 expire 1576367718 last 1576367641 [381630.888969] Lustre: Skipped 1 previous similar message [381660.871577] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [385123.259036] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [385123.267335] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [385213.535133] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [385213.543420] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [388199.464618] Lustre: MGS: Connection restored to cce2fc1a-d500-0fbb-5491-2d32b40f4df2 (at 10.8.20.10@o2ib6) [389689.391241] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [389689.399538] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [389689.406767] Lustre: Skipped 1 previous similar message [389713.454613] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [389713.462933] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [398723.851272] LNetError: 38665:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [404588.656489] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [404588.664796] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [407143.039156] Lustre: MGS: haven't heard from client bc2bce77-0ba4-4 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bc52cf800, cur 1576393380 expire 1576393230 last 1576393153 [407249.356884] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [407306.557355] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [407306.557370] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [407349.603952] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [407349.604025] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [407349.631488] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [407349.638061] Lustre: Skipped 1 previous similar message [407428.439611] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [407428.449804] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [407428.952226] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [407433.199233] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [407433.207529] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [407444.739993] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [407469.796873] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [407469.807064] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [407501.564685] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [407501.574878] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [407541.477571] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [407541.487765] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [413525.599003] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [413525.607298] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [415778.214546] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [415814.073236] Lustre: fir-MDT0000: haven't heard from client 770f56b7-7337-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b91e49400, cur 1576402051 expire 1576401901 last 1576401824 [416099.435730] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [416099.443674] Lustre: Skipped 1 previous similar message [416157.075240] Lustre: fir-MDT0000: haven't heard from client 74faeebe-fc7c-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88691b635800, cur 1576402394 expire 1576402244 last 1576402167 [416157.095221] Lustre: Skipped 1 previous similar message [416337.842543] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [416337.850476] Lustre: Skipped 1 previous similar message [416403.076789] Lustre: fir-MDT0000: haven't heard from client 3f4b180c-b74c-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885631ecf000, cur 1576402640 expire 1576402490 last 1576402413 [416403.096764] Lustre: Skipped 1 previous similar message [416557.972376] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [416557.980302] Lustre: Skipped 1 previous similar message [416575.077793] Lustre: fir-MDT0000: haven't heard from client 718f897a-79aa-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8876c9a59400, cur 1576402812 expire 1576402662 last 1576402585 [416575.097820] Lustre: Skipped 1 previous similar message [416591.123380] Lustre: MGS: haven't heard from client e2f52f1f-9ea8-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885884311400, cur 1576402828 expire 1576402678 last 1576402601 [425305.869247] Lustre: MGS: Connection restored to 4c53748e-746c-128b-a760-b7a4f9c1d7e9 (at 10.9.106.7@o2ib4) [425305.878989] Lustre: Skipped 1 previous similar message [429162.158339] Lustre: fir-MDT0000: haven't heard from client 46f97fff-9baa-c4ae-3237-3a3e8ddc9ddd (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a577dfc00, cur 1576415399 expire 1576415249 last 1576415172 [429210.286188] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [429210.292809] Lustre: Skipped 1 previous similar message [432057.476636] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [432057.484928] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [432057.492158] Lustre: Skipped 1 previous similar message [432195.611349] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [432195.621540] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [432284.177600] Lustre: fir-MDT0000: haven't heard from client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88792c309000, cur 1576418521 expire 1576418371 last 1576418294 [432284.197502] Lustre: Skipped 1 previous similar message [432330.368837] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [432352.635107] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [432352.635954] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [432379.033642] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [432379.041945] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [432397.435312] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [432397.445507] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [432472.700633] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [432472.708926] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [434046.049667] Lustre: MGS: Connection restored to 360200e4-9bb2-dc52-b96e-5f48834c2e13 (at 10.8.27.21@o2ib6) [434267.443333] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [434267.451624] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [434267.458849] Lustre: Skipped 1 previous similar message [434295.366792] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [434320.391731] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [434385.544023] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [434385.552318] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [434483.204790] Lustre: MGS: haven't heard from client 9c7cfe37-aec9-4 (at 10.9.102.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be8235800, cur 1576420720 expire 1576420570 last 1576420493 [437589.064858] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [437634.209288] Lustre: fir-MDT0000: haven't heard from client 239816b2-796b-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884c0c1ee400, cur 1576423871 expire 1576423721 last 1576423644 [437634.229277] Lustre: Skipped 1 previous similar message [437835.380159] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [437835.388087] Lustre: Skipped 1 previous similar message [437892.222770] Lustre: MGS: haven't heard from client 9fc50044-0984-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88616e2cbc00, cur 1576424129 expire 1576423979 last 1576423902 [437892.242047] Lustre: Skipped 1 previous similar message [441613.290438] Lustre: MGS: Connection restored to 88fa221d-7176-1083-8a20-d837893f0e22 (at 10.9.106.8@o2ib4) [441613.300182] Lustre: Skipped 1 previous similar message [442363.829008] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [442363.829175] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [442363.830124] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [442363.830135] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [442363.830136] Lustre: Skipped 1 previous similar message [442412.849198] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [442412.849293] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [442412.876665] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [442412.883198] Lustre: Skipped 1 previous similar message [442451.847582] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [442451.847893] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [442451.875043] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [442476.917797] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [442476.917832] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [442476.917859] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [446899.814377] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [446899.822670] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [446899.829902] Lustre: Skipped 1 previous similar message [447435.257069] Lustre: fir-MDT0000: haven't heard from client 8b3ab50d-6611-e9b8-ac15-a25011950745 (at 10.9.109.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887be3519400, cur 1576433672 expire 1576433522 last 1576433445 [447435.278975] Lustre: Skipped 1 previous similar message [449725.727323] Lustre: MGS: Connection restored to 8b3ab50d-6611-e9b8-ac15-a25011950745 (at 10.9.109.13@o2ib4) [452906.292937] Lustre: MGS: haven't heard from client 18e9c93e-c0d0-064b-adf9-b831eed38b5d (at 10.9.101.50@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bde255400, cur 1576439143 expire 1576438993 last 1576438916 [452906.314125] Lustre: Skipped 1 previous similar message [454828.948614] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [454828.956911] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [454828.964140] Lustre: Skipped 1 previous similar message [454919.675402] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [454919.683698] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [455169.993518] Lustre: MGS: Connection restored to 6abb3660-2eda-f822-d10c-5cf9743df13e (at 10.9.101.50@o2ib4) [457358.728489] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [457358.736784] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [457358.744014] Lustre: Skipped 1 previous similar message [457378.825833] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [457454.090207] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [457454.098501] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [461420.346777] Lustre: fir-MDT0000: haven't heard from client 84f1d1ef-d7b7-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887af9289800, cur 1576447657 expire 1576447507 last 1576447430 [461420.366748] Lustre: Skipped 1 previous similar message [461489.588292] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [461684.584724] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [461684.592649] Lustre: Skipped 1 previous similar message [461727.344735] Lustre: fir-MDT0000: haven't heard from client 6af314f7-0317-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879a333a000, cur 1576447964 expire 1576447814 last 1576447737 [461727.364706] Lustre: Skipped 1 previous similar message [461743.349668] Lustre: MGS: haven't heard from client 26695f69-06a9-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8853e3a18400, cur 1576447980 expire 1576447830 last 1576447753 [461988.347382] Lustre: fir-MDT0000: haven't heard from client 1e44a584-9778-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8868b5d78000, cur 1576448225 expire 1576448075 last 1576447998 [462048.786484] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [462048.794403] Lustre: Skipped 1 previous similar message [462275.084303] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [462275.092232] Lustre: Skipped 1 previous similar message [462275.351065] Lustre: fir-MDT0000: haven't heard from client f4d4243d-d327-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a3a366000, cur 1576448512 expire 1576448362 last 1576448285 [462275.371038] Lustre: Skipped 1 previous similar message [462470.029993] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [462470.037914] Lustre: Skipped 1 previous similar message [462513.349845] Lustre: fir-MDT0000: haven't heard from client 85f58f1f-0696-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88510de1b000, cur 1576448750 expire 1576448600 last 1576448523 [462513.369846] Lustre: Skipped 1 previous similar message [463926.804319] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576450156/real 1576450156] req@ffff887962a0e780 x1652543454381728/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576450163 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [463933.831365] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576450163/real 1576450163] req@ffff887962a0e780 x1652543454381728/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576450170 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [463940.858412] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576450170/real 1576450170] req@ffff887962a0e780 x1652543454381728/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576450177 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [463947.885459] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576450177/real 1576450177] req@ffff887962a0e780 x1652543454381728/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576450184 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [463961.912557] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576450191/real 1576450191] req@ffff887962a0e780 x1652543454381728/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576450198 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [463961.939980] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 1 previous similar message [463981.719713] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [463981.727630] Lustre: Skipped 1 previous similar message [463982.949699] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576450212/real 1576450212] req@ffff887962a0e780 x1652543454381728/t0(0) o104->fir-MDT0000@10.8.18.35@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576450219 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [463982.977125] Lustre: 109742:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [463982.987305] LustreError: 109742:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.18.35@o2ib6) returned error from blocking AST (req@ffff887962a0e780 x1652543454381728 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff8869e3ad9f80/0xc3c20c0b35b96a05 lrc: 4/0,0 mode: PR/PR res: [0x20003c264:0x8:0x0].0x0 bits 0x13/0x0 rrc: 6 type: IBT flags: 0x60200400000020 nid: 10.8.18.35@o2ib6 remote: 0x93588e488463ea17 expref: 830 pid: 109567 timeout: 464129 lvb_type: 0 [463983.030455] LustreError: 109742:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 2 previous similar messages [463983.040750] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.18.35@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [463983.053377] LustreError: Skipped 2 previous similar messages [463983.059157] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 63s: evicting client at 10.8.18.35@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff8869e3ad9f80/0xc3c20c0b35b96a05 lrc: 3/0,0 mode: PR/PR res: [0x20003c264:0x8:0x0].0x0 bits 0x13/0x0 rrc: 6 type: IBT flags: 0x60200400000020 nid: 10.8.18.35@o2ib6 remote: 0x93588e488463ea17 expref: 831 pid: 109567 timeout: 0 lvb_type: 0 [464028.385090] Lustre: MGS: haven't heard from client d629febe-2cc9-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88796be41400, cur 1576450265 expire 1576450115 last 1576450038 [464028.404390] Lustre: Skipped 1 previous similar message [468246.345187] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [468246.353485] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [468246.360714] Lustre: Skipped 1 previous similar message [477845.366081] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [477845.374381] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [477890.016327] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [477890.016342] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [477890.016370] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [477929.612856] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [477929.612917] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [477929.640358] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [477929.646902] Lustre: Skipped 1 previous similar message [477954.701246] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [479252.519189] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [479287.461548] Lustre: fir-MDT0000: haven't heard from client bd1bb6a4-e78c-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8877ae08b000, cur 1576465524 expire 1576465374 last 1576465297 [483280.984404] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [483280.992332] Lustre: Skipped 1 previous similar message [483319.484076] Lustre: fir-MDT0000: haven't heard from client 3deccf9e-b058-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887585386c00, cur 1576469556 expire 1576469406 last 1576469329 [483319.504055] Lustre: Skipped 1 previous similar message [483507.838326] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [483507.846249] Lustre: Skipped 1 previous similar message [483559.488281] Lustre: fir-MDT0000: haven't heard from client deac98c2-ec30-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8854f1eb5400, cur 1576469796 expire 1576469646 last 1576469569 [483559.508266] Lustre: Skipped 1 previous similar message [485365.495371] Lustre: fir-MDT0000: haven't heard from client 1f72d546-482b-ba22-9634-964c4dc9701a (at 10.9.108.56@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744e766400, cur 1576471602 expire 1576471452 last 1576471375 [485365.517251] Lustre: Skipped 1 previous similar message [485367.522260] Lustre: MGS: haven't heard from client e142f2f9-210e-0805-da84-a286b3ea8ca0 (at 10.9.108.56@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdaf7a000, cur 1576471604 expire 1576471454 last 1576471377 [485764.713756] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [485764.721676] Lustre: Skipped 1 previous similar message [485793.498494] Lustre: fir-MDT0000: haven't heard from client 6da958c9-a85d-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8876d8d08800, cur 1576472030 expire 1576471880 last 1576471803 [485928.581120] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [485928.589055] Lustre: Skipped 1 previous similar message [485991.499201] Lustre: fir-MDT0000: haven't heard from client e73464dd-627a-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8852fb4c9400, cur 1576472228 expire 1576472078 last 1576472001 [485991.519177] Lustre: Skipped 1 previous similar message [486841.540978] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [486841.548902] Lustre: Skipped 1 previous similar message [486859.514635] Lustre: MGS: haven't heard from client 14e6ad35-eaa8-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8852f7cdd000, cur 1576473096 expire 1576472946 last 1576472869 [486859.533914] Lustre: Skipped 1 previous similar message [487079.519527] Lustre: fir-MDT0000: haven't heard from client 1a6bcd40-b93c-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886269d6d400, cur 1576473316 expire 1576473166 last 1576473089 [487079.539505] Lustre: Skipped 1 previous similar message [487094.523619] Lustre: MGS: haven't heard from client a2150e62-37aa-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857e6630400, cur 1576473331 expire 1576473181 last 1576473104 [487155.716536] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [487155.724456] Lustre: Skipped 1 previous similar message [487341.409198] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [487341.417121] Lustre: Skipped 1 previous similar message [487382.509488] Lustre: MGS: haven't heard from client e93d4140-8248-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857dcaa3000, cur 1576473619 expire 1576473469 last 1576473392 [487391.513742] Lustre: fir-MDT0000: haven't heard from client fd0fb932-f791-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88648965b000, cur 1576473628 expire 1576473478 last 1576473401 [487705.774680] Lustre: MGS: Connection restored to (at 10.9.108.56@o2ib4) [487705.781429] Lustre: Skipped 1 previous similar message [491564.066331] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [491564.074253] Lustre: Skipped 1 previous similar message [491608.556169] Lustre: MGS: haven't heard from client 27f14c35-36b1-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857cc2bfc00, cur 1576477845 expire 1576477695 last 1576477618 [491726.924165] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [491726.932091] Lustre: Skipped 1 previous similar message [491790.536741] Lustre: fir-MDT0000: haven't heard from client e29ec0f5-8cfb-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8863123f5800, cur 1576478027 expire 1576477877 last 1576477800 [491790.556735] Lustre: Skipped 1 previous similar message [491898.281937] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [491898.289861] Lustre: Skipped 1 previous similar message [491953.550187] Lustre: fir-MDT0000: haven't heard from client a2f71a98-e538-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bb4ad5400, cur 1576478190 expire 1576478040 last 1576477963 [491953.570165] Lustre: Skipped 1 previous similar message [492135.548633] Lustre: fir-MDT0000: haven't heard from client 132f6338-2751-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886796bac800, cur 1576478372 expire 1576478222 last 1576478145 [492135.568605] Lustre: Skipped 1 previous similar message [492151.576310] Lustre: MGS: haven't heard from client 61739e8c-1e31-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885ade644000, cur 1576478388 expire 1576478238 last 1576478161 [492169.126210] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [492169.134136] Lustre: Skipped 1 previous similar message [494028.544931] Lustre: fir-MDT0000: haven't heard from client 37e94551-6b5f-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88587bbca000, cur 1576480265 expire 1576480115 last 1576480038 [494322.468487] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [494322.476408] Lustre: Skipped 1 previous similar message [494511.232320] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [494511.240260] Lustre: Skipped 1 previous similar message [494560.547345] Lustre: fir-MDT0000: haven't heard from client cd77b93f-6db7-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885105cb9400, cur 1576480797 expire 1576480647 last 1576480570 [494560.567324] Lustre: Skipped 1 previous similar message [494575.559507] Lustre: MGS: haven't heard from client 5998af6f-3c68-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b1ff1e400, cur 1576480812 expire 1576480662 last 1576480585 [523625.496613] Lustre: 109656:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [523625.997599] Lustre: 109766:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [523626.009427] Lustre: 109766:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 1496 previous similar messages [523627.232021] Lustre: 109667:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [523627.243847] Lustre: 109667:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 741 previous similar messages [540521.828360] Lustre: fir-MDT0000: haven't heard from client 3a7e0e42-33db-67bb-9fc0-74e80f2686d6 (at 10.9.110.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b9f70a000, cur 1576526758 expire 1576526608 last 1576526531 [544713.853169] Lustre: fir-MDT0000: haven't heard from client 2788318e-2aa2-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8862bffec800, cur 1576530950 expire 1576530800 last 1576530723 [544713.873145] Lustre: Skipped 1 previous similar message [544788.781831] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [544788.789755] Lustre: Skipped 1 previous similar message [544957.918678] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [544957.926605] Lustre: Skipped 1 previous similar message [545014.859653] Lustre: MGS: haven't heard from client 71903702-0eeb-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88593da14c00, cur 1576531251 expire 1576531101 last 1576531024 [545014.878928] Lustre: Skipped 1 previous similar message [545123.763914] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [545123.771862] Lustre: Skipped 1 previous similar message [545184.856089] Lustre: fir-MDT0000: haven't heard from client a2221c31-08cd-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a6e374400, cur 1576531421 expire 1576531271 last 1576531194 [545184.876063] Lustre: Skipped 1 previous similar message [545349.900986] Lustre: MGS: haven't heard from client fade8ab1-b59a-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888b7fe83400, cur 1576531586 expire 1576531436 last 1576531359 [545349.920262] Lustre: Skipped 1 previous similar message [545626.143778] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [545626.151703] Lustre: Skipped 1 previous similar message [546982.871106] Lustre: fir-MDT0000: haven't heard from client 5b14590a-8d88-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a7aa0ac00, cur 1576533219 expire 1576533069 last 1576532992 [546982.891076] Lustre: Skipped 1 previous similar message [547457.396953] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [547457.404875] Lustre: Skipped 1 previous similar message [547621.845863] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [547621.853786] Lustre: Skipped 1 previous similar message [547683.871489] Lustre: fir-MDT0000: haven't heard from client 4a4abc2a-aa1d-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886b95687800, cur 1576533920 expire 1576533770 last 1576533693 [547683.891456] Lustre: Skipped 1 previous similar message [547806.024274] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [547806.032195] Lustre: Skipped 1 previous similar message [547847.876623] Lustre: MGS: haven't heard from client ed255e1f-cf38-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887761335c00, cur 1576534084 expire 1576533934 last 1576533857 [547847.895904] Lustre: Skipped 1 previous similar message [547848.880642] Lustre: fir-MDT0000: haven't heard from client e59ea459-f3a7-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8864b2ad8800, cur 1576534085 expire 1576533935 last 1576533858 [553894.909700] Lustre: fir-MDT0000: haven't heard from client 6d6a43ea-d453-4 (at 10.8.9.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bf4fa1c00, cur 1576540131 expire 1576539981 last 1576539904 [565218.990423] Lustre: fir-MDT0000: haven't heard from client b798f616-b936-c19e-02ba-1cd915f838bb (at 10.9.106.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888ab0a17000, cur 1576551455 expire 1576551305 last 1576551228 [565219.012295] Lustre: Skipped 1 previous similar message [565350.549724] Lustre: MGS: Connection restored to (at 10.9.106.54@o2ib4) [565350.556433] Lustre: Skipped 1 previous similar message [574408.122399] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [574408.130688] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [574408.137913] Lustre: Skipped 1 previous similar message [574408.343444] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [574418.091542] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [574443.118184] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [574443.128377] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [574443.134911] Lustre: Skipped 1 previous similar message [574468.199325] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [574493.289966] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [574493.298260] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [574910.135967] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [574910.144266] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [574944.874649] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [574969.963910] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.64@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [575005.675362] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [575005.683661] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [602232.239518] Lustre: fir-MDT0000: haven't heard from client 5e126667-9256-3aa0-695a-dcb9788d852c (at 10.9.106.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88610eac2c00, cur 1576588468 expire 1576588318 last 1576588241 [602232.261395] Lustre: Skipped 1 previous similar message [602343.109831] Lustre: MGS: Connection restored to (at 10.9.106.54@o2ib4) [603273.295260] Lustre: fir-MDT0000: haven't heard from client 2cceba1b-cedf-ae3f-48bb-79486f997f62 (at 10.9.106.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887152b0f000, cur 1576589509 expire 1576589359 last 1576589282 [603273.317134] Lustre: Skipped 1 previous similar message [603399.226136] Lustre: MGS: Connection restored to (at 10.9.106.54@o2ib4) [603399.232850] Lustre: Skipped 1 previous similar message [614340.320395] Lustre: fir-MDT0000: haven't heard from client 596d1c2a-ecf1-cbca-e695-1db418efffef (at 10.9.106.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d73e6d400, cur 1576600576 expire 1576600426 last 1576600349 [614340.342273] Lustre: Skipped 1 previous similar message [614471.038559] Lustre: MGS: Connection restored to (at 10.9.106.54@o2ib4) [614471.045270] Lustre: Skipped 1 previous similar message [615396.326895] Lustre: fir-MDT0000: haven't heard from client 041fb2f5-3a44-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885470057000, cur 1576601632 expire 1576601482 last 1576601405 [615396.346868] Lustre: Skipped 7 previous similar messages [615864.499934] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [615864.507857] Lustre: Skipped 1 previous similar message [615961.602134] Lustre: MGS: Connection restored to 88fa221d-7176-1083-8a20-d837893f0e22 (at 10.9.106.8@o2ib4) [615961.611876] Lustre: Skipped 1 previous similar message [616160.617034] Lustre: MGS: Connection restored to 99026c83-56e0-4dd3-fc33-0054b763a3cf (at 10.9.108.44@o2ib4) [616160.626866] Lustre: Skipped 1 previous similar message [616366.268755] Lustre: MGS: Connection restored to 714da8dd-1047-4 (at 10.9.107.20@o2ib4) [616366.276793] Lustre: Skipped 1 previous similar message [617974.395282] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [617974.403201] Lustre: Skipped 1 previous similar message [617999.342556] Lustre: fir-MDT0000: haven't heard from client 8cb46322-8f12-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887626f0a000, cur 1576604235 expire 1576604085 last 1576604008 [617999.362526] Lustre: Skipped 1 previous similar message [618302.414714] LNetError: 4728:0:(o2iblnd_cb.c:2961:kiblnd_rejected()) 10.0.10.201@o2ib7 rejected: o2iblnd fatal error [618302.425252] LNetError: 4728:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [618302.437156] LNetError: 4728:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 1 previous similar message [618304.562968] Lustre: MGS: Received new LWP connection from 10.8.27.33@o2ib6, removing former export from same NID [618304.573241] Lustre: MGS: Connection restored to (at 10.8.27.33@o2ib6) [618304.579862] Lustre: Skipped 1 previous similar message [618305.649144] Lustre: MGS: Received new LWP connection from 10.8.26.3@o2ib6, removing former export from same NID [618305.659329] Lustre: MGS: Connection restored to (at 10.8.26.3@o2ib6) [618306.608586] Lustre: fir-MDT0000: Client cdcaaed8-2a18-c127-5b80-537d6c8b1075 (at 10.8.27.7@o2ib6) reconnecting [618307.213458] Lustre: fir-MDT0000: Client 7b29094a-6a20-4 (at 10.8.22.1@o2ib6) reconnecting [618307.221724] Lustre: Skipped 1 previous similar message [618308.021370] Lustre: MGS: Received new LWP connection from 10.8.17.25@o2ib6, removing former export from same NID [618308.031646] Lustre: MGS: Connection restored to b44d8559-b6fa-c6ac-9733-9495841decff (at 10.8.17.25@o2ib6) [618308.041381] Lustre: Skipped 4 previous similar messages [618308.374911] Lustre: fir-MDT0000: Client 840378ea-a823-7fff-6a74-a55368c8e575 (at 10.8.24.23@o2ib6) reconnecting [618308.385090] Lustre: Skipped 2 previous similar messages [618310.439046] Lustre: MGS: Received new LWP connection from 10.8.21.17@o2ib6, removing former export from same NID [618310.449307] Lustre: Skipped 5 previous similar messages [618310.976828] Lustre: fir-MDT0000: Client fb83de3b-fb5e-6930-0db9-4bed36c7d2d5 (at 10.8.25.7@o2ib6) reconnecting [618310.986914] Lustre: Skipped 7 previous similar messages [618312.109797] Lustre: MGS: Connection restored to (at 10.8.18.13@o2ib6) [618312.116412] Lustre: Skipped 19 previous similar messages [618314.673380] Lustre: MGS: Received new LWP connection from 10.8.26.21@o2ib6, removing former export from same NID [618314.683642] Lustre: Skipped 5 previous similar messages [618315.490285] Lustre: fir-MDT0000: Client 2c36e76b-4e4b-7c7a-24ea-21141443e402 (at 10.8.8.25@o2ib6) reconnecting [618315.500370] Lustre: Skipped 7 previous similar messages [618320.440045] Lustre: fir-MDT0000: Connection restored to 6fc393d4-6943-9dea-aaa0-9424272390c8 (at 10.8.26.5@o2ib6) [618320.450400] Lustre: Skipped 35 previous similar messages [618322.907371] Lustre: MGS: Received new LWP connection from 10.8.19.8@o2ib6, removing former export from same NID [618322.917550] Lustre: Skipped 21 previous similar messages [618324.453492] Lustre: fir-MDT0000: Client 36eab9cd-d6e6-272d-4424-f253a884fec5 (at 10.8.8.33@o2ib6) reconnecting [618324.463588] Lustre: Skipped 22 previous similar messages [618329.715094] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.23.2@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [618336.425248] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.22.25@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [618341.398911] Lustre: MGS: Received new LWP connection from 10.8.24.17@o2ib6, removing former export from same NID [618341.409184] Lustre: Skipped 10 previous similar messages [618341.414611] Lustre: MGS: Connection restored to 04b6fb6b-3bcb-7865-aada-b02224ee3504 (at 10.8.24.17@o2ib6) [618341.424354] Lustre: Skipped 39 previous similar messages [618343.166685] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.24.22@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [618350.090286] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.25.30@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [618350.107662] LustreError: Skipped 1 previous similar message [618359.471846] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.30.9@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [618370.511102] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.25.4@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [618387.454577] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.8.18.33@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [618387.471943] LustreError: Skipped 2 previous similar messages [618410.321324] Lustre: fir-MDT0000: Client f2898033-6a23-2537-8cf9-46709394f401 (at 10.8.18.28@o2ib6) reconnecting [618410.331496] Lustre: Skipped 10 previous similar messages [618410.336942] Lustre: fir-MDT0000: Connection restored to (at 10.8.18.28@o2ib6) [618410.344252] Lustre: Skipped 10 previous similar messages [618594.420778] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 0 seconds [618594.431052] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [618599.716802] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.201@o2ib7 added to recovery queue. Health = 900 [618600.420810] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 0 seconds [618600.729781] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.201@o2ib7: -125 [618615.420895] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 0 seconds [618615.431156] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [618641.421037] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 1 seconds [618641.431292] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [618641.440603] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [618641.452610] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 4 previous similar messages [618680.421252] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 0 seconds [618680.431503] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 2 previous similar messages [618725.421504] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [618725.433516] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 6 previous similar messages [618760.421698] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 0 seconds [618760.431955] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 6 previous similar messages [618885.422398] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [618885.434393] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 13 previous similar messages [618902.778464] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.201@o2ib7: -125 [618916.422552] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 1 seconds [618916.432807] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 15 previous similar messages [619185.425053] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [619185.437053] LNetError: 38662:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 26 previous similar messages [619209.826151] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.201@o2ib7: -125 [619225.424263] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.201@o2ib7: 0 seconds [619225.434523] LNet: 38662:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 29 previous similar messages [619510.878751] LNetError: 38679:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.201@o2ib7: -125 [619792.924788] LNetError: 104951:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [619792.936892] LNetError: 104951:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 59 previous similar messages [619891.593370] Lustre: MGS: Received new LWP connection from 10.8.21.10@o2ib6, removing former export from same NID [619891.593409] Lustre: MGS: Connection restored to (at 10.8.21.23@o2ib6) [619891.593411] Lustre: Skipped 7 previous similar messages [619891.615575] Lustre: Skipped 11 previous similar messages [619891.657574] Lustre: fir-MDT0000: Client eafe2fd4-ef02-4 (at 10.8.26.34@o2ib6) reconnecting [619891.665931] Lustre: Skipped 7 previous similar messages [619896.680345] Lustre: MGS: Received new LWP connection from 10.8.18.19@o2ib6, removing former export from same NID [619896.690609] Lustre: Skipped 10 previous similar messages [619896.779801] Lustre: fir-MDT0000: Client ba0eb903-fe09-9718-35fe-05d5281d3f3c (at 10.8.19.3@o2ib6) reconnecting [619896.789897] Lustre: Skipped 15 previous similar messages [619899.743451] Lustre: fir-MDT0000: Connection restored to 1ca289da-9e03-e714-36aa-e7d2048ea910 (at 10.8.22.15@o2ib6) [619899.753890] Lustre: Skipped 41 previous similar messages [619904.960737] Lustre: fir-MDT0000: Client db40d8c1-049a-4 (at 10.8.27.21@o2ib6) reconnecting [619904.969096] Lustre: Skipped 18 previous similar messages [619905.602244] Lustre: MGS: Received new LWP connection from 10.8.22.4@o2ib6, removing former export from same NID [619905.612424] Lustre: Skipped 16 previous similar messages [619910.669261] LustreError: 53927:0:(ldlm_lib.c:3256:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff888bd9ab1850 x1650113192601440/t0(0) o256->cfe7c5dd-3218-5629-ec13-2301548f532e@10.8.13.26@o2ib6:69/0 lens 304/240 e 1 to 0 dl 1576606169 ref 1 fl Interpret:/0/0 rc 0/0 [619910.693394] LustreError: 53927:0:(ldlm_lib.c:3256:target_bulk_io()) Skipped 2 previous similar messages [619915.945647] Lustre: MGS: Connection restored to 5dffb121-cc8a-d916-c96d-21b375fa8f4e (at 10.8.31.9@o2ib6) [619915.955309] Lustre: Skipped 68 previous similar messages [619921.787334] Lustre: fir-MDT0000: Client 3cf7b01d-62ec-c5c8-882c-2b9bee30f26f (at 10.8.22.8@o2ib6) reconnecting [619921.797429] Lustre: Skipped 18 previous similar messages [619926.634580] Lustre: MGS: Received new LWP connection from 10.8.23.18@o2ib6, removing former export from same NID [619926.644860] Lustre: Skipped 31 previous similar messages [619931.697326] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.20.35@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [619931.714696] LustreError: Skipped 3 previous similar messages [619936.218468] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.27.17@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [619936.235836] LustreError: Skipped 3 previous similar messages [619946.148986] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.7.5@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [619946.166179] LustreError: Skipped 4 previous similar messages [619949.074438] Lustre: MGS: Connection restored to eb60645d-f744-528a-c943-6dfa4e724df5 (at 10.8.18.27@o2ib6) [619949.084175] Lustre: Skipped 14 previous similar messages [619959.616775] Lustre: MGS: Received new LWP connection from 10.8.24.26@o2ib6, removing former export from same NID [619959.627036] Lustre: Skipped 20 previous similar messages [619962.905388] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.25.22@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [619962.922766] LustreError: Skipped 7 previous similar messages [619993.335091] Lustre: fir-MDT0000: Client 3af2f28e-938b-b5d7-947a-7269d32abf9c (at 10.8.27.33@o2ib6) reconnecting [619993.345269] Lustre: Skipped 3 previous similar messages [620019.342053] Lustre: fir-MDT0000: Connection restored to (at 10.8.24.33@o2ib6) [620019.349367] Lustre: Skipped 22 previous similar messages [621102.751836] LNetError: 4737:0:(o2iblnd_cb.c:2961:kiblnd_rejected()) 10.0.10.201@o2ib7 rejected: o2iblnd fatal error [621102.762370] LNetError: 4737:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [621102.774280] LNetError: 4737:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 9 previous similar messages [621103.770929] Lustre: fir-MDT0000: Client 9e2ed5f4-f2c0-aa2c-fbfb-597128c0f47c (at 10.8.27.35@o2ib6) reconnecting [621103.781122] Lustre: Skipped 6 previous similar messages [621103.786481] Lustre: fir-MDT0000: Connection restored to 9e2ed5f4-f2c0-aa2c-fbfb-597128c0f47c (at 10.8.27.35@o2ib6) [621103.796933] Lustre: Skipped 2 previous similar messages [621104.506308] Lustre: MGS: Received new LWP connection from 10.8.28.1@o2ib6, removing former export from same NID [621104.516484] Lustre: Skipped 3 previous similar messages [621111.895323] Lustre: fir-MDT0000: Client 10446bf2-1279-4 (at 10.8.20.21@o2ib6) reconnecting [621111.903685] Lustre: Skipped 26 previous similar messages [621113.008340] Lustre: MGS: Received new LWP connection from 10.8.21.23@o2ib6, removing former export from same NID [621113.018604] Lustre: Skipped 34 previous similar messages [621116.368024] Lustre: 109723:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576607345/real 1576607345] req@ffff886999bdba80 x1652543787786592/t0(0) o104->fir-MDT0000@10.8.0.68@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576607352 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [621120.415513] Lustre: fir-MDT0000: Connection restored to 9d35cbf5-bd07-39da-607a-73ee382afdf1 (at 10.8.17.17@o2ib6) [621120.425954] Lustre: Skipped 87 previous similar messages [621127.923377] Lustre: fir-MDT0000: Client d4980c72-b569-ae0a-9b3e-8188518bca1c (at 10.8.18.31@o2ib6) reconnecting [621127.933558] Lustre: Skipped 34 previous similar messages [621129.885324] Lustre: MGS: Received new LWP connection from 10.8.26.7@o2ib6, removing former export from same NID [621129.895498] Lustre: Skipped 26 previous similar messages [621130.771838] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.23.22@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [621130.789214] LustreError: Skipped 7 previous similar messages [621135.468132] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.25.9@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [621140.047183] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.203@o2ib7 added to recovery queue. Health = 900 [621148.041491] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.24.16@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [621148.058861] LustreError: Skipped 3 previous similar messages [621152.683681] Lustre: MGS: Connection restored to (at 10.8.24.31@o2ib6) [621152.690310] Lustre: Skipped 61 previous similar messages [621162.726123] Lustre: MGS: Received new LWP connection from 10.8.26.29@o2ib6, removing former export from same NID [621162.736389] Lustre: Skipped 26 previous similar messages [621165.922368] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.27.17@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [621165.939743] LustreError: Skipped 3 previous similar messages [621170.064381] LNetError: 38679:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.203@o2ib7 added to recovery queue. Health = 900 [621206.018670] Lustre: fir-MDT0000: Client 51315fc7-c4b3-f078-d969-3ad7a610223a (at 10.8.8.32@o2ib6) reconnecting [621206.028760] Lustre: Skipped 7 previous similar messages [621216.749524] Lustre: fir-MDT0000: Connection restored to 1cea57ff-1d0f-b4f4-8769-95e365d89cff (at 10.8.25.26@o2ib6) [621216.759961] Lustre: Skipped 23 previous similar messages [621245.383257] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.25.2@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [621245.400537] LustreError: Skipped 6 previous similar messages [621345.100760] LNetError: 4737:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.201@o2ib7 rejected: consumer defined fatal error [621345.112070] LNetError: 4737:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [621346.100738] LNetError: 4737:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.201@o2ib7 rejected: consumer defined fatal error [621347.100734] LNetError: 4737:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.201@o2ib7 rejected: consumer defined fatal error [621646.124723] LNetError: 105816:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.51@o2ib7 added to recovery queue. Health = 900 [621646.136810] LNetError: 105816:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 3 previous similar messages [621854.613691] Lustre: MGS: Received new LWP connection from 10.8.18.19@o2ib6, removing former export from same NID [621854.623962] Lustre: Skipped 8 previous similar messages [621854.629302] Lustre: MGS: Connection restored to (at 10.8.18.19@o2ib6) [621854.635923] Lustre: Skipped 5 previous similar messages [621854.640459] Lustre: fir-MDT0000: Client 8d7a01c3-e0f4-61f3-17f5-cd0947f68d29 (at 10.8.26.10@o2ib6) reconnecting [621854.640461] Lustre: Skipped 11 previous similar messages [625555.388859] Lustre: fir-MDT0000: haven't heard from client 65640447-d392-3a43-02f3-34e3f7e53218 (at 10.9.107.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbf049c00, cur 1576611791 expire 1576611641 last 1576611564 [625555.410739] Lustre: Skipped 1 previous similar message [626187.694284] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [626187.702207] Lustre: Skipped 2 previous similar messages [626225.392883] Lustre: fir-MDT0000: haven't heard from client a878c8ef-7747-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8858e2f01000, cur 1576612461 expire 1576612311 last 1576612234 [626225.412852] Lustre: Skipped 1 previous similar message [627409.749911] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [627409.758187] Lustre: Skipped 1 previous similar message [627409.763445] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [627409.770672] Lustre: Skipped 1 previous similar message [627508.931606] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [627508.939914] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [630862.394605] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [630862.402896] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [630875.659573] Lustre: MGS: Received new LWP connection from 10.9.0.64@o2ib4, removing former export from same NID [630875.669754] Lustre: Skipped 1 previous similar message [630875.675005] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [630950.362251] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [630950.370550] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [631960.444621] Lustre: MGS: haven't heard from client a41d0d9d-a539-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88888c686400, cur 1576618196 expire 1576618046 last 1576617969 [631960.463898] Lustre: Skipped 1 previous similar message [632190.392098] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [638438.565205] Lustre: fir-MDT0000: Client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) reconnecting [638438.573498] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.64@o2ib4) [638438.580763] Lustre: Skipped 1 previous similar message [638602.490735] Lustre: MGS: haven't heard from client bc2bce77-0ba4-4 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8874bc3a6000, cur 1576624838 expire 1576624688 last 1576624611 [638602.509949] Lustre: Skipped 1 previous similar message [638625.862627] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.108.36@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [638665.470760] Lustre: fir-MDT0000: haven't heard from client f58fa07b-04e0-4 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887794850400, cur 1576624901 expire 1576624751 last 1576624674 [638680.865976] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.117.22@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [638680.883433] LustreError: Skipped 1 previous similar message [638704.262118] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.108.32@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [638776.514914] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [639003.515116] Lustre: MGS: haven't heard from client bc2bce77-0ba4-4 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8862d5177000, cur 1576625239 expire 1576625089 last 1576625012 [639425.190637] Lustre: MGS: Connection restored to (at 10.9.0.64@o2ib4) [642278.493910] Lustre: fir-MDT0000: haven't heard from client 653ce015-8bf3-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88681b6fbc00, cur 1576628514 expire 1576628364 last 1576628287 [642425.332786] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [642425.340707] Lustre: Skipped 1 previous similar message [642651.497685] Lustre: MGS: haven't heard from client 1d073fc5-1295-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8889cd224c00, cur 1576628887 expire 1576628737 last 1576628660 [642651.516957] Lustre: Skipped 1 previous similar message [642829.949094] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [642829.957022] Lustre: Skipped 1 previous similar message [644661.314593] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [644661.322521] Lustre: Skipped 1 previous similar message [644687.509141] Lustre: fir-MDT0000: haven't heard from client 0d7a04f5-f5a0-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8861516fd400, cur 1576630923 expire 1576630773 last 1576630696 [644687.529111] Lustre: Skipped 1 previous similar message [656659.171293] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [656659.179211] Lustre: Skipped 1 previous similar message [656680.590742] Lustre: fir-MDT0000: haven't heard from client 88c5922b-f65a-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88796aef3800, cur 1576642916 expire 1576642766 last 1576642689 [656680.610713] Lustre: Skipped 1 previous similar message [668431.659776] Lustre: fir-MDT0000: haven't heard from client fe5cd664-56d9-e528-59e2-96d905f86590 (at 10.9.106.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd77fd800, cur 1576654667 expire 1576654517 last 1576654440 [668431.681651] Lustre: Skipped 1 previous similar message [668561.612025] Lustre: MGS: Connection restored to (at 10.9.106.54@o2ib4) [668561.618732] Lustre: Skipped 1 previous similar message [668889.299563] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [668889.307486] Lustre: Skipped 1 previous similar message [668929.662853] Lustre: fir-MDT0000: haven't heard from client ec27cecd-a691-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8859723d0400, cur 1576655165 expire 1576655015 last 1576654938 [668929.682820] Lustre: Skipped 1 previous similar message [671060.893142] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [671060.899761] Lustre: Skipped 1 previous similar message [671066.678033] Lustre: fir-MDT0000: haven't heard from client 93380483-86b2-258a-e516-dbc1a03c5266 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8877af489000, cur 1576657302 expire 1576657152 last 1576657075 [671066.699825] Lustre: Skipped 1 previous similar message [671084.697162] Lustre: MGS: haven't heard from client b06fdd01-abed-a073-0cd4-d7fdd3d859e9 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be1ab9c00, cur 1576657320 expire 1576657170 last 1576657093 [671505.496180] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [671505.502801] Lustre: Skipped 1 previous similar message [671539.680833] Lustre: fir-MDT0000: haven't heard from client b29c34c9-93cb-1419-cf77-288c752f013e (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bca653c00, cur 1576657775 expire 1576657625 last 1576657548 [671847.908447] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [671847.915070] Lustre: Skipped 1 previous similar message [671909.682863] Lustre: fir-MDT0000: haven't heard from client 6aa3d15f-2906-5e17-48f5-b50af253cad2 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc23a4000, cur 1576658145 expire 1576657995 last 1576657918 [671909.704652] Lustre: Skipped 1 previous similar message [672177.862765] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [672177.869388] Lustre: Skipped 1 previous similar message [672226.691428] Lustre: MGS: haven't heard from client 99b8fdaa-c4fc-6181-2c7f-2f838d02a6b2 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886be7746000, cur 1576658462 expire 1576658312 last 1576658235 [672226.712522] Lustre: Skipped 1 previous similar message [672451.509805] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [672451.516424] Lustre: Skipped 1 previous similar message [672458.568721] Lustre: MGS: Received new LWP connection from 10.8.23.12@o2ib6, removing former export from same NID [672481.685968] Lustre: fir-MDT0000: haven't heard from client ed897197-26c4-93f5-0721-7f42dd179858 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885aa9f84400, cur 1576658717 expire 1576658567 last 1576658490 [672481.707760] Lustre: Skipped 1 previous similar message [672756.115967] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [672756.122592] Lustre: Skipped 2 previous similar messages [672762.697402] Lustre: fir-MDT0000: haven't heard from client a9ffdedb-49ec-8c77-7f6c-a0ac833189d9 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887182335000, cur 1576658998 expire 1576658848 last 1576658771 [672762.719193] Lustre: Skipped 1 previous similar message [672971.082125] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [672971.088749] Lustre: Skipped 1 previous similar message [673027.687920] Lustre: fir-MDT0000: haven't heard from client c0939750-adcb-aaa1-0527-8cfbe6be2bd7 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886843a60c00, cur 1576659263 expire 1576659113 last 1576659036 [673027.709712] Lustre: Skipped 1 previous similar message [673374.691908] Lustre: fir-MDT0000: haven't heard from client 864a9eac-f739-5ad3-0fe9-32d4aa05793e (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886be3b56400, cur 1576659610 expire 1576659460 last 1576659383 [673374.713716] Lustre: Skipped 1 previous similar message [673485.789414] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [673485.796044] Lustre: Skipped 1 previous similar message [673788.966310] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [673788.972928] Lustre: Skipped 1 previous similar message [673826.713936] Lustre: fir-MDT0000: haven't heard from client 2387de52-2ee0-bc9c-c905-e414ef33241f (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b2e22c400, cur 1576660062 expire 1576659912 last 1576659835 [673826.735721] Lustre: Skipped 1 previous similar message [674117.695546] Lustre: fir-MDT0000: haven't heard from client 489f7d4c-26c6-79d4-d7dd-604d71b66f03 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886be3b50800, cur 1576660353 expire 1576660203 last 1576660126 [674117.717334] Lustre: Skipped 1 previous similar message [674174.854732] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [674174.861352] Lustre: Skipped 1 previous similar message [674765.970375] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [674765.977001] Lustre: Skipped 3 previous similar messages [674802.705174] Lustre: fir-MDT0000: haven't heard from client e9b5fcc4-09ef-855d-0837-6c2594706fa6 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886959410c00, cur 1576661038 expire 1576660888 last 1576660811 [674802.726963] Lustre: Skipped 3 previous similar messages [675348.760765] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [675348.767412] Lustre: Skipped 3 previous similar messages [675403.702110] Lustre: fir-MDT0000: haven't heard from client d1de590e-4dea-0fd6-ce47-c151b1599b9d (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8875fbf39c00, cur 1576661639 expire 1576661489 last 1576661412 [675403.723899] Lustre: Skipped 3 previous similar messages [676271.708704] Lustre: fir-MDT0000: haven't heard from client 705b95a0-d00a-7278-f5b7-c1f05d738e0b (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b5a696c00, cur 1576662507 expire 1576662357 last 1576662280 [676271.730495] Lustre: Skipped 3 previous similar messages [676289.527086] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [676289.533704] Lustre: Skipped 3 previous similar messages [677446.767603] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [677446.774223] Lustre: Skipped 3 previous similar messages [677482.716420] Lustre: fir-MDT0000: haven't heard from client f05f114a-65a7-b562-ff7f-239154392501 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886be7743800, cur 1576663718 expire 1576663568 last 1576663491 [677482.738235] Lustre: Skipped 3 previous similar messages [678111.993270] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [678111.999896] Lustre: Skipped 1 previous similar message [678126.718877] Lustre: fir-MDT0000: haven't heard from client 6fe77124-1fcd-50a0-711e-f97c40a83dda (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886843a65c00, cur 1576664362 expire 1576664212 last 1576664135 [678126.740662] Lustre: Skipped 1 previous similar message [678745.033828] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [678745.040448] Lustre: Skipped 1 previous similar message [678766.736474] Lustre: fir-MDT0000: haven't heard from client 2331bb8b-7d97-51ba-e90f-a2fe68e714e2 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a96f89c00, cur 1576665002 expire 1576664852 last 1576664775 [678766.758267] Lustre: Skipped 1 previous similar message [679349.738950] Lustre: fir-MDT0000: haven't heard from client e7436832-f19a-c881-ec4c-1805f620820a (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a96f8cc00, cur 1576665585 expire 1576665435 last 1576665358 [679349.760737] Lustre: Skipped 1 previous similar message [679565.140430] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [679565.147053] Lustre: Skipped 3 previous similar messages [679567.224397] Lustre: 109680:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576665795/real 1576665795] req@ffff886132e97080 x1652543838447536/t0(0) o104->fir-MDT0000@10.8.23.12@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1576665802 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [679567.252115] LustreError: 109680:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.23.12@o2ib6) returned error from blocking AST (req@ffff886132e97080 x1652543838447536 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff888a647f1b00/0xc3c20c0ec64a5cf5 lrc: 4/0,0 mode: PR/PR res: [0x20003c290:0x5:0x0].0x0 bits 0x13/0x0 rrc: 5 type: IBT flags: 0x60200400000020 nid: 10.8.23.12@o2ib6 remote: 0x364353974bdce609 expref: 17 pid: 109718 timeout: 679712 lvb_type: 0 [679567.295177] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.23.12@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [679567.307815] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 7s: evicting client at 10.8.23.12@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff888a647f1b00/0xc3c20c0ec64a5cf5 lrc: 3/0,0 mode: PR/PR res: [0x20003c290:0x5:0x0].0x0 bits 0x13/0x0 rrc: 5 type: IBT flags: 0x60200400000020 nid: 10.8.23.12@o2ib6 remote: 0x364353974bdce609 expref: 18 pid: 109718 timeout: 0 lvb_type: 0 [680119.732144] Lustre: fir-MDT0000: haven't heard from client c38b160d-6763-df94-54e9-8b1f2db3f0ce (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d5a709800, cur 1576666355 expire 1576666205 last 1576666128 [680119.753936] Lustre: Skipped 2 previous similar messages [680351.211425] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [680351.218045] Lustre: Skipped 3 previous similar messages [681033.334987] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [681033.341614] Lustre: Skipped 3 previous similar messages [681069.737491] Lustre: fir-MDT0000: haven't heard from client 0ea389b2-de13-6136-a0ba-55dec401e714 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ae27be000, cur 1576667305 expire 1576667155 last 1576667078 [681069.759282] Lustre: Skipped 5 previous similar messages [681823.580437] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [681823.587060] Lustre: Skipped 3 previous similar messages [681853.743351] Lustre: fir-MDT0000: haven't heard from client e55c1258-adc7-267c-2f5f-e757495b18f3 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8858eab6e800, cur 1576668089 expire 1576667939 last 1576667862 [681853.765147] Lustre: Skipped 3 previous similar messages [682862.491068] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [682862.497694] Lustre: Skipped 3 previous similar messages [682893.752655] Lustre: fir-MDT0000: haven't heard from client 87d84b14-46a9-9b8f-6c7b-3d4928597b8c (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886959415400, cur 1576669129 expire 1576668979 last 1576668902 [682893.774443] Lustre: Skipped 3 previous similar messages [683865.441138] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [683865.447767] Lustre: Skipped 3 previous similar messages [683925.756642] Lustre: fir-MDT0000: haven't heard from client dbf0a08d-6284-f4a3-9946-570efb85b733 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b7ddf4400, cur 1576670161 expire 1576670011 last 1576669934 [683925.778434] Lustre: Skipped 3 previous similar messages [684590.974307] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [684590.980934] Lustre: Skipped 3 previous similar messages [684646.761032] Lustre: fir-MDT0000: haven't heard from client 4d4f2f5d-4ea3-757b-957f-10a47053b3d7 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8864c61bb800, cur 1576670882 expire 1576670732 last 1576670655 [684646.782818] Lustre: Skipped 3 previous similar messages [685269.765057] Lustre: fir-MDT0000: haven't heard from client 454f6aed-add7-6976-5747-8c2c8c896c7c (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886959413000, cur 1576671505 expire 1576671355 last 1576671278 [685269.786844] Lustre: Skipped 5 previous similar messages [685378.226422] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [685378.233049] Lustre: Skipped 5 previous similar messages [686018.768133] Lustre: fir-MDT0000: haven't heard from client 659061a8-277f-ce03-0fa2-11c03bdab36f (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b82b43000, cur 1576672254 expire 1576672104 last 1576672027 [686018.789920] Lustre: Skipped 3 previous similar messages [686248.768713] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [686248.775336] Lustre: Skipped 3 previous similar messages [686752.773784] Lustre: fir-MDT0000: haven't heard from client 07cbdc7f-4902-010e-eddb-b7047ad9f233 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8859a4e3f800, cur 1576672988 expire 1576672838 last 1576672761 [686752.795569] Lustre: Skipped 1 previous similar message [687300.506053] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [687300.512679] Lustre: Skipped 3 previous similar messages [687879.781234] Lustre: fir-MDT0000: haven't heard from client fefa84d0-2f63-1e28-718a-1cd11050a928 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886959412000, cur 1576674115 expire 1576673965 last 1576673888 [687879.803034] Lustre: Skipped 3 previous similar messages [687913.215073] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [687913.221700] Lustre: Skipped 1 previous similar message [689076.789259] Lustre: fir-MDT0000: haven't heard from client 2eeeb8d4-0a03-2b83-8b6a-6a2253c3a0bf (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bb2982400, cur 1576675312 expire 1576675162 last 1576675085 [689076.811050] Lustre: Skipped 3 previous similar messages [689105.750680] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [689105.757306] Lustre: Skipped 3 previous similar messages [689551.762504] Lustre: MGS: Received new LWP connection from 10.8.23.12@o2ib6, removing former export from same NID [689558.832585] Lustre: MGS: Received new LWP connection from 10.8.23.12@o2ib6, removing former export from same NID [690570.100943] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [690570.107570] Lustre: Skipped 6 previous similar messages [690570.798599] Lustre: fir-MDT0000: haven't heard from client 8df10202-61a2-c031-d4ad-86aad511590e (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885035b01400, cur 1576676806 expire 1576676656 last 1576676579 [690570.820391] Lustre: Skipped 3 previous similar messages [690869.732746] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [690869.739367] Lustre: Skipped 1 previous similar message [690923.801231] Lustre: MGS: haven't heard from client dfd86e8e-c905-e086-6687-d251cbb9f88f (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8852dd546000, cur 1576677159 expire 1576677009 last 1576676932 [690923.822326] Lustre: Skipped 1 previous similar message [691072.039479] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [691072.046099] Lustre: Skipped 1 previous similar message [691147.802453] Lustre: fir-MDT0000: haven't heard from client b24a4f1e-9368-94cb-9430-16e608bc906e (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886b413c0000, cur 1576677383 expire 1576677233 last 1576677156 [691147.824351] Lustre: Skipped 1 previous similar message [691455.031661] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [691455.038291] Lustre: Skipped 1 previous similar message [691500.804919] Lustre: fir-MDT0000: haven't heard from client 73e6129c-8f95-32f1-1a81-6a0ef0b59591 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8872b460a400, cur 1576677736 expire 1576677586 last 1576677509 [691500.826709] Lustre: Skipped 1 previous similar message [706089.587437] Lustre: MGS: Connection restored to (at 10.8.27.17@o2ib6) [706089.594058] Lustre: Skipped 3 previous similar messages [710398.136397] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [710398.144318] Lustre: Skipped 1 previous similar message [710437.922412] Lustre: fir-MDT0000: haven't heard from client 6f16a2d8-fd3e-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b2a3a7400, cur 1576696673 expire 1576696523 last 1576696446 [710437.942409] Lustre: Skipped 3 previous similar messages [716716.957626] Lustre: fir-MDT0000: haven't heard from client 5ffdc310-a01d-df74-cc3e-4a3b4238a4fb (at 10.9.107.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879a8f26c00, cur 1576702952 expire 1576702802 last 1576702725 [716716.979502] Lustre: Skipped 1 previous similar message [718239.287022] Lustre: MGS: Connection restored to 4c497e0b-ea41-4 (at 10.8.9.1@o2ib6) [718239.294774] Lustre: Skipped 1 previous similar message [718815.155375] Lustre: MGS: Connection restored to 5ffdc310-a01d-df74-cc3e-4a3b4238a4fb (at 10.9.107.28@o2ib4) [718815.165205] Lustre: Skipped 1 previous similar message [718869.035593] Lustre: MGS: Connection restored to (at 10.9.110.28@o2ib4) [718869.042304] Lustre: Skipped 1 previous similar message [719019.668662] Lustre: MGS: Connection restored to 65640447-d392-3a43-02f3-34e3f7e53218 (at 10.9.107.30@o2ib4) [719019.678489] Lustre: Skipped 1 previous similar message [719179.427733] Lustre: MGS: Connection restored to (at 10.9.108.24@o2ib4) [719179.434447] Lustre: Skipped 3 previous similar messages [720071.977344] Lustre: fir-MDT0000: haven't heard from client 5149fe88-a20b-4 (at 10.9.107.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b6cf20000, cur 1576706307 expire 1576706157 last 1576706080 [720071.997440] Lustre: Skipped 5 previous similar messages [720089.090169] Lustre: MGS: Connection restored to adf9d391-23a2-4 (at 10.9.102.17@o2ib4) [720089.098178] Lustre: Skipped 1 previous similar message [720235.691074] LNetError: 38663:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [722062.491888] Lustre: MGS: Connection restored to 5ffdc310-a01d-df74-cc3e-4a3b4238a4fb (at 10.9.107.28@o2ib4) [722062.501717] Lustre: Skipped 1 previous similar message [722969.993828] Lustre: fir-MDT0000: haven't heard from client 1063f5ae-12fd-4 (at 10.9.109.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886834ad2c00, cur 1576709205 expire 1576709055 last 1576708978 [722970.013882] Lustre: Skipped 1 previous similar message [723010.298870] Lustre: MGS: Connection restored to (at 10.9.109.37@o2ib4) [723010.305585] Lustre: Skipped 1 previous similar message [726458.198126] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [726458.206052] Lustre: Skipped 1 previous similar message [726500.022471] Lustre: fir-MDT0000: haven't heard from client f3a0b844-8cbe-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8863ff2d2000, cur 1576712735 expire 1576712585 last 1576712508 [726500.042440] Lustre: Skipped 1 previous similar message [726695.014991] Lustre: fir-MDT0000: haven't heard from client 59ea0915-118d-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887402b32c00, cur 1576712930 expire 1576712780 last 1576712703 [726695.034982] Lustre: Skipped 1 previous similar message [726711.020989] Lustre: MGS: haven't heard from client d7887ac9-5fcd-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88799cf64400, cur 1576712946 expire 1576712796 last 1576712719 [726719.468557] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [726719.476478] Lustre: Skipped 1 previous similar message [726941.482353] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [726941.490281] Lustre: Skipped 1 previous similar message [726998.015644] Lustre: fir-MDT0000: haven't heard from client 8a9e4634-3fbc-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b81bb3c00, cur 1576713233 expire 1576713083 last 1576713006 [727179.017561] Lustre: fir-MDT0000: haven't heard from client 4ba14603-8822-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886aebb92400, cur 1576713414 expire 1576713264 last 1576713187 [727179.037530] Lustre: Skipped 1 previous similar message [727195.019356] Lustre: MGS: haven't heard from client 9c959227-d5ae-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887402b30c00, cur 1576713430 expire 1576713280 last 1576713203 [727197.723324] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [727197.731244] Lustre: Skipped 1 previous similar message [728813.394729] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [728813.402661] Lustre: Skipped 1 previous similar message [728831.027136] Lustre: fir-MDT0000: haven't heard from client a13e4d86-f705-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88799cf60c00, cur 1576715066 expire 1576714916 last 1576714839 [729051.028323] Lustre: fir-MDT0000: haven't heard from client f686eefc-c0ae-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885dd51cc400, cur 1576715286 expire 1576715136 last 1576715059 [729051.048290] Lustre: Skipped 1 previous similar message [729067.035812] Lustre: MGS: haven't heard from client ed064400-97f6-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887be3afec00, cur 1576715302 expire 1576715152 last 1576715075 [729273.379377] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [729273.387299] Lustre: Skipped 1 previous similar message [750267.642058] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [750267.648679] Lustre: Skipped 1 previous similar message [750325.149839] Lustre: fir-MDT0000: haven't heard from client f72045bf-1617-e0bb-0624-d123eac98487 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887402b36800, cur 1576736560 expire 1576736410 last 1576736333 [759222.277691] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [759222.284311] Lustre: Skipped 1 previous similar message [759275.200777] Lustre: fir-MDT0000: haven't heard from client 7927add0-52c1-a3ff-0d0e-fdf417784046 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884deea03000, cur 1576745510 expire 1576745360 last 1576745283 [759275.222572] Lustre: Skipped 1 previous similar message [763027.222947] Lustre: MGS: haven't heard from client 82e3aa28-2e97-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8869736e0c00, cur 1576749262 expire 1576749112 last 1576749035 [763027.242222] Lustre: Skipped 1 previous similar message [765083.327261] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [765083.333884] Lustre: Skipped 1 previous similar message [765145.236450] Lustre: fir-MDT0000: haven't heard from client 46c0ec76-2bf3-e5cf-22f0-c61996638e05 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886781b84800, cur 1576751380 expire 1576751230 last 1576751153 [765145.258241] Lustre: Skipped 1 previous similar message [773349.600201] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [773349.606829] Lustre: Skipped 1 previous similar message [773365.288560] Lustre: fir-MDT0000: haven't heard from client 54f0e2f1-dc1e-4343-adb6-a91d24596d2f (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8863b9fe7800, cur 1576759600 expire 1576759450 last 1576759373 [773365.310357] Lustre: Skipped 1 previous similar message [777391.807354] Lustre: MGS: Received new LWP connection from 10.9.0.62@o2ib4, removing former export from same NID [777391.817556] Lustre: MGS: Connection restored to (at 10.9.0.62@o2ib4) [777391.824091] Lustre: Skipped 1 previous similar message [777398.840811] Lustre: fir-MDT0000: Client cfe93466-ba97-4 (at 10.9.0.62@o2ib4) reconnecting [777398.849105] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.62@o2ib4) [777425.063166] Lustre: MGS: Received new LWP connection from 10.9.0.62@o2ib4, removing former export from same NID [777425.063226] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.0.62@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [777425.063229] LustreError: Skipped 2 previous similar messages [777425.096390] Lustre: MGS: Connection restored to (at 10.9.0.62@o2ib4) [777445.525703] Lustre: MGS: Received new LWP connection from 10.9.0.62@o2ib4, removing former export from same NID [777445.535905] Lustre: MGS: Connection restored to (at 10.9.0.62@o2ib4) [777475.111459] Lustre: MGS: Received new LWP connection from 10.9.0.62@o2ib4, removing former export from same NID [777475.111461] Lustre: fir-MDT0000: Client cfe93466-ba97-4 (at 10.9.0.62@o2ib4) reconnecting [777475.111491] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.62@o2ib4) [777475.111667] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.62@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [777475.111669] LustreError: Skipped 1 previous similar message [784615.704161] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [784615.710784] Lustre: Skipped 1 previous similar message [784641.356439] Lustre: fir-MDT0000: haven't heard from client bee6a8da-2dd5-4c96-ed32-d30b9cff656b (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879ed2afc00, cur 1576770876 expire 1576770726 last 1576770649 [784641.378234] Lustre: Skipped 1 previous similar message [790234.524984] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [790234.531611] Lustre: Skipped 1 previous similar message [790238.389224] Lustre: fir-MDT0000: haven't heard from client e8cd63ba-aa3f-9f66-e5a0-8dfe3b012ff9 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8860de990400, cur 1576776473 expire 1576776323 last 1576776246 [790238.411015] Lustre: Skipped 1 previous similar message [791265.394565] Lustre: fir-MDT0000: haven't heard from client f0ea0151-65a4-e3bf-59dc-9776b66a57c8 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88576a7d9800, cur 1576777500 expire 1576777350 last 1576777273 [791265.416351] Lustre: Skipped 1 previous similar message [791271.425011] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [791271.431630] Lustre: Skipped 1 previous similar message [791591.879495] Lustre: MGS: Connection restored to cd53d895-c9de-7824-7c50-8d47ae41d0c4 (at 10.9.112.15@o2ib4) [791591.889326] Lustre: Skipped 1 previous similar message [791608.405158] Lustre: MGS: haven't heard from client b69866e6-5348-b248-5281-f381907c96a5 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0ac2c00, cur 1576777843 expire 1576777693 last 1576777616 [791608.426343] Lustre: Skipped 1 previous similar message [792232.452086] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [792232.458710] Lustre: Skipped 1 previous similar message [792252.407605] Lustre: MGS: haven't heard from client 03020c72-3a9b-e580-d96c-a6c2846e7534 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8850fe1f5800, cur 1576778487 expire 1576778337 last 1576778260 [792252.428702] Lustre: Skipped 1 previous similar message [793508.978803] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [793508.985426] Lustre: Skipped 1 previous similar message [793564.406217] Lustre: fir-MDT0000: haven't heard from client 117f7c13-7b12-7ab1-4c14-5882e0255356 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88639d2b5c00, cur 1576779799 expire 1576779649 last 1576779572 [793564.428012] Lustre: Skipped 1 previous similar message [794030.605286] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [794030.611908] Lustre: Skipped 1 previous similar message [794088.409059] Lustre: MGS: haven't heard from client fab393dc-f95c-2a66-e288-8a751d625162 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886939a70c00, cur 1576780323 expire 1576780173 last 1576780096 [794088.430160] Lustre: Skipped 1 previous similar message [795139.438863] Lustre: fir-MDT0000: haven't heard from client 201ed33e-45ac-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8868f93f0800, cur 1576781374 expire 1576781224 last 1576781147 [795139.458834] Lustre: Skipped 1 previous similar message [795540.735086] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [795540.743007] Lustre: Skipped 1 previous similar message [795767.429183] Lustre: MGS: haven't heard from client a4272150-56e1-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888b75e06800, cur 1576782002 expire 1576781852 last 1576781775 [795767.448457] Lustre: Skipped 1 previous similar message [795865.433257] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [795865.441182] Lustre: Skipped 1 previous similar message [796065.999271] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [796066.007196] Lustre: Skipped 1 previous similar message [796092.431401] Lustre: MGS: haven't heard from client c8bccde4-56f8-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885cf1a5ec00, cur 1576782327 expire 1576782177 last 1576782100 [796092.450683] Lustre: Skipped 1 previous similar message [796102.426368] Lustre: fir-MDT0000: haven't heard from client 764b5a25-04c2-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8851f973a000, cur 1576782337 expire 1576782187 last 1576782110 [796287.015741] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [796287.023667] Lustre: Skipped 1 previous similar message [796344.422267] Lustre: MGS: haven't heard from client f5b551a6-9571-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887991740000, cur 1576782579 expire 1576782429 last 1576782352 [796665.440338] Lustre: fir-MDT0000: haven't heard from client 90c942ca-d4fe-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d11604000, cur 1576782900 expire 1576782750 last 1576782673 [796665.460308] Lustre: Skipped 1 previous similar message [796737.901924] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [796737.909840] Lustre: Skipped 1 previous similar message [798619.446179] Lustre: MGS: haven't heard from client de2b3c8a-5ac0-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886e0d70fc00, cur 1576784854 expire 1576784704 last 1576784627 [798619.465451] Lustre: Skipped 1 previous similar message [798967.316579] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [798967.324499] Lustre: Skipped 1 previous similar message [799131.406200] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [799131.414121] Lustre: Skipped 1 previous similar message [799193.458779] Lustre: MGS: haven't heard from client b35757f9-6ff5-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885a53bf7000, cur 1576785428 expire 1576785278 last 1576785201 [799193.478055] Lustre: Skipped 1 previous similar message [799295.702030] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [799295.709953] Lustre: Skipped 1 previous similar message [799357.439969] Lustre: MGS: haven't heard from client ee7058b8-bbea-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8886a6d5c800, cur 1576785592 expire 1576785442 last 1576785365 [799357.459249] Lustre: Skipped 1 previous similar message [799379.771051] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [799379.777686] Lustre: Skipped 1 previous similar message [799464.528680] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [799464.536604] Lustre: Skipped 1 previous similar message [799522.441538] Lustre: MGS: haven't heard from client b847114e-4bf0-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88586d68a400, cur 1576785757 expire 1576785607 last 1576785530 [799522.460815] Lustre: Skipped 3 previous similar messages [799652.689982] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [799652.697898] Lustre: Skipped 1 previous similar message [799691.442307] Lustre: fir-MDT0000: haven't heard from client 30fd70ba-6a12-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886af76ab400, cur 1576785926 expire 1576785776 last 1576785699 [799691.462317] Lustre: Skipped 1 previous similar message [799828.951011] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [799828.958942] Lustre: Skipped 1 previous similar message [799879.452533] Lustre: MGS: haven't heard from client 8dd20d48-eebd-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885415ab5800, cur 1576786114 expire 1576785964 last 1576785887 [799879.471810] Lustre: Skipped 1 previous similar message [800157.444727] Lustre: fir-MDT0000: haven't heard from client da29cc5c-bfd4-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884eb7390800, cur 1576786392 expire 1576786242 last 1576786165 [800157.464697] Lustre: Skipped 1 previous similar message [800320.780443] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [800320.788367] Lustre: Skipped 1 previous similar message [800494.771782] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [800494.779705] Lustre: Skipped 1 previous similar message [800547.459415] Lustre: MGS: haven't heard from client 2ff09f2f-ae13-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b1b66e400, cur 1576786782 expire 1576786632 last 1576786555 [800547.478695] Lustre: Skipped 1 previous similar message [800721.458429] Lustre: MGS: haven't heard from client 97825372-b33d-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88547923b400, cur 1576786956 expire 1576786806 last 1576786729 [800721.477711] Lustre: Skipped 1 previous similar message [800870.809900] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [800870.817822] Lustre: Skipped 3 previous similar messages [804004.467462] Lustre: fir-MDT0000: haven't heard from client 96d80704-f62c-4 (at 10.9.106.8@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887416f00000, cur 1576790239 expire 1576790089 last 1576790012 [804004.487439] Lustre: Skipped 3 previous similar messages [805920.827056] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [805920.834982] Lustre: Skipped 1 previous similar message [805966.485191] Lustre: fir-MDT0000: haven't heard from client bbe9648b-c8ea-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886acbfa7800, cur 1576792201 expire 1576792051 last 1576791974 [805966.505161] Lustre: Skipped 1 previous similar message [806100.525169] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [806100.533096] Lustre: Skipped 3 previous similar messages [806147.480773] Lustre: MGS: haven't heard from client e8b7e985-79f3-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887691b06400, cur 1576792382 expire 1576792232 last 1576792155 [806147.500052] Lustre: Skipped 1 previous similar message [806157.479731] Lustre: fir-MDT0000: haven't heard from client 1c6ebb13-afaf-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88770cb5e400, cur 1576792392 expire 1576792242 last 1576792165 [806318.678965] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [806318.686894] Lustre: Skipped 1 previous similar message [806338.491367] Lustre: fir-MDT0000: haven't heard from client 4c686703-0121-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a66a67800, cur 1576792573 expire 1576792423 last 1576792346 [806556.491551] Lustre: fir-MDT0000: haven't heard from client 7a6c08ba-07ea-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88787772f000, cur 1576792791 expire 1576792641 last 1576792564 [806556.511526] Lustre: Skipped 1 previous similar message [806732.118196] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [806732.126115] Lustre: Skipped 3 previous similar messages [806796.482360] Lustre: fir-MDT0000: haven't heard from client fab49826-afd1-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88630005cc00, cur 1576793031 expire 1576792881 last 1576792804 [806796.502325] Lustre: Skipped 1 previous similar message [809501.292585] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [809501.299206] Lustre: Skipped 1 previous similar message [809547.502149] Lustre: fir-MDT0000: haven't heard from client d2c6d8c7-45ac-8479-c28c-25f90e88a1fa (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885dedfabc00, cur 1576795782 expire 1576795632 last 1576795555 [809547.523942] Lustre: Skipped 1 previous similar message [809986.916243] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [809986.924161] Lustre: Skipped 1 previous similar message [810021.504689] Lustre: fir-MDT0000: haven't heard from client 20ebae6c-3709-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885cfff6a800, cur 1576796256 expire 1576796106 last 1576796029 [810021.524663] Lustre: Skipped 1 previous similar message [810225.587169] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [810225.595094] Lustre: Skipped 1 previous similar message [810290.513024] Lustre: MGS: haven't heard from client ea14910f-14b1-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887040314800, cur 1576796525 expire 1576796375 last 1576796298 [810290.532303] Lustre: Skipped 1 previous similar message [810452.504353] Lustre: fir-MDT0000: haven't heard from client 791fc2bc-5ce4-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d013bc000, cur 1576796687 expire 1576796537 last 1576796460 [810452.524327] Lustre: Skipped 1 previous similar message [810611.898337] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [810611.906263] Lustre: Skipped 3 previous similar messages [810676.505509] Lustre: fir-MDT0000: haven't heard from client 79f076d1-4dc9-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bc077b400, cur 1576796911 expire 1576796761 last 1576796684 [810676.525482] Lustre: Skipped 1 previous similar message [812490.532054] Lustre: fir-MDT0000: haven't heard from client f2d99824-ba6a-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8869ac78d000, cur 1576798725 expire 1576798575 last 1576798498 [812490.552022] Lustre: Skipped 3 previous similar messages [812523.015738] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [812523.023661] Lustre: Skipped 3 previous similar messages [814181.526797] Lustre: fir-MDT0000: haven't heard from client 477632c3-3c0c-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8859e6e4e400, cur 1576800416 expire 1576800266 last 1576800189 [814181.546768] Lustre: Skipped 1 previous similar message [814219.716338] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [814219.724261] Lustre: Skipped 1 previous similar message [814435.229897] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [814435.237818] Lustre: Skipped 1 previous similar message [814446.540272] Lustre: fir-MDT0000: haven't heard from client 72e86c91-af8a-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b22356000, cur 1576800681 expire 1576800531 last 1576800454 [814446.560241] Lustre: Skipped 1 previous similar message [814603.568275] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [814603.576203] Lustre: Skipped 1 previous similar message [814661.529697] Lustre: fir-MDT0000: haven't heard from client 16210ec7-fc21-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8854ad757400, cur 1576800896 expire 1576800746 last 1576800669 [814661.549689] Lustre: Skipped 1 previous similar message [814773.512529] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [814773.520465] Lustre: Skipped 1 previous similar message [814830.558125] Lustre: fir-MDT0000: haven't heard from client 6c9ffc1f-8be6-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b06b27c00, cur 1576801065 expire 1576800915 last 1576800838 [814830.578098] Lustre: Skipped 1 previous similar message [816881.924483] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [816881.932408] Lustre: Skipped 1 previous similar message [816908.548838] Lustre: fir-MDT0000: haven't heard from client b9e3060a-a60d-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bc077e000, cur 1576803143 expire 1576802993 last 1576802916 [816908.568811] Lustre: Skipped 1 previous similar message [818456.017121] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [818456.025049] Lustre: Skipped 1 previous similar message [818490.553790] Lustre: fir-MDT0000: haven't heard from client 89a3805e-af95-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884be1dd4000, cur 1576804725 expire 1576804575 last 1576804498 [818490.573760] Lustre: Skipped 1 previous similar message [820766.568031] Lustre: MGS: haven't heard from client 15cf8886-b8a1-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b58693000, cur 1576807001 expire 1576806851 last 1576806774 [820766.587306] Lustre: Skipped 1 previous similar message [820774.411566] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [820774.419490] Lustre: Skipped 1 previous similar message [820940.378235] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [820940.386157] Lustre: Skipped 1 previous similar message [821000.595371] Lustre: MGS: haven't heard from client 4a4741a5-c4f7-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8874dfff0400, cur 1576807235 expire 1576807085 last 1576807008 [821000.614649] Lustre: Skipped 1 previous similar message [821102.588676] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [821102.596594] Lustre: Skipped 1 previous similar message [821166.631221] Lustre: MGS: haven't heard from client 5c8646f0-0b83-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8874dfff2400, cur 1576807401 expire 1576807251 last 1576807174 [821166.650490] Lustre: Skipped 1 previous similar message [821285.808961] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [821285.816897] Lustre: Skipped 1 previous similar message [821329.571313] Lustre: fir-MDT0000: haven't heard from client d2b135ef-b84e-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8863377ff400, cur 1576807564 expire 1576807414 last 1576807337 [821329.591287] Lustre: Skipped 1 previous similar message [821468.084002] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [821468.091919] Lustre: Skipped 1 previous similar message [821512.602564] Lustre: MGS: haven't heard from client 7c9495a2-f68f-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8851d567d400, cur 1576807747 expire 1576807597 last 1576807520 [821512.621843] Lustre: Skipped 1 previous similar message [821522.575086] Lustre: fir-MDT0000: haven't heard from client 45c228e8-2d14-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885de43eb800, cur 1576807757 expire 1576807607 last 1576807530 [824154.588504] Lustre: fir-MDT0000: haven't heard from client 5442e2d6-71a8-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8863d3f69400, cur 1576810389 expire 1576810239 last 1576810162 [824619.805697] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [824619.813623] Lustre: Skipped 1 previous similar message [824898.594108] Lustre: fir-MDT0000: haven't heard from client 56cb74a7-69d5-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8864787a9000, cur 1576811133 expire 1576810983 last 1576810906 [824898.614081] Lustre: Skipped 1 previous similar message [824953.226692] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [824953.234617] Lustre: Skipped 1 previous similar message [825141.899418] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [825141.907344] Lustre: Skipped 1 previous similar message [825179.620148] Lustre: MGS: haven't heard from client 26b31086-fbcf-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8863d3f6dc00, cur 1576811414 expire 1576811264 last 1576811187 [825179.639426] Lustre: Skipped 1 previous similar message [825190.602493] Lustre: fir-MDT0000: haven't heard from client 801c62bd-f9d8-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887541a11c00, cur 1576811425 expire 1576811275 last 1576811198 [825347.697999] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [825347.705924] Lustre: Skipped 1 previous similar message [825379.596191] Lustre: fir-MDT0000: haven't heard from client d77982a6-9493-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88741a648400, cur 1576811614 expire 1576811464 last 1576811387 [825395.598271] Lustre: MGS: haven't heard from client 2647b95e-c3f2-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885de43eb800, cur 1576811630 expire 1576811480 last 1576811403 [831019.712050] Lustre: MGS: Connection restored to e15078c5-8209-4 (at 10.8.25.17@o2ib6) [831019.719972] Lustre: Skipped 1 previous similar message [831883.837300] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [831883.843926] Lustre: Skipped 1 previous similar message [831932.654946] Lustre: fir-MDT0000: haven't heard from client a4ee766f-590e-d003-5a63-f2123be6d840 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8868af6ddc00, cur 1576818167 expire 1576818017 last 1576817940 [878283.927573] Lustre: fir-MDT0000: haven't heard from client 764d8bd0-b1e2-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88670764b000, cur 1576864518 expire 1576864368 last 1576864291 [878283.947560] Lustre: Skipped 1 previous similar message [878413.573357] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [878413.581279] Lustre: Skipped 1 previous similar message [880949.944550] Lustre: fir-MDT0000: haven't heard from client a326b420-cfe9-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a0828b800, cur 1576867184 expire 1576867034 last 1576866957 [880949.964522] Lustre: Skipped 1 previous similar message [881253.301867] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [881253.309794] Lustre: Skipped 1 previous similar message [881490.003876] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [881490.011798] Lustre: Skipped 1 previous similar message [881531.959271] Lustre: fir-MDT0000: haven't heard from client 910bbd1b-ba34-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a5eb2c800, cur 1576867766 expire 1576867616 last 1576867539 [881531.979234] Lustre: Skipped 1 previous similar message [887463.007952] Lustre: MGS: haven't heard from client 9d109c58-f17e-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88869d7b3400, cur 1576873697 expire 1576873547 last 1576873470 [887463.027228] Lustre: Skipped 1 previous similar message [887599.131483] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [887599.139402] Lustre: Skipped 1 previous similar message [887760.995545] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [887761.003470] Lustre: Skipped 1 previous similar message [887825.986174] Lustre: MGS: haven't heard from client 32831367-50d9-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888ab869b000, cur 1576874060 expire 1576873910 last 1576873833 [887826.005465] Lustre: Skipped 1 previous similar message [887924.476127] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [887924.484051] Lustre: Skipped 1 previous similar message [887987.985407] Lustre: MGS: haven't heard from client 76815669-c8c0-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88678a762800, cur 1576874222 expire 1576874072 last 1576873995 [887988.004686] Lustre: Skipped 1 previous similar message [888100.262493] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [888100.270415] Lustre: Skipped 1 previous similar message [888150.987976] Lustre: MGS: haven't heard from client 9f7fed8f-244f-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857abdc5400, cur 1576874385 expire 1576874235 last 1576874158 [888151.007248] Lustre: Skipped 1 previous similar message [888161.994649] Lustre: fir-MDT0000: haven't heard from client f9277a0d-f76e-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88689c761c00, cur 1576874396 expire 1576874246 last 1576874169 [893254.957957] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [893254.965877] Lustre: Skipped 1 previous similar message [893280.015215] Lustre: fir-MDT0000: haven't heard from client 2bfa8734-5475-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885fb1ec7c00, cur 1576879514 expire 1576879364 last 1576879287 [893435.283560] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [893435.291481] Lustre: Skipped 1 previous similar message [893481.022350] Lustre: MGS: haven't heard from client 9a791d5c-037f-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ad16fe800, cur 1576879715 expire 1576879565 last 1576879488 [893481.041627] Lustre: Skipped 1 previous similar message [893941.067419] Lustre: fir-MDT0000: haven't heard from client 2ed5eef4-d480-4 (at 10.8.9.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886ab9bba400, cur 1576880175 expire 1576880025 last 1576879948 [893941.087228] Lustre: Skipped 1 previous similar message [895389.570331] Lustre: MGS: Connection restored to 4c497e0b-ea41-4 (at 10.8.9.1@o2ib6) [895389.578081] Lustre: Skipped 1 previous similar message [895599.364088] Lustre: MGS: Connection restored to b34be8aa-32d9-4 (at 10.9.113.13@o2ib4) [895599.372102] Lustre: Skipped 1 previous similar message [895626.590328] Lustre: MGS: Connection restored to da9f6e55-12b4-4 (at 10.9.112.5@o2ib4) [895626.598253] Lustre: Skipped 1 previous similar message [895655.265553] Lustre: MGS: Connection restored to a83208a9-361d-4 (at 10.9.112.4@o2ib4) [895655.273473] Lustre: Skipped 1 previous similar message [896446.394289] Lustre: MGS: Connection restored to 72ec26e6-8490-9625-4bfc-aa584f79f189 (at 10.9.102.25@o2ib4) [896446.404122] Lustre: Skipped 1 previous similar message [896751.041333] Lustre: MGS: haven't heard from client 6b9ef098-3dbe-2cdb-d8d2-1d6c4008fd4c (at 10.9.104.68@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde7d800, cur 1576882985 expire 1576882835 last 1576882758 [896751.062516] Lustre: Skipped 9 previous similar messages [896761.040188] Lustre: fir-MDT0000: haven't heard from client 77d4b49a-e018-77d9-e8fb-a5e24931e768 (at 10.9.104.68@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b9e688400, cur 1576882995 expire 1576882845 last 1576882768 [896827.038938] Lustre: MGS: haven't heard from client 4dd85c08-8e6f-4 (at 10.8.18.35@o2ib6) in 204 seconds. I think it's dead, and I am evicting it. exp ffff88755be59000, cur 1576883061 expire 1576882911 last 1576882857 [896837.128533] Lustre: fir-MDT0000: haven't heard from client 00e8c3c0-75cb-4 (at 10.8.18.35@o2ib6) in 214 seconds. I think it's dead, and I am evicting it. exp ffff886bb8b72400, cur 1576883071 expire 1576882921 last 1576882857 [897271.052705] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [897271.060632] Lustre: Skipped 1 previous similar message [897453.446948] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [897453.454875] Lustre: Skipped 1 previous similar message [897498.050598] Lustre: MGS: haven't heard from client fc0d4817-aec9-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8854165f8400, cur 1576883732 expire 1576883582 last 1576883505 [897619.003322] Lustre: MGS: Connection restored to fb32334e-cf98-4 (at 10.8.18.35@o2ib6) [897619.011250] Lustre: Skipped 1 previous similar message [897680.044330] Lustre: fir-MDT0000: haven't heard from client 7d16b284-1f65-4 (at 10.8.18.35@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8856ef186c00, cur 1576883914 expire 1576883764 last 1576883687 [897680.064303] Lustre: Skipped 1 previous similar message [899085.296549] Lustre: MGS: Connection restored to (at 10.9.104.68@o2ib4) [899085.303267] Lustre: Skipped 1 previous similar message [926513.406443] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.106.16@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [926520.571345] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.102.53@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [926520.588804] LustreError: Skipped 6 previous similar messages [926638.254053] Lustre: fir-MDT0000: haven't heard from client b638064f-3349-4 (at 10.9.116.6@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88759c29d400, cur 1576912872 expire 1576912722 last 1576912645 [926638.274028] Lustre: Skipped 1 previous similar message [926640.237234] Lustre: MGS: haven't heard from client 8c270511-3581-b3aa-cdb1-5246e8311531 (at 10.9.112.6@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde7c800, cur 1576912874 expire 1576912724 last 1576912647 [926640.258331] Lustre: Skipped 79 previous similar messages [927691.316078] Lustre: MGS: Connection restored to 5676a5d3-f6a9-abf8-ffc6-780f63514d69 (at 10.9.113.15@o2ib4) [927691.325952] Lustre: Skipped 1 previous similar message [927693.523500] Lustre: MGS: Connection restored to 6cb89bf6-13ce-cea6-8417-28c40d21b367 (at 10.9.115.7@o2ib4) [927693.533242] Lustre: Skipped 1 previous similar message [927695.713701] Lustre: MGS: Connection restored to b8587914-2fe0-7db6-2f01-05088ddf0089 (at 10.9.115.8@o2ib4) [927695.723441] Lustre: Skipped 1 previous similar message [927709.211210] Lustre: MGS: Connection restored to f01080a0-cc7c-da9c-568d-51eacd84f956 (at 10.9.114.8@o2ib4) [927709.220955] Lustre: Skipped 1 previous similar message [927713.884630] Lustre: MGS: Connection restored to 4c40e4ea-8b50-bea0-9b29-11aa2adbf170 (at 10.9.114.7@o2ib4) [927713.894383] Lustre: Skipped 1 previous similar message [927723.022219] Lustre: MGS: Connection restored to (at 10.9.114.9@o2ib4) [927723.028842] Lustre: Skipped 1 previous similar message [927742.064667] Lustre: MGS: Connection restored to 97102c2b-e0e2-553a-c933-88dc912145da (at 10.9.115.11@o2ib4) [927742.074498] Lustre: Skipped 7 previous similar messages [927774.245626] Lustre: MGS: Connection restored to (at 10.9.114.14@o2ib4) [927774.252337] Lustre: Skipped 27 previous similar messages [927839.365566] Lustre: MGS: Connection restored to (at 10.8.19.6@o2ib6) [927839.372103] Lustre: Skipped 31 previous similar messages [928302.591704] Lustre: MGS: Connection restored to (at 10.8.28.1@o2ib6) [928302.598239] Lustre: Skipped 35 previous similar messages [928803.780968] Lustre: MGS: Connection restored to c81bc9f4-d2a2-4 (at 10.9.115.9@o2ib4) [928803.788889] Lustre: Skipped 23 previous similar messages [929318.485324] Lustre: MGS: Connection restored to 19c70918-a172-38a5-2512-02b987cb686f (at 10.9.116.8@o2ib4) [929318.495065] Lustre: Skipped 21 previous similar messages [930367.256954] Lustre: fir-MDT0000: haven't heard from client f2437a5f-d623-4 (at 10.9.116.5@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd326b800, cur 1576916601 expire 1576916451 last 1576916374 [930367.276922] Lustre: Skipped 79 previous similar messages [932415.225494] Lustre: MGS: Connection restored to (at 10.9.116.5@o2ib4) [932415.232117] Lustre: Skipped 1 previous similar message [938194.306249] Lustre: fir-MDT0000: haven't heard from client c2c89e86-5bb4-0b8e-5ef9-80616e1dd326 (at 10.9.108.50@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887adaad8c00, cur 1576924428 expire 1576924278 last 1576924201 [938194.328130] Lustre: Skipped 3 previous similar messages [940400.047507] Lustre: MGS: Connection restored to c2c89e86-5bb4-0b8e-5ef9-80616e1dd326 (at 10.9.108.50@o2ib4) [940400.057339] Lustre: Skipped 3 previous similar messages [986225.035664] Lustre: 109624:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [986225.047498] Lustre: 109624:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 5 previous similar messages [986225.536642] Lustre: 109761:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [986225.548463] Lustre: 109761:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 593 previous similar messages [987260.601299] Lustre: MGS: haven't heard from client fd10d02c-0024-f855-df44-6bc13830c184 (at 10.9.108.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf4fdc400, cur 1576973494 expire 1576973344 last 1576973267 [987260.622487] Lustre: Skipped 1 previous similar message [989622.687431] Lustre: MGS: Connection restored to (at 10.9.108.37@o2ib4) [989622.694139] Lustre: Skipped 1 previous similar message [1000654.674761] Lustre: fir-MDT0000: haven't heard from client fe8a57dd-079f-3b53-1c56-cba001145a5a (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88670764c000, cur 1576986888 expire 1576986738 last 1576986661 [1000654.696639] Lustre: Skipped 1 previous similar message [1046470.929307] Lustre: fir-MDT0000: haven't heard from client c3a3f4f5-6097-7b12-c1c6-cee0812f09a4 (at 10.9.102.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88757d3ee000, cur 1577032704 expire 1577032554 last 1577032477 [1046470.951180] Lustre: Skipped 1 previous similar message [1048922.250849] Lustre: MGS: Connection restored to (at 10.9.102.7@o2ib4) [1048922.257563] Lustre: Skipped 1 previous similar message [1063404.033267] Lustre: MGS: haven't heard from client 99de6814-36af-3466-c3c8-4440ebae3032 (at 10.9.106.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a11e25000, cur 1577049637 expire 1577049487 last 1577049410 [1063404.054537] Lustre: Skipped 1 previous similar message [1111918.329166] Lustre: fir-MDT0000: haven't heard from client 0fb4e4ea-bb6b-4 (at 10.9.0.1@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88745d343c00, cur 1577098151 expire 1577098001 last 1577097924 [1111918.349051] Lustre: Skipped 1 previous similar message [1112918.642221] Lustre: MGS: Connection restored to (at 10.9.0.1@o2ib4) [1112918.648753] Lustre: Skipped 1 previous similar message [1143007.517887] Lustre: fir-MDT0000: haven't heard from client 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbf04a000, cur 1577129240 expire 1577129090 last 1577129013 [1143007.539678] Lustre: Skipped 1 previous similar message [1147310.532922] Lustre: fir-MDT0000: haven't heard from client fd17c8c5-717a-4 (at 10.9.109.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8876a4201800, cur 1577133543 expire 1577133393 last 1577133316 [1147310.553065] Lustre: Skipped 1 previous similar message [1147355.146620] Lustre: MGS: Connection restored to (at 10.9.109.37@o2ib4) [1147355.153412] Lustre: Skipped 1 previous similar message [1175459.706421] Lustre: fir-MDT0000: haven't heard from client f6b74536-e382-0d38-0b9f-da7271f4bee2 (at 10.9.110.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887762092000, cur 1577161692 expire 1577161542 last 1577161465 [1175459.728382] Lustre: Skipped 1 previous similar message [1187515.784990] Lustre: fir-MDT0000: haven't heard from client 241e5cad-393f-9fce-24f5-a5aee4585bc1 (at 10.9.110.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f2c00, cur 1577173748 expire 1577173598 last 1577173521 [1187515.806973] Lustre: Skipped 1 previous similar message [1189234.803013] Lustre: fir-MDT0000: haven't heard from client 4ac52c87-3d27-43a0-c9f5-795b92a5f10b (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887827adf400, cur 1577175467 expire 1577175317 last 1577175240 [1189234.824893] Lustre: Skipped 1 previous similar message [1196148.134804] Lustre: MGS: Received new LWP connection from 10.8.29.2@o2ib6, removing former export from same NID [1196148.145083] Lustre: MGS: Connection restored to 4dc81e9a-575e-74c4-4147-8daa3386264b (at 10.8.29.2@o2ib6) [1196148.154823] Lustre: Skipped 1 previous similar message [1196767.428865] Lustre: fir-MDT0000: Client ad99fc3b-1c4e-4 (at 10.8.29.2@o2ib6) reconnecting [1196767.437263] Lustre: fir-MDT0000: Connection restored to 4dc81e9a-575e-74c4-4147-8daa3386264b (at 10.8.29.2@o2ib6) [1196767.437348] Lustre: MGS: Received new LWP connection from 10.8.29.2@o2ib6, removing former export from same NID [1196767.457980] Lustre: Skipped 1 previous similar message [1196797.477555] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.29.2@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1196797.494924] LustreError: Skipped 1 previous similar message [1196822.565630] Lustre: MGS: Received new LWP connection from 10.8.29.2@o2ib6, removing former export from same NID [1196822.575913] Lustre: MGS: Connection restored to 4dc81e9a-575e-74c4-4147-8daa3386264b (at 10.8.29.2@o2ib6) [1196847.653785] Lustre: fir-MDT0000: Client ad99fc3b-1c4e-4 (at 10.8.29.2@o2ib6) reconnecting [1196847.653878] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.8.29.2@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1196847.679526] Lustre: fir-MDT0000: Connection restored to 4dc81e9a-575e-74c4-4147-8daa3386264b (at 10.8.29.2@o2ib6) [1197852.925339] LNetError: 38667:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) [1197897.249183] Lustre: fir-MDT0000: Client a55c1ae0-0aa8-4 (at 10.8.27.2@o2ib6) reconnecting [1197897.257562] Lustre: fir-MDT0000: Connection restored to (at 10.8.27.2@o2ib6) [1201074.871671] Lustre: fir-MDT0000: haven't heard from client 3c222422-1505-df45-a734-88e013dbd97d (at 10.9.102.41@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d5675800, cur 1577187307 expire 1577187157 last 1577187080 [1201074.893638] Lustre: Skipped 1 previous similar message [1201085.874409] Lustre: MGS: haven't heard from client 26b178f1-6b8a-7226-a644-e8db93597769 (at 10.9.102.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885be4457c00, cur 1577187318 expire 1577187168 last 1577187091 [1201085.895681] Lustre: Skipped 53 previous similar messages [1201555.873098] Lustre: fir-MDT0000: haven't heard from client 20b87095-1671-09fc-c086-a305451161cc (at 10.9.102.10@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88755c6f5c00, cur 1577187788 expire 1577187638 last 1577187561 [1201555.895063] Lustre: Skipped 54 previous similar messages [1202480.494281] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1202650.526591] Lustre: MGS: Connection restored to f6b74536-e382-0d38-0b9f-da7271f4bee2 (at 10.9.110.30@o2ib4) [1202650.536510] Lustre: Skipped 1 previous similar message [1202670.156491] Lustre: MGS: Connection restored to 241e5cad-393f-9fce-24f5-a5aee4585bc1 (at 10.9.110.37@o2ib4) [1202670.166407] Lustre: Skipped 1 previous similar message [1202857.205487] Lustre: MGS: Connection restored to 6bdde767-e980-edfb-a0f5-b03ac49e6985 (at 10.9.102.42@o2ib4) [1202857.215401] Lustre: Skipped 1 previous similar message [1202863.250895] Lustre: MGS: Connection restored to 646a0999-7ce6-1722-6e20-c11e5811e740 (at 10.9.101.72@o2ib4) [1202863.260814] Lustre: Skipped 1 previous similar message [1202876.958352] Lustre: MGS: Connection restored to a4a78f8a-7688-1de0-b8e9-42f4deb3c330 (at 10.9.104.5@o2ib4) [1202876.968188] Lustre: Skipped 1 previous similar message [1202894.028584] Lustre: MGS: Connection restored to 8bcf3341-45b8-1479-f9bc-f63e5cdf0bdf (at 10.9.102.18@o2ib4) [1202894.038964] Lustre: Skipped 3 previous similar messages [1202920.564766] LustreError: 109727:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.23.14@o2ib6) returned error from blocking AST (req@ffff885bc250a400 x1652549380241168 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff8867e0b518c0/0xc3c20c12ebdc6afa lrc: 4/0,0 mode: EX/EX res: [0x20003c497:0x11:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x1e4ab76bbb43b6c9 expref: 172 pid: 109560 timeout: 1203062 lvb_type: 3 [1202920.564905] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.23.14@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [1202920.564929] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 10.8.23.14@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff8870fa209440/0xc3c20c12ebdc6e49 lrc: 3/0,0 mode: EX/EX res: [0x20003c497:0x12:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x1e4ab76bbb43b84a expref: 173 pid: 109703 timeout: 0 lvb_type: 3 [1202920.657944] LustreError: 109727:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 1 previous similar message [1202934.887960] Lustre: MGS: haven't heard from client aa193901-0512-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887f64b9e000, cur 1577189167 expire 1577189017 last 1577188940 [1202934.907325] Lustre: Skipped 1 previous similar message [1202957.478931] Lustre: MGS: Connection restored to fb2b31ba-9f80-8b46-f913-00d36178e70c (at 10.9.103.56@o2ib4) [1202957.488847] Lustre: Skipped 1 previous similar message [1203047.611444] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1203047.621189] Lustre: Skipped 7 previous similar messages [1203207.108734] Lustre: MGS: Connection restored to (at 10.9.106.54@o2ib4) [1203207.115534] Lustre: Skipped 3 previous similar messages [1203324.315000] LustreError: 109539:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.23.14@o2ib6) returned error from blocking AST (req@ffff885bc7a51680 x1652549380811488 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff88514d462f40/0xc3c20c12ec12d015 lrc: 4/0,0 mode: EX/EX res: [0x20003c498:0x11:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x9aab88cb2246d017 expref: 172 pid: 109619 timeout: 1203466 lvb_type: 3 [1203324.315281] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.23.14@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [1203324.315283] LustreError: Skipped 1 previous similar message [1203324.315307] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 10.8.23.14@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff885d61d61680/0xc3c20c12ec12cf58 lrc: 3/0,0 mode: EX/EX res: [0x20003c498:0x16:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x9aab88cb2246cf53 expref: 173 pid: 109620 timeout: 0 lvb_type: 3 [1203324.413924] LustreError: 109539:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 2 previous similar messages [1203336.908673] Lustre: MGS: haven't heard from client b8bdd668-3442-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be3eb3400, cur 1577189569 expire 1577189419 last 1577189342 [1203464.745306] Lustre: MGS: Connection restored to (at 10.9.102.19@o2ib4) [1203464.752106] Lustre: Skipped 107 previous similar messages [1203601.889129] Lustre: MGS: haven't heard from client 6fd1b377-6647-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88799639c400, cur 1577189834 expire 1577189684 last 1577189607 [1203699.600661] Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.54@o2ib7, removing former export from same NID [1203934.890434] Lustre: fir-MDT0000: haven't heard from client d6897cbe-5b5b-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884e24124c00, cur 1577190167 expire 1577190017 last 1577189940 [1203934.910403] Lustre: Skipped 1 previous similar message [1203950.902639] Lustre: MGS: haven't heard from client 518b6b7d-0354-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888964b55800, cur 1577190183 expire 1577190033 last 1577189956 [1204190.372753] Lustre: MGS: Connection restored to (at 10.9.103.3@o2ib4) [1204190.379464] Lustre: Skipped 30 previous similar messages [1204492.910699] Lustre: MGS: haven't heard from client 1f7d6914-a8ee-0542-5408-27b30e994bf2 (at 10.9.102.65@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdd581400, cur 1577190725 expire 1577190575 last 1577190498 [1205346.940328] Lustre: MGS: haven't heard from client 6e98f767-1155-52ac-adde-0793bf9baad6 (at 10.9.104.50@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc32a1400, cur 1577191579 expire 1577191429 last 1577191352 [1205346.961606] Lustre: Skipped 1 previous similar message [1205350.901937] Lustre: fir-MDT0000: haven't heard from client 03f74746-4ba8-94bf-e7c4-c7631d0831ce (at 10.9.104.50@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8874047cc000, cur 1577191583 expire 1577191433 last 1577191356 [1205674.930480] Lustre: MGS: Connection restored to 42a2168f-4c1a-2014-bc77-91f101bbad4d (at 10.9.106.25@o2ib4) [1205674.940396] Lustre: Skipped 1 previous similar message [1206760.480731] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1206760.490477] Lustre: Skipped 1 previous similar message [1206765.099981] LustreError: 109789:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.26.4@o2ib6) returned error from blocking AST (req@ffff88568b2ecc80 x1652549385398176 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff886cfd384380/0xc3c20c12ed8d694a lrc: 4/0,0 mode: EX/EX res: [0x20003c49b:0x19d:0x0].0x0 bits 0x8/0x0 rrc: 5 type: IBT flags: 0x60000400000020 nid: 10.8.26.4@o2ib6 remote: 0xd86db4fc2073d58d expref: 740 pid: 109586 timeout: 1206856 lvb_type: 3 [1206765.143200] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.26.4@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [1206765.155802] LustreError: Skipped 2 previous similar messages [1206765.161676] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 10.8.26.4@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff886cfd384380/0xc3c20c12ed8d694a lrc: 3/0,0 mode: EX/EX res: [0x20003c49b:0x19d:0x0].0x0 bits 0x8/0x0 rrc: 5 type: IBT flags: 0x60000400000020 nid: 10.8.26.4@o2ib6 remote: 0xd86db4fc2073d58d expref: 741 pid: 109586 timeout: 0 lvb_type: 3 [1206765.198809] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [1206821.916947] Lustre: MGS: haven't heard from client 57a10c84-2072-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887abdf96800, cur 1577193054 expire 1577192904 last 1577192827 [1206858.083596] Lustre: MGS: Connection restored to 44d89838-5792-aa9d-9651-aaf15ed74ef0 (at 10.9.102.65@o2ib4) [1206858.093519] Lustre: Skipped 1 previous similar message [1206922.535146] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1206922.544883] Lustre: Skipped 1 previous similar message [1206986.910068] Lustre: MGS: haven't heard from client 9003eff5-48f3-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf9bdac00, cur 1577193219 expire 1577193069 last 1577192992 [1206998.909781] Lustre: fir-MDT0000: haven't heard from client 89a9983f-b54a-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879b1f12000, cur 1577193231 expire 1577193081 last 1577193004 [1207183.918811] Lustre: MGS: haven't heard from client d19b0227-f3e1-0722-64a6-515188b4951b (at 10.9.102.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde1c800, cur 1577193416 expire 1577193266 last 1577193189 [1207268.419387] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1207268.429133] Lustre: Skipped 1 previous similar message [1207326.924172] Lustre: MGS: haven't heard from client ed50e25e-7018-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88877b9f4800, cur 1577193559 expire 1577193409 last 1577193332 [1207326.943442] Lustre: Skipped 1 previous similar message [1207517.792647] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1207517.802393] Lustre: Skipped 1 previous similar message [1207596.951159] Lustre: MGS: haven't heard from client 526ecb6a-a985-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a2a3f4000, cur 1577193829 expire 1577193679 last 1577193602 [1207596.970438] Lustre: Skipped 1 previous similar message [1207672.915214] Lustre: MGS: haven't heard from client f3fc9cf3-4deb-4 (at 10.8.26.4@o2ib6) in 156 seconds. I think it's dead, and I am evicting it. exp ffff88868da3d800, cur 1577193905 expire 1577193755 last 1577193749 [1207672.934512] Lustre: Skipped 1 previous similar message [1207675.642486] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1207675.652230] Lustre: Skipped 1 previous similar message [1207754.934326] Lustre: fir-MDT0000: haven't heard from client d66b8a9f-bc92-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a2a3f5000, cur 1577193987 expire 1577193837 last 1577193760 [1208159.312549] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1208159.322297] Lustre: Skipped 3 previous similar messages [1208179.922059] Lustre: fir-MDT0000: haven't heard from client e7be3a7e-3cff-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887921e73c00, cur 1577194412 expire 1577194262 last 1577194185 [1208385.948227] Lustre: MGS: haven't heard from client e9cdc75c-39a1-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885277facc00, cur 1577194618 expire 1577194468 last 1577194391 [1208385.967499] Lustre: Skipped 3 previous similar messages [1208899.964354] Lustre: MGS: haven't heard from client 7f1caee2-cad2-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887050b1b800, cur 1577195132 expire 1577194982 last 1577194905 [1208899.983718] Lustre: Skipped 1 previous similar message [1209207.236152] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1209207.242860] Lustre: Skipped 3 previous similar messages [1209443.931384] Lustre: fir-MDT0000: haven't heard from client 13c01cd7-bb86-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886760ec4000, cur 1577195676 expire 1577195526 last 1577195449 [1209443.951444] Lustre: Skipped 3 previous similar messages [1210376.520416] Lustre: MGS: Connection restored to 585ca3ae-3e0f-4794-8d90-9c7aec7df87d (at 10.9.106.33@o2ib4) [1210376.530332] Lustre: Skipped 5 previous similar messages [1211418.346229] Lustre: MGS: Connection restored to 7fc3ef05-0495-25a3-7cdb-c6f981dcc2b9 (at 10.9.102.68@o2ib4) [1211418.356145] Lustre: Skipped 1 previous similar message [1214460.070821] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1214460.077538] Lustre: Skipped 1 previous similar message [1214498.967221] Lustre: MGS: haven't heard from client 8211f34e-713f-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857ba260400, cur 1577200731 expire 1577200581 last 1577200504 [1214498.986587] Lustre: Skipped 1 previous similar message [1215095.247992] Lustre: MGS: Connection restored to 63592f62-314e-a8bf-54fc-9a4f9669489a (at 10.9.103.33@o2ib4) [1215095.257914] Lustre: Skipped 1 previous similar message [1215294.867489] Lustre: MGS: Connection restored to (at 10.9.103.25@o2ib4) [1215294.874293] Lustre: Skipped 3 previous similar messages [1216754.469538] Lustre: MGS: Received new LWP connection from 10.9.0.63@o2ib4, removing former export from same NID [1216754.469541] Lustre: fir-MDT0000: Client e930c269-2a9e-4 (at 10.9.0.63@o2ib4) reconnecting [1216754.469565] Lustre: fir-MDT0000: Connection restored to (at 10.9.0.63@o2ib4) [1216754.469567] Lustre: Skipped 1 previous similar message [1216758.878029] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.0.63@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1216783.966606] Lustre: MGS: Received new LWP connection from 10.9.0.63@o2ib4, removing former export from same NID [1216783.966672] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.0.63@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1216840.777581] LNet: Service thread pid 109710 was inactive for 200.04s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1216840.794772] LNet: Skipped 4 previous similar messages [1216840.800001] Pid: 109710, comm: mdt01_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1216840.810454] Call Trace: [1216840.813098] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [1216840.820217] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [1216840.827612] [] mdt_rename_lock+0x24b/0x4b0 [mdt] [1216840.834110] [] mdt_reint_rename+0x2c5/0x2b90 [mdt] [1216840.840781] [] mdt_reint_rec+0x83/0x210 [mdt] [1216840.847006] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1216840.853762] [] mdt_reint+0x67/0x140 [mdt] [1216840.859641] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [1216840.866771] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1216840.874661] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [1216840.881174] [] kthread+0xd1/0xe0 [1216840.886264] [] ret_from_fork_nospec_begin+0xe/0x21 [1216840.892939] [] 0xffffffffffffffff [1216840.898134] LustreError: dumping log to /tmp/lustre-log.1577203072.109710 [1216854.936678] Lustre: MGS: Received new LWP connection from 10.9.0.63@o2ib4, removing former export from same NID [1216854.946954] Lustre: MGS: Connection restored to (at 10.9.0.63@o2ib4) [1216854.953573] Lustre: Skipped 2 previous similar messages [1216855.335665] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.0.63@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1216869.408561] Lustre: fir-MDT0000: Client e930c269-2a9e-4 (at 10.9.0.63@o2ib4) reconnecting [1216891.953549] Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.54@o2ib7, removing former export from same NID [1216896.329485] Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.52@o2ib7, removing former export from same NID [1216896.340528] Lustre: Skipped 1 previous similar message [1216926.912948] LNet: Service thread pid 109710 completed after 286.17s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [1220784.009255] Lustre: fir-MDT0000: haven't heard from client b9f0c572-9626-5946-123b-eeefc91c9fb7 (at 10.9.102.70@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ace3db000, cur 1577207016 expire 1577206866 last 1577206789 [1220784.031219] Lustre: Skipped 1 previous similar message [1221271.023457] Lustre: fir-MDT0000: haven't heard from client b9e179b7-e1e7-ea8d-40b3-fed6805562da (at 10.9.102.63@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88755c6f0400, cur 1577207503 expire 1577207353 last 1577207276 [1221271.045424] Lustre: Skipped 1 previous similar message [1222260.006984] Lustre: MGS: haven't heard from client f996ba03-0054-0c44-5bbe-f1bc21d329b7 (at 10.9.102.72@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc6a6c800, cur 1577208492 expire 1577208342 last 1577208265 [1222260.028250] Lustre: Skipped 1 previous similar message [1222266.008288] Lustre: fir-MDT0000: haven't heard from client 18451477-28e4-0a0c-1cd6-33dceecfc81f (at 10.9.102.72@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887827adbc00, cur 1577208498 expire 1577208348 last 1577208271 [1223129.577150] Lustre: MGS: Connection restored to b9f0c572-9626-5946-123b-eeefc91c9fb7 (at 10.9.102.70@o2ib4) [1223129.587065] Lustre: Skipped 4 previous similar messages [1223615.601828] Lustre: MGS: Connection restored to b9e179b7-e1e7-ea8d-40b3-fed6805562da (at 10.9.102.63@o2ib4) [1223615.611742] Lustre: Skipped 1 previous similar message [1224018.570968] Lustre: MGS: Connection restored to (at 10.9.102.69@o2ib4) [1224018.577768] Lustre: Skipped 1 previous similar message [1224241.445954] Lustre: MGS: Connection restored to (at 10.9.102.64@o2ib4) [1224241.452756] Lustre: Skipped 1 previous similar message [1224570.005748] Lustre: MGS: Connection restored to 18451477-28e4-0a0c-1cd6-33dceecfc81f (at 10.9.102.72@o2ib4) [1224570.015665] Lustre: Skipped 1 previous similar message [1228894.275438] Lustre: MGS: Connection restored to (at 10.9.103.22@o2ib4) [1228894.282230] Lustre: Skipped 1 previous similar message [1230456.064375] Lustre: fir-MDT0000: haven't heard from client 283cc85a-44ee-a3f0-faaf-27fc939a252a (at 10.9.103.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88755c6f4000, cur 1577216688 expire 1577216538 last 1577216461 [1230532.059867] Lustre: MGS: haven't heard from client b5785c28-9696-a8de-004f-15a3f45570b0 (at 10.9.103.46@o2ib4) in 222 seconds. I think it's dead, and I am evicting it. exp ffff888bf578f000, cur 1577216764 expire 1577216614 last 1577216542 [1230532.081141] Lustre: Skipped 27 previous similar messages [1232301.376762] Lustre: MGS: Connection restored to (at 10.9.103.51@o2ib4) [1232301.383559] Lustre: Skipped 1 previous similar message [1232328.806804] Lustre: MGS: Connection restored to 54a4e18f-2dbf-9330-244f-d38b0011d1d4 (at 10.9.103.65@o2ib4) [1232328.816721] Lustre: Skipped 1 previous similar message [1232342.243543] Lustre: MGS: Connection restored to 9622ebd9-08dd-84f5-187b-b07758b1dd55 (at 10.9.103.48@o2ib4) [1232342.253457] Lustre: Skipped 1 previous similar message [1232433.053732] Lustre: MGS: Connection restored to (at 10.9.103.45@o2ib4) [1232433.060525] Lustre: Skipped 1 previous similar message [1232556.133815] Lustre: MGS: Connection restored to d3de90ea-b754-24e7-0993-a0caab86ec9f (at 10.9.103.63@o2ib4) [1232556.143732] Lustre: Skipped 3 previous similar messages [1232600.902938] Lustre: MGS: Connection restored to 7cd5f0b9-0428-d674-fbab-e1a1ccb7ff51 (at 10.9.103.60@o2ib4) [1232600.912861] Lustre: Skipped 9 previous similar messages [1232698.803324] Lustre: MGS: Connection restored to (at 10.9.103.61@o2ib4) [1232698.810127] Lustre: Skipped 15 previous similar messages [1235404.850427] Lustre: MGS: Connection restored to (at 10.9.104.24@o2ib4) [1235404.857225] Lustre: Skipped 1 previous similar message [1235924.776496] Lustre: MGS: Connection restored to (at 10.9.103.36@o2ib4) [1235924.783298] Lustre: Skipped 1 previous similar message [1235931.128203] Lustre: fir-MDT0000: haven't heard from client 825943a9-81d4-e363-123f-83bd023307a5 (at 10.9.101.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887546e8f400, cur 1577222163 expire 1577222013 last 1577221936 [1235931.150189] Lustre: Skipped 5 previous similar messages [1235933.152765] Lustre: MGS: haven't heard from client 077e611c-f08c-99bd-e5a4-5dff230450b3 (at 10.9.101.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887d4c7e4000, cur 1577222165 expire 1577222015 last 1577221938 [1237679.055803] Lustre: MGS: Connection restored to (at 10.9.101.19@o2ib4) [1237679.062599] Lustre: Skipped 1 previous similar message [1240782.221511] Lustre: MGS: Connection restored to (at 10.9.102.66@o2ib4) [1240782.228311] Lustre: Skipped 1 previous similar message [1240837.729031] Lustre: MGS: Connection restored to 40eaf3ed-6fe0-6c38-626f-ccfadd82bb44 (at 10.9.102.50@o2ib4) [1240837.738944] Lustre: Skipped 1 previous similar message [1241155.128865] Lustre: fir-MDT0000: haven't heard from client b9394a54-7284-d9ef-c866-e3b9139d82b3 (at 10.9.104.55@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf7afc800, cur 1577227387 expire 1577227237 last 1577227160 [1241902.131496] Lustre: fir-MDT0000: haven't heard from client 883aec95-b2fd-29e7-da4a-09f196622c49 (at 10.9.104.51@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744e767000, cur 1577228134 expire 1577227984 last 1577227907 [1241902.153472] Lustre: Skipped 9 previous similar messages [1242231.793383] Lustre: 109619:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1242231.805302] Lustre: 109619:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 67 previous similar messages [1242232.294391] Lustre: 109727:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1242232.306302] Lustre: 109727:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 1212 previous similar messages [1243495.908779] Lustre: MGS: Connection restored to (at 10.9.104.53@o2ib4) [1243495.915606] Lustre: Skipped 1 previous similar message [1243503.485494] Lustre: MGS: Connection restored to (at 10.9.104.64@o2ib4) [1243503.492294] Lustre: Skipped 1 previous similar message [1243509.700449] Lustre: MGS: Connection restored to (at 10.9.104.49@o2ib4) [1243509.707245] Lustre: Skipped 1 previous similar message [1243518.537887] Lustre: MGS: Connection restored to 6e65c674-5fe1-d96e-ba39-083f6cfd9fc1 (at 10.9.104.56@o2ib4) [1243518.547811] Lustre: Skipped 1 previous similar message [1243530.660541] Lustre: MGS: Connection restored to b9394a54-7284-d9ef-c866-e3b9139d82b3 (at 10.9.104.55@o2ib4) [1243530.670456] Lustre: Skipped 1 previous similar message [1244181.250105] Lustre: MGS: Connection restored to 883aec95-b2fd-29e7-da4a-09f196622c49 (at 10.9.104.51@o2ib4) [1244181.260023] Lustre: Skipped 1 previous similar message [1250761.071381] Lustre: MGS: Connection restored to 12e2c9b6-7b56-574c-526a-a98d62b67a85 (at 10.9.103.31@o2ib4) [1250761.081300] Lustre: Skipped 1 previous similar message [1256607.674445] Lustre: MGS: Connection restored to 36347076-5d13-e5ed-fd1b-950e092d49ee (at 10.9.103.11@o2ib4) [1256607.684365] Lustre: Skipped 1 previous similar message [1256890.262300] Lustre: fir-MDT0000: haven't heard from client c319d260-6432-b651-7aeb-47c9eb331ac0 (at 10.9.104.66@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879a9729800, cur 1577243122 expire 1577242972 last 1577242895 [1256890.284265] Lustre: Skipped 1 previous similar message [1256895.238990] Lustre: MGS: haven't heard from client 93211e3e-5198-30d2-e1b7-a18f42d16f4f (at 10.9.104.66@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdddedc00, cur 1577243127 expire 1577242977 last 1577242900 [1259299.556303] Lustre: MGS: Connection restored to c319d260-6432-b651-7aeb-47c9eb331ac0 (at 10.9.104.66@o2ib4) [1259299.566243] Lustre: Skipped 1 previous similar message [1260704.677008] Lustre: MGS: Connection restored to bac37de3-7e0e-eaaf-19ed-94dc03528d96 (at 10.9.103.32@o2ib4) [1260704.686929] Lustre: Skipped 1 previous similar message [1270287.337322] Lustre: MGS: haven't heard from client 46df3464-220e-aa82-180c-02c18cc4a543 (at 10.9.101.52@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc733c000, cur 1577256519 expire 1577256369 last 1577256292 [1270303.316937] Lustre: fir-MDT0000: haven't heard from client 970bc850-7648-f96d-fc2b-8b8c64ce0bd4 (at 10.9.101.52@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf7afb000, cur 1577256535 expire 1577256385 last 1577256308 [1272590.543651] Lustre: MGS: Connection restored to (at 10.9.101.52@o2ib4) [1272590.550444] Lustre: Skipped 1 previous similar message [1272702.333995] Lustre: MGS: haven't heard from client ad0124cd-6523-7842-c6f8-7c21220f251e (at 10.9.108.36@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdf4a8000, cur 1577258934 expire 1577258784 last 1577258707 [1274828.849313] Lustre: MGS: Connection restored to (at 10.9.108.36@o2ib4) [1274828.856119] Lustre: Skipped 1 previous similar message [1278183.373715] Lustre: fir-MDT0000: haven't heard from client 1e65407a-2cff-cc06-15ab-12345c3da5eb (at 10.9.101.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88757d3e8000, cur 1577264415 expire 1577264265 last 1577264188 [1278183.395676] Lustre: Skipped 1 previous similar message [1280407.068890] Lustre: MGS: Connection restored to 1e65407a-2cff-cc06-15ab-12345c3da5eb (at 10.9.101.17@o2ib4) [1280407.078805] Lustre: Skipped 1 previous similar message [1293069.047274] Lustre: MGS: Connection restored to 1c788dd6-50a3-98a7-1472-efee78e1e538 (at 10.9.103.21@o2ib4) [1293069.057194] Lustre: Skipped 1 previous similar message [1294462.488207] Lustre: fir-MDT0000: haven't heard from client bdc6a669-f745-2944-1b74-3762ff7d0bf8 (at 10.9.101.36@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab2e9b800, cur 1577280694 expire 1577280544 last 1577280467 [1294462.510169] Lustre: Skipped 1 previous similar message [1296206.500276] Lustre: MGS: Connection restored to bdc6a669-f745-2944-1b74-3762ff7d0bf8 (at 10.9.101.36@o2ib4) [1296206.510196] Lustre: Skipped 1 previous similar message [1297839.500756] Lustre: fir-MDT0000: haven't heard from client ecf68baa-f439-d45d-3f26-508c1e1183d5 (at 10.9.108.45@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887862c53400, cur 1577284071 expire 1577283921 last 1577283844 [1297839.522719] Lustre: Skipped 1 previous similar message [1299798.508273] Lustre: fir-MDT0000: haven't heard from client 8d7a6799-65e3-6ed3-228b-334c3bfba9f6 (at 10.9.101.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887546e8fc00, cur 1577286030 expire 1577285880 last 1577285803 [1299798.530262] Lustre: Skipped 1 previous similar message [1299807.527339] Lustre: MGS: haven't heard from client 1faea559-9460-dd2b-959f-e2f71f1b39f3 (at 10.9.101.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf43f8000, cur 1577286039 expire 1577285889 last 1577285812 [1300155.605794] Lustre: MGS: Connection restored to ecf68baa-f439-d45d-3f26-508c1e1183d5 (at 10.9.108.45@o2ib4) [1300155.615713] Lustre: Skipped 1 previous similar message [1301502.754285] Lustre: MGS: Connection restored to d2753b93-4aae-c52d-984c-eba7125fbc12 (at 10.9.102.62@o2ib4) [1301502.764200] Lustre: Skipped 1 previous similar message [1301526.047359] Lustre: MGS: Connection restored to 45a5c9f9-e983-39e8-83d8-f8de0d279a9a (at 10.9.102.52@o2ib4) [1301526.057279] Lustre: Skipped 1 previous similar message [1301557.565214] Lustre: MGS: Connection restored to 8d7a6799-65e3-6ed3-228b-334c3bfba9f6 (at 10.9.101.30@o2ib4) [1301557.575132] Lustre: Skipped 3 previous similar messages [1302064.465977] Lustre: MGS: Connection restored to a12f7628-778e-06eb-a6c6-d5f0dcc4d267 (at 10.9.102.59@o2ib4) [1302064.475901] Lustre: Skipped 1 previous similar message [1302069.811442] Lustre: MGS: Connection restored to (at 10.9.102.55@o2ib4) [1302069.818263] Lustre: Skipped 1 previous similar message [1302163.353098] Lustre: MGS: Connection restored to 60afd8d9-7763-30ce-53b8-7015b2e23e2a (at 10.9.102.61@o2ib4) [1302163.363014] Lustre: Skipped 1 previous similar message [1303381.556375] Lustre: MGS: haven't heard from client 2dd13db6-d967-ee41-b75d-c9b93da7b26e (at 10.9.101.5@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bf8830c00, cur 1577289613 expire 1577289463 last 1577289386 [1305099.302400] Lustre: MGS: Connection restored to (at 10.9.101.5@o2ib4) [1305099.309105] Lustre: Skipped 1 previous similar message [1307483.556630] Lustre: fir-MDT0000: haven't heard from client 0ead65cc-d077-2ad5-92ad-5a8b2a04355f (at 10.9.106.18@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b0c00, cur 1577293715 expire 1577293565 last 1577293488 [1307483.578610] Lustre: Skipped 1 previous similar message [1308209.561696] Lustre: fir-MDT0000: haven't heard from client fdca5c4a-6cf3-51e3-c2ce-f648bf33defc (at 10.9.106.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a03828800, cur 1577294441 expire 1577294291 last 1577294214 [1308209.583666] Lustre: Skipped 1 previous similar message [1308285.561877] Lustre: fir-MDT0000: haven't heard from client d9d9948e-f319-d109-813a-4d3dfb4ce61e (at 10.9.106.22@o2ib4) in 175 seconds. I think it's dead, and I am evicting it. exp ffff887978e5e800, cur 1577294517 expire 1577294367 last 1577294342 [1308285.583841] Lustre: Skipped 1 previous similar message [1308337.600218] Lustre: MGS: haven't heard from client 1a26baf1-c5ad-e0fc-9f0a-660b7c1218ac (at 10.9.106.22@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0a29000, cur 1577294569 expire 1577294419 last 1577294342 [1308900.925195] Lustre: MGS: Connection restored to (at 10.9.104.11@o2ib4) [1308900.931988] Lustre: Skipped 1 previous similar message [1308947.565278] Lustre: fir-MDT0000: haven't heard from client 1752b868-bd26-4 (at 10.9.104.11@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a1cbdf800, cur 1577295179 expire 1577295029 last 1577294952 [1309691.256261] Lustre: MGS: Connection restored to 0ead65cc-d077-2ad5-92ad-5a8b2a04355f (at 10.9.106.18@o2ib4) [1309691.266176] Lustre: Skipped 1 previous similar message [1310027.572998] Lustre: fir-MDT0000: haven't heard from client fa37c25e-caf3-90b2-4eae-e4b21a551f9d (at 10.9.106.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744e764400, cur 1577296259 expire 1577296109 last 1577296032 [1310027.594978] Lustre: Skipped 1 previous similar message [1310216.589964] Lustre: fir-MDT0000: haven't heard from client 31722a42-53ae-b678-363c-dc0a8c0b6d11 (at 10.9.109.72@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887762096800, cur 1577296448 expire 1577296298 last 1577296221 [1310216.611930] Lustre: Skipped 7 previous similar messages [1310538.581502] Lustre: MGS: Connection restored to fdca5c4a-6cf3-51e3-c2ce-f648bf33defc (at 10.9.106.15@o2ib4) [1310538.591414] Lustre: Skipped 1 previous similar message [1310782.459491] Lustre: MGS: Connection restored to (at 10.9.106.22@o2ib4) [1310782.466285] Lustre: Skipped 1 previous similar message [1312342.159467] Lustre: MGS: Connection restored to 0713f3a9-f297-cd73-69ad-d70a0f44846f (at 10.9.104.62@o2ib4) [1312342.169384] Lustre: Skipped 1 previous similar message [1312350.562405] Lustre: MGS: Connection restored to 3b3ac53a-66b9-51a2-95e7-14e76e01e7c4 (at 10.9.104.67@o2ib4) [1312350.572318] Lustre: Skipped 1 previous similar message [1312463.136207] Lustre: MGS: Connection restored to (at 10.9.106.19@o2ib4) [1312463.143008] Lustre: Skipped 3 previous similar messages [1314531.600542] Lustre: fir-MDT0000: haven't heard from client c8611503-00c9-0601-ab3e-6ab5502d2910 (at 10.9.109.52@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a0382d400, cur 1577300763 expire 1577300613 last 1577300536 [1314531.622504] Lustre: Skipped 1 previous similar message [1318140.370733] Lustre: MGS: Connection restored to 2d8d99e6-5958-67b2-f1ca-1cf5a5ad035d (at 10.9.103.18@o2ib4) [1318140.380649] Lustre: Skipped 1 previous similar message [1319030.562640] Lustre: MGS: Connection restored to 486a3b4c-834f-5f21-2c64-5468c4e367b9 (at 10.9.103.72@o2ib4) [1319030.572556] Lustre: Skipped 1 previous similar message [1319969.720315] Lustre: MGS: Connection restored to 0d761ed3-a7f2-30d5-2cfe-7894b05d973f (at 10.9.103.12@o2ib4) [1319969.730234] Lustre: Skipped 1 previous similar message [1320003.460844] Lustre: MGS: Connection restored to fef97b82-0880-8519-cff3-12bb701148de (at 10.9.103.26@o2ib4) [1320003.470760] Lustre: Skipped 1 previous similar message [1320047.144524] Lustre: MGS: Connection restored to bd97ce4d-66f2-ec39-b853-fbba04365905 (at 10.9.103.7@o2ib4) [1320047.154355] Lustre: Skipped 1 previous similar message [1320103.463808] Lustre: MGS: Connection restored to fa814a9f-a2be-b208-11b5-070b51b2ad41 (at 10.9.103.34@o2ib4) [1320103.473720] Lustre: Skipped 1 previous similar message [1320520.642666] Lustre: MGS: haven't heard from client f6bd0fcb-234a-c033-c1d1-49454299d257 (at 10.9.102.51@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdfd65400, cur 1577306752 expire 1577306602 last 1577306525 [1320520.663935] Lustre: Skipped 1 previous similar message [1321960.674328] Lustre: MGS: haven't heard from client bce03646-11e3-57c4-431f-6ff9f8cbb78a (at 10.9.108.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf43fe400, cur 1577308192 expire 1577308042 last 1577307965 [1321960.695613] Lustre: Skipped 3 previous similar messages [1322324.828561] Lustre: MGS: Connection restored to 9f1f289d-8652-b8f8-8177-a765b83508cd (at 10.9.102.60@o2ib4) [1322324.838473] Lustre: Skipped 1 previous similar message [1322837.274183] Lustre: MGS: Connection restored to (at 10.9.102.56@o2ib4) [1322837.280977] Lustre: Skipped 1 previous similar message [1322861.990087] Lustre: MGS: Connection restored to (at 10.9.102.54@o2ib4) [1322861.996886] Lustre: Skipped 1 previous similar message [1322872.293746] Lustre: MGS: Connection restored to 2f22f856-9471-7d8e-6ea1-8cae6b236407 (at 10.9.102.51@o2ib4) [1322872.303664] Lustre: Skipped 1 previous similar message [1322898.214424] Lustre: MGS: Connection restored to 58d93a7b-6fad-be0a-ff05-c633c769f421 (at 10.9.102.71@o2ib4) [1322898.224344] Lustre: Skipped 1 previous similar message [1324202.367047] Lustre: MGS: Connection restored to 9ce69080-df65-cc92-5d4c-44dfee293be0 (at 10.9.108.30@o2ib4) [1324202.376963] Lustre: Skipped 1 previous similar message [1324337.661151] Lustre: fir-MDT0000: haven't heard from client dffc1cc0-26ab-9b78-f3a0-8d9b8d410b62 (at 10.9.108.46@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88755c6f3800, cur 1577310569 expire 1577310419 last 1577310342 [1324337.683112] Lustre: Skipped 1 previous similar message [1324338.680021] Lustre: MGS: haven't heard from client 4fe179a5-f39c-efe3-11cf-0c8bdd85ac67 (at 10.9.108.46@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdaabf000, cur 1577310570 expire 1577310420 last 1577310343 [1324843.667461] Lustre: fir-MDT0000: haven't heard from client 867e4ecd-c140-b9a9-d814-894b02f72825 (at 10.9.101.12@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744e767800, cur 1577311075 expire 1577310925 last 1577310848 [1325191.666007] Lustre: fir-MDT0000: haven't heard from client 17dd7332-a264-ab61-e8db-937ca8a76ffc (at 10.9.108.47@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744e761c00, cur 1577311423 expire 1577311273 last 1577311196 [1325191.687971] Lustre: Skipped 1 previous similar message [1325620.282274] Lustre: MGS: Connection restored to (at 10.9.106.57@o2ib4) [1325620.289070] Lustre: Skipped 1 previous similar message [1326530.569727] Lustre: MGS: Connection restored to (at 10.9.108.46@o2ib4) [1326530.576528] Lustre: Skipped 1 previous similar message [1326788.688112] Lustre: MGS: haven't heard from client b0caeb58-988c-1bae-b2d8-da4a39871bb2 (at 10.9.101.32@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf5a04c00, cur 1577313020 expire 1577312870 last 1577312793 [1326788.709384] Lustre: Skipped 1 previous similar message [1326798.671017] Lustre: fir-MDT0000: haven't heard from client 5f2cddb3-ff93-797d-26c0-ba79e6a92a32 (at 10.9.101.33@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887546e8b800, cur 1577313030 expire 1577312880 last 1577312803 [1326798.692984] Lustre: Skipped 1 previous similar message [1327046.681864] Lustre: fir-MDT0000: haven't heard from client b1cd239e-0cd9-fb44-5983-4f2f3e01fa3f (at 10.9.108.43@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88745d346800, cur 1577313278 expire 1577313128 last 1577313051 [1327046.703848] Lustre: Skipped 1 previous similar message [1327089.083867] Lustre: MGS: Connection restored to (at 10.9.101.12@o2ib4) [1327089.090664] Lustre: Skipped 1 previous similar message [1327551.797734] Lustre: MGS: Connection restored to 17dd7332-a264-ab61-e8db-937ca8a76ffc (at 10.9.108.47@o2ib4) [1327551.807653] Lustre: Skipped 1 previous similar message [1327583.458370] Lustre: MGS: Connection restored to (at 10.9.104.27@o2ib4) [1327583.465165] Lustre: Skipped 1 previous similar message [1328581.391921] Lustre: MGS: Connection restored to 5f2cddb3-ff93-797d-26c0-ba79e6a92a32 (at 10.9.101.33@o2ib4) [1328581.401836] Lustre: Skipped 1 previous similar message [1328597.381923] Lustre: MGS: Connection restored to (at 10.9.101.32@o2ib4) [1328597.388719] Lustre: Skipped 1 previous similar message [1329446.385996] Lustre: MGS: Connection restored to (at 10.9.108.43@o2ib4) [1329446.392797] Lustre: Skipped 1 previous similar message [1332933.664252] Lustre: MGS: Connection restored to d60e4b40-a54e-bcc3-d895-4008b2e1a092 (at 10.9.103.15@o2ib4) [1332933.674168] Lustre: Skipped 1 previous similar message [1332998.904449] Lustre: MGS: Connection restored to (at 10.9.103.14@o2ib4) [1332998.911250] Lustre: Skipped 1 previous similar message [1333104.216134] Lustre: MGS: Connection restored to (at 10.9.103.1@o2ib4) [1333104.222838] Lustre: Skipped 1 previous similar message [1334371.205060] Lustre: MGS: Connection restored to 3c8d6e9e-a50e-0a1b-c656-8992c6066eb7 (at 10.9.103.17@o2ib4) [1334371.214976] Lustre: Skipped 1 previous similar message [1334760.729405] Lustre: fir-MDT0000: haven't heard from client d34d73a7-4c96-cff2-e3e6-67931f3d61e0 (at 10.9.101.43@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f4800, cur 1577320992 expire 1577320842 last 1577320765 [1334760.751411] Lustre: Skipped 1 previous similar message [1334836.718919] Lustre: fir-MDT0000: haven't heard from client 5df41e4d-7dff-c151-5918-6d3c9eb8fbfd (at 10.9.108.31@o2ib4) in 188 seconds. I think it's dead, and I am evicting it. exp ffff887b9e68a000, cur 1577321068 expire 1577320918 last 1577320880 [1334836.741000] Lustre: Skipped 5 previous similar messages [1334860.040537] Lustre: MGS: Connection restored to (at 10.9.103.6@o2ib4) [1334860.047245] Lustre: Skipped 1 previous similar message [1334875.719778] Lustre: MGS: haven't heard from client 3052c673-19cd-6bcd-f3e7-4b40e84e4532 (at 10.9.108.31@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf852c400, cur 1577321107 expire 1577320957 last 1577320880 [1334876.652899] Lustre: MGS: Connection restored to (at 10.9.103.10@o2ib4) [1334876.659703] Lustre: Skipped 1 previous similar message [1334912.755379] Lustre: fir-MDT0000: haven't heard from client 89b87f28-f0ea-d134-3bc3-c977132a58d4 (at 10.9.101.15@o2ib4) in 187 seconds. I think it's dead, and I am evicting it. exp ffff887be351a000, cur 1577321144 expire 1577320994 last 1577320957 [1334929.438339] Lustre: MGS: Connection restored to 478b4f20-cf53-3362-4d34-3a72671691c3 (at 10.9.103.4@o2ib4) [1334929.448166] Lustre: Skipped 1 previous similar message [1334951.729330] Lustre: MGS: haven't heard from client 1aafe80c-1b41-bd79-96ef-8a11bbeb9345 (at 10.9.101.15@o2ib4) in 226 seconds. I think it's dead, and I am evicting it. exp ffff888bf3ccdc00, cur 1577321183 expire 1577321033 last 1577320957 [1336475.545107] Lustre: MGS: Connection restored to (at 10.9.101.31@o2ib4) [1336475.551904] Lustre: Skipped 1 previous similar message [1336692.381414] Lustre: MGS: Connection restored to (at 10.9.101.15@o2ib4) [1336692.388213] Lustre: Skipped 1 previous similar message [1337044.660693] Lustre: MGS: Connection restored to (at 10.9.101.43@o2ib4) [1337044.667494] Lustre: Skipped 1 previous similar message [1337126.344545] Lustre: MGS: Connection restored to 1dc3f5bd-361f-e898-bb5a-7e27c765d63b (at 10.9.108.52@o2ib4) [1337126.354458] Lustre: Skipped 1 previous similar message [1337211.257695] Lustre: MGS: Connection restored to 5df41e4d-7dff-c151-5918-6d3c9eb8fbfd (at 10.9.108.31@o2ib4) [1337211.267606] Lustre: Skipped 1 previous similar message [1340988.212834] Lustre: MGS: Connection restored to (at 10.9.103.5@o2ib4) [1340988.219543] Lustre: Skipped 1 previous similar message [1341004.664131] Lustre: MGS: Connection restored to 714bb32a-6958-a8c3-cee0-4539b6d9d551 (at 10.9.103.23@o2ib4) [1341004.674049] Lustre: Skipped 1 previous similar message [1347215.818913] Lustre: MGS: haven't heard from client 2be78582-ac89-cc99-eac2-7d91a5283a6f (at 10.9.106.28@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887d4b8a6c00, cur 1577333447 expire 1577333297 last 1577333220 [1348112.801835] Lustre: fir-MDT0000: haven't heard from client e41f8f6c-0135-3698-3693-dca49342e6d3 (at 10.9.108.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b9f70e000, cur 1577334344 expire 1577334194 last 1577334117 [1348112.823797] Lustre: Skipped 1 previous similar message [1349680.309450] Lustre: MGS: Connection restored to 9028721a-18e4-1b54-91e5-1a1728004be1 (at 10.9.106.28@o2ib4) [1349680.319397] Lustre: Skipped 1 previous similar message [1350168.861369] Lustre: MGS: Connection restored to (at 10.9.108.10@o2ib4) [1350168.868164] Lustre: Skipped 1 previous similar message [1350314.563743] Lustre: MGS: Connection restored to (at 10.9.108.3@o2ib4) [1350314.570448] Lustre: Skipped 1 previous similar message [1350371.310249] Lustre: MGS: Connection restored to 7dc77806-c779-f6f7-b102-8e88c090719f (at 10.9.108.2@o2ib4) [1350371.320076] Lustre: Skipped 1 previous similar message [1350517.135543] Lustre: MGS: Connection restored to e41f8f6c-0135-3698-3693-dca49342e6d3 (at 10.9.108.15@o2ib4) [1350517.145512] Lustre: Skipped 1 previous similar message [1378583.987504] Lustre: fir-MDT0000: haven't heard from client 6fa74e3f-9938-15c7-138c-f99832f361f8 (at 10.9.108.5@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a0382e000, cur 1577364815 expire 1577364665 last 1577364588 [1378584.009400] Lustre: Skipped 7 previous similar messages [1379436.608445] Lustre: MGS: Connection restored to f6ba9aa6-4fc2-730c-edfa-2a78e78265f6 (at 10.9.103.13@o2ib4) [1379436.618361] Lustre: Skipped 1 previous similar message [1380752.014996] Lustre: MGS: Connection restored to 6fa74e3f-9938-15c7-138c-f99832f361f8 (at 10.9.108.5@o2ib4) [1380752.024840] Lustre: Skipped 1 previous similar message [1402289.232880] LNetError: 38663:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1419300.230997] Lustre: MGS: Connection restored to (at 10.9.103.19@o2ib4) [1419300.237801] Lustre: Skipped 1 previous similar message [1422710.680965] Lustre: MGS: Connection restored to c782e41a-2c79-9684-50af-81e0df79157a (at 10.9.104.72@o2ib4) [1422710.690881] Lustre: Skipped 1 previous similar message [1422810.630217] Lustre: MGS: Connection restored to (at 10.9.104.71@o2ib4) [1422810.637012] Lustre: Skipped 1 previous similar message [1427513.313931] Lustre: MGS: haven't heard from client a7009069-c675-53b5-33f1-1365d976f3d3 (at 10.9.104.65@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be4608800, cur 1577413744 expire 1577413594 last 1577413517 [1427513.335203] Lustre: Skipped 1 previous similar message [1427523.282271] Lustre: fir-MDT0000: haven't heard from client 642f74bd-3e14-2ada-9b96-4a08331f88e5 (at 10.9.104.65@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88750cbb3400, cur 1577413754 expire 1577413604 last 1577413527 [1429819.453796] Lustre: MGS: Connection restored to (at 10.9.104.65@o2ib4) [1429819.460612] Lustre: Skipped 1 previous similar message [1432498.324045] Lustre: fir-MDT0000: haven't heard from client 5ec89db4-70f6-e901-f379-b0c0069f29c2 (at 10.9.106.24@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ad06e6000, cur 1577418729 expire 1577418579 last 1577418502 [1434466.335733] Lustre: MGS: Connection restored to 071864c2-9183-bebb-c178-92a495b48267 (at 10.9.106.21@o2ib4) [1434466.345648] Lustre: Skipped 1 previous similar message [1434512.607522] Lustre: MGS: Connection restored to 1bb11754-7387-d7b6-e20b-128b72ce3c4b (at 10.9.106.34@o2ib4) [1434512.617442] Lustre: Skipped 1 previous similar message [1434699.780301] Lustre: MGS: Connection restored to 00b96260-32d1-1021-deb2-b528cf350aa4 (at 10.9.106.29@o2ib4) [1434699.790222] Lustre: Skipped 1 previous similar message [1434718.747128] Lustre: MGS: Connection restored to 5ec89db4-70f6-e901-f379-b0c0069f29c2 (at 10.9.106.24@o2ib4) [1434718.757067] Lustre: Skipped 1 previous similar message [1434740.195263] Lustre: MGS: Connection restored to (at 10.9.106.31@o2ib4) [1434740.202064] Lustre: Skipped 1 previous similar message [1434760.699595] Lustre: MGS: Connection restored to (at 10.9.106.35@o2ib4) [1434760.706399] Lustre: Skipped 1 previous similar message [1437487.350917] Lustre: fir-MDT0000: haven't heard from client b66c4d4c-fe0f-8ae2-31c8-85f7dd71a3ff (at 10.9.106.27@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879bd2f2800, cur 1577423718 expire 1577423568 last 1577423491 [1437487.372880] Lustre: Skipped 11 previous similar messages [1437682.349914] Lustre: fir-MDT0000: haven't heard from client 4d878457-c7d5-447e-2d23-6e8e7d59fb65 (at 10.9.102.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf9da7c00, cur 1577423913 expire 1577423763 last 1577423686 [1437682.371897] Lustre: Skipped 3 previous similar messages [1439716.176314] Lustre: MGS: Connection restored to b66c4d4c-fe0f-8ae2-31c8-85f7dd71a3ff (at 10.9.106.27@o2ib4) [1439716.186227] Lustre: Skipped 1 previous similar message [1439802.261551] Lustre: MGS: Connection restored to eb320cf7-cbf1-7af4-abe8-824de46a382c (at 10.9.106.14@o2ib4) [1439802.271469] Lustre: Skipped 1 previous similar message [1440042.492796] Lustre: MGS: Connection restored to 4d878457-c7d5-447e-2d23-6e8e7d59fb65 (at 10.9.102.30@o2ib4) [1440042.502713] Lustre: Skipped 1 previous similar message [1453014.445104] Lustre: fir-MDT0000: haven't heard from client acb19754-d1f1-4 (at 10.9.103.50@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8869b8f38c00, cur 1577439245 expire 1577439095 last 1577439018 [1453014.465252] Lustre: Skipped 1 previous similar message [1453444.447799] Lustre: fir-MDT0000: haven't heard from client 508a6e6d-3a6d-4 (at 10.9.103.48@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885fff6cd800, cur 1577439675 expire 1577439525 last 1577439448 [1453444.467942] Lustre: Skipped 25 previous similar messages [1453520.448303] Lustre: fir-MDT0000: haven't heard from client 1fa8ddc9-e0b5-8fe7-ac59-6b121a619269 (at 10.8.26.33@o2ib6) in 181 seconds. I think it's dead, and I am evicting it. exp ffff88757d3eb800, cur 1577439751 expire 1577439601 last 1577439570 [1453520.470188] Lustre: Skipped 1 previous similar message [1453550.970468] Lustre: 109778:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577439774/real 1577439774] req@ffff886aca637980 x1652550211567504/t0(0) o104->fir-MDT0000@10.9.116.2@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1577439781 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1453557.997514] Lustre: 109778:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577439781/real 1577439781] req@ffff886aca637980 x1652550211567504/t0(0) o104->fir-MDT0000@10.9.116.2@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1577439788 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1453565.024563] Lustre: 109778:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577439788/real 1577439788] req@ffff886aca637980 x1652550211567504/t0(0) o104->fir-MDT0000@10.9.116.2@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1577439795 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1453566.469383] Lustre: MGS: haven't heard from client d8bf52f5-e071-16ba-8a37-2d5729998cb8 (at 10.8.26.33@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887d4c7e7400, cur 1577439797 expire 1577439647 last 1577439570 [1453566.490571] Lustre: Skipped 19 previous similar messages [1453572.051607] Lustre: 109778:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577439795/real 1577439795] req@ffff886aca637980 x1652550211567504/t0(0) o104->fir-MDT0000@10.9.116.2@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1577439802 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1453579.078656] Lustre: 109778:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577439802/real 1577439802] req@ffff886aca637980 x1652550211567504/t0(0) o104->fir-MDT0000@10.9.116.2@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1577439809 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1453593.105733] Lustre: 109778:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577439816/real 1577439816] req@ffff886aca637980 x1652550211567504/t0(0) o104->fir-MDT0000@10.9.116.2@o2ib4:15/16 lens 296/224 e 0 to 1 dl 1577439823 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1453593.133259] Lustre: 109778:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 1 previous similar message [1453596.448773] Lustre: fir-MDT0000: haven't heard from client 63de368d-cd37-4b74-b25f-9126c563c069 (at 10.9.105.10@o2ib4) in 226 seconds. I think it's dead, and I am evicting it. exp ffff887ab57fcc00, cur 1577439827 expire 1577439677 last 1577439601 [1453596.470739] Lustre: Skipped 22 previous similar messages [1454737.320726] Lustre: MGS: Connection restored to (at 10.9.116.2@o2ib4) [1454737.327437] Lustre: Skipped 1 previous similar message [1454827.243827] Lustre: MGS: Connection restored to fb2b31ba-9f80-8b46-f913-00d36178e70c (at 10.9.103.56@o2ib4) [1454827.253742] Lustre: Skipped 1 previous similar message [1454832.832038] Lustre: MGS: Connection restored to ef2d362e-15f9-19b6-7dd8-07207d8adffe (at 10.9.103.50@o2ib4) [1454832.841952] Lustre: Skipped 1 previous similar message [1454850.466847] Lustre: MGS: Connection restored to 9622ebd9-08dd-84f5-187b-b07758b1dd55 (at 10.9.103.48@o2ib4) [1454850.476764] Lustre: Skipped 1 previous similar message [1454941.063783] Lustre: MGS: Connection restored to 1bb11754-7387-d7b6-e20b-128b72ce3c4b (at 10.9.106.34@o2ib4) [1454941.073715] Lustre: Skipped 1 previous similar message [1455045.649171] Lustre: MGS: Connection restored to (at 10.9.108.10@o2ib4) [1455045.655977] Lustre: Skipped 1 previous similar message [1455093.832271] Lustre: MGS: Connection restored to 54a4e18f-2dbf-9330-244f-d38b0011d1d4 (at 10.9.103.65@o2ib4) [1455093.842194] Lustre: Skipped 3 previous similar messages [1455169.756768] Lustre: MGS: Connection restored to (at 10.8.26.33@o2ib6) [1455169.763479] Lustre: Skipped 7 previous similar messages [1455316.572397] Lustre: MGS: Connection restored to 4e2702d7-8aa3-c82c-cda8-d58ec71db6f4 (at 10.9.105.11@o2ib4) [1455316.582322] Lustre: Skipped 9 previous similar messages [1455660.327857] Lustre: MGS: Connection restored to 969f84d2-0d0f-26d6-74a9-f803adc15ecf (at 10.8.13.27@o2ib6) [1455660.337696] Lustre: Skipped 15 previous similar messages [1456278.465429] Lustre: fir-MDT0000: haven't heard from client a1e19779-6049-1282-3793-3d963dedc017 (at 10.9.105.29@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88741a351800, cur 1577442509 expire 1577442359 last 1577442282 [1456278.487397] Lustre: Skipped 2 previous similar messages [1457480.471377] Lustre: fir-MDT0000: haven't heard from client 45359108-42d4-eec1-42df-ba80fda15025 (at 10.9.105.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b9f709400, cur 1577443711 expire 1577443561 last 1577443484 [1457480.493260] Lustre: Skipped 286 previous similar messages [1458146.933840] Lustre: MGS: Connection restored to abb5ac98-d884-746d-5bad-e1a980f92130 (at 10.9.110.22@o2ib4) [1458146.943753] Lustre: Skipped 53 previous similar messages [1458222.789748] Lustre: MGS: Connection restored to (at 10.9.107.27@o2ib4) [1458222.796551] Lustre: Skipped 15 previous similar messages [1458352.925252] Lustre: MGS: Connection restored to 882378af-0b41-73ee-5c10-5cc51464645c (at 10.9.108.22@o2ib4) [1458352.935190] Lustre: Skipped 17 previous similar messages [1458609.269788] Lustre: MGS: Connection restored to (at 10.9.109.10@o2ib4) [1458609.276592] Lustre: Skipped 251 previous similar messages [1459766.055021] Lustre: MGS: Connection restored to 45359108-42d4-eec1-42df-ba80fda15025 (at 10.9.105.4@o2ib4) [1459766.064862] Lustre: Skipped 145 previous similar messages [1459961.299645] Lustre: MGS: Connection restored to (at 10.9.107.67@o2ib4) [1459961.306450] Lustre: Skipped 1 previous similar message [1461805.496090] Lustre: fir-MDT0000: haven't heard from client b5fd2bd8-98a0-6d39-58e1-4052c3f501ed (at 10.8.28.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f1400, cur 1577448036 expire 1577447886 last 1577447809 [1461805.517891] Lustre: Skipped 1 previous similar message [1462443.499396] Lustre: fir-MDT0000: haven't heard from client c3ac0abe-4001-a9cd-b0c3-8ae511f6ef2d (at 10.9.112.3@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbe9c4c00, cur 1577448674 expire 1577448524 last 1577448447 [1462443.521283] Lustre: Skipped 1 previous similar message [1463277.045260] Lustre: MGS: Connection restored to (at 10.8.28.4@o2ib6) [1463277.051891] Lustre: Skipped 1 previous similar message [1464763.745218] Lustre: MGS: Connection restored to c3ac0abe-4001-a9cd-b0c3-8ae511f6ef2d (at 10.9.112.3@o2ib4) [1464763.755043] Lustre: Skipped 1 previous similar message [1465806.517755] Lustre: fir-MDT0000: haven't heard from client 09403296-99cb-0352-a342-f41333f5025e (at 10.9.107.69@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8874d5e32c00, cur 1577452037 expire 1577451887 last 1577451810 [1465806.539722] Lustre: Skipped 1 previous similar message [1466607.522146] Lustre: fir-MDT0000: haven't heard from client e28be1e7-a280-e0e2-d404-71ed26b45978 (at 10.9.105.49@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887827ade800, cur 1577452838 expire 1577452688 last 1577452611 [1466607.544113] Lustre: Skipped 1 previous similar message [1467798.175433] Lustre: MGS: Connection restored to (at 10.9.107.69@o2ib4) [1467798.182232] Lustre: Skipped 1 previous similar message [1469047.399255] Lustre: MGS: Connection restored to e28be1e7-a280-e0e2-d404-71ed26b45978 (at 10.9.105.49@o2ib4) [1469047.409179] Lustre: Skipped 1 previous similar message [1469553.538696] Lustre: fir-MDT0000: haven't heard from client ae9bd656-5a6f-05f5-a9fa-237fb2f346f5 (at 10.9.107.71@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d1e87800, cur 1577455784 expire 1577455634 last 1577455557 [1469553.560667] Lustre: Skipped 1 previous similar message [1470354.021814] Lustre: MGS: Connection restored to (at 10.9.109.5@o2ib4) [1470354.028524] Lustre: Skipped 1 previous similar message [1470959.546509] Lustre: fir-MDT0000: haven't heard from client 7a6f55ae-87a4-2987-c144-04b6c1dac1db (at 10.9.106.23@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887762090c00, cur 1577457190 expire 1577457040 last 1577456963 [1470959.568474] Lustre: Skipped 5 previous similar messages [1470976.554550] Lustre: MGS: haven't heard from client c1c0bbff-7c22-83d5-d03b-b588a0bf9077 (at 10.9.106.23@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf3c81800, cur 1577457207 expire 1577457057 last 1577456980 [1471617.850598] Lustre: MGS: Connection restored to (at 10.9.107.72@o2ib4) [1471617.857397] Lustre: Skipped 1 previous similar message [1471909.550991] Lustre: fir-MDT0000: haven't heard from client 41380b2a-9baf-4 (at 10.8.28.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887baa6fbc00, cur 1577458140 expire 1577457990 last 1577457913 [1471973.013423] Lustre: MGS: Connection restored to 770ff19b-cee5-0b67-8359-a1437bc75e2e (at 10.9.107.70@o2ib4) [1471973.023346] Lustre: Skipped 1 previous similar message [1471994.665505] Lustre: MGS: Connection restored to (at 10.9.107.71@o2ib4) [1471994.672315] Lustre: Skipped 1 previous similar message [1473383.262946] Lustre: MGS: Connection restored to c3415e6e-dda3-8602-28df-a932f656881d (at 10.9.112.17@o2ib4) [1473383.272863] Lustre: Skipped 1 previous similar message [1473412.216651] Lustre: MGS: Connection restored to 7a6f55ae-87a4-2987-c144-04b6c1dac1db (at 10.9.106.23@o2ib4) [1473412.226574] Lustre: Skipped 1 previous similar message [1474018.504209] Lustre: MGS: Connection restored to (at 10.8.28.4@o2ib6) [1474018.510832] Lustre: Skipped 1 previous similar message [1477167.226900] Lustre: 109632:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1477167.238811] Lustre: 109632:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 311 previous similar messages [1477168.826033] Lustre: 109632:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1477168.837956] Lustre: 109632:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 868 previous similar messages [1477175.505071] Lustre: 109661:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1477175.517007] Lustre: 109661:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 19 previous similar messages [1478463.099186] Lustre: MGS: Connection restored to 9622ebd9-08dd-84f5-187b-b07758b1dd55 (at 10.9.103.48@o2ib4) [1478463.109100] Lustre: Skipped 1 previous similar message [1478463.800696] Lustre: MGS: Connection restored to 79da3557-0dd8-94ba-46e3-b3332f203b06 (at 10.9.110.26@o2ib4) [1478463.810614] Lustre: Skipped 1 previous similar message [1478465.651012] Lustre: MGS: Connection restored to ef2d362e-15f9-19b6-7dd8-07207d8adffe (at 10.9.103.50@o2ib4) [1478465.660926] Lustre: Skipped 5 previous similar messages [1478472.411125] Lustre: MGS: Connection restored to (at 10.9.110.14@o2ib4) [1478472.417924] Lustre: Skipped 1 previous similar message [1478482.627146] Lustre: MGS: Connection restored to 628ef2b2-a10c-b5aa-c89a-d9e59a7dce2e (at 10.9.107.25@o2ib4) [1478482.637062] Lustre: Skipped 1 previous similar message [1478492.597099] Lustre: MGS: Connection restored to 6df4ef0e-2a74-e6b2-3bce-21cb74d17257 (at 10.9.107.10@o2ib4) [1478492.607015] Lustre: Skipped 3 previous similar messages [1478507.590630] Lustre: fir-MDT0000: haven't heard from client f56be651-dc31-4 (at 10.9.110.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a70f0b000, cur 1577464738 expire 1577464588 last 1577464511 [1478507.610784] Lustre: Skipped 1 previous similar message [1478511.019789] Lustre: MGS: Connection restored to 882378af-0b41-73ee-5c10-5cc51464645c (at 10.9.108.22@o2ib4) [1478511.029707] Lustre: Skipped 15 previous similar messages [1478771.592093] Lustre: fir-MDT0000: haven't heard from client 56ec1425-5381-0678-b174-d4693bd27d63 (at 10.9.106.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbd83d400, cur 1577465002 expire 1577464852 last 1577464775 [1478771.614059] Lustre: Skipped 41 previous similar messages [1478859.030938] Lustre: MGS: Connection restored to (at 10.9.107.72@o2ib4) [1478859.037732] Lustre: Skipped 7 previous similar messages [1480754.193975] Lustre: MGS: Connection restored to (at 10.9.109.72@o2ib4) [1480754.200772] Lustre: Skipped 1 previous similar message [1480767.827078] Lustre: MGS: Connection restored to c8611503-00c9-0601-ab3e-6ab5502d2910 (at 10.9.109.52@o2ib4) [1480767.836995] Lustre: Skipped 1 previous similar message [1480847.882196] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1480847.888917] Lustre: Skipped 1 previous similar message [1481186.579084] Lustre: MGS: Connection restored to (at 10.9.106.30@o2ib4) [1481186.585878] Lustre: Skipped 1 previous similar message [1481251.605811] Lustre: fir-MDT0000: haven't heard from client 68a841f6-0091-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879ed2a8400, cur 1577467482 expire 1577467332 last 1577467255 [1481251.625866] Lustre: Skipped 1 previous similar message [1481436.932060] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1481436.938771] Lustre: Skipped 3 previous similar messages [1481494.608455] Lustre: fir-MDT0000: haven't heard from client 9a42c635-8d69-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8868a5f61c00, cur 1577467725 expire 1577467575 last 1577467498 [1481494.628517] Lustre: Skipped 1 previous similar message [1481851.610841] Lustre: fir-MDT0000: haven't heard from client 9e4f3fc0-bbec-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879ed2ac800, cur 1577468082 expire 1577467932 last 1577467855 [1481851.630897] Lustre: Skipped 1 previous similar message [1481865.621881] Lustre: MGS: haven't heard from client 139d34d2-9796-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879a03c0c00, cur 1577468096 expire 1577467946 last 1577467869 [1482111.569448] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1482111.576153] Lustre: Skipped 1 previous similar message [1482507.615117] Lustre: fir-MDT0000: haven't heard from client cce6ad10-0df7-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8869252cf800, cur 1577468738 expire 1577468588 last 1577468511 [1482515.643278] Lustre: MGS: haven't heard from client eeb0ee61-1f85-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885c561ec400, cur 1577468746 expire 1577468596 last 1577468519 [1482678.276003] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1482678.282716] Lustre: Skipped 1 previous similar message [1483051.618733] Lustre: fir-MDT0000: haven't heard from client 766e63ea-d82f-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88560523c400, cur 1577469282 expire 1577469132 last 1577469055 [1483056.626170] Lustre: MGS: haven't heard from client ab7cd1cf-3091-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8867524a6c00, cur 1577469287 expire 1577469137 last 1577469060 [1483308.390493] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1483308.397204] Lustre: Skipped 3 previous similar messages [1483737.632804] Lustre: MGS: haven't heard from client 02289f5e-6d32-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88687fbb3c00, cur 1577469968 expire 1577469818 last 1577469741 [1483739.621820] Lustre: fir-MDT0000: haven't heard from client 9e729b9b-a781-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887659273800, cur 1577469970 expire 1577469820 last 1577469743 [1484367.626953] Lustre: fir-MDT0000: haven't heard from client cf50bd41-e64a-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8865f2e86400, cur 1577470598 expire 1577470448 last 1577470371 [1484600.446627] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1484600.453334] Lustre: Skipped 5 previous similar messages [1484954.630318] Lustre: fir-MDT0000: haven't heard from client 58dcb6b5-1cba-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88504ee27000, cur 1577471185 expire 1577471035 last 1577470958 [1484954.650371] Lustre: Skipped 1 previous similar message [1485019.713885] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1485019.720593] Lustre: Skipped 1 previous similar message [1485133.638415] Lustre: fir-MDT0000: haven't heard from client 4c6caea5-4af0-4 (at 10.9.109.45@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885f4bad9400, cur 1577471364 expire 1577471214 last 1577471137 [1485133.658558] Lustre: Skipped 1 previous similar message [1485546.619087] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1485546.625802] Lustre: Skipped 1 previous similar message [1485573.634177] Lustre: fir-MDT0000: haven't heard from client ac4b1589-d416-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88601ba7d800, cur 1577471804 expire 1577471654 last 1577471577 [1485573.654238] Lustre: Skipped 1 previous similar message [1486000.636653] Lustre: fir-MDT0000: haven't heard from client 4d46ae94-e7ec-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887418284c00, cur 1577472231 expire 1577472081 last 1577472004 [1486000.656736] Lustre: Skipped 1 previous similar message [1486036.316347] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1486036.323052] Lustre: Skipped 1 previous similar message [1486440.639282] Lustre: fir-MDT0000: haven't heard from client 2238cd8b-09e8-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8851b4300c00, cur 1577472671 expire 1577472521 last 1577472444 [1486440.659340] Lustre: Skipped 1 previous similar message [1486637.354720] Lustre: MGS: Connection restored to (at 10.9.109.45@o2ib4) [1486637.361519] Lustre: Skipped 3 previous similar messages [1486678.640896] Lustre: fir-MDT0000: haven't heard from client 7650cc1d-e09f-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88601ba7f800, cur 1577472909 expire 1577472759 last 1577472682 [1486678.660954] Lustre: Skipped 1 previous similar message [1486980.642697] Lustre: fir-MDT0000: haven't heard from client 27108ea1-064e-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bd1a9cc00, cur 1577473211 expire 1577473061 last 1577472984 [1486980.662754] Lustre: Skipped 1 previous similar message [1487431.361066] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1487431.367782] Lustre: Skipped 5 previous similar messages [1487481.645436] Lustre: fir-MDT0000: haven't heard from client 408b93e5-57d3-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ae86fcc00, cur 1577473712 expire 1577473562 last 1577473485 [1487481.665533] Lustre: Skipped 1 previous similar message [1487709.646745] Lustre: fir-MDT0000: haven't heard from client 6175307b-3702-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a2ba4d400, cur 1577473940 expire 1577473790 last 1577473713 [1487709.666799] Lustre: Skipped 1 previous similar message [1487973.655333] Lustre: MGS: haven't heard from client 931ed0a8-59b2-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a6ff09000, cur 1577474204 expire 1577474054 last 1577473977 [1487973.674697] Lustre: Skipped 1 previous similar message [1488146.171211] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1488146.177921] Lustre: Skipped 5 previous similar messages [1488621.652671] Lustre: fir-MDT0000: haven't heard from client bff639f8-9e03-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ae86fc000, cur 1577474852 expire 1577474702 last 1577474625 [1488621.672731] Lustre: Skipped 5 previous similar messages [1488947.456143] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1488947.462849] Lustre: Skipped 5 previous similar messages [1489225.656007] Lustre: fir-MDT0000: haven't heard from client 1a9305f0-78f7-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a451d9c00, cur 1577475456 expire 1577475306 last 1577475229 [1489225.676068] Lustre: Skipped 5 previous similar messages [1490587.779335] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1490587.786045] Lustre: Skipped 5 previous similar messages [1490999.665912] Lustre: fir-MDT0000: haven't heard from client 60624d86-718a-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886fe3fd7c00, cur 1577477230 expire 1577477080 last 1577477003 [1490999.685972] Lustre: Skipped 7 previous similar messages [1491371.784727] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1491371.791442] Lustre: Skipped 1 previous similar message [1491389.347660] Lustre: MGS: Connection restored to e14056e8-a850-2c49-c269-dc2b6853832b (at 10.9.108.69@o2ib4) [1491389.357574] Lustre: Skipped 1 previous similar message [1491410.851983] Lustre: MGS: Connection restored to 24aa28b1-552f-aa63-3edd-077b217a77da (at 10.9.108.70@o2ib4) [1491410.861901] Lustre: Skipped 1 previous similar message [1491733.671242] Lustre: fir-MDT0000: haven't heard from client c1db379b-f117-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a75ab4c00, cur 1577477964 expire 1577477814 last 1577477737 [1491733.691294] Lustre: Skipped 1 previous similar message [1491750.679506] Lustre: MGS: haven't heard from client bfbb4ba2-eb9a-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8872d8ef7c00, cur 1577477981 expire 1577477831 last 1577477754 [1491808.667652] Lustre: MGS: Connection restored to 884abc27-a74d-8041-6bee-9a3356dc7069 (at 10.9.107.68@o2ib4) [1491808.677569] Lustre: Skipped 3 previous similar messages [1491883.864464] Lustre: fir-MDT0000: Connection restored to (at 10.8.23.12@o2ib6) [1491883.871870] Lustre: Skipped 2 previous similar messages [1492161.673495] Lustre: fir-MDT0000: haven't heard from client 2725365a-832c-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886a451d8400, cur 1577478392 expire 1577478242 last 1577478165 [1492574.088388] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1492920.677897] Lustre: fir-MDT0000: haven't heard from client 875b5aa9-d7ae-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884c2535b400, cur 1577479151 expire 1577479001 last 1577478924 [1492920.698000] Lustre: Skipped 1 previous similar message [1492927.688103] Lustre: MGS: haven't heard from client e03fcc0f-bcb2-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886944741c00, cur 1577479158 expire 1577479008 last 1577478931 [1493167.680575] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1493167.687284] Lustre: Skipped 1 previous similar message [1493225.679098] Lustre: MGS: haven't heard from client 5a9dd631-8a16-e221-c7ee-215bf477f808 (at 10.9.110.45@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887d4c813800, cur 1577479456 expire 1577479306 last 1577479229 [1493301.684974] Lustre: fir-MDT0000: haven't heard from client 755cfe5d-3893-170b-7d25-0214edf35779 (at 10.9.105.68@o2ib4) in 179 seconds. I think it's dead, and I am evicting it. exp ffff887a51f72c00, cur 1577479532 expire 1577479382 last 1577479353 [1493301.706937] Lustre: Skipped 1 previous similar message [1493352.719360] Lustre: MGS: haven't heard from client 78d776f4-1236-74bc-07a4-4cf323951618 (at 10.9.105.68@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde7fc00, cur 1577479583 expire 1577479433 last 1577479356 [1493911.042950] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1493911.049660] Lustre: Skipped 1 previous similar message [1494247.686983] Lustre: fir-MDT0000: haven't heard from client 8051cb6f-a336-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8862b22d7800, cur 1577480478 expire 1577480328 last 1577480251 [1494247.707040] Lustre: Skipped 2 previous similar messages [1494264.698816] Lustre: MGS: haven't heard from client eefd3cce-324a-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b01e7c400, cur 1577480495 expire 1577480345 last 1577480268 [1494897.669380] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1494897.676095] Lustre: Skipped 1 previous similar message [1495135.695876] Lustre: fir-MDT0000: haven't heard from client 30fc7389-64a1-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8875a130c800, cur 1577481366 expire 1577481216 last 1577481139 [1495151.700318] Lustre: MGS: haven't heard from client a7304b6b-a6b3-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8856ada55c00, cur 1577481382 expire 1577481232 last 1577481155 [1495691.293459] Lustre: MGS: Connection restored to (at 10.9.105.68@o2ib4) [1495691.300252] Lustre: Skipped 5 previous similar messages [1495910.695503] Lustre: fir-MDT0000: haven't heard from client 80de7570-e807-8787-7716-ace1800f6695 (at 10.9.108.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a51f77800, cur 1577482141 expire 1577481991 last 1577481914 [1498005.708278] Lustre: MGS: haven't heard from client 1b2caf8a-0240-8ec4-0b8c-e8cbb5a5fc81 (at 10.9.110.46@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bbc462000, cur 1577484236 expire 1577484086 last 1577484009 [1498005.729556] Lustre: Skipped 1 previous similar message [1498117.968535] Lustre: MGS: Connection restored to 80de7570-e807-8787-7716-ace1800f6695 (at 10.9.108.17@o2ib4) [1498117.978453] Lustre: Skipped 1 previous similar message [1498270.709468] Lustre: fir-MDT0000: haven't heard from client a3ce419d-e689-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857082ea000, cur 1577484501 expire 1577484351 last 1577484274 [1498270.729525] Lustre: Skipped 1 previous similar message [1498634.486979] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1498634.493692] Lustre: Skipped 1 previous similar message [1499513.722126] Lustre: fir-MDT0000: haven't heard from client 7794b003-7ee1-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884e70ba3800, cur 1577485744 expire 1577485594 last 1577485517 [1499513.742182] Lustre: Skipped 1 previous similar message [1499514.728984] Lustre: MGS: haven't heard from client 030cfae4-3799-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8859f4b37000, cur 1577485745 expire 1577485595 last 1577485518 [1499627.123897] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1499627.130609] Lustre: Skipped 1 previous similar message [1499752.718459] Lustre: fir-MDT0000: haven't heard from client 1e345d12-444b-48ed-89b3-9136a08e6c23 (at 10.9.107.2@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf9da0800, cur 1577485983 expire 1577485833 last 1577485756 [1499828.717721] Lustre: fir-MDT0000: haven't heard from client b0f1f394-eab1-7c6a-14aa-cdbd7cff86a5 (at 10.8.31.10@o2ib6) in 158 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b1800, cur 1577486059 expire 1577485909 last 1577485901 [1499828.739600] Lustre: Skipped 1 previous similar message [1499891.737784] Lustre: MGS: haven't heard from client 9573f4e4-5755-451a-7da6-e05e7d5d3ae8 (at 10.8.31.10@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bcafec800, cur 1577486122 expire 1577485972 last 1577485895 [1500003.719762] Lustre: fir-MDT0000: haven't heard from client 3d22d837-342b-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885fbee11800, cur 1577486234 expire 1577486084 last 1577486007 [1500067.783615] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1500067.790334] Lustre: Skipped 1 previous similar message [1500112.627497] Lustre: MGS: Connection restored to (at 10.9.110.46@o2ib4) [1500112.634291] Lustre: Skipped 1 previous similar message [1500578.723103] Lustre: fir-MDT0000: haven't heard from client 3e0f566d-5522-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885ebe01ac00, cur 1577486809 expire 1577486659 last 1577486582 [1500578.743183] Lustre: Skipped 1 previous similar message [1500612.580184] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1500612.586896] Lustre: Skipped 1 previous similar message [1500949.738902] Lustre: fir-MDT0000: haven't heard from client f3956eb9-9be5-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8857082eb400, cur 1577487180 expire 1577487030 last 1577486953 [1500949.758964] Lustre: Skipped 1 previous similar message [1501148.726715] Lustre: fir-MDT0000: haven't heard from client 4380447f-ca0d-2b3f-165d-69d1c6bd4d88 (at 10.9.108.8@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f5400, cur 1577487379 expire 1577487229 last 1577487152 [1501148.748587] Lustre: Skipped 1 previous similar message [1501578.729607] Lustre: fir-MDT0000: haven't heard from client 3532db27-3550-1319-6c1b-3d6651c2c9af (at 10.9.108.62@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f7400, cur 1577487809 expire 1577487659 last 1577487582 [1501578.751584] Lustre: Skipped 1 previous similar message [1501897.321963] Lustre: MGS: Connection restored to (at 10.9.107.2@o2ib4) [1501897.328669] Lustre: Skipped 1 previous similar message [1501988.760680] Lustre: MGS: Connection restored to 554dcc06-8c06-f49d-eac2-beeb59276b64 (at 10.9.109.6@o2ib4) [1501988.770511] Lustre: Skipped 1 previous similar message [1502460.849891] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1502460.856601] Lustre: Skipped 3 previous similar messages [1502687.737506] Lustre: fir-MDT0000: haven't heard from client 21c77817-b642-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d8d66c000, cur 1577488918 expire 1577488768 last 1577488691 [1502687.757579] Lustre: Skipped 1 previous similar message [1502856.738938] Lustre: fir-MDT0000: haven't heard from client 51d0c022-098e-f3a8-e504-1672a4918d33 (at 10.9.108.55@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887862c50c00, cur 1577489087 expire 1577488937 last 1577488860 [1502856.760898] Lustre: Skipped 1 previous similar message [1503011.402260] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1503011.408972] Lustre: Skipped 1 previous similar message [1503419.775776] Lustre: MGS: Connection restored to (at 10.8.24.30@o2ib6) [1503419.782492] Lustre: Skipped 1 previous similar message [1503464.742747] Lustre: fir-MDT0000: haven't heard from client acb498ac-e0af-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879eeb29c00, cur 1577489695 expire 1577489545 last 1577489468 [1503464.762804] Lustre: Skipped 1 previous similar message [1503741.844364] Lustre: MGS: Connection restored to 6fc393d4-6943-9dea-aaa0-9424272390c8 (at 10.8.26.5@o2ib6) [1503741.854120] Lustre: Skipped 5 previous similar messages [1504056.746921] Lustre: fir-MDT0000: haven't heard from client b6bfc9f5-2cd6-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8855dbeca800, cur 1577490287 expire 1577490137 last 1577490060 [1504056.766986] Lustre: Skipped 1 previous similar message [1504418.623323] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1504418.630029] Lustre: Skipped 7 previous similar messages [1504497.750070] Lustre: fir-MDT0000: haven't heard from client 946b0205-e2ab-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879f62db400, cur 1577490728 expire 1577490578 last 1577490501 [1504497.770130] Lustre: Skipped 3 previous similar messages [1505148.039053] Lustre: MGS: Connection restored to 51d0c022-098e-f3a8-e504-1672a4918d33 (at 10.9.108.55@o2ib4) [1505148.048972] Lustre: Skipped 1 previous similar message [1506262.762934] Lustre: fir-MDT0000: haven't heard from client 34e4932f-aa61-0617-6986-8c885c948b7a (at 10.9.106.32@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887546e89400, cur 1577492493 expire 1577492343 last 1577492266 [1506262.784896] Lustre: Skipped 3 previous similar messages [1506838.630497] Lustre: MGS: Connection restored to f7e00986-8544-51b5-5c7e-5b48cb50b80d (at 10.9.108.65@o2ib4) [1506838.640411] Lustre: Skipped 1 previous similar message [1507121.769534] Lustre: MGS: haven't heard from client d8a48766-273b-f61a-d45c-9581a264c1e4 (at 10.9.108.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be2aefc00, cur 1577493352 expire 1577493202 last 1577493125 [1507121.790805] Lustre: Skipped 1 previous similar message [1507195.751338] Lustre: MGS: Connection restored to 788c6bef-0d82-e2c1-b3af-3e03bdb6233c (at 10.9.103.20@o2ib4) [1507195.761256] Lustre: Skipped 1 previous similar message [1508086.776864] Lustre: fir-MDT0000: haven't heard from client 35ba350a-bccc-3fd9-39f0-a94eca80785d (at 10.9.107.33@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a51f75400, cur 1577494317 expire 1577494167 last 1577494090 [1508086.798844] Lustre: Skipped 1 previous similar message [1508470.633730] Lustre: MGS: Connection restored to (at 10.9.106.32@o2ib4) [1508470.640528] Lustre: Skipped 1 previous similar message [1509530.185970] Lustre: MGS: Connection restored to 72873d34-8e81-fac8-81c3-235a686f2b2f (at 10.9.108.54@o2ib4) [1509530.195890] Lustre: Skipped 1 previous similar message [1510119.846782] Lustre: MGS: Connection restored to 35ba350a-bccc-3fd9-39f0-a94eca80785d (at 10.9.107.33@o2ib4) [1510119.856700] Lustre: Skipped 1 previous similar message [1510137.791666] Lustre: fir-MDT0000: haven't heard from client 901b72e8-5a4a-171a-6f23-9a8a3f1d6ac1 (at 10.9.108.40@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887978e5cc00, cur 1577496368 expire 1577496218 last 1577496141 [1510137.813627] Lustre: Skipped 1 previous similar message [1512342.930668] Lustre: MGS: Connection restored to (at 10.9.108.40@o2ib4) [1512342.937474] Lustre: Skipped 1 previous similar message [1513101.813478] Lustre: fir-MDT0000: haven't heard from client 44c6f171-46e6-59fb-2b19-a9c23808714f (at 10.9.108.49@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879bd2f6000, cur 1577499332 expire 1577499182 last 1577499105 [1513101.835440] Lustre: Skipped 1 previous similar message [1514137.820521] Lustre: fir-MDT0000: haven't heard from client b76f5425-cba8-6476-49e4-36be4bb23e72 (at 10.9.105.1@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a51f76400, cur 1577500368 expire 1577500218 last 1577500141 [1514137.842421] Lustre: Skipped 1 previous similar message [1515122.454238] Lustre: MGS: Connection restored to 44c6f171-46e6-59fb-2b19-a9c23808714f (at 10.9.108.49@o2ib4) [1515122.464154] Lustre: Skipped 1 previous similar message [1515527.979473] Lustre: MGS: Connection restored to (at 10.9.112.13@o2ib4) [1515527.986268] Lustre: Skipped 1 previous similar message [1515561.833911] Lustre: MGS: haven't heard from client a02762ba-8c7a-4 (at 10.9.112.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8887d12d9800, cur 1577501792 expire 1577501642 last 1577501565 [1515561.853359] Lustre: Skipped 1 previous similar message [1515637.857644] Lustre: fir-MDT0000: haven't heard from client 6f180733-5160-b093-3d65-dc65c8fbdb8a (at 10.9.108.13@o2ib4) in 185 seconds. I think it's dead, and I am evicting it. exp ffff8874d5e37c00, cur 1577501868 expire 1577501718 last 1577501683 [1515637.879628] Lustre: Skipped 1 previous similar message [1515679.842679] Lustre: MGS: haven't heard from client 36bfcc74-a312-b0b4-3f9c-faa2022aab09 (at 10.9.108.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf4fd9400, cur 1577501910 expire 1577501760 last 1577501683 [1516440.119295] Lustre: MGS: Connection restored to (at 10.9.105.1@o2ib4) [1516440.126003] Lustre: Skipped 1 previous similar message [1518039.523687] Lustre: MGS: Connection restored to 6f180733-5160-b093-3d65-dc65c8fbdb8a (at 10.9.108.13@o2ib4) [1518039.533604] Lustre: Skipped 1 previous similar message [1521645.870473] Lustre: fir-MDT0000: haven't heard from client 86a11234-25d1-a0be-a716-6c664a1d3a62 (at 10.9.109.38@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d2772000, cur 1577507876 expire 1577507726 last 1577507649 [1522658.877116] Lustre: fir-MDT0000: haven't heard from client 6b4dbc71-e415-4c59-fe43-096878d83a1b (at 10.8.27.15@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887862c53c00, cur 1577508889 expire 1577508739 last 1577508662 [1522658.899011] Lustre: Skipped 1 previous similar message [1523905.068057] Lustre: MGS: Connection restored to 86a11234-25d1-a0be-a716-6c664a1d3a62 (at 10.9.109.38@o2ib4) [1523905.077970] Lustre: Skipped 1 previous similar message [1524338.146884] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1524338.153595] Lustre: Skipped 1 previous similar message [1524390.893123] Lustre: fir-MDT0000: haven't heard from client 49451555-2bdf-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886daaeddc00, cur 1577510621 expire 1577510471 last 1577510394 [1524390.913216] Lustre: Skipped 1 previous similar message [1524510.193649] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1524510.200355] Lustre: Skipped 1 previous similar message [1524564.892508] Lustre: fir-MDT0000: haven't heard from client a6de7702-59ae-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88791c761000, cur 1577510795 expire 1577510645 last 1577510568 [1524564.912570] Lustre: Skipped 1 previous similar message [1524673.651731] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1524673.658440] Lustre: Skipped 1 previous similar message [1524736.892886] Lustre: MGS: haven't heard from client ec1d7468-2dc1-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88619f678800, cur 1577510967 expire 1577510817 last 1577510740 [1524736.912247] Lustre: Skipped 1 previous similar message [1524746.902263] Lustre: fir-MDT0000: haven't heard from client b4bac62a-a162-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884c29603000, cur 1577510977 expire 1577510827 last 1577510750 [1524759.178522] Lustre: MGS: Connection restored to (at 10.8.27.15@o2ib6) [1524759.185235] Lustre: Skipped 1 previous similar message [1524951.909599] Lustre: fir-MDT0000: haven't heard from client 4ef044ad-cb93-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bc0b47c00, cur 1577511182 expire 1577511032 last 1577510955 [1525220.995119] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1525221.001825] Lustre: Skipped 1 previous similar message [1525635.657336] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1525635.664050] Lustre: Skipped 1 previous similar message [1525674.901816] Lustre: fir-MDT0000: haven't heard from client e179f00f-99d3-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884c0f680400, cur 1577511905 expire 1577511755 last 1577511678 [1525674.921881] Lustre: Skipped 1 previous similar message [1526089.835883] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1526089.842590] Lustre: Skipped 1 previous similar message [1526164.901705] Lustre: fir-MDT0000: haven't heard from client 7eff80bc-fe00-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885f9415e800, cur 1577512395 expire 1577512245 last 1577512168 [1526164.921764] Lustre: Skipped 1 previous similar message [1526744.904906] Lustre: fir-MDT0000: haven't heard from client ab74125f-f9c3-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886d872a6400, cur 1577512975 expire 1577512825 last 1577512748 [1526744.924964] Lustre: Skipped 1 previous similar message [1526900.189616] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1526900.196361] Lustre: Skipped 1 previous similar message [1535581.968705] Lustre: fir-MDT0000: haven't heard from client c6b36f94-67b7-7ffb-733c-d0e83ca0d57f (at 10.9.106.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88757d3e8c00, cur 1577521812 expire 1577521662 last 1577521585 [1535581.990686] Lustre: Skipped 1 previous similar message [1535590.986901] Lustre: MGS: haven't heard from client 1cf694c2-2ec1-ed70-4a09-60c479c0648c (at 10.9.106.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bbc704000, cur 1577521821 expire 1577521671 last 1577521594 [1537830.014538] Lustre: MGS: Connection restored to (at 10.9.106.17@o2ib4) [1537830.021336] Lustre: Skipped 1 previous similar message [1538761.765961] Lustre: MGS: Connection restored to 1cc29b5c-8c91-e918-da08-fd0e6933bc8c (at 10.8.26.28@o2ib6) [1538761.775789] Lustre: Skipped 1 previous similar message [1539349.685800] Lustre: MGS: Connection restored to 29e1aebf-1670-cd17-c1cf-79c8ab624d16 (at 10.8.31.6@o2ib6) [1539349.695547] Lustre: Skipped 1 previous similar message [1540488.573286] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1540488.579990] Lustre: Skipped 1 previous similar message [1540525.997702] Lustre: fir-MDT0000: haven't heard from client 1c8ff0c5-69a3-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887396709c00, cur 1577526756 expire 1577526606 last 1577526529 [1540690.085701] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1540690.092416] Lustre: Skipped 1 previous similar message [1540725.998909] Lustre: fir-MDT0000: haven't heard from client 0c82a016-9f6d-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d276bb800, cur 1577526956 expire 1577526806 last 1577526729 [1540726.018966] Lustre: Skipped 1 previous similar message [1540742.080336] Lustre: MGS: haven't heard from client 47975ee6-092f-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885088e81000, cur 1577526972 expire 1577526822 last 1577526745 [1541094.036842] Lustre: fir-MDT0000: haven't heard from client 6de19752-bac3-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885d276bc000, cur 1577527324 expire 1577527174 last 1577527097 [1541759.921128] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1541759.927837] Lustre: Skipped 1 previous similar message [1544161.027030] Lustre: fir-MDT0000: haven't heard from client 8352cb37-7608-ba50-57aa-697129031d38 (at 10.9.104.60@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b5800, cur 1577530391 expire 1577530241 last 1577530164 [1544161.048990] Lustre: Skipped 1 previous similar message [1544979.031906] Lustre: MGS: haven't heard from client 9d9b4656-2a4f-59ae-37cb-6c69fadf6b22 (at 10.9.108.42@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf7b46000, cur 1577531209 expire 1577531059 last 1577530982 [1544979.053180] Lustre: Skipped 11 previous similar messages [1544992.028673] Lustre: fir-MDT0000: haven't heard from client cada385f-07ac-4f31-693e-b06b0f8d5686 (at 10.9.108.42@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ace3da800, cur 1577531222 expire 1577531072 last 1577530995 [1544992.050645] Lustre: Skipped 1 previous similar message [1546443.343942] Lustre: MGS: Connection restored to 8660dc7a-172c-047f-9f20-55fab5f17314 (at 10.9.104.57@o2ib4) [1546443.353855] Lustre: Skipped 1 previous similar message [1546468.987310] Lustre: MGS: Connection restored to (at 10.9.104.69@o2ib4) [1546468.994104] Lustre: Skipped 1 previous similar message [1546470.188664] Lustre: fir-MDT0000: Connection restored to 563424bc-ee7d-9050-ca9d-8df1f4bae750 (at 10.9.104.63@o2ib4) [1546470.199276] Lustre: Skipped 2 previous similar messages [1546472.780352] Lustre: MGS: Connection restored to (at 10.9.104.60@o2ib4) [1546478.300627] Lustre: MGS: Connection restored to c4f678a2-1dd2-d62e-4ea7-95949b75ef2e (at 10.9.104.58@o2ib4) [1546478.310538] Lustre: Skipped 1 previous similar message [1546486.783378] Lustre: MGS: Connection restored to (at 10.9.104.59@o2ib4) [1546486.790173] Lustre: Skipped 1 previous similar message [1546505.828564] Lustre: MGS: Connection restored to 42e53ca1-9e61-f63a-3561-4c5cbd63896a (at 10.9.104.52@o2ib4) [1546505.838481] Lustre: Skipped 1 previous similar message [1547287.612476] Lustre: MGS: Connection restored to (at 10.9.109.1@o2ib4) [1547287.619188] Lustre: Skipped 3 previous similar messages [1547321.713929] Lustre: MGS: Connection restored to cada385f-07ac-4f31-693e-b06b0f8d5686 (at 10.9.108.42@o2ib4) [1547321.723855] Lustre: Skipped 1 previous similar message [1550142.072642] Lustre: MGS: haven't heard from client 09a12266-a819-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887402e75400, cur 1577536372 expire 1577536222 last 1577536145 [1550142.092016] Lustre: Skipped 1 previous similar message [1550259.209515] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1550259.216225] Lustre: Skipped 1 previous similar message [1551609.988483] Lustre: MGS: Connection restored to 135a9cfe-4817-5326-7c6d-f991149294ef (at 10.8.23.28@o2ib6) [1551609.998315] Lustre: Skipped 1 previous similar message [1551619.831374] Lustre: MGS: Connection restored to (at 10.8.23.30@o2ib6) [1551619.838079] Lustre: Skipped 1 previous similar message [1551640.477041] Lustre: MGS: Connection restored to 34a24f1a-b283-f294-9e57-b82db6a433ab (at 10.8.30.8@o2ib6) [1551640.486786] Lustre: Skipped 1 previous similar message [1552122.058671] Lustre: MGS: Connection restored to (at 10.8.30.30@o2ib6) [1552122.065382] Lustre: Skipped 1 previous similar message [1554217.107925] Lustre: MGS: haven't heard from client a673fd8d-d8aa-cf8d-1254-cef5d68d5fdf (at 10.9.109.56@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bcaf8ec00, cur 1577540447 expire 1577540297 last 1577540220 [1554217.129198] Lustre: Skipped 1 previous similar message [1555203.101982] Lustre: MGS: haven't heard from client d70012f5-9913-0f25-3077-96e51adec4dc (at 10.9.110.47@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885be7149800, cur 1577541433 expire 1577541283 last 1577541206 [1555203.123249] Lustre: Skipped 1 previous similar message [1556323.110864] Lustre: MGS: haven't heard from client 9461c31c-60ee-dc79-dc05-429f950c40cb (at 10.9.107.66@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885beab9d800, cur 1577542553 expire 1577542403 last 1577542326 [1556323.132137] Lustre: Skipped 1 previous similar message [1556339.112113] Lustre: fir-MDT0000: haven't heard from client 56ef913c-ebdd-0a4e-50bc-9edd16b9742f (at 10.9.107.66@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88750cbb5400, cur 1577542569 expire 1577542419 last 1577542342 [1556469.923478] Lustre: MGS: Connection restored to 2554762e-83d1-1b7e-74ee-ddb72b35235a (at 10.9.109.56@o2ib4) [1556469.933394] Lustre: Skipped 1 previous similar message [1556748.112127] Lustre: fir-MDT0000: haven't heard from client 24e9e2b8-c701-7886-1991-a2238348e3e1 (at 10.9.108.11@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d1e84000, cur 1577542978 expire 1577542828 last 1577542751 [1556751.149370] Lustre: MGS: haven't heard from client b9a7d2a4-77d7-1136-800f-a2753e72f14a (at 10.9.108.11@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf4fdbc00, cur 1577542981 expire 1577542831 last 1577542754 [1556751.170638] Lustre: Skipped 1 previous similar message [1557058.257318] Lustre: MGS: Connection restored to df232092-858e-a632-396d-0cfff0b9daea (at 10.9.110.47@o2ib4) [1557058.267237] Lustre: Skipped 1 previous similar message [1558574.559723] Lustre: MGS: Connection restored to 56ef913c-ebdd-0a4e-50bc-9edd16b9742f (at 10.9.107.66@o2ib4) [1558574.569638] Lustre: Skipped 1 previous similar message [1558965.233017] Lustre: MGS: Connection restored to (at 10.9.108.9@o2ib4) [1558965.239728] Lustre: Skipped 1 previous similar message [1559132.420091] Lustre: MGS: Connection restored to 24e9e2b8-c701-7886-1991-a2238348e3e1 (at 10.9.108.11@o2ib4) [1559132.430007] Lustre: Skipped 1 previous similar message [1561924.147668] Lustre: fir-MDT0000: haven't heard from client e19e1947-897d-03aa-f267-2edb615db310 (at 10.9.110.41@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ace3dcc00, cur 1577548154 expire 1577548004 last 1577547927 [1561924.169637] Lustre: Skipped 1 previous similar message [1562279.150050] Lustre: fir-MDT0000: haven't heard from client 08c110e5-7506-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88689c763400, cur 1577548509 expire 1577548359 last 1577548282 [1562279.170107] Lustre: Skipped 1 previous similar message [1562422.091443] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1562422.098154] Lustre: Skipped 1 previous similar message [1562617.551093] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1562617.557807] Lustre: Skipped 1 previous similar message [1562700.152960] Lustre: fir-MDT0000: haven't heard from client a48ca926-0fa0-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884bc3c81000, cur 1577548930 expire 1577548780 last 1577548703 [1562700.173021] Lustre: Skipped 1 previous similar message [1563925.161201] Lustre: fir-MDT0000: haven't heard from client 808399ce-03a4-cdf4-bafc-aeb40e7dfde4 (at 10.9.107.50@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88784209fc00, cur 1577550155 expire 1577550005 last 1577549928 [1563925.183175] Lustre: Skipped 1 previous similar message [1564031.396360] Lustre: MGS: Connection restored to (at 10.9.110.41@o2ib4) [1564031.403160] Lustre: Skipped 1 previous similar message [1566211.146274] Lustre: MGS: Connection restored to 808399ce-03a4-cdf4-bafc-aeb40e7dfde4 (at 10.9.107.50@o2ib4) [1566211.156199] Lustre: Skipped 1 previous similar message [1571686.076897] Lustre: MGS: Connection restored to 840378ea-a823-7fff-6a74-a55368c8e575 (at 10.8.24.23@o2ib6) [1571686.086739] Lustre: Skipped 1 previous similar message [1573723.220686] LNetError: 38663:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1580617.749439] Lustre: MGS: Connection restored to 5dffb121-cc8a-d916-c96d-21b375fa8f4e (at 10.8.31.9@o2ib6) [1580617.759190] Lustre: Skipped 1 previous similar message [1585038.275019] Lustre: fir-MDT0000: haven't heard from client 83bdb7e7-1550-dd14-c26d-076e24050573 (at 10.9.101.20@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab57fd400, cur 1577571268 expire 1577571118 last 1577571041 [1585038.297008] Lustre: Skipped 1 previous similar message [1586591.304551] Lustre: MGS: haven't heard from client 3cdd987d-3ab8-94ae-4f26-a2bd12fc6d50 (at 10.9.107.59@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdc619c00, cur 1577572821 expire 1577572671 last 1577572594 [1586591.325826] Lustre: Skipped 1 previous similar message [1586610.284339] Lustre: fir-MDT0000: haven't heard from client 9e147519-9132-6153-cb42-0517beb227da (at 10.9.107.61@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8874047ca000, cur 1577572840 expire 1577572690 last 1577572613 [1586610.306309] Lustre: Skipped 2 previous similar messages [1586775.948725] Lustre: MGS: Connection restored to 83bdb7e7-1550-dd14-c26d-076e24050573 (at 10.9.101.20@o2ib4) [1586775.958684] Lustre: Skipped 1 previous similar message [1588784.467675] Lustre: MGS: Connection restored to (at 10.9.107.61@o2ib4) [1588784.474507] Lustre: Skipped 1 previous similar message [1588786.452351] Lustre: MGS: Connection restored to (at 10.9.107.60@o2ib4) [1588786.459161] Lustre: Skipped 1 previous similar message [1591999.929613] Lustre: MGS: Connection restored to e6dbb046-ac72-ee9d-92ee-f9690d53c3a4 (at 10.8.25.6@o2ib6) [1591999.939354] Lustre: Skipped 3 previous similar messages [1592002.875871] Lustre: MGS: Connection restored to 718ff980-4577-e880-bf4e-a1496ebce1be (at 10.8.24.6@o2ib6) [1592002.885613] Lustre: Skipped 1 previous similar message [1592005.499170] Lustre: MGS: Connection restored to 39ba4a85-ac21-2f62-35cd-76e4d8eb6548 (at 10.8.24.36@o2ib6) [1592005.508999] Lustre: Skipped 1 previous similar message [1592009.719501] Lustre: MGS: Connection restored to b9f97646-4f54-942e-8dc3-edbc1a6b5b4d (at 10.8.25.22@o2ib6) [1592009.729338] Lustre: Skipped 1 previous similar message [1592015.413626] Lustre: MGS: Connection restored to (at 10.8.30.25@o2ib6) [1592015.420338] Lustre: Skipped 1 previous similar message [1592520.386674] Lustre: MGS: Connection restored to d30d2da1-5a39-6a53-6def-eb7c150e8cb6 (at 10.8.31.1@o2ib6) [1592520.396426] Lustre: Skipped 5 previous similar messages [1592536.480791] Lustre: fir-MDT0000: Connection restored to eefbe848-d2e0-7e89-29f0-96f7490c6623 (at 10.8.30.15@o2ib6) [1592536.491328] Lustre: Skipped 2 previous similar messages [1592539.320018] Lustre: fir-MDT0000: haven't heard from client e6122ed7-ba94-2cf0-06a2-d452873eaed0 (at 10.9.105.3@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887762093800, cur 1577578769 expire 1577578619 last 1577578542 [1592539.341899] Lustre: Skipped 2 previous similar messages [1592573.834943] Lustre: MGS: Connection restored to (at 10.8.30.17@o2ib6) [1592573.841649] Lustre: Skipped 6 previous similar messages [1593582.857966] Lustre: MGS: Connection restored to (at 10.9.114.1@o2ib4) [1593582.864684] Lustre: Skipped 5 previous similar messages [1593647.327306] Lustre: fir-MDT0000: haven't heard from client 1a183d06-fa1a-4 (at 10.9.114.1@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8862393e8800, cur 1577579877 expire 1577579727 last 1577579650 [1593647.347402] Lustre: Skipped 1 previous similar message [1594869.058106] Lustre: MGS: Connection restored to e6122ed7-ba94-2cf0-06a2-d452873eaed0 (at 10.9.105.3@o2ib4) [1594869.067936] Lustre: Skipped 1 previous similar message [1595909.343562] Lustre: fir-MDT0000: haven't heard from client d8a07f77-ab6a-3cfc-cae0-aecee82b5ebd (at 10.9.108.41@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879bd2f3400, cur 1577582139 expire 1577581989 last 1577581912 [1595909.365531] Lustre: Skipped 1 previous similar message [1597702.550987] Lustre: MGS: Connection restored to 72b66a84-eb6d-8862-b24a-97d6ffec93b7 (at 10.8.24.22@o2ib6) [1597702.560824] Lustre: Skipped 1 previous similar message [1597899.178183] Lustre: MGS: Connection restored to 72b66a84-eb6d-8862-b24a-97d6ffec93b7 (at 10.8.24.22@o2ib6) [1597899.188014] Lustre: Skipped 1 previous similar message [1597921.006456] Lustre: MGS: Connection restored to (at 10.8.20.2@o2ib6) [1597921.013076] Lustre: Skipped 1 previous similar message [1598245.777465] Lustre: MGS: Connection restored to a04806b0-197f-6813-7c37-ce6427bd9a29 (at 10.8.24.3@o2ib6) [1598245.787206] Lustre: Skipped 1 previous similar message [1598276.487895] Lustre: MGS: Connection restored to d8a07f77-ab6a-3cfc-cae0-aecee82b5ebd (at 10.9.108.41@o2ib4) [1598276.497817] Lustre: Skipped 1 previous similar message [1598323.215667] Lustre: MGS: Connection restored to f9a312c9-9b7f-c1d3-3d88-73bb3684b0e1 (at 10.8.30.18@o2ib6) [1598323.225499] Lustre: Skipped 1 previous similar message [1598413.360918] Lustre: fir-MDT0000: haven't heard from client 0bc6830e-7223-d847-5a33-34e33258d412 (at 10.9.110.48@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b9f709000, cur 1577584643 expire 1577584493 last 1577584416 [1598413.382881] Lustre: Skipped 1 previous similar message [1600558.688251] Lustre: MGS: Connection restored to 0bc6830e-7223-d847-5a33-34e33258d412 (at 10.9.110.48@o2ib4) [1600558.698167] Lustre: Skipped 1 previous similar message [1601067.378861] Lustre: fir-MDT0000: haven't heard from client 8265f765-14f0-9a2d-df96-329ae652d148 (at 10.9.108.35@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab2e9a800, cur 1577587297 expire 1577587147 last 1577587070 [1601067.400828] Lustre: Skipped 1 previous similar message [1601646.382621] Lustre: fir-MDT0000: haven't heard from client 83be24ed-ef36-c298-4c93-73347c93a212 (at 10.9.106.26@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ace3da000, cur 1577587876 expire 1577587726 last 1577587649 [1601646.404587] Lustre: Skipped 1 previous similar message [1603243.831972] Lustre: MGS: Connection restored to 8265f765-14f0-9a2d-df96-329ae652d148 (at 10.9.108.35@o2ib4) [1603243.841885] Lustre: Skipped 1 previous similar message [1603832.008352] Lustre: MGS: Connection restored to (at 10.9.106.26@o2ib4) [1603832.015148] Lustre: Skipped 1 previous similar message [1603930.663156] Lustre: MGS: Connection restored to d8aa6f82-54cf-b7b6-c533-7761ac172b8e (at 10.9.108.6@o2ib4) [1603930.672990] Lustre: Skipped 1 previous similar message [1604061.073233] Lustre: MGS: Connection restored to (at 10.9.108.4@o2ib4) [1604061.079945] Lustre: Skipped 1 previous similar message [1605251.406462] Lustre: fir-MDT0000: haven't heard from client 5958f30d-ed92-2ec9-851c-e5cf88e35f18 (at 10.8.27.20@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887be3519000, cur 1577591481 expire 1577591331 last 1577591254 [1605251.428341] Lustre: Skipped 5 previous similar messages [1606797.140581] Lustre: MGS: Connection restored to 5958f30d-ed92-2ec9-851c-e5cf88e35f18 (at 10.8.27.20@o2ib6) [1606797.150418] Lustre: Skipped 1 previous similar message [1611009.449255] Lustre: fir-MDT0000: haven't heard from client cbb61929-f899-d935-0c8b-4bd0be9bb816 (at 10.9.109.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88741a353c00, cur 1577597239 expire 1577597089 last 1577597012 [1611009.471152] Lustre: Skipped 1 previous similar message [1612983.834262] Lustre: MGS: Connection restored to 054634c1-3550-4d84-f85e-c51d51cde98e (at 10.8.31.2@o2ib6) [1612983.844002] Lustre: Skipped 1 previous similar message [1613407.672831] Lustre: MGS: Connection restored to cbb61929-f899-d935-0c8b-4bd0be9bb816 (at 10.9.109.7@o2ib4) [1613407.682663] Lustre: Skipped 1 previous similar message [1613541.485245] Lustre: fir-MDT0000: haven't heard from client f07c5f6c-7b37-2e06-010b-5b9cd07d02b4 (at 10.9.110.36@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f6000, cur 1577599771 expire 1577599621 last 1577599544 [1613541.507203] Lustre: Skipped 1 previous similar message [1613796.507233] Lustre: MGS: Connection restored to 60dc91c7-b103-c782-f03e-90ce05aeafeb (at 10.8.13.26@o2ib6) [1613796.517059] Lustre: Skipped 1 previous similar message [1615610.407438] Lustre: MGS: Connection restored to f07c5f6c-7b37-2e06-010b-5b9cd07d02b4 (at 10.9.110.36@o2ib4) [1615610.417349] Lustre: Skipped 1 previous similar message [1615635.951939] Lustre: MGS: Connection restored to 948062a4-b619-897a-6d1c-0608082320be (at 10.9.110.35@o2ib4) [1615635.961852] Lustre: Skipped 1 previous similar message [1615878.092177] Lustre: MGS: Connection restored to 219ce4b2-a1b0-9f16-3eb9-b36c8cbbf77b (at 10.9.108.14@o2ib4) [1615878.102094] Lustre: Skipped 1 previous similar message [1615934.459970] Lustre: MGS: Connection restored to (at 10.9.108.16@o2ib4) [1615934.466775] Lustre: Skipped 1 previous similar message [1618827.498220] Lustre: fir-MDT0000: haven't heard from client d6522fa0-c114-4 (at 10.9.109.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8854fe7e9000, cur 1577605057 expire 1577604907 last 1577604830 [1618827.518372] Lustre: Skipped 7 previous similar messages [1618863.112081] Lustre: MGS: Connection restored to (at 10.9.109.37@o2ib4) [1618863.118875] Lustre: Skipped 1 previous similar message [1620778.514761] Lustre: fir-MDT0000: haven't heard from client 23d7531b-4942-4 (at 10.9.110.47@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888b8af70800, cur 1577607008 expire 1577606858 last 1577606781 [1620778.534903] Lustre: Skipped 1 previous similar message [1621863.517399] Lustre: fir-MDT0000: haven't heard from client 45f5bb90-2004-eb20-ac18-a9c17e461d42 (at 10.9.109.55@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744ee71000, cur 1577608093 expire 1577607943 last 1577607866 [1621863.539364] Lustre: Skipped 45 previous similar messages [1622198.397762] Lustre: MGS: Connection restored to (at 10.9.110.14@o2ib4) [1622198.404558] Lustre: Skipped 1 previous similar message [1622223.193842] Lustre: MGS: Connection restored to 628ef2b2-a10c-b5aa-c89a-d9e59a7dce2e (at 10.9.107.25@o2ib4) [1622223.203759] Lustre: Skipped 1 previous similar message [1622273.122173] Lustre: MGS: Connection restored to ba6d6b80-84f0-aa6a-2ba3-b0d9fb94a304 (at 10.9.109.40@o2ib4) [1622273.132092] Lustre: Skipped 1 previous similar message [1622289.984026] Lustre: MGS: Connection restored to 1311b8bc-210d-a85c-a936-de2aa4639560 (at 10.9.109.30@o2ib4) [1622289.993980] Lustre: Skipped 1 previous similar message [1622331.096358] Lustre: MGS: Connection restored to 882378af-0b41-73ee-5c10-5cc51464645c (at 10.9.108.22@o2ib4) [1622331.106275] Lustre: Skipped 1 previous similar message [1622451.419666] Lustre: MGS: Connection restored to 9622ebd9-08dd-84f5-187b-b07758b1dd55 (at 10.9.103.48@o2ib4) [1622451.429587] Lustre: Skipped 1 previous similar message [1622470.165992] Lustre: MGS: Connection restored to (at 10.9.110.54@o2ib4) [1622470.172785] Lustre: Skipped 7 previous similar messages [1622569.521252] Lustre: MGS: Connection restored to (at 10.9.107.27@o2ib4) [1622569.528043] Lustre: Skipped 5 previous similar messages [1622638.560400] Lustre: MGS: Connection restored to b28c253e-3041-3544-86e5-3ee759d202d3 (at 10.9.109.24@o2ib4) [1622638.570316] Lustre: Skipped 5 previous similar messages [1624109.835769] Lustre: MGS: Connection restored to 45f5bb90-2004-eb20-ac18-a9c17e461d42 (at 10.9.109.55@o2ib4) [1624109.845686] Lustre: Skipped 15 previous similar messages [1646868.668860] Lustre: fir-MDT0000: haven't heard from client 515141cc-68cc-d451-ee11-13fe464f05cb (at 10.9.106.36@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a51f76c00, cur 1577633098 expire 1577632948 last 1577632871 [1646868.690828] Lustre: Skipped 1 previous similar message [1649097.954139] Lustre: MGS: Connection restored to 515141cc-68cc-d451-ee11-13fe464f05cb (at 10.9.106.36@o2ib4) [1649097.964062] Lustre: Skipped 1 previous similar message [1650287.476842] LNetError: 38670:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1652732.705414] Lustre: fir-MDT0000: haven't heard from client 125e518d-eeb8-4 (at 10.9.109.40@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888993f52c00, cur 1577638962 expire 1577638812 last 1577638735 [1652732.725562] Lustre: Skipped 1 previous similar message [1652808.704851] Lustre: fir-MDT0000: haven't heard from client 9361ac61-78b9-6a42-6f92-71cba3d614d3 (at 10.8.21.6@o2ib6) in 187 seconds. I think it's dead, and I am evicting it. exp ffff887b9f70a800, cur 1577639038 expire 1577638888 last 1577638851 [1652808.726653] Lustre: Skipped 9 previous similar messages [1652848.755457] Lustre: MGS: haven't heard from client d0f67b07-12f7-c2a9-5b86-d5fe8115c0bc (at 10.8.21.6@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be56dd800, cur 1577639078 expire 1577638928 last 1577638851 [1652848.776569] Lustre: Skipped 17 previous similar messages [1654157.937376] Lustre: MGS: Connection restored to (at 10.9.110.14@o2ib4) [1654157.944183] Lustre: Skipped 1 previous similar message [1654242.943720] Lustre: MGS: Connection restored to ba6d6b80-84f0-aa6a-2ba3-b0d9fb94a304 (at 10.9.109.40@o2ib4) [1654242.953636] Lustre: Skipped 1 previous similar message [1654389.975005] Lustre: MGS: Connection restored to 3cf7b01d-62ec-c5c8-882c-2b9bee30f26f (at 10.8.22.8@o2ib6) [1654389.984761] Lustre: Skipped 1 previous similar message [1654394.932964] Lustre: MGS: Connection restored to ff8445d1-f99d-03b2-7c66-3abfa27fa6d1 (at 10.8.27.23@o2ib6) [1654394.942825] Lustre: Skipped 1 previous similar message [1654407.254417] Lustre: MGS: Connection restored to 9361ac61-78b9-6a42-6f92-71cba3d614d3 (at 10.8.21.6@o2ib6) [1654407.264177] Lustre: Skipped 3 previous similar messages [1654430.159246] Lustre: MGS: Connection restored to 0c6eefbc-9819-190e-86e1-f92a98686467 (at 10.8.23.19@o2ib6) [1654430.169089] Lustre: Skipped 3 previous similar messages [1654455.692860] Lustre: MGS: Connection restored to 628ef2b2-a10c-b5aa-c89a-d9e59a7dce2e (at 10.9.107.25@o2ib4) [1654455.702785] Lustre: Skipped 3 previous similar messages [1654543.184027] Lustre: MGS: Connection restored to 1311b8bc-210d-a85c-a936-de2aa4639560 (at 10.9.109.30@o2ib4) [1654543.193949] Lustre: Skipped 1 previous similar message [1654638.654939] Lustre: MGS: Connection restored to 882378af-0b41-73ee-5c10-5cc51464645c (at 10.9.108.22@o2ib4) [1654638.664861] Lustre: Skipped 1 previous similar message [1654915.455209] Lustre: MGS: Connection restored to 3a889078-7918-0917-5d6a-834ec1407eee (at 10.8.13.28@o2ib6) [1654915.465051] Lustre: Skipped 1 previous similar message [1655221.153630] Lustre: MGS: Connection restored to ac2f8831-eb52-3a99-b876-6ffdc2f892de (at 10.9.106.16@o2ib4) [1655221.163554] Lustre: Skipped 19 previous similar messages [1655324.719677] Lustre: fir-MDT0000: haven't heard from client 6daf0e91-8ab6-6a75-03a8-6b93f26e37d2 (at 10.8.7.8@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf9da7400, cur 1577641554 expire 1577641404 last 1577641327 [1655324.741379] Lustre: Skipped 17 previous similar messages [1656410.770310] Lustre: MGS: Connection restored to (at 10.9.104.11@o2ib4) [1656410.777110] Lustre: Skipped 1 previous similar message [1675222.337336] Lustre: 109713:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675222.880962] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675222.892447] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 1 previous similar message [1675223.997308] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675224.008786] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 4 previous similar messages [1675226.085478] Lustre: 109564:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675226.096958] Lustre: 109564:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 9 previous similar messages [1675230.271938] Lustre: 109691:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675230.283424] Lustre: 109691:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 20 previous similar messages [1675238.374011] Lustre: 109761:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675238.385486] Lustre: 109761:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 36 previous similar messages [1675254.483811] Lustre: 109691:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675254.495296] Lustre: 109691:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 89 previous similar messages [1675286.574116] Lustre: 109671:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675286.585593] Lustre: 109671:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 164 previous similar messages [1675350.680762] Lustre: 109604:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675350.692249] Lustre: 109604:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 330 previous similar messages [1675478.820127] Lustre: 109671:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675478.831616] Lustre: 109671:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 1050 previous similar messages [1675734.994896] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1675735.006377] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 1604 previous similar messages [1676247.087365] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1676247.098841] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3621 previous similar messages [1676847.109961] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1676847.121450] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3508 previous similar messages [1677447.136978] Lustre: 109604:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1677447.148474] Lustre: 109604:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3373 previous similar messages [1678047.374654] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1678047.386134] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3310 previous similar messages [1678647.532230] Lustre: 109146:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1678647.543710] Lustre: 109146:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3224 previous similar messages [1679247.681763] Lustre: 109626:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1679247.693246] Lustre: 109626:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3123 previous similar messages [1679847.826645] Lustre: 109619:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1679847.838160] Lustre: 109619:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3102 previous similar messages [1680447.901346] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1680447.912826] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3445 previous similar messages [1681047.941843] Lustre: 109761:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1681047.953331] Lustre: 109761:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3528 previous similar messages [1681647.984394] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1681647.995872] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 5579 previous similar messages [1682248.170890] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1682248.182375] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3674 previous similar messages [1682848.522443] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1682848.533927] Lustre: 109572:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 5433 previous similar messages [1683206.820588] Lustre: MGS: Connection restored to 1fad028f-9e2f-e029-5b31-8f7d8f724daa (at 10.9.105.15@o2ib4) [1683206.830509] Lustre: Skipped 1 previous similar message [1683448.555891] Lustre: 109619:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1683448.567374] Lustre: 109619:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 3807 previous similar messages [1684048.687190] Lustre: 109713:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1684048.698667] Lustre: 109713:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 5897 previous similar messages [1684648.764478] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1684648.775962] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 6005 previous similar messages [1685249.344817] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1685249.356350] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 5740 previous similar messages [1685849.380849] Lustre: 109761:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1685849.392332] Lustre: 109761:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 4349 previous similar messages [1686449.446901] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1686449.458386] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 4686 previous similar messages [1687049.631840] Lustre: 109613:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1687049.643313] Lustre: 109613:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 9200 previous similar messages [1687649.652399] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1687649.663874] Lustre: 109616:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 7716 previous similar messages [1688249.677592] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [1688249.689068] Lustre: 109540:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 9372 previous similar messages [1688651.575617] Lustre: MGS: Connection restored to (at 10.9.112.13@o2ib4) [1688651.582421] Lustre: Skipped 1 previous similar message [1701441.998302] Lustre: fir-MDT0000: haven't heard from client 9986fccb-33ca-bc62-53bc-352ccf41dbda (at 10.9.109.44@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf7af8c00, cur 1577687671 expire 1577687521 last 1577687444 [1701442.020270] Lustre: Skipped 1 previous similar message [1703740.096103] Lustre: MGS: Connection restored to 9986fccb-33ca-bc62-53bc-352ccf41dbda (at 10.9.109.44@o2ib4) [1703740.106017] Lustre: Skipped 1 previous similar message [1727206.455655] LNetError: 38675:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1733127.056038] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1733127.065791] Lustre: Skipped 1 previous similar message [1733135.189318] Lustre: fir-MDT0000: haven't heard from client 87bfa51f-8184-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88759f71b000, cur 1577719364 expire 1577719214 last 1577719137 [1733135.209302] Lustre: Skipped 1 previous similar message [1733138.214787] Lustre: MGS: haven't heard from client b7c88612-21f4-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8882a36d5800, cur 1577719367 expire 1577719217 last 1577719140 [1733489.979070] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1733489.988817] Lustre: Skipped 1 previous similar message [1733556.191968] Lustre: fir-MDT0000: haven't heard from client 2b4c2ee2-8aed-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879b7637400, cur 1577719785 expire 1577719635 last 1577719558 [1733826.193590] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1733826.203335] Lustre: Skipped 1 previous similar message [1733850.193787] Lustre: fir-MDT0000: haven't heard from client 1b05706b-6351-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88637536b000, cur 1577720079 expire 1577719929 last 1577719852 [1733850.213778] Lustre: Skipped 1 previous similar message [1733869.198760] Lustre: MGS: haven't heard from client 7f711c4a-1dee-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8858bdf6dc00, cur 1577720098 expire 1577719948 last 1577719871 [1734184.461205] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1734184.470953] Lustre: Skipped 1 previous similar message [1734230.196150] Lustre: fir-MDT0000: haven't heard from client 44f76646-2789-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885cc6fa1400, cur 1577720459 expire 1577720309 last 1577720232 [1734562.477866] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1734562.487607] Lustre: Skipped 1 previous similar message [1734588.197291] Lustre: fir-MDT0000: haven't heard from client 4b1ed493-8a84-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bf5c60c00, cur 1577720817 expire 1577720667 last 1577720590 [1734588.217258] Lustre: Skipped 1 previous similar message [1734897.513764] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1734897.523511] Lustre: Skipped 1 previous similar message [1734922.200221] Lustre: fir-MDT0000: haven't heard from client fb952db6-4d3e-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88694d11e000, cur 1577721151 expire 1577721001 last 1577720924 [1734922.220204] Lustre: Skipped 1 previous similar message [1735296.501942] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1735296.511683] Lustre: Skipped 1 previous similar message [1735351.202890] Lustre: fir-MDT0000: haven't heard from client 8fb2828c-3841-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885b81a99400, cur 1577721580 expire 1577721430 last 1577721353 [1735351.222862] Lustre: Skipped 1 previous similar message [1735756.988294] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1735756.998040] Lustre: Skipped 1 previous similar message [1735801.205448] Lustre: fir-MDT0000: haven't heard from client 9445be38-7cfd-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab7aeac00, cur 1577722030 expire 1577721880 last 1577721803 [1735801.225423] Lustre: Skipped 1 previous similar message [1736424.133498] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1736424.143251] Lustre: Skipped 1 previous similar message [1736459.209493] Lustre: fir-MDT0000: haven't heard from client 05d24e72-8b2d-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8869bd7bfc00, cur 1577722688 expire 1577722538 last 1577722461 [1736459.229479] Lustre: Skipped 1 previous similar message [1736995.326354] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1736995.336102] Lustre: Skipped 1 previous similar message [1737028.211868] Lustre: fir-MDT0000: haven't heard from client 78531409-6513-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884c25aa4c00, cur 1577723257 expire 1577723107 last 1577723030 [1737028.231842] Lustre: Skipped 1 previous similar message [1737495.045664] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1737495.055410] Lustre: Skipped 1 previous similar message [1738049.235406] Lustre: MGS: haven't heard from client 15002a18-b9ed-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884e7fcb4c00, cur 1577724278 expire 1577724128 last 1577724051 [1738049.254687] Lustre: Skipped 3 previous similar messages [1738405.784910] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1738405.794659] Lustre: Skipped 3 previous similar messages [1738910.222888] Lustre: fir-MDT0000: haven't heard from client 37e508ac-424c-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887abdf97000, cur 1577725139 expire 1577724989 last 1577724912 [1738910.242880] Lustre: Skipped 3 previous similar messages [1739434.689590] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1739434.699332] Lustre: Skipped 3 previous similar messages [1744022.305444] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [1744022.315194] Lustre: Skipped 1 previous similar message [1744053.252961] Lustre: fir-MDT0000: haven't heard from client 8e41cec0-f44d-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886553396c00, cur 1577730282 expire 1577730132 last 1577730055 [1744053.272931] Lustre: Skipped 3 previous similar messages [1754433.546337] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1754433.553051] Lustre: Skipped 1 previous similar message [1754450.314729] Lustre: fir-MDT0000: haven't heard from client b10b89f4-5a48-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8864cc65a400, cur 1577740679 expire 1577740529 last 1577740452 [1754450.334789] Lustre: Skipped 1 previous similar message [1754854.270475] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1754854.277188] Lustre: Skipped 1 previous similar message [1754912.317369] Lustre: fir-MDT0000: haven't heard from client 243d01cb-8e07-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8868c77e9c00, cur 1577741141 expire 1577740991 last 1577740914 [1754912.337427] Lustre: Skipped 1 previous similar message [1755268.288978] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1755268.295687] Lustre: Skipped 1 previous similar message [1755308.319699] Lustre: fir-MDT0000: haven't heard from client 6c00da3b-b839-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8889dd3f9000, cur 1577741537 expire 1577741387 last 1577741310 [1755308.339772] Lustre: Skipped 1 previous similar message [1755715.882213] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1755715.888923] Lustre: Skipped 1 previous similar message [1755797.322730] Lustre: fir-MDT0000: haven't heard from client 16386755-e7ae-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887596aa1000, cur 1577742026 expire 1577741876 last 1577741799 [1755797.342783] Lustre: Skipped 1 previous similar message [1756797.328687] Lustre: fir-MDT0000: haven't heard from client 3207a7bc-316b-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bbe2af400, cur 1577743026 expire 1577742876 last 1577742799 [1756797.348741] Lustre: Skipped 1 previous similar message [1756826.979756] LNet: Service thread pid 12703 was inactive for 200.16s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1756826.996865] Pid: 12703, comm: mdt03_053 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1756827.007210] Call Trace: [1756827.009851] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [1756827.016972] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [1756827.024341] [] mdt_rename_lock+0x24b/0x4b0 [mdt] [1756827.030823] [] mdt_reint_rename+0x2c5/0x2b90 [mdt] [1756827.037478] [] mdt_reint_rec+0x83/0x210 [mdt] [1756827.043722] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1756827.050472] [] mdt_reint+0x67/0x140 [mdt] [1756827.056372] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [1756827.063507] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1756827.071425] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [1756827.077936] [] kthread+0xd1/0xe0 [1756827.083039] [] ret_from_fork_nospec_begin+0xe/0x21 [1756827.089687] [] 0xffffffffffffffff [1756827.094891] LustreError: dumping log to /tmp/lustre-log.1577743055.12703 [1756827.491762] LNet: Service thread pid 109647 was inactive for 200.38s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1756827.508955] Pid: 109647, comm: mdt02_027 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1756827.519388] Call Trace: [1756827.522029] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [1756827.529171] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [1756827.536544] [] mdt_rename_lock+0x24b/0x4b0 [mdt] [1756827.543028] [] mdt_reint_rename+0x2c5/0x2b90 [mdt] [1756827.549700] [] mdt_reint_rec+0x83/0x210 [mdt] [1756827.555925] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1756827.562674] [] mdt_reint+0x67/0x140 [mdt] [1756827.568566] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [1756827.575703] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1756827.583622] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [1756827.590131] [] kthread+0xd1/0xe0 [1756827.595219] [] ret_from_fork_nospec_begin+0xe/0x21 [1756827.601866] [] 0xffffffffffffffff [1756827.607058] LustreError: dumping log to /tmp/lustre-log.1577743056.109647 [1756830.051776] LNet: Service thread pid 109600 was inactive for 200.50s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1756830.068973] Pid: 109600, comm: mdt02_017 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1756830.079407] Call Trace: [1756830.082045] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [1756830.089158] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [1756830.096542] [] mdt_rename_lock+0x24b/0x4b0 [mdt] [1756830.103026] [] mdt_reint_rename+0x2c5/0x2b90 [mdt] [1756830.109696] [] mdt_reint_rec+0x83/0x210 [mdt] [1756830.115920] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1756830.122665] [] mdt_reint+0x67/0x140 [mdt] [1756830.128533] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [1756830.135654] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1756830.143543] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [1756830.150042] [] kthread+0xd1/0xe0 [1756830.155128] [] ret_from_fork_nospec_begin+0xe/0x21 [1756830.161778] [] 0xffffffffffffffff [1756830.166985] LustreError: dumping log to /tmp/lustre-log.1577743058.109600 [1756830.174579] Pid: 109718, comm: mdt00_047 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1756830.185012] Call Trace: [1756830.187646] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [1756830.194753] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [1756830.202130] [] mdt_rename_lock+0x24b/0x4b0 [mdt] [1756830.208605] [] mdt_reint_rename+0x2c5/0x2b90 [mdt] [1756830.215261] [] mdt_reint_rec+0x83/0x210 [mdt] [1756830.221474] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1756830.228207] [] mdt_reint+0x67/0x140 [mdt] [1756830.234089] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [1756830.241222] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1756830.249113] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [1756830.255626] [] kthread+0xd1/0xe0 [1756830.260718] [] ret_from_fork_nospec_begin+0xe/0x21 [1756830.267365] [] 0xffffffffffffffff [1756830.272551] Pid: 109482, comm: mdt02_003 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1756830.282982] Call Trace: [1756830.285612] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [1756830.292724] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [1756830.300099] [] mdt_rename_lock+0x24b/0x4b0 [mdt] [1756830.306581] [] mdt_reint_rename+0x2c5/0x2b90 [mdt] [1756830.313253] [] mdt_reint_rec+0x83/0x210 [mdt] [1756830.319471] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1756830.326206] [] mdt_reint+0x67/0x140 [mdt] [1756830.332078] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [1756830.339208] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1756830.347099] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [1756830.353598] [] kthread+0xd1/0xe0 [1756830.358685] [] ret_from_fork_nospec_begin+0xe/0x21 [1756830.365348] [] 0xffffffffffffffff [1756832.100784] LNet: Service thread pid 109584 was inactive for 200.15s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1756832.113905] LNet: Skipped 4 previous similar messages [1756832.119141] LustreError: dumping log to /tmp/lustre-log.1577743060.109584 [1756833.123798] LustreError: dumping log to /tmp/lustre-log.1577743061.109743 [1756833.635797] LustreError: dumping log to /tmp/lustre-log.1577743062.12697 [1756860.259945] LustreError: dumping log to /tmp/lustre-log.1577743088.109617 [1756891.920387] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1756891.927097] Lustre: Skipped 1 previous similar message [1756893.467051] Lustre: fir-MDT0000: Received new LWP connection from 10.0.10.54@o2ib7, removing former export from same NID [1756898.640695] LNet: Service thread pid 12703 completed after 271.82s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [1756898.657022] LNet: Skipped 6 previous similar messages [1757806.036837] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1757806.043558] Lustre: Skipped 3 previous similar messages [1757847.334459] Lustre: fir-MDT0000: haven't heard from client 6f4b8d90-2a27-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887227b3d400, cur 1577744076 expire 1577743926 last 1577743849 [1757847.354517] Lustre: Skipped 1 previous similar message [1757967.593958] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1757967.600667] Lustre: Skipped 1 previous similar message [1758032.362580] Lustre: MGS: haven't heard from client d1f3dc9b-913d-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88508ef36000, cur 1577744261 expire 1577744111 last 1577744034 [1758032.381941] Lustre: Skipped 1 previous similar message [1758043.335476] Lustre: fir-MDT0000: haven't heard from client 2b6a226f-726d-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886f8880d400, cur 1577744272 expire 1577744122 last 1577744045 [1758577.186427] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1758577.193140] Lustre: Skipped 1 previous similar message [1758586.465000] LustreError: 109713:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.23.14@o2ib6) returned error from blocking AST (req@ffff885042adde80 x1652561539353776 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff885664da4ec0/0xc3c20c2acc1d421b lrc: 4/0,0 mode: EX/EX res: [0x20003cb1e:0x2:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x7492e551ecdaa2db expref: 172 pid: 109642 timeout: 1758724 lvb_type: 3 [1758586.508206] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.23.14@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [1758586.520954] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 10.8.23.14@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff885664da4ec0/0xc3c20c2acc1d421b lrc: 3/0,0 mode: EX/EX res: [0x20003cb1e:0x2:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x7492e551ecdaa2db expref: 173 pid: 109642 timeout: 0 lvb_type: 3 [1758622.356673] Lustre: MGS: haven't heard from client 10e3a892-a884-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8873d221e400, cur 1577744851 expire 1577744701 last 1577744624 [1759142.074840] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1759142.081549] Lustre: Skipped 1 previous similar message [1759167.244307] LustreError: 109564:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.23.14@o2ib6) returned error from blocking AST (req@ffff8871ca625580 x1652561547519760 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff884cf9ca06c0/0xc3c20c2acf80da25 lrc: 4/0,0 mode: EX/EX res: [0x20003cb21:0x11:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0xd9f049351f2ed539 expref: 172 pid: 109679 timeout: 1759305 lvb_type: 3 [1759167.244546] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.23.14@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [1759167.244570] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 10.8.23.14@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff88651d5e0fc0/0xc3c20c2acf80da8e lrc: 3/0,0 mode: EX/EX res: [0x20003cb21:0x12:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0xd9f049351f2ed55c expref: 173 pid: 109716 timeout: 0 lvb_type: 3 [1759167.337501] LustreError: 109564:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 1 previous similar message [1759181.362331] Lustre: MGS: haven't heard from client eeb4ebd4-d766-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be0a48800, cur 1577745410 expire 1577745260 last 1577745183 [1760620.093977] Lustre: 109671:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760620.105911] Lustre: 109671:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 54 previous similar messages [1760620.594632] Lustre: 109626:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760620.606543] Lustre: 109626:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 1292 previous similar messages [1760621.594446] Lustre: 109619:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760621.606357] Lustre: 109619:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 2923 previous similar messages [1760623.594531] Lustre: 109613:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760623.606444] Lustre: 109613:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 5438 previous similar messages [1760627.594466] Lustre: 109556:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760627.606380] Lustre: 109556:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 4592 previous similar messages [1760636.344188] Lustre: 109626:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760636.356105] Lustre: 109626:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 3889 previous similar messages [1760652.771169] Lustre: 109714:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760652.783088] Lustre: 109714:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 6361 previous similar messages [1760684.921974] Lustre: 109679:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760684.933905] Lustre: 109679:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 13986 previous similar messages [1760748.923394] Lustre: 109626:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0000: Failure to clear the changelog for user 3: -22 [1760748.935313] Lustre: 109626:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 28639 previous similar messages [1762348.401721] Lustre: MGS: Connection restored to (at 10.9.112.14@o2ib4) [1762348.408517] Lustre: Skipped 1 previous similar message [1762387.364845] Lustre: fir-MDT0000: haven't heard from client 387bc7a8-980a-4 (at 10.9.112.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8859f6d42c00, cur 1577748616 expire 1577748466 last 1577748389 [1771563.434437] Lustre: MGS: haven't heard from client 4a4b1457-991a-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885a55aae400, cur 1577757792 expire 1577757642 last 1577757565 [1771563.453803] Lustre: Skipped 1 previous similar message [1771574.433049] Lustre: fir-MDT0000: haven't heard from client eb6d2c93-2bb0-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885046659c00, cur 1577757803 expire 1577757653 last 1577757576 [1771774.428791] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1771774.435496] Lustre: Skipped 1 previous similar message [1772421.410321] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1772421.417038] Lustre: Skipped 1 previous similar message [1772429.435169] Lustre: fir-MDT0000: haven't heard from client 2cf24200-c11c-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886aea3ef000, cur 1577758658 expire 1577758508 last 1577758431 [1772852.317727] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1772852.324438] Lustre: Skipped 1 previous similar message [1772900.439295] Lustre: fir-MDT0000: haven't heard from client af04c5c4-921c-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8867c7154400, cur 1577759129 expire 1577758979 last 1577758902 [1772900.459361] Lustre: Skipped 1 previous similar message [1773373.552929] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1773373.559642] Lustre: Skipped 1 previous similar message [1773431.441948] Lustre: fir-MDT0000: haven't heard from client bae6dbe7-d3b0-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bea0db000, cur 1577759660 expire 1577759510 last 1577759433 [1773431.462010] Lustre: Skipped 1 previous similar message [1773853.397962] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1773853.404670] Lustre: Skipped 1 previous similar message [1773902.445107] Lustre: fir-MDT0000: haven't heard from client 5f6c5ce6-dd4c-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bfa1d5400, cur 1577760131 expire 1577759981 last 1577759904 [1773902.465167] Lustre: Skipped 1 previous similar message [1774083.166624] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1774083.173340] Lustre: Skipped 1 previous similar message [1774156.446854] Lustre: fir-MDT0000: haven't heard from client bd0a9a59-8af6-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88628e2cd800, cur 1577760385 expire 1577760235 last 1577760158 [1774156.466926] Lustre: Skipped 1 previous similar message [1789504.181685] LNetError: 38673:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1790863.068049] LNetError: 38670:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1797890.598415] Lustre: fir-MDT0000: haven't heard from client b1177c82-65c9-4 (at 10.9.110.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884bf73acc00, cur 1577784119 expire 1577783969 last 1577783892 [1797890.618564] Lustre: Skipped 1 previous similar message [1797996.512328] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1797996.519043] Lustre: Skipped 1 previous similar message [1798044.599496] Lustre: fir-MDT0000: haven't heard from client 0b9577d7-365a-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8863e7ae0800, cur 1577784273 expire 1577784123 last 1577784046 [1798044.619548] Lustre: Skipped 5 previous similar messages [1798399.334965] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1798399.341676] Lustre: Skipped 1 previous similar message [1798450.602274] Lustre: fir-MDT0000: haven't heard from client 8b268de3-04ed-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8866f262b800, cur 1577784679 expire 1577784529 last 1577784452 [1798450.622338] Lustre: Skipped 1 previous similar message [1798776.427336] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1798776.434052] Lustre: Skipped 1 previous similar message [1798853.605105] Lustre: fir-MDT0000: haven't heard from client f6f120a8-5d69-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886157f98c00, cur 1577785082 expire 1577784932 last 1577784855 [1798853.625188] Lustre: Skipped 1 previous similar message [1799246.469200] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1799246.475913] Lustre: Skipped 1 previous similar message [1799305.606975] Lustre: fir-MDT0000: haven't heard from client e45743e0-a65d-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8871d1761c00, cur 1577785534 expire 1577785384 last 1577785307 [1799305.627040] Lustre: Skipped 1 previous similar message [1799501.053283] Lustre: MGS: Connection restored to (at 10.8.28.4@o2ib6) [1799501.059908] Lustre: Skipped 1 previous similar message [1799556.537093] Lustre: MGS: Connection restored to (at 10.9.110.14@o2ib4) [1799556.543897] Lustre: Skipped 1 previous similar message [1799657.437866] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1799657.444579] Lustre: Skipped 1 previous similar message [1799700.610631] Lustre: fir-MDT0000: haven't heard from client 29bee2c9-84e5-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0a56800, cur 1577785929 expire 1577785779 last 1577785702 [1799700.630692] Lustre: Skipped 1 previous similar message [1800079.398482] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1800079.405194] Lustre: Skipped 3 previous similar messages [1800136.612829] Lustre: fir-MDT0000: haven't heard from client 51a1e39b-c27b-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8867c6827000, cur 1577786365 expire 1577786215 last 1577786138 [1800136.632884] Lustre: Skipped 1 previous similar message [1800474.082355] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1800474.089065] Lustre: Skipped 1 previous similar message [1800533.616442] Lustre: fir-MDT0000: haven't heard from client adca1ee8-28e4-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8856e7b8c800, cur 1577786762 expire 1577786612 last 1577786535 [1800533.636505] Lustre: Skipped 1 previous similar message [1800864.935235] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1800864.941948] Lustre: Skipped 1 previous similar message [1800902.618964] Lustre: fir-MDT0000: haven't heard from client d0c64b55-142c-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886bc2b26000, cur 1577787131 expire 1577786981 last 1577786904 [1800902.639025] Lustre: Skipped 1 previous similar message [1801293.621667] Lustre: fir-MDT0000: haven't heard from client 362014aa-8f6a-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884c29baa800, cur 1577787522 expire 1577787372 last 1577787295 [1801293.641729] Lustre: Skipped 1 previous similar message [1801632.167196] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1801632.173905] Lustre: Skipped 3 previous similar messages [1801869.624583] Lustre: fir-MDT0000: haven't heard from client 7f058457-dc07-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8865634f6800, cur 1577788098 expire 1577787948 last 1577787871 [1801869.644637] Lustre: Skipped 3 previous similar messages [1802394.584588] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1802394.591299] Lustre: Skipped 3 previous similar messages [1802898.631924] Lustre: fir-MDT0000: haven't heard from client 9b38ca26-f38c-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8868a5f66800, cur 1577789127 expire 1577788977 last 1577788900 [1802898.651988] Lustre: Skipped 3 previous similar messages [1803426.014606] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1803426.021312] Lustre: Skipped 3 previous similar messages [1803652.636645] Lustre: fir-MDT0000: haven't heard from client ab0acce3-d559-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88641e6d8c00, cur 1577789881 expire 1577789731 last 1577789654 [1803652.656699] Lustre: Skipped 3 previous similar messages [1804052.714375] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1804052.721088] Lustre: Skipped 3 previous similar messages [1804725.677828] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1804725.684542] Lustre: Skipped 1 previous similar message [1804783.648956] Lustre: fir-MDT0000: haven't heard from client ea0ef2ab-d6b0-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887670afdc00, cur 1577791012 expire 1577790862 last 1577790785 [1804783.669049] Lustre: Skipped 3 previous similar messages [1805409.457083] LNetError: 38663:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1805700.413955] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1805700.420672] Lustre: Skipped 3 previous similar messages [1805702.650168] Lustre: fir-MDT0000: haven't heard from client 7e51e408-0f7c-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8873e0a6b800, cur 1577791931 expire 1577791781 last 1577791704 [1805702.670228] Lustre: Skipped 3 previous similar messages [1806207.367706] Lustre: 109591:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577792428/real 1577792428] req@ffff8866d04d8480 x1652565217042432/t0(0) o104->fir-MDT0000@10.8.27.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577792435 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1806214.394743] Lustre: 109591:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577792435/real 1577792435] req@ffff8866d04d8480 x1652565217042432/t0(0) o104->fir-MDT0000@10.8.27.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577792442 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1806214.422172] Lustre: 109591:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [1806221.411788] Lustre: 109625:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577792442/real 1577792442] req@ffff887772374800 x1652565217072208/t0(0) o104->fir-MDT0000@10.8.27.8@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577792449 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1806221.439210] Lustre: 109625:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [1806232.084859] Lustre: 109632:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577792453/real 1577792453] req@ffff88502e6e7980 x1652565217275584/t0(0) o104->fir-MDT0000@10.8.27.8@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577792460 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1806232.112283] Lustre: 109632:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [1806249.431965] Lustre: 109591:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577792470/real 1577792470] req@ffff8866d04d8480 x1652565217042432/t0(0) o104->fir-MDT0000@10.8.27.7@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577792477 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1806249.459394] Lustre: 109591:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [1806279.665359] LustreError: 109682:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.27.8@o2ib6) failed to reply to blocking AST (req@ffff8856d11abf00 x1652565217275808 status 0 rc -5), evict it ns: mdt-fir-MDT0000_UUID lock: ffff8866017c8480/0xc3c20c2d1912d56f lrc: 4/0,0 mode: PR/PR res: [0x20003aaab:0x1feb9:0x0].0x0 bits 0x1b/0x0 rrc: 5 type: IBT flags: 0x60200400000020 nid: 10.8.27.8@o2ib6 remote: 0x335e15af77cd943c expref: 55 pid: 109716 timeout: 1806362 lvb_type: 0 [1806279.665368] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.27.8@o2ib6 was evicted due to a lock blocking callback time out: rc -5 [1806279.665370] LustreError: Skipped 1 previous similar message [1806279.726380] LustreError: 109682:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 2 previous similar messages [1806439.665667] Lustre: MGS: haven't heard from client 018ffde3-fd38-8514-e8f3-12c7445597dc (at 10.9.101.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bf8833c00, cur 1577792668 expire 1577792518 last 1577792441 [1806439.686940] Lustre: Skipped 79 previous similar messages [1807234.664148] Lustre: MGS: haven't heard from client f4e8a55c-0e23-04a1-5608-d92917ce0ea9 (at 10.8.24.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde1e800, cur 1577793463 expire 1577793313 last 1577793236 [1807234.685338] Lustre: Skipped 1 previous similar message [1808086.664785] Lustre: fir-MDT0000: haven't heard from client 32ab1a50-74ca-44b4-212a-e2a408110c8d (at 10.9.101.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d5676c00, cur 1577794315 expire 1577794165 last 1577794088 [1808086.686746] Lustre: Skipped 13 previous similar messages [1808106.468332] Lustre: MGS: Connection restored to 34b263e7-c235-6737-be01-1bc0ec67d622 (at 10.9.117.33@o2ib4) [1808106.478252] Lustre: Skipped 3 previous similar messages [1808337.659966] Lustre: MGS: Connection restored to (at 10.9.117.15@o2ib4) [1808337.666758] Lustre: Skipped 11 previous similar messages [1808500.767870] Lustre: MGS: Connection restored to 1c5cd270-678c-23a0-aed8-7e00f7448deb (at 10.8.26.32@o2ib6) [1808500.777705] Lustre: Skipped 65 previous similar messages [1809277.935676] Lustre: MGS: Connection restored to (at 10.8.24.20@o2ib6) [1809277.942433] Lustre: Skipped 11 previous similar messages [1809657.674580] Lustre: fir-MDT0000: haven't heard from client 44785c16-a13a-0ac3-068e-f8ceba05a374 (at 10.9.101.55@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88745d341400, cur 1577795886 expire 1577795736 last 1577795659 [1809657.696543] Lustre: Skipped 1 previous similar message [1810347.112285] Lustre: MGS: Connection restored to (at 10.9.101.13@o2ib4) [1810347.119081] Lustre: Skipped 21 previous similar messages [1810577.679701] Lustre: fir-MDT0000: haven't heard from client e5398439-eb81-be12-8ff9-341a7621376c (at 10.9.101.47@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ad06e2c00, cur 1577796806 expire 1577796656 last 1577796579 [1810577.701664] Lustre: Skipped 1 previous similar message [1810866.688154] Lustre: fir-MDT0000: haven't heard from client ece0b089-18cd-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88658691f800, cur 1577797095 expire 1577796945 last 1577796868 [1810866.708210] Lustre: Skipped 3 previous similar messages [1811137.685592] Lustre: fir-MDT0000: haven't heard from client 2155528b-412d-a244-0490-c5ce581f37f8 (at 10.9.101.11@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbd83e800, cur 1577797366 expire 1577797216 last 1577797139 [1811137.707564] Lustre: Skipped 1 previous similar message [1811316.908639] Lustre: MGS: Connection restored to d0d1f9e4-5e40-b568-3ab1-413970b0d653 (at 10.8.30.23@o2ib6) [1811316.918476] Lustre: Skipped 5 previous similar messages [1811830.688246] Lustre: fir-MDT0000: haven't heard from client c896c43c-8a77-9e8e-7227-b2c8f6de4845 (at 10.8.25.20@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887546e89800, cur 1577798059 expire 1577797909 last 1577797832 [1811830.710173] Lustre: Skipped 1 previous similar message [1811838.692186] Lustre: MGS: haven't heard from client b3990e78-1e82-ecdf-a105-e2689e8928da (at 10.8.25.20@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc23a6400, cur 1577798067 expire 1577797917 last 1577797840 [1811918.731363] Lustre: MGS: Connection restored to 44785c16-a13a-0ac3-068e-f8ceba05a374 (at 10.9.101.55@o2ib4) [1811918.741290] Lustre: Skipped 1 previous similar message [1812844.742078] Lustre: MGS: Connection restored to (at 10.9.101.47@o2ib4) [1812844.748876] Lustre: Skipped 1 previous similar message [1813888.839935] Lustre: MGS: Connection restored to c896c43c-8a77-9e8e-7227-b2c8f6de4845 (at 10.8.25.20@o2ib6) [1813888.849760] Lustre: Skipped 5 previous similar messages [1817708.984492] Lustre: MGS: Connection restored to 1cea57ff-1d0f-b4f4-8769-95e365d89cff (at 10.8.25.26@o2ib6) [1817708.994325] Lustre: Skipped 1 previous similar message [1818771.736448] Lustre: MGS: haven't heard from client 96c128fd-09c8-a068-0de9-622799980d31 (at 10.9.101.39@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bbc703000, cur 1577805000 expire 1577804850 last 1577804773 [1821000.722734] Lustre: MGS: Connection restored to (at 10.9.101.39@o2ib4) [1821000.729529] Lustre: Skipped 1 previous similar message [1821905.754328] Lustre: fir-MDT0000: haven't heard from client 9ea720cf-881f-7649-fd64-8cd07128a90c (at 10.9.101.35@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bc8735c00, cur 1577808134 expire 1577807984 last 1577807907 [1821905.776295] Lustre: Skipped 1 previous similar message [1822516.763031] Lustre: fir-MDT0000: haven't heard from client 9ff24344-feb6-8c0e-cb07-a92244c00aa4 (at 10.9.101.21@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887adaad9400, cur 1577808745 expire 1577808595 last 1577808518 [1822516.785016] Lustre: Skipped 1 previous similar message [1822780.761758] Lustre: MGS: haven't heard from client a880a175-a18f-f4d8-1b2c-38fe16da75f5 (at 10.8.24.16@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc7f8b800, cur 1577809009 expire 1577808859 last 1577808782 [1822780.782941] Lustre: Skipped 1 previous similar message [1822790.759510] Lustre: fir-MDT0000: haven't heard from client 8e30f81d-d3a3-33d2-6958-ecc784e62a3a (at 10.8.24.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887aeae37000, cur 1577809019 expire 1577808869 last 1577808792 [1822790.781392] Lustre: Skipped 1 previous similar message [1823320.623523] Lustre: MGS: Connection restored to (at 10.8.30.34@o2ib6) [1823320.630233] Lustre: Skipped 1 previous similar message [1823512.073053] Lustre: MGS: Connection restored to 34004248-b9f7-fa76-67ab-9379f67ee678 (at 10.9.117.45@o2ib4) [1823512.082967] Lustre: Skipped 1 previous similar message [1824147.646292] Lustre: MGS: Connection restored to 9ea720cf-881f-7649-fd64-8cd07128a90c (at 10.9.101.35@o2ib4) [1824147.656210] Lustre: Skipped 1 previous similar message [1824716.137935] Lustre: MGS: Connection restored to 9ff24344-feb6-8c0e-cb07-a92244c00aa4 (at 10.9.101.21@o2ib4) [1824716.147854] Lustre: Skipped 1 previous similar message [1824861.630516] Lustre: MGS: Connection restored to 750c84e1-eadd-f494-4b05-c3838a481685 (at 10.8.24.18@o2ib6) [1824861.640352] Lustre: Skipped 1 previous similar message [1824870.025545] Lustre: MGS: Connection restored to (at 10.8.24.19@o2ib6) [1824870.032260] Lustre: Skipped 1 previous similar message [1824885.017583] Lustre: MGS: Connection restored to 04b6fb6b-3bcb-7865-aada-b02224ee3504 (at 10.8.24.17@o2ib6) [1824885.027408] Lustre: Skipped 3 previous similar messages [1824933.274028] Lustre: MGS: Connection restored to (at 10.8.24.13@o2ib6) [1824933.280751] Lustre: Skipped 3 previous similar messages [1827087.789660] Lustre: MGS: haven't heard from client b5e5896d-d7d4-7b33-3c78-ead8f0372573 (at 10.9.101.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bfa766400, cur 1577813316 expire 1577813166 last 1577813089 [1827087.810844] Lustre: Skipped 1 previous similar message [1827100.785760] Lustre: fir-MDT0000: haven't heard from client 138212f0-46c8-0092-07d2-847316236d53 (at 10.9.101.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab2e9e000, cur 1577813329 expire 1577813179 last 1577813102 [1827492.539426] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1827492.546134] Lustre: Skipped 1 previous similar message [1827549.787667] Lustre: fir-MDT0000: haven't heard from client 6117d851-f584-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be3fc4c00, cur 1577813778 expire 1577813628 last 1577813551 [1827880.808314] Lustre: MGS: haven't heard from client 4e2ae07d-8bab-ff63-1af4-46cee7b35c1d (at 10.8.30.36@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdc61b000, cur 1577814109 expire 1577813959 last 1577813882 [1827880.829495] Lustre: Skipped 1 previous similar message [1827893.789712] Lustre: fir-MDT0000: haven't heard from client fb83de3b-fb5e-6930-0db9-4bed36c7d2d5 (at 10.8.25.7@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879bd2f7c00, cur 1577814122 expire 1577813972 last 1577813895 [1827893.811506] Lustre: Skipped 5 previous similar messages [1828001.678700] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1828001.685410] Lustre: Skipped 1 previous similar message [1828080.233611] Lustre: MGS: Connection restored to (at 10.8.23.14@o2ib6) [1828080.240328] Lustre: Skipped 1 previous similar message [1828089.790958] LustreError: 109568:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.23.14@o2ib6) returned error from blocking AST (req@ffff885bdeb6f500 x1652565586493648 status -107 rc -107), evict it ns: mdt-fir-MDT0000_UUID lock: ffff8861dc2be780/0xc3c20c2d5cc119d7 lrc: 4/0,0 mode: EX/EX res: [0x20003ccd4:0xd:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x36291ca29e91bb7b expref: 132 pid: 109731 timeout: 1828227 lvb_type: 3 [1828089.791288] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.23.14@o2ib6 was evicted due to a lock blocking callback time out: rc -107 [1828089.791290] LustreError: Skipped 2 previous similar messages [1828089.791315] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 10.8.23.14@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff884cc90d0b40/0xc3c20c2d5cc131ee lrc: 3/0,0 mode: EX/EX res: [0x20003ccd4:0xe:0x0].0x0 bits 0x8/0x0 rrc: 6 type: IBT flags: 0x60000400000020 nid: 10.8.23.14@o2ib6 remote: 0x36291ca29e91bca1 expref: 133 pid: 109731 timeout: 0 lvb_type: 3 [1828089.889803] LustreError: 109568:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 1 previous similar message [1828121.796159] Lustre: MGS: haven't heard from client 7841fdd9-4d6e-4 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff884bd4840c00, cur 1577814350 expire 1577814200 last 1577814123 [1828121.815528] Lustre: Skipped 5 previous similar messages [1828157.255434] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1828157.262145] Lustre: Skipped 1 previous similar message [1828227.815313] Lustre: MGS: haven't heard from client 10e66b70-992f-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8851734e7c00, cur 1577814456 expire 1577814306 last 1577814229 [1828354.991059] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1828354.997770] Lustre: Skipped 1 previous similar message [1828435.794457] Lustre: fir-MDT0000: haven't heard from client bd5dbce3-b525-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879b037c400, cur 1577814664 expire 1577814514 last 1577814437 [1828435.814520] Lustre: Skipped 1 previous similar message [1829289.545638] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1829289.552357] Lustre: Skipped 1 previous similar message [1829310.797527] Lustre: fir-MDT0000: haven't heard from client c90b43e1-52d8-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8862fb213000, cur 1577815539 expire 1577815389 last 1577815312 [1829310.817582] Lustre: Skipped 1 previous similar message [1829354.661103] Lustre: MGS: Connection restored to (at 10.9.101.9@o2ib4) [1829354.667818] Lustre: Skipped 1 previous similar message [1829592.800038] Lustre: fir-MDT0000: haven't heard from client b77e39b1-ed31-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8875a5373800, cur 1577815821 expire 1577815671 last 1577815594 [1829592.820099] Lustre: Skipped 1 previous similar message [1829687.121489] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1829687.128195] Lustre: Skipped 1 previous similar message [1829976.497715] Lustre: MGS: Connection restored to 6ff3adf1-91c6-b215-6967-bff54d6f1325 (at 10.8.26.30@o2ib6) [1829976.507546] Lustre: Skipped 1 previous similar message [1830011.928111] Lustre: MGS: Connection restored to (at 10.8.26.7@o2ib6) [1830011.934736] Lustre: Skipped 7 previous similar messages [1830107.168360] Lustre: MGS: Connection restored to 2af19058-d342-84a2-b421-bcc9f649d6db (at 10.8.30.13@o2ib6) [1830107.178186] Lustre: Skipped 15 previous similar messages [1830258.544991] Lustre: MGS: Connection restored to (at 10.8.23.12@o2ib6) [1830258.551708] Lustre: Skipped 3 previous similar messages [1830306.804004] Lustre: fir-MDT0000: haven't heard from client 5bafd09f-ad5b-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8875c2c5a400, cur 1577816535 expire 1577816385 last 1577816308 [1830306.824058] Lustre: Skipped 1 previous similar message [1830316.829712] Lustre: MGS: haven't heard from client 5692f571-3993-4 (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88679b820c00, cur 1577816545 expire 1577816395 last 1577816318 [1831219.812319] Lustre: MGS: haven't heard from client 680df13c-39c2-2084-5be4-fd19b73a7a7f (at 10.9.101.22@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885beabcb400, cur 1577817448 expire 1577817298 last 1577817221 [1831222.811847] Lustre: fir-MDT0000: haven't heard from client 6e6e0ee4-2ca1-ba66-f23e-af10663132a4 (at 10.9.101.22@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887adaad8400, cur 1577817451 expire 1577817301 last 1577817224 [1833472.198247] Lustre: MGS: Connection restored to 6e6e0ee4-2ca1-ba66-f23e-af10663132a4 (at 10.9.101.22@o2ib4) [1833472.208166] Lustre: Skipped 1 previous similar message [1833781.824934] Lustre: fir-MDT0000: haven't heard from client 06192bd5-124f-c4f8-5878-c3b2d7e910f9 (at 10.8.24.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ad06e5000, cur 1577820010 expire 1577819860 last 1577819783 [1835402.498214] Lustre: MGS: Connection restored to 2b11540e-8a44-d8df-08be-2cf4d0e7e7cc (at 10.8.25.25@o2ib6) [1835402.508045] Lustre: Skipped 1 previous similar message [1835488.836230] Lustre: fir-MDT0000: haven't heard from client f27082f8-7761-5c5d-b196-67b78beb0e67 (at 10.9.101.66@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879a972a400, cur 1577821717 expire 1577821567 last 1577821490 [1835488.858194] Lustre: Skipped 1 previous similar message [1835701.866190] Lustre: MGS: haven't heard from client 591cbf91-37a1-9058-34ab-fb4353516343 (at 10.8.27.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdc61a800, cur 1577821930 expire 1577821780 last 1577821703 [1835701.887373] Lustre: Skipped 1 previous similar message [1835711.837322] Lustre: fir-MDT0000: haven't heard from client 24fab89a-6f6a-550a-7225-4734c7f7b849 (at 10.8.27.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887978e5e000, cur 1577821940 expire 1577821790 last 1577821713 [1835899.352719] Lustre: MGS: Connection restored to 2113211d-82da-e1fb-d2b3-5e0121d5e442 (at 10.8.24.5@o2ib6) [1835899.362464] Lustre: Skipped 1 previous similar message [1835916.744409] Lustre: MGS: Connection restored to (at 10.8.24.1@o2ib6) [1835916.751030] Lustre: Skipped 1 previous similar message [1835975.293813] Lustre: MGS: Connection restored to b9a4701f-5383-be45-4634-5a5273218afb (at 10.8.24.4@o2ib6) [1835975.303560] Lustre: Skipped 1 previous similar message [1836784.756748] Lustre: MGS: Connection restored to be905cb3-e9eb-7ff5-d486-568f13f98db0 (at 10.8.25.13@o2ib6) [1836784.766576] Lustre: Skipped 1 previous similar message [1837771.710893] Lustre: MGS: Connection restored to f27082f8-7761-5c5d-b196-67b78beb0e67 (at 10.9.101.66@o2ib4) [1837771.720814] Lustre: Skipped 1 previous similar message [1837837.887117] Lustre: MGS: Connection restored to (at 10.8.27.12@o2ib6) [1837837.893830] Lustre: Skipped 1 previous similar message [1838221.520278] Lustre: MGS: Connection restored to (at 10.8.25.29@o2ib6) [1838221.526984] Lustre: Skipped 1 previous similar message [1843067.922225] Lustre: MGS: haven't heard from client 0aaa7fc8-dcda-40b6-d029-3fc55e50a07b (at 10.8.30.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885be4450000, cur 1577829296 expire 1577829146 last 1577829069 [1843079.892859] Lustre: fir-MDT0000: haven't heard from client 6c8c6de8-60d9-5ab1-5f74-dc1e64ab5212 (at 10.8.30.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf9da6800, cur 1577829308 expire 1577829158 last 1577829081 [1845150.632112] Lustre: MGS: Connection restored to 6c8c6de8-60d9-5ab1-5f74-dc1e64ab5212 (at 10.8.30.9@o2ib6) [1845150.641852] Lustre: Skipped 1 previous similar message [1845562.906763] Lustre: fir-MDT0000: haven't heard from client 7635e623-5abf-6ad3-276e-e26791b3d5cc (at 10.9.101.18@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887402e65000, cur 1577831791 expire 1577831641 last 1577831564 [1845563.976792] Lustre: MGS: haven't heard from client 3ab1c38b-8159-183d-ff09-f29b39a43c1a (at 10.9.101.18@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bc6bdd000, cur 1577831792 expire 1577831642 last 1577831565 [1845783.733384] Lustre: MGS: Connection restored to 8d7a01c3-e0f4-61f3-17f5-cd0947f68d29 (at 10.8.26.10@o2ib6) [1845783.743211] Lustre: Skipped 1 previous similar message [1845788.064153] Lustre: MGS: Connection restored to (at 10.9.105.16@o2ib4) [1845788.070947] Lustre: Skipped 1 previous similar message [1847795.748885] Lustre: MGS: Connection restored to (at 10.9.101.18@o2ib4) [1847795.755676] Lustre: Skipped 1 previous similar message [1848833.509200] Lustre: MGS: Connection restored to (at 10.8.25.11@o2ib6) [1848833.515905] Lustre: Skipped 1 previous similar message [1849416.129781] Lustre: MGS: Connection restored to f5cbc3f1-0994-cb9b-0b11-7fa3db8cbaa9 (at 10.8.26.16@o2ib6) [1849416.139609] Lustre: Skipped 1 previous similar message [1850543.932357] Lustre: fir-MDT0000: haven't heard from client 4c1f7414-081e-38fa-7245-fdc2400de56e (at 10.9.101.49@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887762093400, cur 1577836772 expire 1577836622 last 1577836545 [1852898.538180] Lustre: MGS: Connection restored to (at 10.9.101.49@o2ib4) [1852898.544974] Lustre: Skipped 1 previous similar message [1853537.953316] Lustre: fir-MDT0000: haven't heard from client 9ae19243-f7e2-d3bd-e779-08d880a1a19f (at 10.9.101.56@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88750cbb6000, cur 1577839766 expire 1577839616 last 1577839539 [1853537.975300] Lustre: Skipped 1 previous similar message [1855769.990391] Lustre: MGS: Connection restored to 9ae19243-f7e2-d3bd-e779-08d880a1a19f (at 10.9.101.56@o2ib4) [1855770.000309] Lustre: Skipped 1 previous similar message [1856764.978544] Lustre: MGS: haven't heard from client 65694433-52cc-6208-c30e-04942c32da8b (at 10.9.101.44@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bbc77b000, cur 1577842993 expire 1577842843 last 1577842766 [1856764.999838] Lustre: Skipped 1 previous similar message [1858884.902375] Lustre: MGS: Connection restored to (at 10.8.27.18@o2ib6) [1858884.909085] Lustre: Skipped 1 previous similar message [1859067.165056] Lustre: MGS: Connection restored to (at 10.9.101.44@o2ib4) [1859067.171852] Lustre: Skipped 1 previous similar message [1859123.711083] Lustre: MGS: Connection restored to (at 10.9.101.24@o2ib4) [1859123.717879] Lustre: Skipped 1 previous similar message [1866656.566460] Lustre: MGS: Connection restored to 1c30f97b-7d47-8c9d-c1e8-e8bf522ea702 (at 10.8.24.11@o2ib6) [1866656.576292] Lustre: Skipped 1 previous similar message [1866723.156958] Lustre: MGS: Connection restored to (at 10.8.25.12@o2ib6) [1866723.163665] Lustre: Skipped 1 previous similar message [1868120.062411] Lustre: fir-MDT0000: haven't heard from client 003d0909-3f9c-f126-c358-ac2fc02533ea (at 10.9.101.10@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b5400, cur 1577854348 expire 1577854198 last 1577854121 [1868120.084380] Lustre: Skipped 5 previous similar messages [1868121.054927] Lustre: MGS: haven't heard from client 0610d5c0-19fe-711b-5c8b-fefab88b354a (at 10.9.101.10@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bea720800, cur 1577854349 expire 1577854199 last 1577854122 [1868816.059310] Lustre: MGS: haven't heard from client 6a5e3857-6afb-239e-1c4d-63acd40f37c3 (at 10.8.26.6@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be0afc400, cur 1577855044 expire 1577854894 last 1577854817 [1870077.065435] Lustre: fir-MDT0000: haven't heard from client 2d3b1474-a233-197a-adba-8ab5eddbb0d5 (at 10.8.27.13@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bbe9c4800, cur 1577856305 expire 1577856155 last 1577856078 [1870077.087309] Lustre: Skipped 3 previous similar messages [1870079.064976] Lustre: MGS: haven't heard from client 092616f8-a43a-209c-1b0a-9e4f55b60e44 (at 10.8.27.13@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bea723c00, cur 1577856307 expire 1577856157 last 1577856080 [1870328.226566] Lustre: MGS: Connection restored to 003d0909-3f9c-f126-c358-ac2fc02533ea (at 10.9.101.10@o2ib4) [1870328.236485] Lustre: Skipped 1 previous similar message [1870914.640280] Lustre: MGS: Connection restored to 9e6019b2-e72a-be9a-07e3-b4bb84e4d17c (at 10.8.30.29@o2ib6) [1870914.650134] Lustre: Skipped 1 previous similar message [1870919.800674] Lustre: MGS: Connection restored to (at 10.8.26.3@o2ib6) [1870919.807299] Lustre: Skipped 1 previous similar message [1870929.886429] Lustre: MGS: Connection restored to (at 10.8.26.18@o2ib6) [1870929.893141] Lustre: Skipped 1 previous similar message [1870936.819489] Lustre: MGS: Connection restored to (at 10.8.26.12@o2ib6) [1870936.826199] Lustre: Skipped 1 previous similar message [1871240.331580] Lustre: MGS: Connection restored to (at 10.8.26.6@o2ib6) [1871240.338218] Lustre: Skipped 1 previous similar message [1872157.082832] Lustre: MGS: Connection restored to 2d3b1474-a233-197a-adba-8ab5eddbb0d5 (at 10.8.27.13@o2ib6) [1872157.092668] Lustre: Skipped 1 previous similar message [1872296.077533] Lustre: fir-MDT0000: haven't heard from client 82ec6a6c-fd4e-e9b4-f57a-07042023ce08 (at 10.9.101.16@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ad06e1800, cur 1577858524 expire 1577858374 last 1577858297 [1872300.082282] Lustre: MGS: haven't heard from client 433200fd-58fe-6669-ee73-909fa6eb8f48 (at 10.9.101.16@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf6b1a000, cur 1577858528 expire 1577858378 last 1577858301 [1872735.086339] Lustre: MGS: haven't heard from client b741c411-c7be-8dec-2d9d-116ffba64204 (at 10.9.101.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be97e7800, cur 1577858963 expire 1577858813 last 1577858736 [1872749.127213] Lustre: fir-MDT0000: haven't heard from client c10ed93a-c9db-76f9-833b-adc60f0f324a (at 10.9.101.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88741a356400, cur 1577858977 expire 1577858827 last 1577858750 [1874594.757247] Lustre: MGS: Connection restored to 82ec6a6c-fd4e-e9b4-f57a-07042023ce08 (at 10.9.101.16@o2ib4) [1874594.767163] Lustre: Skipped 1 previous similar message [1875027.888052] Lustre: MGS: Connection restored to (at 10.9.101.37@o2ib4) [1875027.894862] Lustre: Skipped 1 previous similar message [1876845.110239] Lustre: MGS: haven't heard from client 120396da-6d98-c9d6-3dab-d4beb694fca8 (at 10.9.101.34@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888be8234800, cur 1577863073 expire 1577862923 last 1577862846 [1876854.105668] Lustre: fir-MDT0000: haven't heard from client b4c9913c-f59e-b8ac-70a9-c2d8d6c39257 (at 10.9.101.34@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879bd2f1c00, cur 1577863082 expire 1577862932 last 1577862855 [1878419.606992] LNetError: 38672:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1879061.578793] Lustre: MGS: Connection restored to b4c9913c-f59e-b8ac-70a9-c2d8d6c39257 (at 10.9.101.34@o2ib4) [1879061.588735] Lustre: Skipped 1 previous similar message [1888798.224715] LNetError: 38663:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1890736.106768] LNetError: 38665:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1891895.188195] Lustre: MGS: haven't heard from client c22a4b99-92e6-9a04-6abd-0402018bb769 (at 10.8.30.32@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bf3c82400, cur 1577878123 expire 1577877973 last 1577877896 [1891901.187029] Lustre: fir-MDT0000: haven't heard from client 016cfe19-2250-799b-d8ad-887e11d25409 (at 10.8.30.32@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887aeae30000, cur 1577878129 expire 1577877979 last 1577877902 [1894000.774735] Lustre: MGS: Connection restored to 016cfe19-2250-799b-d8ad-887e11d25409 (at 10.8.30.32@o2ib6) [1894000.784571] Lustre: Skipped 1 previous similar message [1897906.938594] LNetError: 38663:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1899026.225249] Lustre: fir-MDT0000: haven't heard from client 4e1cc11b-70d3-525f-ee38-2a7467cc154b (at 10.8.30.3@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88784209d000, cur 1577885254 expire 1577885104 last 1577885027 [1901105.834524] Lustre: MGS: Connection restored to (at 10.8.30.12@o2ib6) [1901105.841238] Lustre: Skipped 1 previous similar message [1901108.561050] Lustre: MGS: Connection restored to fafd56e0-21c9-04db-6a1d-480587a30573 (at 10.8.24.28@o2ib6) [1901108.570896] Lustre: Skipped 1 previous similar message [1901134.457347] Lustre: MGS: Connection restored to 35625223-7b16-ecd5-7152-f328376026d8 (at 10.8.24.14@o2ib6) [1901134.467176] Lustre: Skipped 1 previous similar message [1901137.861845] Lustre: MGS: Connection restored to 4e1cc11b-70d3-525f-ee38-2a7467cc154b (at 10.8.30.3@o2ib6) [1901137.871620] Lustre: Skipped 3 previous similar messages [1901147.702581] Lustre: MGS: Connection restored to 70dfb233-fcbd-c1c9-7b9c-ef34a1fa40ac (at 10.8.30.26@o2ib6) [1901147.712426] Lustre: Skipped 3 previous similar messages [1901156.147321] Lustre: MGS: Connection restored to (at 10.8.24.35@o2ib6) [1901156.154034] Lustre: Skipped 3 previous similar messages [1901205.344218] Lustre: MGS: Connection restored to (at 10.8.26.1@o2ib6) [1901205.350846] Lustre: Skipped 13 previous similar messages [1901248.370906] Lustre: MGS: Connection restored to cbe8ab8a-c160-9941-915d-4ea03e0c10d7 (at 10.8.25.3@o2ib6) [1901248.380685] Lustre: Skipped 3 previous similar messages [1901400.405884] Lustre: MGS: Connection restored to cbcc93d8-fa41-da9c-2b2c-68a3b5a3055f (at 10.8.26.23@o2ib6) [1901400.415718] Lustre: Skipped 1 previous similar message [1908166.274609] Lustre: fir-MDT0000: haven't heard from client 7128bbc5-55d8-ff02-9d63-ba25c68604fa (at 10.9.101.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887a51f75c00, cur 1577894394 expire 1577894244 last 1577894167 [1908166.296493] Lustre: Skipped 15 previous similar messages [1910430.835982] Lustre: MGS: Connection restored to 7128bbc5-55d8-ff02-9d63-ba25c68604fa (at 10.9.101.7@o2ib4) [1910430.845808] Lustre: Skipped 3 previous similar messages [1922218.353164] Lustre: fir-MDT0000: haven't heard from client 6754d552-207d-43c9-a59c-3b5a2cb4043d (at 10.9.101.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887546e8a400, cur 1577908446 expire 1577908296 last 1577908219 [1922218.375135] Lustre: Skipped 1 previous similar message [1923374.361163] Lustre: fir-MDT0000: haven't heard from client efed7301-8e74-ea49-ce8b-6858f68d4266 (at 10.9.117.48@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8874047cc400, cur 1577909602 expire 1577909452 last 1577909375 [1923374.383125] Lustre: Skipped 3 previous similar messages [1924183.826813] Lustre: MGS: Connection restored to (at 10.9.108.51@o2ib4) [1924183.833606] Lustre: Skipped 1 previous similar message [1924461.734404] Lustre: MGS: Connection restored to 6754d552-207d-43c9-a59c-3b5a2cb4043d (at 10.9.101.14@o2ib4) [1924461.744326] Lustre: Skipped 1 previous similar message [1925116.371834] Lustre: fir-MDT0000: haven't heard from client 77ec2033-19bc-3d84-2583-8f3d17caa043 (at 10.9.108.29@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba97b3800, cur 1577911344 expire 1577911194 last 1577911117 [1925116.393793] Lustre: Skipped 1 previous similar message [1925506.872931] Lustre: MGS: Connection restored to (at 10.9.117.48@o2ib4) [1925506.879729] Lustre: Skipped 1 previous similar message [1926912.402418] Lustre: MGS: haven't heard from client c4fa048b-1b87-c26d-f13e-18c2682afa07 (at 10.9.101.29@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bfa1d0c00, cur 1577913140 expire 1577912990 last 1577912913 [1926912.423687] Lustre: Skipped 1 previous similar message [1926919.382638] Lustre: fir-MDT0000: haven't heard from client 20463417-fb32-2f92-5aae-59bfa8e287e3 (at 10.9.101.29@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf7afd400, cur 1577913147 expire 1577912997 last 1577912920 [1927482.942452] Lustre: MGS: Connection restored to 77ec2033-19bc-3d84-2583-8f3d17caa043 (at 10.9.108.29@o2ib4) [1927482.952364] Lustre: Skipped 1 previous similar message [1929152.267536] Lustre: MGS: Connection restored to 20463417-fb32-2f92-5aae-59bfa8e287e3 (at 10.9.101.29@o2ib4) [1929152.277455] Lustre: Skipped 1 previous similar message [1933097.422223] Lustre: MGS: haven't heard from client 48ac361a-d705-e3f0-8c4d-d0f9011866de (at 10.9.101.25@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde65400, cur 1577919325 expire 1577919175 last 1577919098 [1933100.431302] Lustre: fir-MDT0000: haven't heard from client 4e9738be-9725-7d91-37d9-ab21f60c9eb3 (at 10.9.101.25@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744e765400, cur 1577919328 expire 1577919178 last 1577919101 [1933713.437476] Lustre: MGS: haven't heard from client 7fdd0c56-0c0b-3b2f-5a0d-a741c3f8dbb0 (at 10.9.101.27@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bcde2dc00, cur 1577919941 expire 1577919791 last 1577919714 [1933717.424384] Lustre: fir-MDT0000: haven't heard from client f9f503f0-6ff6-698f-9a8d-14bd128a6d42 (at 10.9.101.27@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ace3db400, cur 1577919945 expire 1577919795 last 1577919718 [1935360.012890] Lustre: MGS: Connection restored to (at 10.9.101.25@o2ib4) [1935360.019686] Lustre: Skipped 1 previous similar message [1935927.972484] Lustre: MGS: Connection restored to f9f503f0-6ff6-698f-9a8d-14bd128a6d42 (at 10.9.101.27@o2ib4) [1935927.982401] Lustre: Skipped 1 previous similar message [1946625.029614] LNetError: 38672:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1947615.907376] LNetError: 38673:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1949572.703865] LNetError: 38674:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1951539.692913] LNetError: 38678:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1952466.008857] LNetError: 38671:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) [1952472.916875] Lustre: fir-MDT0000: Client ef7dc48c-ab4d-4 (at 10.9.101.51@o2ib4) reconnecting [1952472.925435] Lustre: fir-MDT0000: Connection restored to 7de2709b-434b-c2b2-ee11-fe99c3a9d16f (at 10.9.101.51@o2ib4) [1952472.936049] Lustre: Skipped 1 previous similar message [1952972.544254] Lustre: fir-MDT0000: haven't heard from client 1e18de08-a3eb-42b1-8ff1-a5d78030cd54 (at 10.9.108.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b9f70bc00, cur 1577939200 expire 1577939050 last 1577938973 [1953105.363351] LNetError: 38675:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1955388.559387] Lustre: MGS: Connection restored to (at 10.9.108.7@o2ib4) [1962432.113348] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.8.30@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962432.640350] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.27.10@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962432.657821] LustreError: Skipped 14 previous similar messages [1962433.696435] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.18.14@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962433.713895] LustreError: Skipped 18 previous similar messages [1962435.904038] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.31.3@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962435.921420] LustreError: Skipped 24 previous similar messages [1962440.147410] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.30.1@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962440.164823] LustreError: Skipped 42 previous similar messages [1962448.201013] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.26.28@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962448.218478] LustreError: Skipped 87 previous similar messages [1962473.225391] Lustre: 109646:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577948693/real 1577948693] req@ffff888b5b610d80 x1652573272367648/t0(0) o104->fir-MDT0000@10.8.18.22@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577948700 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1962473.252940] Lustre: 109646:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 23 previous similar messages [1962477.754042] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.21.8@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962477.771438] LustreError: Skipped 126 previous similar messages [1962480.229402] Lustre: 116206:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577948700/real 1577948700] req@ffff888a38c00480 x1652573272367664/t0(0) o104->fir-MDT0000@10.8.18.22@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577948707 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1962480.256925] Lustre: 116206:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [1962488.251451] Lustre: 109538:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577948708/real 1577948708] req@ffff88682d2c8000 x1652573272425344/t0(0) o104->fir-MDT0000@10.8.18.29@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577948715 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1962488.278970] Lustre: 109538:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15 previous similar messages [1962506.819556] Lustre: 12702:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1577948727/real 1577948727] req@ffff887d65f7d100 x1652573274124576/t0(0) o104->fir-MDT0000@10.8.7.16@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1577948734 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1962506.846892] Lustre: 12702:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 28 previous similar messages [1962527.595694] Lustre: fir-MDT0000: haven't heard from client 4c24f803-ac39-f2e9-62fc-ef86388a1d21 (at 10.9.110.2@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88741a355000, cur 1577948755 expire 1577948605 last 1577948528 [1962527.617578] Lustre: Skipped 1 previous similar message [1962527.624965] LustreError: 109745:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.18.15@o2ib6) failed to reply to blocking AST (req@ffff88833a6d9680 x1652573279994320 status 0 rc -5), evict it ns: mdt-fir-MDT0000_UUID lock: ffff8854546d9440/0xc3c20c3b8e817190 lrc: 4/0,0 mode: PR/PR res: [0x20003c4f1:0x34d8:0x0].0x0 bits 0x1b/0x0 rrc: 5 type: IBT flags: 0x60200400000020 nid: 10.8.18.15@o2ib6 remote: 0xcf18d394a65f6919 expref: 26 pid: 109674 timeout: 1962609 lvb_type: 0 [1962527.624973] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.18.15@o2ib6 was evicted due to a lock blocking callback time out: rc -5 [1962527.624976] LustreError: Skipped 1 previous similar message [1962527.625003] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 6s: evicting client at 10.8.18.15@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff884c052e45c0/0xc3c20c3b8e8192b4 lrc: 3/0,0 mode: PR/PR res: [0x2000375d6:0x56d9:0x0].0x0 bits 0x1b/0x0 rrc: 5 type: IBT flags: 0x60200400000020 nid: 10.8.18.15@o2ib6 remote: 0xcf18d394a65f6a31 expref: 26 pid: 109720 timeout: 0 lvb_type: 0 [1962527.723562] LustreError: 109745:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) Skipped 20 previous similar messages [1962532.726452] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.8.18.21@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1962532.743921] LustreError: Skipped 3 previous similar messages [1962603.596119] Lustre: fir-MDT0000: haven't heard from client a0c85913-dcf2-8cea-738a-a56479eb8d1b (at 10.9.116.7@o2ib4) in 221 seconds. I think it's dead, and I am evicting it. exp ffff887aeae36c00, cur 1577948831 expire 1577948681 last 1577948610 [1962603.618000] Lustre: Skipped 72 previous similar messages [1962970.598265] Lustre: fir-MDT0000: haven't heard from client c87b5971-11d8-4 (at 10.9.117.10@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf1122800, cur 1577949198 expire 1577949048 last 1577948971 [1962970.618431] Lustre: Skipped 1 previous similar message [1964177.604879] Lustre: fir-MDT0000: haven't heard from client de1440cd-d1d6-487a-c020-781b5f81b0dd (at 10.9.116.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88784209f800, cur 1577950405 expire 1577950255 last 1577950178 [1964177.626855] Lustre: Skipped 9 previous similar messages [1964228.087139] Lustre: MGS: Connection restored to 5c728e66-c987-c368-217a-c8a52a905b8e (at 10.8.7.14@o2ib6) [1964228.096889] Lustre: Skipped 1 previous similar message [1964240.180570] Lustre: MGS: Connection restored to (at 10.8.7.10@o2ib6) [1964240.187190] Lustre: Skipped 1 previous similar message [1964246.524322] Lustre: MGS: Connection restored to (at 10.8.7.6@o2ib6) [1964246.530865] Lustre: Skipped 1 previous similar message [1964263.504291] Lustre: MGS: Connection restored to e73512f9-41d4-4cb4-0f7e-6f267e429adc (at 10.8.7.18@o2ib6) [1964263.514052] Lustre: Skipped 1 previous similar message [1964268.656802] Lustre: MGS: Connection restored to 51d039f0-180b-c2f2-39da-443d9476c206 (at 10.8.7.16@o2ib6) [1964268.666557] Lustre: Skipped 1 previous similar message [1964276.974665] Lustre: MGS: Connection restored to (at 10.8.7.11@o2ib6) [1964276.981299] Lustre: Skipped 1 previous similar message [1964319.080477] Lustre: MGS: Connection restored to (at 10.8.7.17@o2ib6) [1964319.087126] Lustre: Skipped 1 previous similar message [1964411.033631] Lustre: MGS: Connection restored to 11b495c3-9933-d035-2b9f-d17c2b7523c4 (at 10.9.117.9@o2ib4) [1964411.043470] Lustre: Skipped 5 previous similar messages [1964527.835180] Lustre: MGS: Connection restored to (at 10.9.108.51@o2ib4) [1964527.841978] Lustre: Skipped 1 previous similar message [1964659.051559] Lustre: MGS: Connection restored to eb60645d-f744-528a-c943-6dfa4e724df5 (at 10.8.18.27@o2ib6) [1964659.061436] Lustre: Skipped 19 previous similar messages [1965148.769963] Lustre: MGS: Connection restored to a0c85913-dcf2-8cea-738a-a56479eb8d1b (at 10.9.116.7@o2ib4) [1965148.779788] Lustre: Skipped 51 previous similar messages [1966101.025645] perf: interrupt took too long (3915 > 3911), lowering kernel.perf_event_max_sample_rate to 51000 [1966285.476707] Lustre: MGS: Connection restored to (at 10.9.116.13@o2ib4) [1966285.483532] Lustre: Skipped 1 previous similar message [1967987.617500] LNetError: 38676:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1968836.887200] LNetError: 38678:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1970092.638034] Lustre: MGS: haven't heard from client fc6cbb71-3778-4 (at 10.9.109.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8856ea0e7c00, cur 1577956320 expire 1577956170 last 1577956093 [1970092.657526] Lustre: Skipped 1 previous similar message [1970093.694125] LNet: Service thread pid 109634 was inactive for 200.61s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1970093.711351] LNet: Skipped 2 previous similar messages [1970093.716620] Pid: 109634, comm: mdt03_025 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1970093.727083] Call Trace: [1970093.729766] [] ldlm_completion_ast+0x430/0x860 [ptlrpc] [1970093.736933] [] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [1970093.744337] [] mdt_rename_lock+0x24b/0x4b0 [mdt] [1970093.750838] [] mdt_reint_rename+0x2c5/0x2b90 [mdt] [1970093.757516] [] mdt_reint_rec+0x83/0x210 [mdt] [1970093.763775] [] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1970093.770555] [] mdt_reint+0x67/0x140 [mdt] [1970093.776463] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [1970093.783660] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1970093.791615] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [1970093.798168] [] kthread+0xd1/0xe0 [1970093.803287] [] ret_from_fork_nospec_begin+0xe/0x21 [1970093.809967] [] 0xffffffffffffffff [1970093.815182] LustreError: dumping log to /tmp/lustre-log.1577956321.109634 [1970097.036823] LNet: Service thread pid 109634 completed after 203.95s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [1970097.053251] LNet: Skipped 1 previous similar message [1970105.635201] Lustre: fir-MDT0000: haven't heard from client 4443a8af-fd6e-4 (at 10.9.109.37@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885869b4a800, cur 1577956333 expire 1577956183 last 1577956106 [1970145.944334] Lustre: MGS: Connection restored to (at 10.9.109.37@o2ib4) [1970145.951136] Lustre: Skipped 1 previous similar message [1976995.671492] Lustre: fir-MDT0000: haven't heard from client 5aa7d6de-875b-7f1e-fa2e-01fb0b68841a (at 10.9.108.1@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744ee76c00, cur 1577963223 expire 1577963073 last 1577962996 [1979219.725352] Lustre: MGS: Connection restored to 5aa7d6de-875b-7f1e-fa2e-01fb0b68841a (at 10.9.108.1@o2ib4) [1979219.735201] Lustre: Skipped 1 previous similar message [1979454.739404] Lustre: MGS: haven't heard from client c9ce823b-673b-72c6-8c5f-3466b29b180c (at 10.8.26.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bd0a29c00, cur 1577965682 expire 1577965532 last 1577965455 [1979454.760609] Lustre: Skipped 1 previous similar message [1979459.685091] Lustre: fir-MDT0000: haven't heard from client c79437d5-51cd-a97d-bdf2-13c5e6d8a02b (at 10.8.26.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88744e765800, cur 1577965687 expire 1577965537 last 1577965460 [1981042.831316] LNetError: 38664:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [1981536.299378] Lustre: MGS: Connection restored to (at 10.8.25.18@o2ib6) [1981536.306126] Lustre: Skipped 1 previous similar message [1981564.048097] Lustre: MGS: Connection restored to (at 10.8.30.21@o2ib6) [1981564.054815] Lustre: Skipped 1 previous similar message [1981573.146086] Lustre: MGS: Connection restored to 2b1c79f0-7e80-9fb6-9652-d84b00c6c331 (at 10.8.30.28@o2ib6) [1981573.155921] Lustre: Skipped 2 previous similar messages [1981576.573057] Lustre: MGS: Connection restored to (at 10.8.26.25@o2ib6) [1981576.579789] Lustre: Skipped 2 previous similar messages [1981592.295458] Lustre: MGS: Connection restored to ea43cad7-8e30-4e17-f067-dc042f6e8696 (at 10.8.30.24@o2ib6) [1981592.305289] Lustre: Skipped 5 previous similar messages [1998063.792226] Lustre: fir-MDT0000: haven't heard from client f443f4ba-3277-14e5-03e7-2ce1168d8345 (at 10.8.18.25@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab2e9ac00, cur 1577984291 expire 1577984141 last 1577984064 [1999837.799403] Lustre: fir-MDT0000: haven't heard from client 35096782-0357-53c6-1580-bcb8a28045a3 (at 10.8.24.29@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d2777c00, cur 1577986065 expire 1577985915 last 1577985838 [1999837.821284] Lustre: Skipped 1 previous similar message [2000168.552434] Lustre: MGS: Connection restored to (at 10.8.18.25@o2ib6) [2000168.559141] Lustre: Skipped 1 previous similar message [2001992.628292] Lustre: MGS: Connection restored to 35096782-0357-53c6-1580-bcb8a28045a3 (at 10.8.24.29@o2ib6) [2001992.638132] Lustre: Skipped 1 previous similar message [2012267.871277] Lustre: fir-MDT0000: haven't heard from client 469c9fed-7a4e-a33d-2f08-51ca338b69fb (at 10.9.108.68@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88741a354000, cur 1577998495 expire 1577998345 last 1577998268 [2012267.893246] Lustre: Skipped 1 previous similar message [2013109.876254] Lustre: fir-MDT0000: haven't heard from client 3a18a690-f6fb-7d4d-c179-697da5c59619 (at 10.9.116.10@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887bf7afec00, cur 1577999337 expire 1577999187 last 1577999110 [2013109.898217] Lustre: Skipped 1 previous similar message [2014635.648660] Lustre: MGS: Connection restored to 469c9fed-7a4e-a33d-2f08-51ca338b69fb (at 10.9.108.68@o2ib4) [2014635.658590] Lustre: Skipped 1 previous similar message [2016268.680438] Lustre: MGS: Connection restored to (at 10.9.116.10@o2ib4) [2016268.687239] Lustre: Skipped 1 previous similar message [2028293.732801] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1578014513/real 1578014513] req@ffff888a106b7980 x1652576997731920/t0(0) o104->fir-MDT0000@10.8.27.19@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1578014520 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [2028293.760313] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 50 previous similar messages [2028300.769857] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1578014520/real 1578014520] req@ffff888a106b7980 x1652576997731920/t0(0) o104->fir-MDT0000@10.8.27.19@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1578014527 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [2028314.796944] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1578014534/real 1578014534] req@ffff888a106b7980 x1652576997731920/t0(0) o104->fir-MDT0000@10.8.27.19@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1578014541 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [2028314.824456] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 1 previous similar message [2028335.834086] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1578014555/real 1578014555] req@ffff888a106b7980 x1652576997731920/t0(0) o104->fir-MDT0000@10.8.27.19@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1578014562 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [2028335.861601] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [2028370.872331] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1578014590/real 1578014590] req@ffff888a106b7980 x1652576997731920/t0(0) o104->fir-MDT0000@10.8.27.19@o2ib6:15/16 lens 296/224 e 0 to 1 dl 1578014597 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [2028370.899841] Lustre: 109605:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [2028391.911500] LustreError: 109605:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.8.27.19@o2ib6) failed to reply to blocking AST (req@ffff888a106b7980 x1652576997731920 status 0 rc -110), evict it ns: mdt-fir-MDT0000_UUID lock: ffff886460be6300/0xc3c20c3f13427abb lrc: 4/0,0 mode: PR/PR res: [0x200029e25:0x9445:0x0].0x0 bits 0x13/0x0 rrc: 18 type: IBT flags: 0x60200400000020 nid: 10.8.27.19@o2ib6 remote: 0x82db2efa076fe991 expref: 15 pid: 109557 timeout: 2028471 lvb_type: 0 [2028391.954708] LustreError: 138-a: fir-MDT0000: A client on nid 10.8.27.19@o2ib6 was evicted due to a lock blocking callback time out: rc -110 [2028391.967409] LustreError: Skipped 20 previous similar messages [2028391.973372] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 105s: evicting client at 10.8.27.19@o2ib6 ns: mdt-fir-MDT0000_UUID lock: ffff886460be6300/0xc3c20c3f13427abb lrc: 3/0,0 mode: PR/PR res: [0x200029e25:0x9445:0x0].0x0 bits 0x13/0x0 rrc: 18 type: IBT flags: 0x60200400000020 nid: 10.8.27.19@o2ib6 remote: 0x82db2efa076fe991 expref: 16 pid: 109557 timeout: 0 lvb_type: 0 [2028392.011031] LustreError: 38883:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 9 previous similar messages [2028442.983948] Lustre: MGS: haven't heard from client 79e170f9-ef30-cb8c-f7c1-4344671005fc (at 10.8.27.19@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff888bdde7e400, cur 1578014670 expire 1578014520 last 1578014443 [2028443.005134] Lustre: Skipped 1 previous similar message [2030546.520167] Lustre: MGS: Connection restored to (at 10.8.27.19@o2ib6) [2030546.526880] Lustre: Skipped 1 previous similar message [2037877.528475] Lustre: MGS: Connection restored to (at 10.8.26.2@o2ib6) [2037877.535101] Lustre: Skipped 1 previous similar message [2039209.726636] Lustre: MGS: Connection restored to (at 10.9.103.64@o2ib4) [2039209.733442] Lustre: Skipped 1 previous similar message [2040197.056247] Lustre: fir-MDT0000: haven't heard from client d269b7b3-c7ee-1895-0bbf-8293c505cff2 (at 10.9.110.44@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887be351b800, cur 1578026424 expire 1578026274 last 1578026197 [2042356.157607] Lustre: MGS: Connection restored to d269b7b3-c7ee-1895-0bbf-8293c505cff2 (at 10.9.110.44@o2ib4) [2042356.167532] Lustre: Skipped 1 previous similar message [2045331.090151] Lustre: fir-MDT0000: haven't heard from client 55a6debd-0988-594c-efa3-0f5a697e0e77 (at 10.8.17.24@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887762092400, cur 1578031558 expire 1578031408 last 1578031331 [2045331.112027] Lustre: Skipped 1 previous similar message [2045628.091868] Lustre: fir-MDT0000: haven't heard from client 09d35619-6b74-febd-1dd5-6d4a61665424 (at 10.8.17.15@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88794e3f0c00, cur 1578031855 expire 1578031705 last 1578031628 [2045628.113781] Lustre: Skipped 1 previous similar message [2046786.098733] Lustre: fir-MDT0000: haven't heard from client f7c79d26-1484-ddfe-29a2-0f5c91a2bbfc (at 10.8.8.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887762095c00, cur 1578033013 expire 1578032863 last 1578032786 [2046786.120531] Lustre: Skipped 1 previous similar message [2047513.942017] Lustre: MGS: Connection restored to (at 10.8.17.24@o2ib6) [2047513.948732] Lustre: Skipped 1 previous similar message [2047715.771007] Lustre: MGS: Connection restored to 09d35619-6b74-febd-1dd5-6d4a61665424 (at 10.8.17.15@o2ib6) [2047715.780851] Lustre: Skipped 1 previous similar message [2048451.986458] Lustre: MGS: Connection restored to a31c4d05-c2c1-d128-e70d-4b9b8b78ea7d (at 10.8.8.22@o2ib6) [2048451.996204] Lustre: Skipped 1 previous similar message [2048571.359237] Lustre: MGS: Connection restored to 2c36e76b-4e4b-7c7a-24ea-21141443e402 (at 10.8.8.25@o2ib6) [2048571.369008] Lustre: Skipped 1 previous similar message [2048585.647895] Lustre: MGS: Connection restored to f7c79d26-1484-ddfe-29a2-0f5c91a2bbfc (at 10.8.8.21@o2ib6) [2048585.657658] Lustre: Skipped 1 previous similar message [2048659.847183] Lustre: MGS: Connection restored to (at 10.8.8.19@o2ib6) [2048659.853844] Lustre: Skipped 1 previous similar message [2070761.224779] Lustre: fir-MDT0000: haven't heard from client 835e05fe-e19a-f367-bb77-6f70ae428a7c (at 10.8.8.18@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d1e86400, cur 1578056988 expire 1578056838 last 1578056761 [2070761.246576] Lustre: Skipped 7 previous similar messages [2072261.234968] Lustre: fir-MDT0000: haven't heard from client 7b82293a-73f8-138a-13e4-d48833d3398a (at 10.9.101.38@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88745d345c00, cur 1578058488 expire 1578058338 last 1578058261 [2072261.256971] Lustre: Skipped 5 previous similar messages [2072513.566837] Lustre: MGS: Connection restored to 0d7bbbba-7854-32d1-13f9-67b03c7acccc (at 10.8.8.17@o2ib6) [2072513.576580] Lustre: Skipped 1 previous similar message [2072542.420343] Lustre: MGS: Connection restored to 835e05fe-e19a-f367-bb77-6f70ae428a7c (at 10.8.8.18@o2ib6) [2072542.430094] Lustre: Skipped 1 previous similar message [2072647.740714] Lustre: MGS: Connection restored to (at 10.8.8.24@o2ib6) [2072647.747361] Lustre: Skipped 1 previous similar message [2073190.241035] Lustre: fir-MDT0000: haven't heard from client cff3451c-e996-8c2a-4369-5e4a58059d6b (at 10.9.101.48@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ab57fe000, cur 1578059417 expire 1578059267 last 1578059190 [2073190.263039] Lustre: Skipped 1 previous similar message [2074528.505953] Lustre: MGS: Connection restored to 7b82293a-73f8-138a-13e4-d48833d3398a (at 10.9.101.38@o2ib4) [2074528.515882] Lustre: Skipped 1 previous similar message [2075453.755401] Lustre: MGS: Connection restored to cff3451c-e996-8c2a-4369-5e4a58059d6b (at 10.9.101.48@o2ib4) [2075453.765375] Lustre: Skipped 1 previous similar message [2076452.873887] perf: interrupt took too long (4894 > 4893), lowering kernel.perf_event_max_sample_rate to 40000 [2079121.301444] Lustre: MGS: Connection restored to (at 10.9.103.2@o2ib4) [2079121.308148] Lustre: Skipped 1 previous similar message [2085524.314930] Lustre: fir-MDT0000: haven't heard from client 52008b8a-1aae-c71d-80d5-aeea34862c6c (at 10.9.101.8@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8879d2772800, cur 1578071751 expire 1578071601 last 1578071524 [2085524.336846] Lustre: Skipped 1 previous similar message [2087710.933502] Lustre: MGS: Connection restored to 52008b8a-1aae-c71d-80d5-aeea34862c6c (at 10.9.101.8@o2ib4) [2087710.943348] Lustre: Skipped 1 previous similar message [2091182.277950] Lustre: MGS: Connection restored to (at 10.8.25.23@o2ib6) [2091182.284673] Lustre: Skipped 1 previous similar message [2109602.448767] Lustre: fir-MDT0000: haven't heard from client 4d27158e-a6f1-1b8d-88a1-b3b4ecc1ab60 (at 10.9.105.2@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887b9f70c800, cur 1578095829 expire 1578095679 last 1578095602 [2109602.470678] Lustre: Skipped 1 previous similar message [2110570.454969] Lustre: fir-MDT0000: haven't heard from client 19f0f52b-d6a6-0ecb-bf08-1ece503c917d (at 10.9.101.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ba5f43400, cur 1578096797 expire 1578096647 last 1578096570 [2110570.476981] Lustre: Skipped 1 previous similar message [2110868.457154] Lustre: MGS: haven't heard from client 3b041863-9699-dfe6-b83e-62bda6e53561 (at 10.8.30.10@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bbe909800, cur 1578097095 expire 1578096945 last 1578096868 [2110868.478405] Lustre: Skipped 1 previous similar message [2110872.457421] Lustre: fir-MDT0000: haven't heard from client b31f1471-61ea-c666-036d-9a02a027ccfb (at 10.8.30.6@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88759c299c00, cur 1578097099 expire 1578096949 last 1578096872 [2110872.479245] Lustre: Skipped 1 previous similar message [2111404.321297] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [2111404.331059] Lustre: Skipped 1 previous similar message [2111448.461052] Lustre: fir-MDT0000: haven't heard from client be0d3707-5988-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8871e764e000, cur 1578097675 expire 1578097525 last 1578097448 [2111448.481044] Lustre: Skipped 1 previous similar message [2111581.776017] Lustre: MGS: Connection restored to 8d53094b-786e-854a-949f-904eb0728008 (at 10.8.26.4@o2ib6) [2111581.785764] Lustre: Skipped 1 previous similar message [2111641.464208] Lustre: fir-MDT0000: haven't heard from client 283cdd89-e34e-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff886107240c00, cur 1578097868 expire 1578097718 last 1578097641 [2111641.484180] Lustre: Skipped 1 previous similar message [2111657.477143] Lustre: MGS: haven't heard from client e1d22710-d46b-4 (at 10.8.26.4@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885df7b89000, cur 1578097884 expire 1578097734 last 1578097657 [2111928.296969] Lustre: MGS: Connection restored to 4d27158e-a6f1-1b8d-88a1-b3b4ecc1ab60 (at 10.9.105.2@o2ib4) [2111928.306792] Lustre: Skipped 1 previous similar message [2112845.419910] Lustre: MGS: Connection restored to 19f0f52b-d6a6-0ecb-bf08-1ece503c917d (at 10.9.101.54@o2ib4) [2112845.429832] Lustre: Skipped 1 previous similar message [2112953.814947] Lustre: MGS: Connection restored to (at 10.8.26.11@o2ib6) [2112953.821676] Lustre: Skipped 1 previous similar message [2112955.711304] Lustre: MGS: Connection restored to a76cca61-9f31-8203-f0f0-b5ac7feacee3 (at 10.8.24.34@o2ib6) [2112955.721140] Lustre: Skipped 1 previous similar message [2112967.864174] Lustre: MGS: Connection restored to (at 10.8.25.9@o2ib6) [2112967.870829] Lustre: Skipped 3 previous similar messages [2112990.520148] Lustre: MGS: Connection restored to (at 10.8.26.9@o2ib6) [2112990.526770] Lustre: Skipped 5 previous similar messages [2113000.366392] Lustre: MGS: Connection restored to 7616ae34-a3a2-7bba-3b2e-5f7ec765a7fd (at 10.8.26.20@o2ib6) [2113000.376222] Lustre: Skipped 1 previous similar message [2113025.710910] Lustre: MGS: Connection restored to (at 10.8.30.6@o2ib6) [2113025.717564] Lustre: Skipped 5 previous similar messages [2113058.768778] Lustre: MGS: Connection restored to (at 10.8.24.33@o2ib6) [2113058.775488] Lustre: Skipped 9 previous similar messages [2114740.485404] Lustre: MGS: haven't heard from client 193fde05-a0a1-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8873a6f4bc00, cur 1578100967 expire 1578100817 last 1578100740 [2121917.529068] Lustre: MGS: haven't heard from client d986ddc6-2da6-5974-6d11-93258b76ea5d (at 10.9.113.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bdaf7d400, cur 1578108144 expire 1578107994 last 1578107917 [2121917.550309] Lustre: Skipped 1 previous similar message [2124042.404272] Lustre: MGS: Connection restored to (at 10.9.113.9@o2ib4) [2124042.410991] Lustre: Skipped 5 previous similar messages [2126886.637364] Lustre: MGS: Connection restored to baa131a5-88f3-96c5-0b26-4e7d621f24f3 (at 10.8.26.22@o2ib6) [2126886.647192] Lustre: Skipped 1 previous similar message [2136479.905356] LNetError: 38675:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [2137957.486044] LNetError: 38676:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [2138699.292284] LNetError: 38675:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) [2138742.909940] Lustre: fir-MDT0000: Client e0e5609a-3df9-4 (at 10.9.117.20@o2ib4) reconnecting [2138742.918500] Lustre: fir-MDT0000: Connection restored to a3673fdd-c091-ae6f-4781-d627da6f4e17 (at 10.9.117.20@o2ib4) [2138742.929115] Lustre: Skipped 1 previous similar message [2139984.460323] LNetError: 38678:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (-125, 0) [2140028.283418] Lustre: fir-MDT0000: Client e86388a2-4f31-4 (at 10.9.110.44@o2ib4) reconnecting [2140028.293044] Lustre: fir-MDT0000: Connection restored to d269b7b3-c7ee-1895-0bbf-8293c505cff2 (at 10.9.110.44@o2ib4) [2140975.740720] LNetError: 38673:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [2142199.926339] LNetError: 38665:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [2147213.241940] LNetError: 38670:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5) [2161552.736164] Lustre: fir-MDT0000: haven't heard from client 5c94f021-adf9-fa1f-72f9-194d183e38e1 (at 10.8.17.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff887ace3df400, cur 1578147779 expire 1578147629 last 1578147552 [2161552.758050] Lustre: Skipped 1 previous similar message [2188201.883196] Lustre: MGS: haven't heard from client 99e279e3-9429-a107-5503-dbdcf071034c (at 10.8.18.24@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff885bbb452400, cur 1578174428 expire 1578174278 last 1578174201 [2188201.904410] Lustre: Skipped 1 previous similar message [2190161.616203] Lustre: MGS: Connection restored to 8ed76d02-8e05-3e88-fa0d-1c2cd38448b9 (at 10.8.18.24@o2ib6) [2204264.434391] Lustre: 109766:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [2204264.446003] Lustre: 109766:0:(llog_cat.c:894:llog_cat_process_or_fork()) Skipped 5978 previous similar messages [2204423.393739] Lustre: 43092:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [2204623.802129] LNet: Service thread pid 43092 was inactive for 200.40s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [2204623.819237] Pid: 43092, comm: mdt00_080 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [2204623.829697] Call Trace: [2204623.832418] [] __wait_on_buffer+0x2a/0x30 [2204623.838337] [] ldiskfs_bread+0x7c/0xc0 [ldiskfs] [2204623.844870] [] osd_ldiskfs_read+0xf4/0x2d0 [osd_ldiskfs] [2204623.852097] [] osd_read+0x95/0xc0 [osd_ldiskfs] [2204623.858528] [] dt_read+0x1a/0x50 [obdclass] [2204623.864648] [] llog_osd_next_block+0x36a/0xbc0 [obdclass] [2204623.871969] [] llog_process_thread+0x330/0x18e0 [obdclass] [2204623.879345] [] llog_process_or_fork+0xbc/0x450 [obdclass] [2204623.886643] [] llog_cat_process_cb+0x239/0x250 [obdclass] [2204623.893949] [] llog_process_thread+0x82f/0x18e0 [obdclass] [2204623.901326] [] llog_process_or_fork+0xbc/0x450 [obdclass] [2204623.908622] [] llog_cat_process_or_fork+0x17e/0x360 [obdclass] [2204623.916345] [] llog_cat_process+0x2e/0x30 [obdclass] [2204623.923172] [] llog_changelog_cancel.isra.16+0x54/0x1c0 [mdd] [2204623.930778] [] mdd_changelog_llog_cancel+0xd0/0x270 [mdd] [2204623.938038] [] mdd_changelog_clear+0x503/0x690 [mdd] [2204623.944858] [] mdd_iocontrol+0x163/0x540 [mdd] [2204623.951165] [] mdt_iocontrol+0x5ec/0xb00 [mdt] [2204623.957485] [] mdt_set_info+0x484/0x490 [mdt] [2204623.963707] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [2204623.970850] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [2204623.978734] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [2204623.985244] [] kthread+0xd1/0xe0 [2204623.990332] [] ret_from_fork_nospec_begin+0xe/0x21 [2204623.996977] [] 0xffffffffffffffff [2204624.002191] LustreError: dumping log to /tmp/lustre-log.1578190850.43092 [2204886.051817] LNetError: 38662:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds [2204886.062173] LNetError: 38662:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.3@o2ib7 (105): c: 7, oc: 0, rc: 8 [2206073.170762] Lustre: MGS: Connection restored to ae1d0080-04fa-5436-e145-ffdf0db9990d (at 10.0.10.3@o2ib7) [2206073.180531] Lustre: Skipped 1 previous similar message [2206088.009705] Lustre: 109638:0:(llog_cat.c:894:llog_cat_process_or_fork()) fir-MDD0000: catlog [0x5:0xa:0x0] crosses index zero [2206091.272923] general protection fault: 0000 [#1] SMP [2206091.278116] Modules linked in: osp(OE) mdd(OE) mdt(OE) lustre(OE) mdc(OE) mgs(OE) lod(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lmv(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel dcdbas ipmi_si ses ipmi_devintf aesni_intel lrw gf128mul glue_helper enclosure ablk_helper pcspkr cryptd sg dm_multipath ipmi_msghandler ccp acpi_power_meter dm_mod i2c_piix4 k10temp ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx5_ib(OE) [2206091.351292] ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm mlx5_core(OE) libahci mlxfw(OE) devlink crct10dif_pclmul mpt3sas(OE) drm tg3 crct10dif_common mlx_compat(OE) libata raid_class crc32c_intel megaraid_sas ptp scsi_transport_sas drm_panel_orientation_quirks pps_core [last unloaded: osp] [2206091.382061] CPU: 36 PID: 109638 Comm: mdt00_030 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 [2206091.395007] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019 [2206091.402832] task: ffff8865c0ae5140 ti: ffff886229cd4000 task.ti: ffff886229cd4000 [2206091.410484] RIP: 0010:[] [] llog_osd_next_block+0x964/0xbc0 [obdclass] [2206091.420443] RSP: 0018:ffff886229cd7668 EFLAGS: 00010246 [2206091.425928] RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000002000 RCX: 0000000000063c58 [2206091.433235] RDX: 0000000000000000 RSI: ffffffffc0c87f20 RDI: ffffffffc0cbd320 [2206091.440539] RBP: ffff886229cd7730 R08: 0000000000000001 R09: 0000000000000000 [2206091.447845] R10: 0000000000000000 R11: 0000000000000000 R12: ffff886229cd77e0 [2206091.455152] R13: ffff888080b9b000 R14: ffff8867bb808c00 R15: 000000000000fc8d [2206091.462458] FS: 00007fa4a0ec4700(0000) GS:ffff885bff040000(0000) knlGS:0000000000000000 [2206091.470717] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2206091.476636] CR2: 00007fa4ae229000 CR3: 000000318c610000 CR4: 00000000003407e0 [2206091.483944] Call Trace: [2206091.486596] [] ? llog_cat_cancel_records+0x1e7/0x3c0 [obdclass] [2206091.494348] [] ? osd_trunc_unlock_all+0xf4/0x160 [osd_ldiskfs] [2206091.502011] [] llog_process_thread+0x330/0x18e0 [obdclass] [2206091.509329] [] ? mdd_obd_set_info_async+0x440/0x440 [mdd] [2206091.516564] [] llog_process_or_fork+0xbc/0x450 [obdclass] [2206091.523800] [] llog_cat_process_cb+0x239/0x250 [obdclass] [2206091.531038] [] llog_process_thread+0x82f/0x18e0 [obdclass] [2206091.538361] [] ? llog_cat_cancel_records+0x3c0/0x3c0 [obdclass] [2206091.546116] [] llog_process_or_fork+0xbc/0x450 [obdclass] [2206091.553353] [] ? llog_cat_cancel_records+0x3c0/0x3c0 [obdclass] [2206091.561111] [] llog_cat_process_or_fork+0x17e/0x360 [obdclass] [2206091.568786] [] ? lprocfs_counter_sub+0xc1/0x130 [obdclass] [2206091.576093] [] ? mdd_obd_set_info_async+0x440/0x440 [mdd] [2206091.583322] [] llog_cat_process+0x2e/0x30 [obdclass] [2206091.590116] [] llog_changelog_cancel.isra.16+0x54/0x1c0 [mdd] [2206091.597682] [] ? ktime_get+0x52/0xe0 [2206091.603089] [] mdd_changelog_llog_cancel+0xd0/0x270 [mdd] [2206091.610310] [] mdd_changelog_clear+0x503/0x690 [mdd] [2206091.617105] [] mdd_iocontrol+0x163/0x540 [mdd] [2206091.623396] [] ? lu_context_init+0xd3/0x1f0 [obdclass] [2206091.630367] [] mdt_iocontrol+0x5ec/0xb00 [mdt] [2206091.636643] [] mdt_set_info+0x484/0x490 [mdt] [2206091.642883] [] tgt_request_handle+0xaea/0x1580 [ptlrpc] [2206091.649965] [] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [2206091.657712] [] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [2206091.664964] [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [2206091.672824] [] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [2206091.679791] [] ? __wake_up+0x44/0x50 [2206091.685217] [] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [2206091.691689] [] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [2206091.699255] [] kthread+0xd1/0xe0 [2206091.704303] [] ? insert_kthread_work+0x40/0x40 [2206091.710573] [] ret_from_fork_nospec_begin+0xe/0x21 [2206091.717183] [] ? insert_kthread_work+0x40/0x40 [2206091.723447] Code: 05 d6 ab 0a 00 00 00 02 00 48 c7 c7 20 d3 cb c0 48 8b 40 08 48 c7 05 a8 ab 0a 00 b0 76 c8 c0 45 8b 4e 54 45 8b 46 50 49 8b 4e 48 <48> 8b 50 28 49 8b 04 24 89 5c 24 10 48 89 44 24 08 41 8b 46 58 [2206091.744144] RIP [] llog_osd_next_block+0x964/0xbc0 [obdclass] [2206091.751763] RSP