[ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-957.27.2.el7_lustre.pl2.x86_64 (sthiell@oak-rbh01) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Thu Nov 7 15:26:16 PST 2019 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.27.2.el7_lustre.pl2.x86_64 root=UUID=d849e912-a315-42cb-87d5-f5cdb3f9be1f ro crashkernel=auto nomodeset console=ttyS0,115200 LANG=en_US.UTF-8 [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008efff] usable [ 0.000000] BIOS-e820: [mem 0x000000000008f000-0x000000000008ffff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x0000000000090000-0x000000000009ffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000004f780fff] usable [ 0.000000] BIOS-e820: [mem 0x000000004f781000-0x0000000057789fff] reserved [ 0.000000] BIOS-e820: [mem 0x000000005778a000-0x000000006cacefff] usable [ 0.000000] BIOS-e820: [mem 0x000000006cacf000-0x000000006efcefff] reserved [ 0.000000] BIOS-e820: [mem 0x000000006efcf000-0x000000006fdfefff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000006fdff000-0x000000006fffefff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000006ffff000-0x000000006fffffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000070000000-0x000000008fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000107f37ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000107f380000-0x000000107fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000001080000000-0x000000207ff7ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000207ff80000-0x000000207fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000002080000000-0x000000307ff7ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000307ff80000-0x000000307fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000003080000000-0x000000407ff7ffff] usable [ 0.000000] BIOS-e820: [mem 0x000000407ff80000-0x000000407fffffff] reserved [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] e820: update [mem 0x3793a020-0x379dbc5f] usable ==> usable [ 0.000000] e820: update [mem 0x37908020-0x37939c5f] usable ==> usable [ 0.000000] e820: update [mem 0x378d6020-0x37907c5f] usable ==> usable [ 0.000000] e820: update [mem 0x378cd020-0x378d505f] usable ==> usable [ 0.000000] e820: update [mem 0x378a7020-0x378ccc5f] usable ==> usable [ 0.000000] e820: update [mem 0x3788e020-0x378a665f] usable ==> usable [ 0.000000] extended physical RAM map: [ 0.000000] reserve setup_data: [mem 0x0000000000000000-0x000000000008efff] usable [ 0.000000] reserve setup_data: [mem 0x000000000008f000-0x000000000008ffff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x0000000000090000-0x000000000009ffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000000100000-0x000000003788e01f] usable [ 0.000000] reserve setup_data: [mem 0x000000003788e020-0x00000000378a665f] usable [ 0.000000] reserve setup_data: [mem 0x00000000378a6660-0x00000000378a701f] usable [ 0.000000] reserve setup_data: [mem 0x00000000378a7020-0x00000000378ccc5f] usable [ 0.000000] reserve setup_data: [mem 0x00000000378ccc60-0x00000000378cd01f] usable [ 0.000000] reserve setup_data: [mem 0x00000000378cd020-0x00000000378d505f] usable [ 0.000000] reserve setup_data: [mem 0x00000000378d5060-0x00000000378d601f] usable [ 0.000000] reserve setup_data: [mem 0x00000000378d6020-0x0000000037907c5f] usable [ 0.000000] reserve setup_data: [mem 0x0000000037907c60-0x000000003790801f] usable [ 0.000000] reserve setup_data: [mem 0x0000000037908020-0x0000000037939c5f] usable [ 0.000000] reserve setup_data: [mem 0x0000000037939c60-0x000000003793a01f] usable [ 0.000000] reserve setup_data: [mem 0x000000003793a020-0x00000000379dbc5f] usable [ 0.000000] reserve setup_data: [mem 0x00000000379dbc60-0x000000004f780fff] usable [ 0.000000] reserve setup_data: [mem 0x000000004f781000-0x0000000057789fff] reserved [ 0.000000] reserve setup_data: [mem 0x000000005778a000-0x000000006cacefff] usable [ 0.000000] reserve setup_data: [mem 0x000000006cacf000-0x000000006efcefff] reserved [ 0.000000] reserve setup_data: [mem 0x000000006efcf000-0x000000006fdfefff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x000000006fdff000-0x000000006fffefff] ACPI data [ 0.000000] reserve setup_data: [mem 0x000000006ffff000-0x000000006fffffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000070000000-0x000000008fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000fec10000-0x00000000fec10fff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000fed80000-0x00000000fed80fff] reserved [ 0.000000] reserve setup_data: [mem 0x0000000100000000-0x000000107f37ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000107f380000-0x000000107fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000001080000000-0x000000207ff7ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000207ff80000-0x000000207fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000002080000000-0x000000307ff7ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000307ff80000-0x000000307fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000003080000000-0x000000407ff7ffff] usable [ 0.000000] reserve setup_data: [mem 0x000000407ff80000-0x000000407fffffff] reserved [ 0.000000] efi: EFI v2.50 by Dell Inc. [ 0.000000] efi: ACPI=0x6fffe000 ACPI 2.0=0x6fffe014 SMBIOS=0x6eab5000 SMBIOS 3.0=0x6eab3000 [ 0.000000] efi: mem00: type=3, attr=0xf, range=[0x0000000000000000-0x0000000000001000) (0MB) [ 0.000000] efi: mem01: type=2, attr=0xf, range=[0x0000000000001000-0x0000000000002000) (0MB) [ 0.000000] efi: mem02: type=7, attr=0xf, range=[0x0000000000002000-0x0000000000010000) (0MB) [ 0.000000] efi: mem03: type=3, attr=0xf, range=[0x0000000000010000-0x0000000000014000) (0MB) [ 0.000000] efi: mem04: type=7, attr=0xf, range=[0x0000000000014000-0x0000000000063000) (0MB) [ 0.000000] efi: mem05: type=3, attr=0xf, range=[0x0000000000063000-0x000000000008f000) (0MB) [ 0.000000] efi: mem06: type=10, attr=0xf, range=[0x000000000008f000-0x0000000000090000) (0MB) [ 0.000000] efi: mem07: type=3, attr=0xf, range=[0x0000000000090000-0x00000000000a0000) (0MB) [ 0.000000] efi: mem08: type=4, attr=0xf, range=[0x0000000000100000-0x0000000000120000) (0MB) [ 0.000000] efi: mem09: type=7, attr=0xf, range=[0x0000000000120000-0x0000000000c00000) (10MB) [ 0.000000] efi: mem10: type=3, attr=0xf, range=[0x0000000000c00000-0x0000000001000000) (4MB) [ 0.000000] efi: mem11: type=2, attr=0xf, range=[0x0000000001000000-0x000000000267b000) (22MB) [ 0.000000] efi: mem12: type=7, attr=0xf, range=[0x000000000267b000-0x0000000004000000) (25MB) [ 0.000000] efi: mem13: type=4, attr=0xf, range=[0x0000000004000000-0x000000000403b000) (0MB) [ 0.000000] efi: mem14: type=7, attr=0xf, range=[0x000000000403b000-0x000000003788e000) (824MB) [ 0.000000] efi: mem15: type=2, attr=0xf, range=[0x000000003788e000-0x000000004ede4000) (373MB) [ 0.000000] efi: mem16: type=7, attr=0xf, range=[0x000000004ede4000-0x000000004ede8000) (0MB) [ 0.000000] efi: mem17: type=2, attr=0xf, range=[0x000000004ede8000-0x000000004edea000) (0MB) [ 0.000000] efi: mem18: type=1, attr=0xf, range=[0x000000004edea000-0x000000004ef07000) (1MB) [ 0.000000] efi: mem19: type=2, attr=0xf, range=[0x000000004ef07000-0x000000004f026000) (1MB) [ 0.000000] efi: mem20: type=1, attr=0xf, range=[0x000000004f026000-0x000000004f135000) (1MB) [ 0.000000] efi: mem21: type=3, attr=0xf, range=[0x000000004f135000-0x000000004f781000) (6MB) [ 0.000000] efi: mem22: type=0, attr=0xf, range=[0x000000004f781000-0x000000005778a000) (128MB) [ 0.000000] efi: mem23: type=3, attr=0xf, range=[0x000000005778a000-0x000000005796e000) (1MB) [ 0.000000] efi: mem24: type=4, attr=0xf, range=[0x000000005796e000-0x000000005b4cf000) (59MB) [ 0.000000] efi: mem25: type=3, attr=0xf, range=[0x000000005b4cf000-0x000000005b8cf000) (4MB) [ 0.000000] efi: mem26: type=7, attr=0xf, range=[0x000000005b8cf000-0x000000006531c000) (154MB) [ 0.000000] efi: mem27: type=4, attr=0xf, range=[0x000000006531c000-0x0000000065329000) (0MB) [ 0.000000] efi: mem28: type=7, attr=0xf, range=[0x0000000065329000-0x000000006532d000) (0MB) [ 0.000000] efi: mem29: type=4, attr=0xf, range=[0x000000006532d000-0x0000000065957000) (6MB) [ 0.000000] efi: mem30: type=7, attr=0xf, range=[0x0000000065957000-0x0000000065958000) (0MB) [ 0.000000] efi: mem31: type=4, attr=0xf, range=[0x0000000065958000-0x0000000065961000) (0MB) [ 0.000000] efi: mem32: type=7, attr=0xf, range=[0x0000000065961000-0x0000000065962000) (0MB) [ 0.000000] efi: mem33: type=4, attr=0xf, range=[0x0000000065962000-0x0000000065974000) (0MB) [ 0.000000] efi: mem34: type=7, attr=0xf, range=[0x0000000065974000-0x0000000065975000) (0MB) [ 0.000000] efi: mem35: type=4, attr=0xf, range=[0x0000000065975000-0x0000000065976000) (0MB) [ 0.000000] efi: mem36: type=7, attr=0xf, range=[0x0000000065976000-0x0000000065977000) (0MB) [ 0.000000] efi: mem37: type=4, attr=0xf, range=[0x0000000065977000-0x000000006597c000) (0MB) [ 0.000000] efi: mem38: type=7, attr=0xf, range=[0x000000006597c000-0x000000006597f000) (0MB) [ 0.000000] efi: mem39: type=4, attr=0xf, range=[0x000000006597f000-0x000000006599b000) (0MB) [ 0.000000] efi: mem40: type=7, attr=0xf, range=[0x000000006599b000-0x000000006599d000) (0MB) [ 0.000000] efi: mem41: type=4, attr=0xf, range=[0x000000006599d000-0x00000000659a1000) (0MB) [ 0.000000] efi: mem42: type=7, attr=0xf, range=[0x00000000659a1000-0x00000000659a2000) (0MB) [ 0.000000] efi: mem43: type=4, attr=0xf, range=[0x00000000659a2000-0x00000000659a6000) (0MB) [ 0.000000] efi: mem44: type=7, attr=0xf, range=[0x00000000659a6000-0x00000000659a7000) (0MB) [ 0.000000] efi: mem45: type=4, attr=0xf, range=[0x00000000659a7000-0x00000000659ac000) (0MB) [ 0.000000] efi: mem46: type=7, attr=0xf, range=[0x00000000659ac000-0x00000000659ad000) (0MB) [ 0.000000] efi: mem47: type=4, attr=0xf, range=[0x00000000659ad000-0x00000000659b0000) (0MB) [ 0.000000] efi: mem48: type=7, attr=0xf, range=[0x00000000659b0000-0x00000000659b2000) (0MB) [ 0.000000] efi: mem49: type=4, attr=0xf, range=[0x00000000659b2000-0x00000000659bc000) (0MB) [ 0.000000] efi: mem50: type=7, attr=0xf, range=[0x00000000659bc000-0x00000000659bd000) (0MB) [ 0.000000] efi: mem51: type=4, attr=0xf, range=[0x00000000659bd000-0x00000000659c0000) (0MB) [ 0.000000] efi: mem52: type=7, attr=0xf, range=[0x00000000659c0000-0x00000000659c1000) (0MB) [ 0.000000] efi: mem53: type=4, attr=0xf, range=[0x00000000659c1000-0x00000000659d0000) (0MB) [ 0.000000] efi: mem54: type=7, attr=0xf, range=[0x00000000659d0000-0x00000000659d1000) (0MB) [ 0.000000] efi: mem55: type=4, attr=0xf, range=[0x00000000659d1000-0x0000000065a58000) (0MB) [ 0.000000] efi: mem56: type=7, attr=0xf, range=[0x0000000065a58000-0x0000000065a59000) (0MB) [ 0.000000] efi: mem57: type=4, attr=0xf, range=[0x0000000065a59000-0x0000000065ceb000) (2MB) [ 0.000000] efi: mem58: type=7, attr=0xf, range=[0x0000000065ceb000-0x0000000065cec000) (0MB) [ 0.000000] efi: mem59: type=4, attr=0xf, range=[0x0000000065cec000-0x0000000065d1c000) (0MB) [ 0.000000] efi: mem60: type=7, attr=0xf, range=[0x0000000065d1c000-0x0000000065d1d000) (0MB) [ 0.000000] efi: mem61: type=4, attr=0xf, range=[0x0000000065d1d000-0x0000000065d30000) (0MB) [ 0.000000] efi: mem62: type=7, attr=0xf, range=[0x0000000065d30000-0x0000000065d31000) (0MB) [ 0.000000] efi: mem63: type=4, attr=0xf, range=[0x0000000065d31000-0x0000000065d73000) (0MB) [ 0.000000] efi: mem64: type=7, attr=0xf, range=[0x0000000065d73000-0x0000000065d74000) (0MB) [ 0.000000] efi: mem65: type=4, attr=0xf, range=[0x0000000065d74000-0x0000000065da8000) (0MB) [ 0.000000] efi: mem66: type=7, attr=0xf, range=[0x0000000065da8000-0x0000000065da9000) (0MB) [ 0.000000] efi: mem67: type=4, attr=0xf, range=[0x0000000065da9000-0x0000000065dc5000) (0MB) [ 0.000000] efi: mem68: type=7, attr=0xf, range=[0x0000000065dc5000-0x0000000065dc6000) (0MB) [ 0.000000] efi: mem69: type=4, attr=0xf, range=[0x0000000065dc6000-0x0000000065dd4000) (0MB) [ 0.000000] efi: mem70: type=7, attr=0xf, range=[0x0000000065dd4000-0x0000000065dd5000) (0MB) [ 0.000000] efi: mem71: type=4, attr=0xf, range=[0x0000000065dd5000-0x0000000065df4000) (0MB) [ 0.000000] efi: mem72: type=7, attr=0xf, range=[0x0000000065df4000-0x0000000065df5000) (0MB) [ 0.000000] efi: mem73: type=4, attr=0xf, range=[0x0000000065df5000-0x0000000065e01000) (0MB) [ 0.000000] efi: mem74: type=7, attr=0xf, range=[0x0000000065e01000-0x0000000065e02000) (0MB) [ 0.000000] efi: mem75: type=4, attr=0xf, range=[0x0000000065e02000-0x0000000065e08000) (0MB) [ 0.000000] efi: mem76: type=7, attr=0xf, range=[0x0000000065e08000-0x0000000065e09000) (0MB) [ 0.000000] efi: mem77: type=4, attr=0xf, range=[0x0000000065e09000-0x0000000065e64000) (0MB) [ 0.000000] efi: mem78: type=7, attr=0xf, range=[0x0000000065e64000-0x0000000065e66000) (0MB) [ 0.000000] efi: mem79: type=4, attr=0xf, range=[0x0000000065e66000-0x0000000065e84000) (0MB) [ 0.000000] efi: mem80: type=7, attr=0xf, range=[0x0000000065e84000-0x0000000065e85000) (0MB) [ 0.000000] efi: mem81: type=4, attr=0xf, range=[0x0000000065e85000-0x0000000065e95000) (0MB) [ 0.000000] efi: mem82: type=7, attr=0xf, range=[0x0000000065e95000-0x0000000065e96000) (0MB) [ 0.000000] efi: mem83: type=4, attr=0xf, range=[0x0000000065e96000-0x0000000065eb1000) (0MB) [ 0.000000] efi: mem84: type=7, attr=0xf, range=[0x0000000065eb1000-0x0000000065eb2000) (0MB) [ 0.000000] efi: mem85: type=4, attr=0xf, range=[0x0000000065eb2000-0x0000000065ec1000) (0MB) [ 0.000000] efi: mem86: type=7, attr=0xf, range=[0x0000000065ec1000-0x0000000065ec2000) (0MB) [ 0.000000] efi: mem87: type=4, attr=0xf, range=[0x0000000065ec2000-0x0000000065eca000) (0MB) [ 0.000000] efi: mem88: type=7, attr=0xf, range=[0x0000000065eca000-0x0000000065ecb000) (0MB) [ 0.000000] efi: mem89: type=4, attr=0xf, range=[0x0000000065ecb000-0x000000006b8cf000) (90MB) [ 0.000000] efi: mem90: type=7, attr=0xf, range=[0x000000006b8cf000-0x000000006b8d0000) (0MB) [ 0.000000] efi: mem91: type=3, attr=0xf, range=[0x000000006b8d0000-0x000000006cacf000) (17MB) [ 0.000000] efi: mem92: type=6, attr=0x800000000000000f, range=[0x000000006cacf000-0x000000006cbcf000) (1MB) [ 0.000000] efi: mem93: type=5, attr=0x800000000000000f, range=[0x000000006cbcf000-0x000000006cdcf000) (2MB) [ 0.000000] efi: mem94: type=0, attr=0xf, range=[0x000000006cdcf000-0x000000006efcf000) (34MB) [ 0.000000] efi: mem95: type=10, attr=0xf, range=[0x000000006efcf000-0x000000006fdff000) (14MB) [ 0.000000] efi: mem96: type=9, attr=0xf, range=[0x000000006fdff000-0x000000006ffff000) (2MB) [ 0.000000] efi: mem97: type=4, attr=0xf, range=[0x000000006ffff000-0x0000000070000000) (0MB) [ 0.000000] efi: mem98: type=7, attr=0xf, range=[0x0000000100000000-0x000000107f380000) (63475MB) [ 0.000000] efi: mem99: type=7, attr=0xf, range=[0x0000001080000000-0x000000207ff80000) (65535MB) [ 0.000000] efi: mem100: type=7, attr=0xf, range=[0x0000002080000000-0x000000307ff80000) (65535MB) [ 0.000000] efi: mem101: type=7, attr=0xf, range=[0x0000003080000000-0x000000407ff80000) (65535MB) [ 0.000000] efi: mem102: type=0, attr=0x9, range=[0x0000000070000000-0x0000000080000000) (256MB) [ 0.000000] efi: mem103: type=11, attr=0x800000000000000f, range=[0x0000000080000000-0x0000000090000000) (256MB) [ 0.000000] efi: mem104: type=11, attr=0x800000000000000f, range=[0x00000000fec10000-0x00000000fec11000) (0MB) [ 0.000000] efi: mem105: type=11, attr=0x800000000000000f, range=[0x00000000fed80000-0x00000000fed81000) (0MB) [ 0.000000] efi: mem106: type=0, attr=0x0, range=[0x000000107f380000-0x0000001080000000) (12MB) [ 0.000000] efi: mem107: type=0, attr=0x0, range=[0x000000207ff80000-0x0000002080000000) (0MB) [ 0.000000] efi: mem108: type=0, attr=0x0, range=[0x000000307ff80000-0x0000003080000000) (0MB) [ 0.000000] efi: mem109: type=0, attr=0x0, range=[0x000000407ff80000-0x0000004080000000) (0MB) [ 0.000000] SMBIOS 3.2.0 present. [ 0.000000] DMI: Dell Inc. PowerEdge R6415/07YXFK, BIOS 1.10.6 08/15/2019 [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable [ 0.000000] e820: last_pfn = 0x407ff80 max_arch_pfn = 0x400000000 [ 0.000000] MTRR default type: uncachable [ 0.000000] MTRR fixed ranges enabled: [ 0.000000] 00000-9FFFF write-back [ 0.000000] A0000-FFFFF uncachable [ 0.000000] MTRR variable ranges enabled: [ 0.000000] 0 base 0000FF000000 mask FFFFFF000000 write-protect [ 0.000000] 1 base 000000000000 mask FFFF80000000 write-back [ 0.000000] 2 base 000070000000 mask FFFFF0000000 uncachable [ 0.000000] 3 disabled [ 0.000000] 4 disabled [ 0.000000] 5 disabled [ 0.000000] 6 disabled [ 0.000000] 7 disabled [ 0.000000] TOM2: 0000004080000000 aka 264192M [ 0.000000] PAT configuration [0-7]: WB WC UC- UC WB WP UC- UC [ 0.000000] e820: last_pfn = 0x70000 max_arch_pfn = 0x400000000 [ 0.000000] Base memory trampoline at [ffff94f380099000] 99000 size 24576 [ 0.000000] Using GB pages for direct mapping [ 0.000000] BRK [0x3a8fe53000, 0x3a8fe53fff] PGTABLE [ 0.000000] BRK [0x3a8fe54000, 0x3a8fe54fff] PGTABLE [ 0.000000] BRK [0x3a8fe55000, 0x3a8fe55fff] PGTABLE [ 0.000000] BRK [0x3a8fe56000, 0x3a8fe56fff] PGTABLE [ 0.000000] BRK [0x3a8fe57000, 0x3a8fe57fff] PGTABLE [ 0.000000] BRK [0x3a8fe58000, 0x3a8fe58fff] PGTABLE [ 0.000000] BRK [0x3a8fe59000, 0x3a8fe59fff] PGTABLE [ 0.000000] BRK [0x3a8fe5a000, 0x3a8fe5afff] PGTABLE [ 0.000000] BRK [0x3a8fe5b000, 0x3a8fe5bfff] PGTABLE [ 0.000000] BRK [0x3a8fe5c000, 0x3a8fe5cfff] PGTABLE [ 0.000000] BRK [0x3a8fe5d000, 0x3a8fe5dfff] PGTABLE [ 0.000000] BRK [0x3a8fe5e000, 0x3a8fe5efff] PGTABLE [ 0.000000] RAMDISK: [mem 0x379dc000-0x38d22fff] [ 0.000000] Early table checksum verification disabled [ 0.000000] ACPI: RSDP 000000006fffe014 00024 (v02 DELL ) [ 0.000000] ACPI: XSDT 000000006fffd0e8 000AC (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: FACP 000000006fff0000 00114 (v06 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: DSDT 000000006ffdc000 1038C (v02 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: FACS 000000006fdd3000 00040 [ 0.000000] ACPI: SSDT 000000006fffc000 000D2 (v02 DELL PE_SC3 00000002 MSFT 04000000) [ 0.000000] ACPI: BERT 000000006fffb000 00030 (v01 DELL BERT 00000001 DELL 00000001) [ 0.000000] ACPI: HEST 000000006fffa000 006DC (v01 DELL HEST 00000001 DELL 00000001) [ 0.000000] ACPI: SSDT 000000006fff9000 00294 (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: SRAT 000000006fff8000 00420 (v03 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: MSCT 000000006fff7000 0004E (v01 DELL PE_SC3 00000000 AMD 00000001) [ 0.000000] ACPI: SLIT 000000006fff6000 0003C (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: CRAT 000000006fff3000 02DC0 (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: EINJ 000000006fff2000 00150 (v01 DELL PE_SC3 00000001 AMD 00000001) [ 0.000000] ACPI: SLIC 000000006fff1000 00024 (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: HPET 000000006ffef000 00038 (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: APIC 000000006ffee000 004B2 (v03 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: MCFG 000000006ffed000 0003C (v01 DELL PE_SC3 00000002 DELL 00000001) [ 0.000000] ACPI: SSDT 000000006ffdb000 00629 (v02 DELL xhc_port 00000001 INTL 20170119) [ 0.000000] ACPI: IVRS 000000006ffda000 00210 (v02 DELL PE_SC3 00000001 AMD 00000000) [ 0.000000] ACPI: SSDT 000000006ffd8000 01658 (v01 AMD CPMCMN 00000001 INTL 20170119) [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] SRAT: PXM 0 -> APIC 0x00 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x01 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x02 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x03 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x04 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x05 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x08 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x09 -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0a -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0b -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0c -> Node 0 [ 0.000000] SRAT: PXM 0 -> APIC 0x0d -> Node 0 [ 0.000000] SRAT: PXM 1 -> APIC 0x10 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x11 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x12 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x13 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x14 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x15 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x18 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x19 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1a -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1b -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1c -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 0x1d -> Node 1 [ 0.000000] SRAT: PXM 2 -> APIC 0x20 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x21 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x22 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x23 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x24 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x25 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x28 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x29 -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2a -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2b -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2c -> Node 2 [ 0.000000] SRAT: PXM 2 -> APIC 0x2d -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 0x30 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x31 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x32 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x33 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x34 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x35 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x38 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x39 -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3a -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3b -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3c -> Node 3 [ 0.000000] SRAT: PXM 3 -> APIC 0x3d -> Node 3 [ 0.000000] SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff] [ 0.000000] SRAT: Node 0 PXM 0 [mem 0x00100000-0x7fffffff] [ 0.000000] SRAT: Node 0 PXM 0 [mem 0x100000000-0x107fffffff] [ 0.000000] SRAT: Node 1 PXM 1 [mem 0x1080000000-0x207fffffff] [ 0.000000] SRAT: Node 2 PXM 2 [mem 0x2080000000-0x307fffffff] [ 0.000000] SRAT: Node 3 PXM 3 [mem 0x3080000000-0x407fffffff] [ 0.000000] NUMA: Initialized distance table, cnt=4 [ 0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0x7fffffff] -> [mem 0x00000000-0x7fffffff] [ 0.000000] NUMA: Node 0 [mem 0x00000000-0x7fffffff] + [mem 0x100000000-0x107fffffff] -> [mem 0x00000000-0x107fffffff] [ 0.000000] NODE_DATA(0) allocated [mem 0x107f359000-0x107f37ffff] [ 0.000000] NODE_DATA(1) allocated [mem 0x207ff59000-0x207ff7ffff] [ 0.000000] NODE_DATA(2) allocated [mem 0x307ff59000-0x307ff7ffff] [ 0.000000] NODE_DATA(3) allocated [mem 0x407ff58000-0x407ff7efff] [ 0.000000] Reserving 176MB of memory at 704MB for crashkernel (System RAM: 261692MB) [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x00001000-0x00ffffff] [ 0.000000] DMA32 [mem 0x01000000-0xffffffff] [ 0.000000] Normal [mem 0x100000000-0x407ff7ffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00001000-0x0008efff] [ 0.000000] node 0: [mem 0x00090000-0x0009ffff] [ 0.000000] node 0: [mem 0x00100000-0x4f780fff] [ 0.000000] node 0: [mem 0x5778a000-0x6cacefff] [ 0.000000] node 0: [mem 0x6ffff000-0x6fffffff] [ 0.000000] node 0: [mem 0x100000000-0x107f37ffff] [ 0.000000] node 1: [mem 0x1080000000-0x207ff7ffff] [ 0.000000] node 2: [mem 0x2080000000-0x307ff7ffff] [ 0.000000] node 3: [mem 0x3080000000-0x407ff7ffff] [ 0.000000] Initmem setup node 0 [mem 0x00001000-0x107f37ffff] [ 0.000000] On node 0 totalpages: 16661989 [ 0.000000] DMA zone: 64 pages used for memmap [ 0.000000] DMA zone: 1126 pages reserved [ 0.000000] DMA zone: 3998 pages, LIFO batch:0 [ 0.000000] DMA32 zone: 6380 pages used for memmap [ 0.000000] DMA32 zone: 408263 pages, LIFO batch:31 [ 0.000000] Normal zone: 253902 pages used for memmap [ 0.000000] Normal zone: 16249728 pages, LIFO batch:31 [ 0.000000] Initmem setup node 1 [mem 0x1080000000-0x207ff7ffff] [ 0.000000] On node 1 totalpages: 16777088 [ 0.000000] Normal zone: 262142 pages used for memmap [ 0.000000] Normal zone: 16777088 pages, LIFO batch:31 [ 0.000000] Initmem setup node 2 [mem 0x2080000000-0x307ff7ffff] [ 0.000000] On node 2 totalpages: 16777088 [ 0.000000] Normal zone: 262142 pages used for memmap [ 0.000000] Normal zone: 16777088 pages, LIFO batch:31 [ 0.000000] Initmem setup node 3 [mem 0x3080000000-0x407ff7ffff] [ 0.000000] On node 3 totalpages: 16777088 [ 0.000000] Normal zone: 262142 pages used for memmap [ 0.000000] Normal zone: 16777088 pages, LIFO batch:31 [ 0.000000] ACPI: PM-Timer IO Port: 0x408 [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x20] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x30] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x08] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x18] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x28] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x38] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x02] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x12] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x22] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x32] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x1a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x2a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x3a] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x10] lapic_id[0x04] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x11] lapic_id[0x14] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x12] lapic_id[0x24] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x13] lapic_id[0x34] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x14] lapic_id[0x0c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x15] lapic_id[0x1c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x16] lapic_id[0x2c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x17] lapic_id[0x3c] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x18] lapic_id[0x01] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x19] lapic_id[0x11] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1a] lapic_id[0x21] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1b] lapic_id[0x31] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1c] lapic_id[0x09] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1d] lapic_id[0x19] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1e] lapic_id[0x29] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x1f] lapic_id[0x39] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x20] lapic_id[0x03] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x21] lapic_id[0x13] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x22] lapic_id[0x23] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x23] lapic_id[0x33] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x24] lapic_id[0x0b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x25] lapic_id[0x1b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x26] lapic_id[0x2b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x27] lapic_id[0x3b] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x28] lapic_id[0x05] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x29] lapic_id[0x15] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2a] lapic_id[0x25] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2b] lapic_id[0x35] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2c] lapic_id[0x0d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2d] lapic_id[0x1d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2e] lapic_id[0x2d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x2f] lapic_id[0x3d] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x30] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x31] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x32] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x33] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x34] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x35] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x36] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x37] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x38] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x39] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x3f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x40] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x41] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x42] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x43] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x44] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x45] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x46] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x47] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x48] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x49] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x4f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x50] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x51] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x52] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x53] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x54] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x55] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x56] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x57] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x58] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x59] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x5f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x60] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x61] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x62] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x63] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x64] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x65] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x66] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x67] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x68] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x69] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x6f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x70] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x71] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x72] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x73] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x74] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x75] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x76] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x77] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x78] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x79] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7a] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7b] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7c] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7d] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7e] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x7f] lapic_id[0x00] disabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1]) [ 0.000000] ACPI: IOAPIC (id[0x80] address[0xfec00000] gsi_base[0]) [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: IOAPIC (id[0x81] address[0xfd880000] gsi_base[24]) [ 0.000000] IOAPIC[1]: apic_id 129, version 33, address 0xfd880000, GSI 24-55 [ 0.000000] ACPI: IOAPIC (id[0x82] address[0xe0900000] gsi_base[56]) [ 0.000000] IOAPIC[2]: apic_id 130, version 33, address 0xe0900000, GSI 56-87 [ 0.000000] ACPI: IOAPIC (id[0x83] address[0xc5900000] gsi_base[88]) [ 0.000000] IOAPIC[3]: apic_id 131, version 33, address 0xc5900000, GSI 88-119 [ 0.000000] ACPI: IOAPIC (id[0x84] address[0xaa900000] gsi_base[120]) [ 0.000000] IOAPIC[4]: apic_id 132, version 33, address 0xaa900000, GSI 120-151 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) [ 0.000000] ACPI: IRQ0 used by override. [ 0.000000] ACPI: IRQ9 used by override. [ 0.000000] Using ACPI (MADT) for SMP configuration information [ 0.000000] ACPI: HPET id: 0x10228201 base: 0xfed00000 [ 0.000000] smpboot: Allowing 128 CPUs, 80 hotplug CPUs [ 0.000000] PM: Registered nosave memory: [mem 0x0008f000-0x0008ffff] [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000fffff] [ 0.000000] PM: Registered nosave memory: [mem 0x3788e000-0x3788efff] [ 0.000000] PM: Registered nosave memory: [mem 0x378a6000-0x378a6fff] [ 0.000000] PM: Registered nosave memory: [mem 0x378a7000-0x378a7fff] [ 0.000000] PM: Registered nosave memory: [mem 0x378cc000-0x378ccfff] [ 0.000000] PM: Registered nosave memory: [mem 0x378cd000-0x378cdfff] [ 0.000000] PM: Registered nosave memory: [mem 0x378d5000-0x378d5fff] [ 0.000000] PM: Registered nosave memory: [mem 0x378d6000-0x378d6fff] [ 0.000000] PM: Registered nosave memory: [mem 0x37907000-0x37907fff] [ 0.000000] PM: Registered nosave memory: [mem 0x37908000-0x37908fff] [ 0.000000] PM: Registered nosave memory: [mem 0x37939000-0x37939fff] [ 0.000000] PM: Registered nosave memory: [mem 0x3793a000-0x3793afff] [ 0.000000] PM: Registered nosave memory: [mem 0x379db000-0x379dbfff] [ 0.000000] PM: Registered nosave memory: [mem 0x4f781000-0x57789fff] [ 0.000000] PM: Registered nosave memory: [mem 0x6cacf000-0x6efcefff] [ 0.000000] PM: Registered nosave memory: [mem 0x6efcf000-0x6fdfefff] [ 0.000000] PM: Registered nosave memory: [mem 0x6fdff000-0x6fffefff] [ 0.000000] PM: Registered nosave memory: [mem 0x70000000-0x8fffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x90000000-0xfec0ffff] [ 0.000000] PM: Registered nosave memory: [mem 0xfec10000-0xfec10fff] [ 0.000000] PM: Registered nosave memory: [mem 0xfec11000-0xfed7ffff] [ 0.000000] PM: Registered nosave memory: [mem 0xfed80000-0xfed80fff] [ 0.000000] PM: Registered nosave memory: [mem 0xfed81000-0xffffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x107f380000-0x107fffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x207ff80000-0x207fffffff] [ 0.000000] PM: Registered nosave memory: [mem 0x307ff80000-0x307fffffff] [ 0.000000] e820: [mem 0x90000000-0xfec0ffff] available for PCI devices [ 0.000000] Booting paravirtualized kernel on bare hardware [ 0.000000] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:128 nr_cpu_ids:128 nr_node_ids:4 [ 0.000000] PERCPU: Embedded 38 pages/cpu @ffff9503bee00000 s118784 r8192 d28672 u262144 [ 0.000000] pcpu-alloc: s118784 r8192 d28672 u262144 alloc=1*2097152 [ 0.000000] pcpu-alloc: [0] 000 004 008 012 016 020 024 028 [ 0.000000] pcpu-alloc: [0] 032 036 040 044 048 052 056 060 [ 0.000000] pcpu-alloc: [0] 064 068 072 076 080 084 088 092 [ 0.000000] pcpu-alloc: [0] 096 100 104 108 112 116 120 124 [ 0.000000] pcpu-alloc: [1] 001 005 009 013 017 021 025 029 [ 0.000000] pcpu-alloc: [1] 033 037 041 045 049 053 057 061 [ 0.000000] pcpu-alloc: [1] 065 069 073 077 081 085 089 093 [ 0.000000] pcpu-alloc: [1] 097 101 105 109 113 117 121 125 [ 0.000000] pcpu-alloc: [2] 002 006 010 014 018 022 026 030 [ 0.000000] pcpu-alloc: [2] 034 038 042 046 050 054 058 062 [ 0.000000] pcpu-alloc: [2] 066 070 074 078 082 086 090 094 [ 0.000000] pcpu-alloc: [2] 098 102 106 110 114 118 122 126 [ 0.000000] pcpu-alloc: [3] 003 007 011 015 019 023 027 031 [ 0.000000] pcpu-alloc: [3] 035 039 043 047 051 055 059 063 [ 0.000000] pcpu-alloc: [3] 067 071 075 079 083 087 091 095 [ 0.000000] pcpu-alloc: [3] 099 103 107 111 115 119 123 127 [ 0.000000] Built 4 zonelists in Zone order, mobility grouping on. Total pages: 65945355 [ 0.000000] Policy zone: Normal [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.27.2.el7_lustre.pl2.x86_64 root=UUID=d849e912-a315-42cb-87d5-f5cdb3f9be1f ro crashkernel=auto nomodeset console=ttyS0,115200 LANG=en_US.UTF-8 [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100 [ 0.000000] xsave: enabled xstate_bv 0x7, cntxt size 0x340 using standard form [ 0.000000] Memory: 9570336k/270532096k available (7676k kernel code, 2559084k absent, 4697628k reserved, 6045k data, 1876k init) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=4 [ 0.000000] Hierarchical RCU implementation. [ 0.000000] RCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=128. [ 0.000000] NR_IRQS:327936 nr_irqs:3624 0 [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [ttyS0] enabled [ 0.000000] allocated 1072693248 bytes of page_cgroup [ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups [ 0.000000] Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl [ 0.000000] hpet clockevent registered [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] tsc: Detected 1996.233 MHz processor [ 0.000056] Calibrating delay loop (skipped), value calculated using timer frequency.. 3992.46 BogoMIPS (lpj=1996233) [ 0.010704] pid_max: default: 131072 minimum: 1024 [ 0.016308] Security Framework initialized [ 0.020425] SELinux: Initializing. [ 0.023985] SELinux: Starting in permissive mode [ 0.023986] Yama: becoming mindful. [ 0.044087] Dentry cache hash table entries: 33554432 (order: 16, 268435456 bytes) [ 0.099821] Inode-cache hash table entries: 16777216 (order: 15, 134217728 bytes) [ 0.127532] Mount-cache hash table entries: 524288 (order: 10, 4194304 bytes) [ 0.134928] Mountpoint-cache hash table entries: 524288 (order: 10, 4194304 bytes) [ 0.144058] Initializing cgroup subsys memory [ 0.148455] Initializing cgroup subsys devices [ 0.152917] Initializing cgroup subsys freezer [ 0.157371] Initializing cgroup subsys net_cls [ 0.161827] Initializing cgroup subsys blkio [ 0.166106] Initializing cgroup subsys perf_event [ 0.170831] Initializing cgroup subsys hugetlb [ 0.175286] Initializing cgroup subsys pids [ 0.179480] Initializing cgroup subsys net_prio [ 0.184094] tseg: 0070000000 [ 0.189722] LVT offset 2 assigned for vector 0xf4 [ 0.194450] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 512 [ 0.200470] Last level dTLB entries: 4KB 1536, 2MB 1536, 4MB 768 [ 0.206484] tlb_flushall_shift: 6 [ 0.209834] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp [ 0.219408] FEATURE SPEC_CTRL Not Present [ 0.223428] FEATURE IBPB_SUPPORT Present [ 0.227364] Spectre V2 : Enabling Indirect Branch Prediction Barrier [ 0.233800] Spectre V2 : Mitigation: Full retpoline [ 0.239269] Freeing SMP alternatives: 28k freed [ 0.245706] ACPI: Core revision 20130517 [ 0.254393] ACPI: All ACPI Tables successfully acquired [ 0.265991] ftrace: allocating 29216 entries in 115 pages [ 0.606235] Switched APIC routing to physical flat. [ 0.613165] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.629173] smpboot: CPU0: AMD EPYC 7401P 24-Core Processor (fam: 17, model: 01, stepping: 02) [ 0.711613] random: fast init done [ 0.741612] APIC calibration not consistent with PM-Timer: 101ms instead of 100ms [ 0.749095] APIC delta adjusted to PM-Timer: 623827 (636297) [ 0.754786] Performance Events: Fam17h core perfctr, AMD PMU driver. [ 0.761221] ... version: 0 [ 0.765231] ... bit width: 48 [ 0.769332] ... generic registers: 6 [ 0.773346] ... value mask: 0000ffffffffffff [ 0.778656] ... max period: 00007fffffffffff [ 0.783969] ... fixed-purpose events: 0 [ 0.787983] ... event mask: 000000000000003f [ 0.796320] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 0.804403] smpboot: Booting Node 1, Processors #1 OK [ 0.817606] smpboot: Booting Node 2, Processors #2 OK [ 0.830809] smpboot: Booting Node 3, Processors #3 OK [ 0.844016] smpboot: Booting Node 0, Processors #4 OK [ 0.857197] smpboot: Booting Node 1, Processors #5 OK [ 0.870378] smpboot: Booting Node 2, Processors #6 OK [ 0.883550] smpboot: Booting Node 3, Processors #7 OK [ 0.896734] smpboot: Booting Node 0, Processors #8 OK [ 0.910134] smpboot: Booting Node 1, Processors #9 OK [ 0.923332] smpboot: Booting Node 2, Processors #10 OK [ 0.936609] smpboot: Booting Node 3, Processors #11 OK [ 0.949882] smpboot: Booting Node 0, Processors #12 OK [ 0.963151] smpboot: Booting Node 1, Processors #13 OK [ 0.976424] smpboot: Booting Node 2, Processors #14 OK [ 0.989694] smpboot: Booting Node 3, Processors #15 OK [ 1.002965] smpboot: Booting Node 0, Processors #16 OK [ 1.016352] smpboot: Booting Node 1, Processors #17 OK [ 1.029630] smpboot: Booting Node 2, Processors #18 OK [ 1.042910] smpboot: Booting Node 3, Processors #19 OK [ 1.056181] smpboot: Booting Node 0, Processors #20 OK [ 1.069446] smpboot: Booting Node 1, Processors #21 OK [ 1.082715] smpboot: Booting Node 2, Processors #22 OK [ 1.095994] smpboot: Booting Node 3, Processors #23 OK [ 1.109270] smpboot: Booting Node 0, Processors #24 OK [ 1.123009] smpboot: Booting Node 1, Processors #25 OK [ 1.136257] smpboot: Booting Node 2, Processors #26 OK [ 1.149500] smpboot: Booting Node 3, Processors #27 OK [ 1.162734] smpboot: Booting Node 0, Processors #28 OK [ 1.175963] smpboot: Booting Node 1, Processors #29 OK [ 1.189197] smpboot: Booting Node 2, Processors #30 OK [ 1.202430] smpboot: Booting Node 3, Processors #31 OK [ 1.215655] smpboot: Booting Node 0, Processors #32 OK [ 1.228995] smpboot: Booting Node 1, Processors #33 OK [ 1.242229] smpboot: Booting Node 2, Processors #34 OK [ 1.255471] smpboot: Booting Node 3, Processors #35 OK [ 1.268696] smpboot: Booting Node 0, Processors #36 OK [ 1.281933] smpboot: Booting Node 1, Processors #37 OK [ 1.295277] smpboot: Booting Node 2, Processors #38 OK [ 1.308520] smpboot: Booting Node 3, Processors #39 OK [ 1.321744] smpboot: Booting Node 0, Processors #40 OK [ 1.335086] smpboot: Booting Node 1, Processors #41 OK [ 1.348422] smpboot: Booting Node 2, Processors #42 OK [ 1.361655] smpboot: Booting Node 3, Processors #43 OK [ 1.374893] smpboot: Booting Node 0, Processors #44 OK [ 1.388125] smpboot: Booting Node 1, Processors #45 OK [ 1.401463] smpboot: Booting Node 2, Processors #46 OK [ 1.414705] smpboot: Booting Node 3, Processors #47 [ 1.427406] Brought up 48 CPUs [ 1.430663] smpboot: Max logical packages: 3 [ 1.434939] smpboot: Total of 48 processors activated (191638.36 BogoMIPS) [ 1.723794] node 0 initialised, 15462980 pages in 274ms [ 1.732440] node 2 initialised, 15989367 pages in 278ms [ 1.736111] node 3 initialised, 15984546 pages in 282ms [ 1.739762] node 1 initialised, 15989367 pages in 285ms [ 1.748672] devtmpfs: initialized [ 1.774461] EVM: security.selinux [ 1.777777] EVM: security.ima [ 1.780751] EVM: security.capability [ 1.784427] PM: Registering ACPI NVS region [mem 0x0008f000-0x0008ffff] (4096 bytes) [ 1.792167] PM: Registering ACPI NVS region [mem 0x6efcf000-0x6fdfefff] (14876672 bytes) [ 1.801801] atomic64 test passed for x86-64 platform with CX8 and with SSE [ 1.808676] pinctrl core: initialized pinctrl subsystem [ 1.814010] RTC time: 14:44:32, date: 12/10/19 [ 1.818613] NET: Registered protocol family 16 [ 1.823422] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it [ 1.830993] ACPI: bus type PCI registered [ 1.835006] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 1.841589] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) [ 1.850894] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 [ 1.857685] PCI: Using configuration type 1 for base access [ 1.863269] PCI: Dell System detected, enabling pci=bfsort. [ 1.878999] ACPI: Added _OSI(Module Device) [ 1.883185] ACPI: Added _OSI(Processor Device) [ 1.887629] ACPI: Added _OSI(3.0 _SCP Extensions) [ 1.892335] ACPI: Added _OSI(Processor Aggregator Device) [ 1.897735] ACPI: Added _OSI(Linux-Dell-Video) [ 1.902999] ACPI: EC: Look up EC in DSDT [ 1.903981] ACPI: Executed 2 blocks of module-level executable AML code [ 1.916042] ACPI: Interpreter enabled [ 1.919713] ACPI: (supports S0 S5) [ 1.923120] ACPI: Using IOAPIC for interrupt routing [ 1.928298] HEST: Table parsing has been initialized. [ 1.933349] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 1.942498] ACPI: Enabled 1 GPEs in block 00 to 1F [ 1.954162] ACPI: PCI Interrupt Link [LNKA] (IRQs 4 5 7 10 11 14 15) *0 [ 1.961073] ACPI: PCI Interrupt Link [LNKB] (IRQs 4 5 7 10 11 14 15) *0 [ 1.967977] ACPI: PCI Interrupt Link [LNKC] (IRQs 4 5 7 10 11 14 15) *0 [ 1.974887] ACPI: PCI Interrupt Link [LNKD] (IRQs 4 5 7 10 11 14 15) *0 [ 1.981794] ACPI: PCI Interrupt Link [LNKE] (IRQs 4 5 7 10 11 14 15) *0 [ 1.988702] ACPI: PCI Interrupt Link [LNKF] (IRQs 4 5 7 10 11 14 15) *0 [ 1.995608] ACPI: PCI Interrupt Link [LNKG] (IRQs 4 5 7 10 11 14 15) *0 [ 2.002514] ACPI: PCI Interrupt Link [LNKH] (IRQs 4 5 7 10 11 14 15) *0 [ 2.009564] ACPI: PCI Root Bridge [PC00] (domain 0000 [bus 00-3f]) [ 2.015748] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.023967] acpi PNP0A08:00: PCIe AER handled by firmware [ 2.029408] acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug] [ 2.036355] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.044004] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.052460] PCI host bridge to bus 0000:00 [ 2.056564] pci_bus 0000:00: root bus resource [io 0x0000-0x03af window] [ 2.063351] pci_bus 0000:00: root bus resource [io 0x03e0-0x0cf7 window] [ 2.070135] pci_bus 0000:00: root bus resource [mem 0x000c0000-0x000c3fff window] [ 2.077615] pci_bus 0000:00: root bus resource [mem 0x000c4000-0x000c7fff window] [ 2.085093] pci_bus 0000:00: root bus resource [mem 0x000c8000-0x000cbfff window] [ 2.092573] pci_bus 0000:00: root bus resource [mem 0x000cc000-0x000cffff window] [ 2.100053] pci_bus 0000:00: root bus resource [mem 0x000d0000-0x000d3fff window] [ 2.107533] pci_bus 0000:00: root bus resource [mem 0x000d4000-0x000d7fff window] [ 2.115012] pci_bus 0000:00: root bus resource [mem 0x000d8000-0x000dbfff window] [ 2.122492] pci_bus 0000:00: root bus resource [mem 0x000dc000-0x000dffff window] [ 2.129973] pci_bus 0000:00: root bus resource [mem 0x000e0000-0x000e3fff window] [ 2.137451] pci_bus 0000:00: root bus resource [mem 0x000e4000-0x000e7fff window] [ 2.144931] pci_bus 0000:00: root bus resource [mem 0x000e8000-0x000ebfff window] [ 2.152410] pci_bus 0000:00: root bus resource [mem 0x000ec000-0x000effff window] [ 2.159889] pci_bus 0000:00: root bus resource [mem 0x000f0000-0x000fffff window] [ 2.167369] pci_bus 0000:00: root bus resource [io 0x0d00-0x3fff window] [ 2.174155] pci_bus 0000:00: root bus resource [mem 0xe1000000-0xfebfffff window] [ 2.181633] pci_bus 0000:00: root bus resource [mem 0x10000000000-0x2bf3fffffff window] [ 2.189634] pci_bus 0000:00: root bus resource [bus 00-3f] [ 2.195129] pci 0000:00:00.0: [1022:1450] type 00 class 0x060000 [ 2.195212] pci 0000:00:00.2: [1022:1451] type 00 class 0x080600 [ 2.195300] pci 0000:00:01.0: [1022:1452] type 00 class 0x060000 [ 2.195377] pci 0000:00:02.0: [1022:1452] type 00 class 0x060000 [ 2.195450] pci 0000:00:03.0: [1022:1452] type 00 class 0x060000 [ 2.195514] pci 0000:00:03.1: [1022:1453] type 01 class 0x060400 [ 2.195936] pci 0000:00:03.1: PME# supported from D0 D3hot D3cold [ 2.196035] pci 0000:00:04.0: [1022:1452] type 00 class 0x060000 [ 2.196117] pci 0000:00:07.0: [1022:1452] type 00 class 0x060000 [ 2.196176] pci 0000:00:07.1: [1022:1454] type 01 class 0x060400 [ 2.196933] pci 0000:00:07.1: PME# supported from D0 D3hot D3cold [ 2.197013] pci 0000:00:08.0: [1022:1452] type 00 class 0x060000 [ 2.197074] pci 0000:00:08.1: [1022:1454] type 01 class 0x060400 [ 2.197915] pci 0000:00:08.1: PME# supported from D0 D3hot D3cold [ 2.198029] pci 0000:00:14.0: [1022:790b] type 00 class 0x0c0500 [ 2.198230] pci 0000:00:14.3: [1022:790e] type 00 class 0x060100 [ 2.198433] pci 0000:00:18.0: [1022:1460] type 00 class 0x060000 [ 2.198485] pci 0000:00:18.1: [1022:1461] type 00 class 0x060000 [ 2.198536] pci 0000:00:18.2: [1022:1462] type 00 class 0x060000 [ 2.198587] pci 0000:00:18.3: [1022:1463] type 00 class 0x060000 [ 2.198638] pci 0000:00:18.4: [1022:1464] type 00 class 0x060000 [ 2.198689] pci 0000:00:18.5: [1022:1465] type 00 class 0x060000 [ 2.198739] pci 0000:00:18.6: [1022:1466] type 00 class 0x060000 [ 2.198790] pci 0000:00:18.7: [1022:1467] type 00 class 0x060000 [ 2.198840] pci 0000:00:19.0: [1022:1460] type 00 class 0x060000 [ 2.198895] pci 0000:00:19.1: [1022:1461] type 00 class 0x060000 [ 2.198949] pci 0000:00:19.2: [1022:1462] type 00 class 0x060000 [ 2.199002] pci 0000:00:19.3: [1022:1463] type 00 class 0x060000 [ 2.199055] pci 0000:00:19.4: [1022:1464] type 00 class 0x060000 [ 2.199109] pci 0000:00:19.5: [1022:1465] type 00 class 0x060000 [ 2.199163] pci 0000:00:19.6: [1022:1466] type 00 class 0x060000 [ 2.199218] pci 0000:00:19.7: [1022:1467] type 00 class 0x060000 [ 2.199271] pci 0000:00:1a.0: [1022:1460] type 00 class 0x060000 [ 2.199325] pci 0000:00:1a.1: [1022:1461] type 00 class 0x060000 [ 2.199378] pci 0000:00:1a.2: [1022:1462] type 00 class 0x060000 [ 2.199432] pci 0000:00:1a.3: [1022:1463] type 00 class 0x060000 [ 2.199485] pci 0000:00:1a.4: [1022:1464] type 00 class 0x060000 [ 2.199541] pci 0000:00:1a.5: [1022:1465] type 00 class 0x060000 [ 2.199595] pci 0000:00:1a.6: [1022:1466] type 00 class 0x060000 [ 2.199651] pci 0000:00:1a.7: [1022:1467] type 00 class 0x060000 [ 2.199703] pci 0000:00:1b.0: [1022:1460] type 00 class 0x060000 [ 2.199757] pci 0000:00:1b.1: [1022:1461] type 00 class 0x060000 [ 2.199810] pci 0000:00:1b.2: [1022:1462] type 00 class 0x060000 [ 2.199864] pci 0000:00:1b.3: [1022:1463] type 00 class 0x060000 [ 2.199917] pci 0000:00:1b.4: [1022:1464] type 00 class 0x060000 [ 2.199970] pci 0000:00:1b.5: [1022:1465] type 00 class 0x060000 [ 2.200024] pci 0000:00:1b.6: [1022:1466] type 00 class 0x060000 [ 2.200078] pci 0000:00:1b.7: [1022:1467] type 00 class 0x060000 [ 2.200961] pci 0000:01:00.0: [15b3:101b] type 00 class 0x020700 [ 2.201108] pci 0000:01:00.0: reg 0x10: [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.201343] pci 0000:01:00.0: reg 0x30: [mem 0xfff00000-0xffffffff pref] [ 2.201750] pci 0000:01:00.0: PME# supported from D3cold [ 2.202027] pci 0000:00:03.1: PCI bridge to [bus 01] [ 2.206998] pci 0000:00:03.1: bridge window [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.207073] pci 0000:02:00.0: [1022:145a] type 00 class 0x130000 [ 2.207171] pci 0000:02:00.2: [1022:1456] type 00 class 0x108000 [ 2.207188] pci 0000:02:00.2: reg 0x18: [mem 0xf7300000-0xf73fffff] [ 2.207200] pci 0000:02:00.2: reg 0x24: [mem 0xf7400000-0xf7401fff] [ 2.207277] pci 0000:02:00.3: [1022:145f] type 00 class 0x0c0330 [ 2.207289] pci 0000:02:00.3: reg 0x10: [mem 0xf7200000-0xf72fffff 64bit] [ 2.207336] pci 0000:02:00.3: PME# supported from D0 D3hot D3cold [ 2.207394] pci 0000:00:07.1: PCI bridge to [bus 02] [ 2.212361] pci 0000:00:07.1: bridge window [mem 0xf7200000-0xf74fffff] [ 2.212955] pci 0000:03:00.0: [1022:1455] type 00 class 0x130000 [ 2.213064] pci 0000:03:00.1: [1022:1468] type 00 class 0x108000 [ 2.213082] pci 0000:03:00.1: reg 0x18: [mem 0xf7000000-0xf70fffff] [ 2.213095] pci 0000:03:00.1: reg 0x24: [mem 0xf7100000-0xf7101fff] [ 2.213185] pci 0000:00:08.1: PCI bridge to [bus 03] [ 2.218161] pci 0000:00:08.1: bridge window [mem 0xf7000000-0xf71fffff] [ 2.218176] pci_bus 0000:00: on NUMA node 0 [ 2.218552] ACPI: PCI Root Bridge [PC01] (domain 0000 [bus 40-7f]) [ 2.224738] acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.232953] acpi PNP0A08:01: PCIe AER handled by firmware [ 2.238389] acpi PNP0A08:01: _OSC: platform does not support [SHPCHotplug] [ 2.245337] acpi PNP0A08:01: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.252989] acpi PNP0A08:01: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.261402] PCI host bridge to bus 0000:40 [ 2.265505] pci_bus 0000:40: root bus resource [io 0x4000-0x7fff window] [ 2.272290] pci_bus 0000:40: root bus resource [mem 0xc6000000-0xe0ffffff window] [ 2.279767] pci_bus 0000:40: root bus resource [mem 0x2bf40000000-0x47e7fffffff window] [ 2.287767] pci_bus 0000:40: root bus resource [bus 40-7f] [ 2.293258] pci 0000:40:00.0: [1022:1450] type 00 class 0x060000 [ 2.293329] pci 0000:40:00.2: [1022:1451] type 00 class 0x080600 [ 2.293419] pci 0000:40:01.0: [1022:1452] type 00 class 0x060000 [ 2.293493] pci 0000:40:02.0: [1022:1452] type 00 class 0x060000 [ 2.293570] pci 0000:40:03.0: [1022:1452] type 00 class 0x060000 [ 2.293645] pci 0000:40:04.0: [1022:1452] type 00 class 0x060000 [ 2.293724] pci 0000:40:07.0: [1022:1452] type 00 class 0x060000 [ 2.293784] pci 0000:40:07.1: [1022:1454] type 01 class 0x060400 [ 2.293954] pci 0000:40:07.1: PME# supported from D0 D3hot D3cold [ 2.294035] pci 0000:40:08.0: [1022:1452] type 00 class 0x060000 [ 2.294098] pci 0000:40:08.1: [1022:1454] type 01 class 0x060400 [ 2.294209] pci 0000:40:08.1: PME# supported from D0 D3hot D3cold [ 2.294878] pci 0000:41:00.0: [1022:145a] type 00 class 0x130000 [ 2.294983] pci 0000:41:00.2: [1022:1456] type 00 class 0x108000 [ 2.295002] pci 0000:41:00.2: reg 0x18: [mem 0xdb300000-0xdb3fffff] [ 2.295015] pci 0000:41:00.2: reg 0x24: [mem 0xdb400000-0xdb401fff] [ 2.295099] pci 0000:41:00.3: [1022:145f] type 00 class 0x0c0330 [ 2.295112] pci 0000:41:00.3: reg 0x10: [mem 0xdb200000-0xdb2fffff 64bit] [ 2.295165] pci 0000:41:00.3: PME# supported from D0 D3hot D3cold [ 2.295226] pci 0000:40:07.1: PCI bridge to [bus 41] [ 2.300199] pci 0000:40:07.1: bridge window [mem 0xdb200000-0xdb4fffff] [ 2.300293] pci 0000:42:00.0: [1022:1455] type 00 class 0x130000 [ 2.300409] pci 0000:42:00.1: [1022:1468] type 00 class 0x108000 [ 2.300429] pci 0000:42:00.1: reg 0x18: [mem 0xdb000000-0xdb0fffff] [ 2.300443] pci 0000:42:00.1: reg 0x24: [mem 0xdb100000-0xdb101fff] [ 2.300542] pci 0000:40:08.1: PCI bridge to [bus 42] [ 2.305513] pci 0000:40:08.1: bridge window [mem 0xdb000000-0xdb1fffff] [ 2.305525] pci_bus 0000:40: on NUMA node 1 [ 2.305706] ACPI: PCI Root Bridge [PC02] (domain 0000 [bus 80-bf]) [ 2.311891] acpi PNP0A08:02: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.320101] acpi PNP0A08:02: PCIe AER handled by firmware [ 2.325542] acpi PNP0A08:02: _OSC: platform does not support [SHPCHotplug] [ 2.332488] acpi PNP0A08:02: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.340141] acpi PNP0A08:02: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.348577] PCI host bridge to bus 0000:80 [ 2.352674] pci_bus 0000:80: root bus resource [io 0x03b0-0x03df window] [ 2.359459] pci_bus 0000:80: root bus resource [mem 0x000a0000-0x000bffff window] [ 2.366940] pci_bus 0000:80: root bus resource [io 0x8000-0xbfff window] [ 2.373726] pci_bus 0000:80: root bus resource [mem 0xab000000-0xc5ffffff window] [ 2.381206] pci_bus 0000:80: root bus resource [mem 0x47e80000000-0x63dbfffffff window] [ 2.389204] pci_bus 0000:80: root bus resource [bus 80-bf] [ 2.394696] pci 0000:80:00.0: [1022:1450] type 00 class 0x060000 [ 2.394768] pci 0000:80:00.2: [1022:1451] type 00 class 0x080600 [ 2.394856] pci 0000:80:01.0: [1022:1452] type 00 class 0x060000 [ 2.394920] pci 0000:80:01.1: [1022:1453] type 01 class 0x060400 [ 2.395047] pci 0000:80:01.1: PME# supported from D0 D3hot D3cold [ 2.395119] pci 0000:80:01.2: [1022:1453] type 01 class 0x060400 [ 2.395239] pci 0000:80:01.2: PME# supported from D0 D3hot D3cold [ 2.395320] pci 0000:80:02.0: [1022:1452] type 00 class 0x060000 [ 2.395396] pci 0000:80:03.0: [1022:1452] type 00 class 0x060000 [ 2.395456] pci 0000:80:03.1: [1022:1453] type 01 class 0x060400 [ 2.395976] pci 0000:80:03.1: PME# supported from D0 D3hot D3cold [ 2.396073] pci 0000:80:04.0: [1022:1452] type 00 class 0x060000 [ 2.396155] pci 0000:80:07.0: [1022:1452] type 00 class 0x060000 [ 2.396218] pci 0000:80:07.1: [1022:1454] type 01 class 0x060400 [ 2.396326] pci 0000:80:07.1: PME# supported from D0 D3hot D3cold [ 2.396403] pci 0000:80:08.0: [1022:1452] type 00 class 0x060000 [ 2.396465] pci 0000:80:08.1: [1022:1454] type 01 class 0x060400 [ 2.396978] pci 0000:80:08.1: PME# supported from D0 D3hot D3cold [ 2.397190] pci 0000:81:00.0: [14e4:165f] type 00 class 0x020000 [ 2.397216] pci 0000:81:00.0: reg 0x10: [mem 0xac230000-0xac23ffff 64bit pref] [ 2.397230] pci 0000:81:00.0: reg 0x18: [mem 0xac240000-0xac24ffff 64bit pref] [ 2.397245] pci 0000:81:00.0: reg 0x20: [mem 0xac250000-0xac25ffff 64bit pref] [ 2.397255] pci 0000:81:00.0: reg 0x30: [mem 0xfffc0000-0xffffffff pref] [ 2.397330] pci 0000:81:00.0: PME# supported from D0 D3hot D3cold [ 2.397425] pci 0000:81:00.1: [14e4:165f] type 00 class 0x020000 [ 2.397450] pci 0000:81:00.1: reg 0x10: [mem 0xac200000-0xac20ffff 64bit pref] [ 2.397465] pci 0000:81:00.1: reg 0x18: [mem 0xac210000-0xac21ffff 64bit pref] [ 2.397479] pci 0000:81:00.1: reg 0x20: [mem 0xac220000-0xac22ffff 64bit pref] [ 2.397490] pci 0000:81:00.1: reg 0x30: [mem 0xfffc0000-0xffffffff pref] [ 2.397566] pci 0000:81:00.1: PME# supported from D0 D3hot D3cold [ 2.397654] pci 0000:80:01.1: PCI bridge to [bus 81] [ 2.402628] pci 0000:80:01.1: bridge window [mem 0xac200000-0xac2fffff 64bit pref] [ 2.402959] pci 0000:82:00.0: [1556:be00] type 01 class 0x060400 [ 2.405632] pci 0000:80:01.2: PCI bridge to [bus 82-83] [ 2.410867] pci 0000:80:01.2: bridge window [mem 0xc0000000-0xc08fffff] [ 2.410871] pci 0000:80:01.2: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.410919] pci 0000:83:00.0: [102b:0536] type 00 class 0x030000 [ 2.410938] pci 0000:83:00.0: reg 0x10: [mem 0xab000000-0xabffffff pref] [ 2.410949] pci 0000:83:00.0: reg 0x14: [mem 0xc0808000-0xc080bfff] [ 2.410960] pci 0000:83:00.0: reg 0x18: [mem 0xc0000000-0xc07fffff] [ 2.411100] pci 0000:82:00.0: PCI bridge to [bus 83] [ 2.416071] pci 0000:82:00.0: bridge window [mem 0xc0000000-0xc08fffff] [ 2.416077] pci 0000:82:00.0: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.416160] pci 0000:84:00.0: [1000:00d1] type 00 class 0x010700 [ 2.416182] pci 0000:84:00.0: reg 0x10: [mem 0xac000000-0xac0fffff 64bit pref] [ 2.416193] pci 0000:84:00.0: reg 0x18: [mem 0xac100000-0xac1fffff 64bit pref] [ 2.416200] pci 0000:84:00.0: reg 0x20: [mem 0xc0d00000-0xc0dfffff] [ 2.416207] pci 0000:84:00.0: reg 0x24: [io 0x8000-0x80ff] [ 2.416216] pci 0000:84:00.0: reg 0x30: [mem 0x00000000-0x0003ffff pref] [ 2.416267] pci 0000:84:00.0: supports D1 D2 [ 2.418631] pci 0000:80:03.1: PCI bridge to [bus 84] [ 2.423597] pci 0000:80:03.1: bridge window [io 0x8000-0x8fff] [ 2.423600] pci 0000:80:03.1: bridge window [mem 0xc0d00000-0xc0dfffff] [ 2.423603] pci 0000:80:03.1: bridge window [mem 0xac000000-0xac1fffff 64bit pref] [ 2.423995] pci 0000:85:00.0: [1022:145a] type 00 class 0x130000 [ 2.424100] pci 0000:85:00.2: [1022:1456] type 00 class 0x108000 [ 2.424118] pci 0000:85:00.2: reg 0x18: [mem 0xc0b00000-0xc0bfffff] [ 2.424131] pci 0000:85:00.2: reg 0x24: [mem 0xc0c00000-0xc0c01fff] [ 2.424223] pci 0000:80:07.1: PCI bridge to [bus 85] [ 2.429195] pci 0000:80:07.1: bridge window [mem 0xc0b00000-0xc0cfffff] [ 2.429288] pci 0000:86:00.0: [1022:1455] type 00 class 0x130000 [ 2.429406] pci 0000:86:00.1: [1022:1468] type 00 class 0x108000 [ 2.429425] pci 0000:86:00.1: reg 0x18: [mem 0xc0900000-0xc09fffff] [ 2.429439] pci 0000:86:00.1: reg 0x24: [mem 0xc0a00000-0xc0a01fff] [ 2.429528] pci 0000:86:00.2: [1022:7901] type 00 class 0x010601 [ 2.429560] pci 0000:86:00.2: reg 0x24: [mem 0xc0a02000-0xc0a02fff] [ 2.429598] pci 0000:86:00.2: PME# supported from D3hot D3cold [ 2.429666] pci 0000:80:08.1: PCI bridge to [bus 86] [ 2.434638] pci 0000:80:08.1: bridge window [mem 0xc0900000-0xc0afffff] [ 2.434663] pci_bus 0000:80: on NUMA node 2 [ 2.434832] ACPI: PCI Root Bridge [PC03] (domain 0000 [bus c0-ff]) [ 2.441017] acpi PNP0A08:03: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [ 2.449226] acpi PNP0A08:03: PCIe AER handled by firmware [ 2.454670] acpi PNP0A08:03: _OSC: platform does not support [SHPCHotplug] [ 2.461615] acpi PNP0A08:03: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] [ 2.469268] acpi PNP0A08:03: FADT indicates ASPM is unsupported, using BIOS configuration [ 2.477594] acpi PNP0A08:03: host bridge window [mem 0x63dc0000000-0xffffffffffff window] ([0x80000000000-0xffffffffffff] ignored, not CPU addressable) [ 2.491232] PCI host bridge to bus 0000:c0 [ 2.495331] pci_bus 0000:c0: root bus resource [io 0xc000-0xffff window] [ 2.502117] pci_bus 0000:c0: root bus resource [mem 0x90000000-0xaaffffff window] [ 2.509596] pci_bus 0000:c0: root bus resource [mem 0x63dc0000000-0x7ffffffffff window] [ 2.517595] pci_bus 0000:c0: root bus resource [bus c0-ff] [ 2.523085] pci 0000:c0:00.0: [1022:1450] type 00 class 0x060000 [ 2.523154] pci 0000:c0:00.2: [1022:1451] type 00 class 0x080600 [ 2.523244] pci 0000:c0:01.0: [1022:1452] type 00 class 0x060000 [ 2.523305] pci 0000:c0:01.1: [1022:1453] type 01 class 0x060400 [ 2.523432] pci 0000:c0:01.1: PME# supported from D0 D3hot D3cold [ 2.523529] pci 0000:c0:02.0: [1022:1452] type 00 class 0x060000 [ 2.523604] pci 0000:c0:03.0: [1022:1452] type 00 class 0x060000 [ 2.523680] pci 0000:c0:04.0: [1022:1452] type 00 class 0x060000 [ 2.523758] pci 0000:c0:07.0: [1022:1452] type 00 class 0x060000 [ 2.523819] pci 0000:c0:07.1: [1022:1454] type 01 class 0x060400 [ 2.524234] pci 0000:c0:07.1: PME# supported from D0 D3hot D3cold [ 2.524311] pci 0000:c0:08.0: [1022:1452] type 00 class 0x060000 [ 2.524374] pci 0000:c0:08.1: [1022:1454] type 01 class 0x060400 [ 2.524485] pci 0000:c0:08.1: PME# supported from D0 D3hot D3cold [ 2.525162] pci 0000:c1:00.0: [1000:005f] type 00 class 0x010400 [ 2.525176] pci 0000:c1:00.0: reg 0x10: [io 0xc000-0xc0ff] [ 2.525186] pci 0000:c1:00.0: reg 0x14: [mem 0xa5500000-0xa550ffff 64bit] [ 2.525196] pci 0000:c1:00.0: reg 0x1c: [mem 0xa5400000-0xa54fffff 64bit] [ 2.525208] pci 0000:c1:00.0: reg 0x30: [mem 0xfff00000-0xffffffff pref] [ 2.525257] pci 0000:c1:00.0: supports D1 D2 [ 2.525307] pci 0000:c0:01.1: PCI bridge to [bus c1] [ 2.530275] pci 0000:c0:01.1: bridge window [io 0xc000-0xcfff] [ 2.530278] pci 0000:c0:01.1: bridge window [mem 0xa5400000-0xa55fffff] [ 2.530360] pci 0000:c2:00.0: [1022:145a] type 00 class 0x130000 [ 2.530465] pci 0000:c2:00.2: [1022:1456] type 00 class 0x108000 [ 2.530484] pci 0000:c2:00.2: reg 0x18: [mem 0xa5200000-0xa52fffff] [ 2.530497] pci 0000:c2:00.2: reg 0x24: [mem 0xa5300000-0xa5301fff] [ 2.530590] pci 0000:c0:07.1: PCI bridge to [bus c2] [ 2.535554] pci 0000:c0:07.1: bridge window [mem 0xa5200000-0xa53fffff] [ 2.535649] pci 0000:c3:00.0: [1022:1455] type 00 class 0x130000 [ 2.535764] pci 0000:c3:00.1: [1022:1468] type 00 class 0x108000 [ 2.535783] pci 0000:c3:00.1: reg 0x18: [mem 0xa5000000-0xa50fffff] [ 2.535797] pci 0000:c3:00.1: reg 0x24: [mem 0xa5100000-0xa5101fff] [ 2.535895] pci 0000:c0:08.1: PCI bridge to [bus c3] [ 2.540868] pci 0000:c0:08.1: bridge window [mem 0xa5000000-0xa51fffff] [ 2.540884] pci_bus 0000:c0: on NUMA node 3 [ 2.543022] vgaarb: device added: PCI:0000:83:00.0,decodes=io+mem,owns=io+mem,locks=none [ 2.551114] vgaarb: loaded [ 2.553830] vgaarb: bridge control possible 0000:83:00.0 [ 2.559256] SCSI subsystem initialized [ 2.563033] ACPI: bus type USB registered [ 2.567061] usbcore: registered new interface driver usbfs [ 2.572557] usbcore: registered new interface driver hub [ 2.578073] usbcore: registered new device driver usb [ 2.583443] EDAC MC: Ver: 3.0.0 [ 2.586840] PCI: Using ACPI for IRQ routing [ 2.609999] PCI: pci_cache_line_size set to 64 bytes [ 2.610146] e820: reserve RAM buffer [mem 0x0008f000-0x0008ffff] [ 2.610148] e820: reserve RAM buffer [mem 0x3788e020-0x37ffffff] [ 2.610150] e820: reserve RAM buffer [mem 0x378a7020-0x37ffffff] [ 2.610152] e820: reserve RAM buffer [mem 0x378cd020-0x37ffffff] [ 2.610154] e820: reserve RAM buffer [mem 0x378d6020-0x37ffffff] [ 2.610155] e820: reserve RAM buffer [mem 0x37908020-0x37ffffff] [ 2.610157] e820: reserve RAM buffer [mem 0x3793a020-0x37ffffff] [ 2.610158] e820: reserve RAM buffer [mem 0x4f781000-0x4fffffff] [ 2.610159] e820: reserve RAM buffer [mem 0x6cacf000-0x6fffffff] [ 2.610161] e820: reserve RAM buffer [mem 0x107f380000-0x107fffffff] [ 2.610162] e820: reserve RAM buffer [mem 0x207ff80000-0x207fffffff] [ 2.610163] e820: reserve RAM buffer [mem 0x307ff80000-0x307fffffff] [ 2.610164] e820: reserve RAM buffer [mem 0x407ff80000-0x407fffffff] [ 2.610418] NetLabel: Initializing [ 2.613821] NetLabel: domain hash size = 128 [ 2.618180] NetLabel: protocols = UNLABELED CIPSOv4 [ 2.623162] NetLabel: unlabeled traffic allowed by default [ 2.628933] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 [ 2.633918] hpet0: 3 comparators, 32-bit 14.318180 MHz counter [ 2.641948] Switched to clocksource hpet [ 2.650558] pnp: PnP ACPI init [ 2.653638] ACPI: bus type PNP registered [ 2.657830] system 00:00: [mem 0x80000000-0x8fffffff] has been reserved [ 2.664457] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active) [ 2.664512] pnp 00:01: Plug and Play ACPI device, IDs PNP0b00 (active) [ 2.664709] pnp 00:02: Plug and Play ACPI device, IDs PNP0501 (active) [ 2.664885] pnp 00:03: Plug and Play ACPI device, IDs PNP0501 (active) [ 2.665037] pnp: PnP ACPI: found 4 devices [ 2.669142] ACPI: bus type PNP unregistered [ 2.680599] pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window [ 2.690516] pci 0000:81:00.0: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window [ 2.700430] pci 0000:81:00.1: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window [ 2.710347] pci 0000:c1:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window [ 2.720284] pci 0000:00:03.1: BAR 14: assigned [mem 0xe1000000-0xe10fffff] [ 2.727167] pci 0000:01:00.0: BAR 6: assigned [mem 0xe1000000-0xe10fffff pref] [ 2.734395] pci 0000:00:03.1: PCI bridge to [bus 01] [ 2.739373] pci 0000:00:03.1: bridge window [mem 0xe1000000-0xe10fffff] [ 2.746165] pci 0000:00:03.1: bridge window [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.753925] pci 0000:00:07.1: PCI bridge to [bus 02] [ 2.758896] pci 0000:00:07.1: bridge window [mem 0xf7200000-0xf74fffff] [ 2.765692] pci 0000:00:08.1: PCI bridge to [bus 03] [ 2.770666] pci 0000:00:08.1: bridge window [mem 0xf7000000-0xf71fffff] [ 2.777465] pci_bus 0000:00: resource 4 [io 0x0000-0x03af window] [ 2.777467] pci_bus 0000:00: resource 5 [io 0x03e0-0x0cf7 window] [ 2.777469] pci_bus 0000:00: resource 6 [mem 0x000c0000-0x000c3fff window] [ 2.777471] pci_bus 0000:00: resource 7 [mem 0x000c4000-0x000c7fff window] [ 2.777473] pci_bus 0000:00: resource 8 [mem 0x000c8000-0x000cbfff window] [ 2.777475] pci_bus 0000:00: resource 9 [mem 0x000cc000-0x000cffff window] [ 2.777476] pci_bus 0000:00: resource 10 [mem 0x000d0000-0x000d3fff window] [ 2.777478] pci_bus 0000:00: resource 11 [mem 0x000d4000-0x000d7fff window] [ 2.777480] pci_bus 0000:00: resource 12 [mem 0x000d8000-0x000dbfff window] [ 2.777481] pci_bus 0000:00: resource 13 [mem 0x000dc000-0x000dffff window] [ 2.777483] pci_bus 0000:00: resource 14 [mem 0x000e0000-0x000e3fff window] [ 2.777485] pci_bus 0000:00: resource 15 [mem 0x000e4000-0x000e7fff window] [ 2.777486] pci_bus 0000:00: resource 16 [mem 0x000e8000-0x000ebfff window] [ 2.777488] pci_bus 0000:00: resource 17 [mem 0x000ec000-0x000effff window] [ 2.777490] pci_bus 0000:00: resource 18 [mem 0x000f0000-0x000fffff window] [ 2.777491] pci_bus 0000:00: resource 19 [io 0x0d00-0x3fff window] [ 2.777493] pci_bus 0000:00: resource 20 [mem 0xe1000000-0xfebfffff window] [ 2.777495] pci_bus 0000:00: resource 21 [mem 0x10000000000-0x2bf3fffffff window] [ 2.777497] pci_bus 0000:01: resource 1 [mem 0xe1000000-0xe10fffff] [ 2.777499] pci_bus 0000:01: resource 2 [mem 0xe2000000-0xe3ffffff 64bit pref] [ 2.777500] pci_bus 0000:02: resource 1 [mem 0xf7200000-0xf74fffff] [ 2.777502] pci_bus 0000:03: resource 1 [mem 0xf7000000-0xf71fffff] [ 2.777514] pci 0000:40:07.1: PCI bridge to [bus 41] [ 2.782490] pci 0000:40:07.1: bridge window [mem 0xdb200000-0xdb4fffff] [ 2.789285] pci 0000:40:08.1: PCI bridge to [bus 42] [ 2.794258] pci 0000:40:08.1: bridge window [mem 0xdb000000-0xdb1fffff] [ 2.801057] pci_bus 0000:40: resource 4 [io 0x4000-0x7fff window] [ 2.801058] pci_bus 0000:40: resource 5 [mem 0xc6000000-0xe0ffffff window] [ 2.801060] pci_bus 0000:40: resource 6 [mem 0x2bf40000000-0x47e7fffffff window] [ 2.801062] pci_bus 0000:41: resource 1 [mem 0xdb200000-0xdb4fffff] [ 2.801064] pci_bus 0000:42: resource 1 [mem 0xdb000000-0xdb1fffff] [ 2.801095] pci 0000:80:01.1: BAR 14: assigned [mem 0xac300000-0xac3fffff] [ 2.807978] pci 0000:81:00.0: BAR 6: assigned [mem 0xac300000-0xac33ffff pref] [ 2.815204] pci 0000:81:00.1: BAR 6: assigned [mem 0xac340000-0xac37ffff pref] [ 2.822432] pci 0000:80:01.1: PCI bridge to [bus 81] [ 2.827408] pci 0000:80:01.1: bridge window [mem 0xac300000-0xac3fffff] [ 2.834203] pci 0000:80:01.1: bridge window [mem 0xac200000-0xac2fffff 64bit pref] [ 2.841953] pci 0000:82:00.0: PCI bridge to [bus 83] [ 2.846928] pci 0000:82:00.0: bridge window [mem 0xc0000000-0xc08fffff] [ 2.853721] pci 0000:82:00.0: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.861471] pci 0000:80:01.2: PCI bridge to [bus 82-83] [ 2.866705] pci 0000:80:01.2: bridge window [mem 0xc0000000-0xc08fffff] [ 2.873499] pci 0000:80:01.2: bridge window [mem 0xab000000-0xabffffff 64bit pref] [ 2.881248] pci 0000:84:00.0: BAR 6: no space for [mem size 0x00040000 pref] [ 2.888301] pci 0000:84:00.0: BAR 6: failed to assign [mem size 0x00040000 pref] [ 2.895702] pci 0000:80:03.1: PCI bridge to [bus 84] [ 2.900677] pci 0000:80:03.1: bridge window [io 0x8000-0x8fff] [ 2.906778] pci 0000:80:03.1: bridge window [mem 0xc0d00000-0xc0dfffff] [ 2.913574] pci 0000:80:03.1: bridge window [mem 0xac000000-0xac1fffff 64bit pref] [ 2.921323] pci 0000:80:07.1: PCI bridge to [bus 85] [ 2.926297] pci 0000:80:07.1: bridge window [mem 0xc0b00000-0xc0cfffff] [ 2.933093] pci 0000:80:08.1: PCI bridge to [bus 86] [ 2.938067] pci 0000:80:08.1: bridge window [mem 0xc0900000-0xc0afffff] [ 2.944864] pci_bus 0000:80: resource 4 [io 0x03b0-0x03df window] [ 2.944865] pci_bus 0000:80: resource 5 [mem 0x000a0000-0x000bffff window] [ 2.944867] pci_bus 0000:80: resource 6 [io 0x8000-0xbfff window] [ 2.944869] pci_bus 0000:80: resource 7 [mem 0xab000000-0xc5ffffff window] [ 2.944871] pci_bus 0000:80: resource 8 [mem 0x47e80000000-0x63dbfffffff window] [ 2.944873] pci_bus 0000:81: resource 1 [mem 0xac300000-0xac3fffff] [ 2.944874] pci_bus 0000:81: resource 2 [mem 0xac200000-0xac2fffff 64bit pref] [ 2.944876] pci_bus 0000:82: resource 1 [mem 0xc0000000-0xc08fffff] [ 2.944878] pci_bus 0000:82: resource 2 [mem 0xab000000-0xabffffff 64bit pref] [ 2.944879] pci_bus 0000:83: resource 1 [mem 0xc0000000-0xc08fffff] [ 2.944881] pci_bus 0000:83: resource 2 [mem 0xab000000-0xabffffff 64bit pref] [ 2.944883] pci_bus 0000:84: resource 0 [io 0x8000-0x8fff] [ 2.944885] pci_bus 0000:84: resource 1 [mem 0xc0d00000-0xc0dfffff] [ 2.944886] pci_bus 0000:84: resource 2 [mem 0xac000000-0xac1fffff 64bit pref] [ 2.944888] pci_bus 0000:85: resource 1 [mem 0xc0b00000-0xc0cfffff] [ 2.944890] pci_bus 0000:86: resource 1 [mem 0xc0900000-0xc0afffff] [ 2.944905] pci 0000:c1:00.0: BAR 6: no space for [mem size 0x00100000 pref] [ 2.951958] pci 0000:c1:00.0: BAR 6: failed to assign [mem size 0x00100000 pref] [ 2.959360] pci 0000:c0:01.1: PCI bridge to [bus c1] [ 2.964335] pci 0000:c0:01.1: bridge window [io 0xc000-0xcfff] [ 2.970436] pci 0000:c0:01.1: bridge window [mem 0xa5400000-0xa55fffff] [ 2.977233] pci 0000:c0:07.1: PCI bridge to [bus c2] [ 2.982207] pci 0000:c0:07.1: bridge window [mem 0xa5200000-0xa53fffff] [ 2.989002] pci 0000:c0:08.1: PCI bridge to [bus c3] [ 2.993976] pci 0000:c0:08.1: bridge window [mem 0xa5000000-0xa51fffff] [ 3.000773] pci_bus 0000:c0: resource 4 [io 0xc000-0xffff window] [ 3.000775] pci_bus 0000:c0: resource 5 [mem 0x90000000-0xaaffffff window] [ 3.000777] pci_bus 0000:c0: resource 6 [mem 0x63dc0000000-0x7ffffffffff window] [ 3.000778] pci_bus 0000:c1: resource 0 [io 0xc000-0xcfff] [ 3.000780] pci_bus 0000:c1: resource 1 [mem 0xa5400000-0xa55fffff] [ 3.000782] pci_bus 0000:c2: resource 1 [mem 0xa5200000-0xa53fffff] [ 3.000783] pci_bus 0000:c3: resource 1 [mem 0xa5000000-0xa51fffff] [ 3.000867] NET: Registered protocol family 2 [ 3.005905] TCP established hash table entries: 524288 (order: 10, 4194304 bytes) [ 3.014055] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) [ 3.020879] TCP: Hash tables configured (established 524288 bind 65536) [ 3.027514] TCP: reno registered [ 3.030859] UDP hash table entries: 65536 (order: 9, 2097152 bytes) [ 3.037466] UDP-Lite hash table entries: 65536 (order: 9, 2097152 bytes) [ 3.044680] NET: Registered protocol family 1 [ 3.049483] pci 0000:83:00.0: Boot video device [ 3.049520] PCI: CLS 64 bytes, default 64 [ 3.049569] Unpacking initramfs... [ 3.320198] Freeing initrd memory: 19740k freed [ 3.326932] AMD-Vi: IOMMU performance counters supported [ 3.332317] AMD-Vi: IOMMU performance counters supported [ 3.337672] AMD-Vi: IOMMU performance counters supported [ 3.343029] AMD-Vi: IOMMU performance counters supported [ 3.349656] iommu: Adding device 0000:00:01.0 to group 0 [ 3.355653] iommu: Adding device 0000:00:02.0 to group 1 [ 3.361635] iommu: Adding device 0000:00:03.0 to group 2 [ 3.367748] iommu: Adding device 0000:00:03.1 to group 3 [ 3.373750] iommu: Adding device 0000:00:04.0 to group 4 [ 3.379759] iommu: Adding device 0000:00:07.0 to group 5 [ 3.385738] iommu: Adding device 0000:00:07.1 to group 6 [ 3.391731] iommu: Adding device 0000:00:08.0 to group 7 [ 3.397711] iommu: Adding device 0000:00:08.1 to group 8 [ 3.403712] iommu: Adding device 0000:00:14.0 to group 9 [ 3.409053] iommu: Adding device 0000:00:14.3 to group 9 [ 3.415152] iommu: Adding device 0000:00:18.0 to group 10 [ 3.420581] iommu: Adding device 0000:00:18.1 to group 10 [ 3.426004] iommu: Adding device 0000:00:18.2 to group 10 [ 3.431430] iommu: Adding device 0000:00:18.3 to group 10 [ 3.436854] iommu: Adding device 0000:00:18.4 to group 10 [ 3.442279] iommu: Adding device 0000:00:18.5 to group 10 [ 3.447710] iommu: Adding device 0000:00:18.6 to group 10 [ 3.453137] iommu: Adding device 0000:00:18.7 to group 10 [ 3.459310] iommu: Adding device 0000:00:19.0 to group 11 [ 3.464739] iommu: Adding device 0000:00:19.1 to group 11 [ 3.470165] iommu: Adding device 0000:00:19.2 to group 11 [ 3.475590] iommu: Adding device 0000:00:19.3 to group 11 [ 3.481013] iommu: Adding device 0000:00:19.4 to group 11 [ 3.486442] iommu: Adding device 0000:00:19.5 to group 11 [ 3.491866] iommu: Adding device 0000:00:19.6 to group 11 [ 3.497291] iommu: Adding device 0000:00:19.7 to group 11 [ 3.503448] iommu: Adding device 0000:00:1a.0 to group 12 [ 3.508871] iommu: Adding device 0000:00:1a.1 to group 12 [ 3.514293] iommu: Adding device 0000:00:1a.2 to group 12 [ 3.519721] iommu: Adding device 0000:00:1a.3 to group 12 [ 3.525148] iommu: Adding device 0000:00:1a.4 to group 12 [ 3.530572] iommu: Adding device 0000:00:1a.5 to group 12 [ 3.536000] iommu: Adding device 0000:00:1a.6 to group 12 [ 3.541423] iommu: Adding device 0000:00:1a.7 to group 12 [ 3.547609] iommu: Adding device 0000:00:1b.0 to group 13 [ 3.553034] iommu: Adding device 0000:00:1b.1 to group 13 [ 3.558463] iommu: Adding device 0000:00:1b.2 to group 13 [ 3.563890] iommu: Adding device 0000:00:1b.3 to group 13 [ 3.569316] iommu: Adding device 0000:00:1b.4 to group 13 [ 3.574736] iommu: Adding device 0000:00:1b.5 to group 13 [ 3.580163] iommu: Adding device 0000:00:1b.6 to group 13 [ 3.585592] iommu: Adding device 0000:00:1b.7 to group 13 [ 3.591726] iommu: Adding device 0000:01:00.0 to group 14 [ 3.597795] iommu: Adding device 0000:02:00.0 to group 15 [ 3.603903] iommu: Adding device 0000:02:00.2 to group 16 [ 3.610025] iommu: Adding device 0000:02:00.3 to group 17 [ 3.616160] iommu: Adding device 0000:03:00.0 to group 18 [ 3.622225] iommu: Adding device 0000:03:00.1 to group 19 [ 3.628313] iommu: Adding device 0000:40:01.0 to group 20 [ 3.634429] iommu: Adding device 0000:40:02.0 to group 21 [ 3.640514] iommu: Adding device 0000:40:03.0 to group 22 [ 3.646600] iommu: Adding device 0000:40:04.0 to group 23 [ 3.652694] iommu: Adding device 0000:40:07.0 to group 24 [ 3.658747] iommu: Adding device 0000:40:07.1 to group 25 [ 3.664778] iommu: Adding device 0000:40:08.0 to group 26 [ 3.670830] iommu: Adding device 0000:40:08.1 to group 27 [ 3.676800] iommu: Adding device 0000:41:00.0 to group 28 [ 3.682808] iommu: Adding device 0000:41:00.2 to group 29 [ 3.688835] iommu: Adding device 0000:41:00.3 to group 30 [ 3.694884] iommu: Adding device 0000:42:00.0 to group 31 [ 3.700912] iommu: Adding device 0000:42:00.1 to group 32 [ 3.706983] iommu: Adding device 0000:80:01.0 to group 33 [ 3.713064] iommu: Adding device 0000:80:01.1 to group 34 [ 3.719189] iommu: Adding device 0000:80:01.2 to group 35 [ 3.725242] iommu: Adding device 0000:80:02.0 to group 36 [ 3.731271] iommu: Adding device 0000:80:03.0 to group 37 [ 3.737265] iommu: Adding device 0000:80:03.1 to group 38 [ 3.743337] iommu: Adding device 0000:80:04.0 to group 39 [ 3.749363] iommu: Adding device 0000:80:07.0 to group 40 [ 3.755417] iommu: Adding device 0000:80:07.1 to group 41 [ 3.761449] iommu: Adding device 0000:80:08.0 to group 42 [ 3.767461] iommu: Adding device 0000:80:08.1 to group 43 [ 3.773535] iommu: Adding device 0000:81:00.0 to group 44 [ 3.778984] iommu: Adding device 0000:81:00.1 to group 44 [ 3.785008] iommu: Adding device 0000:82:00.0 to group 45 [ 3.790427] iommu: Adding device 0000:83:00.0 to group 45 [ 3.796497] iommu: Adding device 0000:84:00.0 to group 46 [ 3.802563] iommu: Adding device 0000:85:00.0 to group 47 [ 3.808579] iommu: Adding device 0000:85:00.2 to group 48 [ 3.814606] iommu: Adding device 0000:86:00.0 to group 49 [ 3.820659] iommu: Adding device 0000:86:00.1 to group 50 [ 3.826730] iommu: Adding device 0000:86:00.2 to group 51 [ 3.832765] iommu: Adding device 0000:c0:01.0 to group 52 [ 3.838756] iommu: Adding device 0000:c0:01.1 to group 53 [ 3.844779] iommu: Adding device 0000:c0:02.0 to group 54 [ 3.850820] iommu: Adding device 0000:c0:03.0 to group 55 [ 3.856874] iommu: Adding device 0000:c0:04.0 to group 56 [ 3.862896] iommu: Adding device 0000:c0:07.0 to group 57 [ 3.868890] iommu: Adding device 0000:c0:07.1 to group 58 [ 3.874889] iommu: Adding device 0000:c0:08.0 to group 59 [ 3.880956] iommu: Adding device 0000:c0:08.1 to group 60 [ 3.889413] iommu: Adding device 0000:c1:00.0 to group 61 [ 3.895467] iommu: Adding device 0000:c2:00.0 to group 62 [ 3.901542] iommu: Adding device 0000:c2:00.2 to group 63 [ 3.907545] iommu: Adding device 0000:c3:00.0 to group 64 [ 3.913603] iommu: Adding device 0000:c3:00.1 to group 65 [ 3.919217] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40 [ 3.924540] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.929858] PPR NX GT IA GA PC GA_vAPIC [ 3.934003] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40 [ 3.939324] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.944642] PPR NX GT IA GA PC GA_vAPIC [ 3.948780] AMD-Vi: Found IOMMU at 0000:80:00.2 cap 0x40 [ 3.954103] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.959421] PPR NX GT IA GA PC GA_vAPIC [ 3.963556] AMD-Vi: Found IOMMU at 0000:c0:00.2 cap 0x40 [ 3.968875] AMD-Vi: Extended features (0xf77ef22294ada): [ 3.974197] PPR NX GT IA GA PC GA_vAPIC [ 3.978340] AMD-Vi: Interrupt remapping enabled [ 3.982884] AMD-Vi: virtual APIC enabled [ 3.986872] pci 0000:00:00.2: irq 26 for MSI/MSI-X [ 3.986979] pci 0000:40:00.2: irq 27 for MSI/MSI-X [ 3.987062] pci 0000:80:00.2: irq 28 for MSI/MSI-X [ 3.987146] pci 0000:c0:00.2: irq 29 for MSI/MSI-X [ 3.987202] AMD-Vi: Lazy IO/TLB flushing enabled [ 3.993531] perf: AMD NB counters detected [ 3.997682] perf: AMD LLC counters detected [ 4.007939] sha1_ssse3: Using SHA-NI optimized SHA-1 implementation [ 4.014293] sha256_ssse3: Using SHA-256-NI optimized SHA-256 implementation [ 4.022894] futex hash table entries: 32768 (order: 9, 2097152 bytes) [ 4.029525] Initialise system trusted keyring [ 4.033941] audit: initializing netlink socket (disabled) [ 4.039367] type=2000 audit(1575989070.190:1): initialized [ 4.070265] HugeTLB registered 1 GB page size, pre-allocated 0 pages [ 4.076623] HugeTLB registered 2 MB page size, pre-allocated 0 pages [ 4.084259] zpool: loaded [ 4.086893] zbud: loaded [ 4.089797] VFS: Disk quotas dquot_6.6.0 [ 4.093832] Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 4.100648] msgmni has been set to 32768 [ 4.104673] Key type big_key registered [ 4.108518] SELinux: Registering netfilter hooks [ 4.110971] NET: Registered protocol family 38 [ 4.115433] Key type asymmetric registered [ 4.119542] Asymmetric key parser 'x509' registered [ 4.124481] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248) [ 4.132059] io scheduler noop registered [ 4.135991] io scheduler deadline registered (default) [ 4.141175] io scheduler cfq registered [ 4.145022] io scheduler mq-deadline registered [ 4.149563] io scheduler kyber registered [ 4.154655] pcieport 0000:00:03.1: irq 30 for MSI/MSI-X [ 4.155602] pcieport 0000:00:07.1: irq 31 for MSI/MSI-X [ 4.156596] pcieport 0000:00:08.1: irq 33 for MSI/MSI-X [ 4.156878] pcieport 0000:40:07.1: irq 34 for MSI/MSI-X [ 4.157663] pcieport 0000:40:08.1: irq 36 for MSI/MSI-X [ 4.158330] pcieport 0000:80:01.1: irq 37 for MSI/MSI-X [ 4.158578] pcieport 0000:80:01.2: irq 38 for MSI/MSI-X [ 4.159271] pcieport 0000:80:03.1: irq 39 for MSI/MSI-X [ 4.159564] pcieport 0000:80:07.1: irq 41 for MSI/MSI-X [ 4.160299] pcieport 0000:80:08.1: irq 43 for MSI/MSI-X [ 4.160548] pcieport 0000:c0:01.1: irq 44 for MSI/MSI-X [ 4.161293] pcieport 0000:c0:07.1: irq 46 for MSI/MSI-X [ 4.161508] pcieport 0000:c0:08.1: irq 48 for MSI/MSI-X [ 4.161605] pcieport 0000:00:03.1: Signaling PME through PCIe PME interrupt [ 4.168574] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt [ 4.175104] pcie_pme 0000:00:03.1:pcie001: service driver pcie_pme loaded [ 4.175119] pcieport 0000:00:07.1: Signaling PME through PCIe PME interrupt [ 4.182092] pci 0000:02:00.0: Signaling PME through PCIe PME interrupt [ 4.188626] pci 0000:02:00.2: Signaling PME through PCIe PME interrupt [ 4.195160] pci 0000:02:00.3: Signaling PME through PCIe PME interrupt [ 4.201696] pcie_pme 0000:00:07.1:pcie001: service driver pcie_pme loaded [ 4.201710] pcieport 0000:00:08.1: Signaling PME through PCIe PME interrupt [ 4.208677] pci 0000:03:00.0: Signaling PME through PCIe PME interrupt [ 4.215215] pci 0000:03:00.1: Signaling PME through PCIe PME interrupt [ 4.221751] pcie_pme 0000:00:08.1:pcie001: service driver pcie_pme loaded [ 4.221770] pcieport 0000:40:07.1: Signaling PME through PCIe PME interrupt [ 4.228737] pci 0000:41:00.0: Signaling PME through PCIe PME interrupt [ 4.235269] pci 0000:41:00.2: Signaling PME through PCIe PME interrupt [ 4.241802] pci 0000:41:00.3: Signaling PME through PCIe PME interrupt [ 4.248340] pcie_pme 0000:40:07.1:pcie001: service driver pcie_pme loaded [ 4.248356] pcieport 0000:40:08.1: Signaling PME through PCIe PME interrupt [ 4.255326] pci 0000:42:00.0: Signaling PME through PCIe PME interrupt [ 4.261860] pci 0000:42:00.1: Signaling PME through PCIe PME interrupt [ 4.268393] pcie_pme 0000:40:08.1:pcie001: service driver pcie_pme loaded [ 4.268412] pcieport 0000:80:01.1: Signaling PME through PCIe PME interrupt [ 4.275380] pci 0000:81:00.0: Signaling PME through PCIe PME interrupt [ 4.281914] pci 0000:81:00.1: Signaling PME through PCIe PME interrupt [ 4.288453] pcie_pme 0000:80:01.1:pcie001: service driver pcie_pme loaded [ 4.288469] pcieport 0000:80:01.2: Signaling PME through PCIe PME interrupt [ 4.295434] pci 0000:82:00.0: Signaling PME through PCIe PME interrupt [ 4.301970] pci 0000:83:00.0: Signaling PME through PCIe PME interrupt [ 4.308507] pcie_pme 0000:80:01.2:pcie001: service driver pcie_pme loaded [ 4.308523] pcieport 0000:80:03.1: Signaling PME through PCIe PME interrupt [ 4.315490] pci 0000:84:00.0: Signaling PME through PCIe PME interrupt [ 4.322024] pcie_pme 0000:80:03.1:pcie001: service driver pcie_pme loaded [ 4.322039] pcieport 0000:80:07.1: Signaling PME through PCIe PME interrupt [ 4.329011] pci 0000:85:00.0: Signaling PME through PCIe PME interrupt [ 4.335545] pci 0000:85:00.2: Signaling PME through PCIe PME interrupt [ 4.342082] pcie_pme 0000:80:07.1:pcie001: service driver pcie_pme loaded [ 4.342096] pcieport 0000:80:08.1: Signaling PME through PCIe PME interrupt [ 4.349063] pci 0000:86:00.0: Signaling PME through PCIe PME interrupt [ 4.355597] pci 0000:86:00.1: Signaling PME through PCIe PME interrupt [ 4.362136] pci 0000:86:00.2: Signaling PME through PCIe PME interrupt [ 4.368671] pcie_pme 0000:80:08.1:pcie001: service driver pcie_pme loaded [ 4.368685] pcieport 0000:c0:01.1: Signaling PME through PCIe PME interrupt [ 4.375657] pci 0000:c1:00.0: Signaling PME through PCIe PME interrupt [ 4.382189] pcie_pme 0000:c0:01.1:pcie001: service driver pcie_pme loaded [ 4.382203] pcieport 0000:c0:07.1: Signaling PME through PCIe PME interrupt [ 4.389168] pci 0000:c2:00.0: Signaling PME through PCIe PME interrupt [ 4.395701] pci 0000:c2:00.2: Signaling PME through PCIe PME interrupt [ 4.402237] pcie_pme 0000:c0:07.1:pcie001: service driver pcie_pme loaded [ 4.402249] pcieport 0000:c0:08.1: Signaling PME through PCIe PME interrupt [ 4.409221] pci 0000:c3:00.0: Signaling PME through PCIe PME interrupt [ 4.415753] pci 0000:c3:00.1: Signaling PME through PCIe PME interrupt [ 4.422293] pcie_pme 0000:c0:08.1:pcie001: service driver pcie_pme loaded [ 4.422312] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [ 4.427893] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 [ 4.434563] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 4.441375] efifb: probing for efifb [ 4.444976] efifb: framebuffer at 0xab000000, mapped to 0xffffb7fd99800000, using 3072k, total 3072k [ 4.454107] efifb: mode is 1024x768x32, linelength=4096, pages=1 [ 4.460121] efifb: scrolling: redraw [ 4.463712] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0 [ 4.485120] Console: switching to colour frame buffer device 128x48 [ 4.506967] fb0: EFI VGA frame buffer device [ 4.511346] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input0 [ 4.519528] ACPI: Power Button [PWRB] [ 4.523249] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1 [ 4.530654] ACPI: Power Button [PWRF] [ 4.535501] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC. [ 4.542990] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled [ 4.570194] 00:02: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 4.596739] 00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 4.602814] Non-volatile memory driver v1.3 [ 4.607042] Linux agpgart interface v0.103 [ 4.613664] crash memory driver: version 1.1 [ 4.618163] rdac: device handler registered [ 4.622410] hp_sw: device handler registered [ 4.626690] emc: device handler registered [ 4.630969] alua: device handler registered [ 4.635197] libphy: Fixed MDIO Bus: probed [ 4.639360] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 4.645899] ehci-pci: EHCI PCI platform driver [ 4.650368] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 4.656559] ohci-pci: OHCI PCI platform driver [ 4.661022] uhci_hcd: USB Universal Host Controller Interface driver [ 4.667525] xhci_hcd 0000:02:00.3: xHCI Host Controller [ 4.672833] xhci_hcd 0000:02:00.3: new USB bus registered, assigned bus number 1 [ 4.680336] xhci_hcd 0000:02:00.3: hcc params 0x0270f665 hci version 0x100 quirks 0x00000410 [ 4.688819] xhci_hcd 0000:02:00.3: irq 50 for MSI/MSI-X [ 4.688840] xhci_hcd 0000:02:00.3: irq 51 for MSI/MSI-X [ 4.688860] xhci_hcd 0000:02:00.3: irq 52 for MSI/MSI-X [ 4.688880] xhci_hcd 0000:02:00.3: irq 53 for MSI/MSI-X [ 4.688900] xhci_hcd 0000:02:00.3: irq 54 for MSI/MSI-X [ 4.688919] xhci_hcd 0000:02:00.3: irq 55 for MSI/MSI-X [ 4.688947] xhci_hcd 0000:02:00.3: irq 56 for MSI/MSI-X [ 4.688966] xhci_hcd 0000:02:00.3: irq 57 for MSI/MSI-X [ 4.689106] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [ 4.695902] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.703130] usb usb1: Product: xHCI Host Controller [ 4.708019] usb usb1: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.716114] usb usb1: SerialNumber: 0000:02:00.3 [ 4.720865] hub 1-0:1.0: USB hub found [ 4.724630] hub 1-0:1.0: 2 ports detected [ 4.728887] xhci_hcd 0000:02:00.3: xHCI Host Controller [ 4.734177] xhci_hcd 0000:02:00.3: new USB bus registered, assigned bus number 2 [ 4.741606] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM. [ 4.749712] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003 [ 4.756510] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.763738] usb usb2: Product: xHCI Host Controller [ 4.768626] usb usb2: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.776722] usb usb2: SerialNumber: 0000:02:00.3 [ 4.781439] hub 2-0:1.0: USB hub found [ 4.785202] hub 2-0:1.0: 2 ports detected [ 4.789543] xhci_hcd 0000:41:00.3: xHCI Host Controller [ 4.794853] xhci_hcd 0000:41:00.3: new USB bus registered, assigned bus number 3 [ 4.802362] xhci_hcd 0000:41:00.3: hcc params 0x0270f665 hci version 0x100 quirks 0x00000410 [ 4.810843] xhci_hcd 0000:41:00.3: irq 59 for MSI/MSI-X [ 4.810863] xhci_hcd 0000:41:00.3: irq 60 for MSI/MSI-X [ 4.810883] xhci_hcd 0000:41:00.3: irq 61 for MSI/MSI-X [ 4.810902] xhci_hcd 0000:41:00.3: irq 62 for MSI/MSI-X [ 4.810920] xhci_hcd 0000:41:00.3: irq 63 for MSI/MSI-X [ 4.810957] xhci_hcd 0000:41:00.3: irq 64 for MSI/MSI-X [ 4.810975] xhci_hcd 0000:41:00.3: irq 65 for MSI/MSI-X [ 4.810993] xhci_hcd 0000:41:00.3: irq 66 for MSI/MSI-X [ 4.811145] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002 [ 4.817942] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.825170] usb usb3: Product: xHCI Host Controller [ 4.830058] usb usb3: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.838153] usb usb3: SerialNumber: 0000:41:00.3 [ 4.842886] hub 3-0:1.0: USB hub found [ 4.846649] hub 3-0:1.0: 2 ports detected [ 4.850906] xhci_hcd 0000:41:00.3: xHCI Host Controller [ 4.856184] xhci_hcd 0000:41:00.3: new USB bus registered, assigned bus number 4 [ 4.863625] usb usb4: We don't know the algorithms for LPM for this host, disabling LPM. [ 4.871734] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003 [ 4.878531] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 4.885759] usb usb4: Product: xHCI Host Controller [ 4.890647] usb usb4: Manufacturer: Linux 3.10.0-957.27.2.el7_lustre.pl2.x86_64 xhci-hcd [ 4.898741] usb usb4: SerialNumber: 0000:41:00.3 [ 4.903454] hub 4-0:1.0: USB hub found [ 4.907219] hub 4-0:1.0: 2 ports detected [ 4.911479] usbcore: registered new interface driver usbserial_generic [ 4.918023] usbserial: USB Serial support registered for generic [ 4.924078] i8042: PNP: No PS/2 controller found. Probing ports directly. [ 5.039951] usb 1-1: new high-speed USB device number 2 using xhci_hcd [ 5.161952] usb 3-1: new high-speed USB device number 2 using xhci_hcd [ 5.171856] usb 1-1: New USB device found, idVendor=0424, idProduct=2744 [ 5.178571] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 5.185718] usb 1-1: Product: USB2734 [ 5.189390] usb 1-1: Manufacturer: Microchip Tech [ 5.221754] hub 1-1:1.0: USB hub found [ 5.225730] hub 1-1:1.0: 4 ports detected [ 5.282979] usb 2-1: new SuperSpeed USB device number 2 using xhci_hcd [ 5.291879] usb 3-1: New USB device found, idVendor=1604, idProduct=10c0 [ 5.298583] usb 3-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 5.304231] usb 2-1: New USB device found, idVendor=0424, idProduct=5744 [ 5.304233] usb 2-1: New USB device strings: Mfr=2, Product=3, SerialNumber=0 [ 5.304234] usb 2-1: Product: USB5734 [ 5.304235] usb 2-1: Manufacturer: Microchip Tech [ 5.317748] hub 2-1:1.0: USB hub found [ 5.318103] hub 2-1:1.0: 4 ports detected [ 5.319156] usb: port power management may be unreliable [ 5.343773] hub 3-1:1.0: USB hub found [ 5.347753] hub 3-1:1.0: 4 ports detected [ 5.964315] i8042: No controller found [ 5.968097] tsc: Refined TSC clocksource calibration: 1996.249 MHz [ 5.968164] mousedev: PS/2 mouse device common for all mice [ 5.968366] rtc_cmos 00:01: RTC can wake from S4 [ 5.968714] rtc_cmos 00:01: rtc core: registered rtc_cmos as rtc0 [ 5.968812] rtc_cmos 00:01: alarms up to one month, y3k, 114 bytes nvram, hpet irqs [ 5.968867] cpuidle: using governor menu [ 5.969134] EFI Variables Facility v0.08 2004-May-17 [ 5.990106] hidraw: raw HID events driver (C) Jiri Kosina [ 5.990204] usbcore: registered new interface driver usbhid [ 5.990205] usbhid: USB HID core driver [ 5.990331] drop_monitor: Initializing network drop monitor service [ 5.990485] TCP: cubic registered [ 5.990490] Initializing XFRM netlink socket [ 5.990699] NET: Registered protocol family 10 [ 5.991241] NET: Registered protocol family 17 [ 5.991245] mpls_gso: MPLS GSO support [ 5.992310] mce: Using 23 MCE banks [ 5.992359] microcode: CPU0: patch_level=0x08001250 [ 5.992375] microcode: CPU1: patch_level=0x08001250 [ 5.992387] microcode: CPU2: patch_level=0x08001250 [ 5.992398] microcode: CPU3: patch_level=0x08001250 [ 5.992411] microcode: CPU4: patch_level=0x08001250 [ 5.992426] microcode: CPU5: patch_level=0x08001250 [ 5.992442] microcode: CPU6: patch_level=0x08001250 [ 5.992611] microcode: CPU7: patch_level=0x08001250 [ 5.992622] microcode: CPU8: patch_level=0x08001250 [ 5.992632] microcode: CPU9: patch_level=0x08001250 [ 5.992642] microcode: CPU10: patch_level=0x08001250 [ 5.992652] microcode: CPU11: patch_level=0x08001250 [ 5.992663] microcode: CPU12: patch_level=0x08001250 [ 5.992675] microcode: CPU13: patch_level=0x08001250 [ 5.992686] microcode: CPU14: patch_level=0x08001250 [ 5.992696] microcode: CPU15: patch_level=0x08001250 [ 5.992707] microcode: CPU16: patch_level=0x08001250 [ 5.992719] microcode: CPU17: patch_level=0x08001250 [ 5.992729] microcode: CPU18: patch_level=0x08001250 [ 5.992740] microcode: CPU19: patch_level=0x08001250 [ 5.992750] microcode: CPU20: patch_level=0x08001250 [ 5.992761] microcode: CPU21: patch_level=0x08001250 [ 5.992773] microcode: CPU22: patch_level=0x08001250 [ 5.992783] microcode: CPU23: patch_level=0x08001250 [ 5.992793] microcode: CPU24: patch_level=0x08001250 [ 5.992801] microcode: CPU25: patch_level=0x08001250 [ 5.992812] microcode: CPU26: patch_level=0x08001250 [ 5.992823] microcode: CPU27: patch_level=0x08001250 [ 5.992831] microcode: CPU28: patch_level=0x08001250 [ 5.992839] microcode: CPU29: patch_level=0x08001250 [ 5.992850] microcode: CPU30: patch_level=0x08001250 [ 5.992861] microcode: CPU31: patch_level=0x08001250 [ 5.992869] microcode: CPU32: patch_level=0x08001250 [ 5.992877] microcode: CPU33: patch_level=0x08001250 [ 5.992886] microcode: CPU34: patch_level=0x08001250 [ 5.992896] microcode: CPU35: patch_level=0x08001250 [ 5.992904] microcode: CPU36: patch_level=0x08001250 [ 5.992913] microcode: CPU37: patch_level=0x08001250 [ 5.992923] microcode: CPU38: patch_level=0x08001250 [ 5.992934] microcode: CPU39: patch_level=0x08001250 [ 5.992954] microcode: CPU40: patch_level=0x08001250 [ 5.992965] microcode: CPU41: patch_level=0x08001250 [ 5.992976] microcode: CPU42: patch_level=0x08001250 [ 5.992986] microcode: CPU43: patch_level=0x08001250 [ 5.992997] microcode: CPU44: patch_level=0x08001250 [ 5.993005] microcode: CPU45: patch_level=0x08001250 [ 5.993016] microcode: CPU46: patch_level=0x08001250 [ 5.993027] microcode: CPU47: patch_level=0x08001250 [ 5.993071] microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba [ 5.993206] PM: Hibernation image not present or could not be loaded. [ 5.993209] Loading compiled-in X.509 certificates [ 5.993234] Loaded X.509 cert 'CentOS Linux kpatch signing key: ea0413152cde1d98ebdca3fe6f0230904c9ef717' [ 5.993250] Loaded X.509 cert 'CentOS Linux Driver update signing key: 7f421ee0ab69461574bb358861dbe77762a4201b' [ 5.993627] Loaded X.509 cert 'CentOS Linux kernel signing key: 468656045a39b52ff2152c315f6198c3e658f24d' [ 5.993642] registered taskstats version 1 [ 5.995842] Key type trusted registered [ 5.997393] Key type encrypted registered [ 5.997441] IMA: No TPM chip found, activating TPM-bypass! (rc=-19) [ 6.000011] Magic number: 15:955:737 [ 6.000221] memory memory451: hash matches [ 6.006619] rtc_cmos 00:01: setting system clock to 2019-12-10 14:44:37 UTC (1575989077) [ 6.375507] usb 3-1.1: new high-speed USB device number 3 using xhci_hcd [ 6.375518] Switched to clocksource tsc [ 6.387060] Freeing unused kernel memory: 1876k freed [ 6.392380] Write protecting the kernel read-only data: 12288k [ 6.399607] Freeing unused kernel memory: 504k freed [ 6.405967] Freeing unused kernel memory: 596k freed [ 6.458334] systemd[1]: systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) [ 6.459885] usb 3-1.1: New USB device found, idVendor=1604, idProduct=10c0 [ 6.459886] usb 3-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 6.463776] hub 3-1.1:1.0: USB hub found [ 6.464134] hub 3-1.1:1.0: 4 ports detected [ 6.499755] systemd[1]: Detected architecture x86-64. [ 6.504808] systemd[1]: Running in initial RAM disk. [ 6.518094] systemd[1]: Set hostname to <fir-md1-s4>. [ 6.527953] usb 3-1.4: new high-speed USB device number 4 using xhci_hcd [ 6.552758] systemd[1]: Reached target Swap. [ 6.561032] systemd[1]: Reached target Local File Systems. [ 6.572216] systemd[1]: Created slice Root Slice. [ 6.583066] systemd[1]: Listening on Journal Socket. [ 6.594008] systemd[1]: Reached target Timers. [ 6.603022] systemd[1]: Listening on udev Kernel Socket. [ 6.608888] usb 3-1.4: New USB device found, idVendor=1604, idProduct=10c0 [ 6.616724] usb 3-1.4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 6.628032] systemd[1]: Listening on udev Control Socket. [ 6.639006] systemd[1]: Reached target Sockets. [ 6.648063] systemd[1]: Created slice System Slice. [ 6.655785] hub 3-1.4:1.0: USB hub found [ 6.661131] systemd[1]: Reached target Slices. [ 6.661261] hub 3-1.4:1.0: 4 ports detected [ 6.674578] systemd[1]: Starting Apply Kernel Variables... [ 6.685426] systemd[1]: Starting Create list of required static device nodes for the current kernel... [ 6.703473] systemd[1]: Starting dracut cmdline hook... [ 6.713378] systemd[1]: Starting Setup Virtual Console... [ 6.723392] systemd[1]: Starting Journal Service... [ 6.746386] systemd[1]: Started Apply Kernel Variables. [ 6.757433] systemd[1]: Started Create list of required static device nodes for the current kernel. [ 6.775245] systemd[1]: Started dracut cmdline hook. [ 6.786193] systemd[1]: Started Setup Virtual Console. [ 6.797133] systemd[1]: Started Journal Service. [ 6.911073] pps_core: LinuxPPS API ver. 1 registered [ 6.916044] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it> [ 6.928296] PTP clock support registered [ 6.935537] megasas: 07.705.02.00-rh1 [ 6.935542] mlx_compat: loading out-of-tree module taints kernel. [ 6.946409] mlx_compat: module verification failed: signature and/or required key missing - tainting kernel [ 6.946573] megaraid_sas 0000:c1:00.0: FW now in Ready state [ 6.946577] megaraid_sas 0000:c1:00.0: 64 bit DMA mask and 32 bit consistent mask [ 6.946975] megaraid_sas 0000:c1:00.0: irq 68 for MSI/MSI-X [ 6.947004] megaraid_sas 0000:c1:00.0: irq 69 for MSI/MSI-X [ 6.947034] megaraid_sas 0000:c1:00.0: irq 70 for MSI/MSI-X [ 6.947061] megaraid_sas 0000:c1:00.0: irq 71 for MSI/MSI-X [ 6.947087] megaraid_sas 0000:c1:00.0: irq 72 for MSI/MSI-X [ 6.947113] megaraid_sas 0000:c1:00.0: irq 73 for MSI/MSI-X [ 6.947139] megaraid_sas 0000:c1:00.0: irq 74 for MSI/MSI-X [ 6.947167] megaraid_sas 0000:c1:00.0: irq 75 for MSI/MSI-X [ 6.947192] megaraid_sas 0000:c1:00.0: irq 76 for MSI/MSI-X [ 6.947218] megaraid_sas 0000:c1:00.0: irq 77 for MSI/MSI-X [ 6.947243] megaraid_sas 0000:c1:00.0: irq 78 for MSI/MSI-X [ 6.947269] megaraid_sas 0000:c1:00.0: irq 79 for MSI/MSI-X [ 6.947300] megaraid_sas 0000:c1:00.0: irq 80 for MSI/MSI-X [ 6.947325] megaraid_sas 0000:c1:00.0: irq 81 for MSI/MSI-X [ 6.947351] megaraid_sas 0000:c1:00.0: irq 82 for MSI/MSI-X [ 6.947376] megaraid_sas 0000:c1:00.0: irq 83 for MSI/MSI-X [ 6.947402] megaraid_sas 0000:c1:00.0: irq 84 for MSI/MSI-X [ 6.947428] megaraid_sas 0000:c1:00.0: irq 85 for MSI/MSI-X [ 6.947453] megaraid_sas 0000:c1:00.0: irq 86 for MSI/MSI-X [ 6.947480] megaraid_sas 0000:c1:00.0: irq 87 for MSI/MSI-X [ 6.947501] megaraid_sas 0000:c1:00.0: irq 88 for MSI/MSI-X [ 6.947523] megaraid_sas 0000:c1:00.0: irq 89 for MSI/MSI-X [ 6.947545] megaraid_sas 0000:c1:00.0: irq 90 for MSI/MSI-X [ 6.947571] megaraid_sas 0000:c1:00.0: irq 91 for MSI/MSI-X [ 6.947609] megaraid_sas 0000:c1:00.0: irq 92 for MSI/MSI-X [ 6.947636] megaraid_sas 0000:c1:00.0: irq 93 for MSI/MSI-X [ 6.947661] megaraid_sas 0000:c1:00.0: irq 94 for MSI/MSI-X [ 6.947685] megaraid_sas 0000:c1:00.0: irq 95 for MSI/MSI-X [ 6.947711] megaraid_sas 0000:c1:00.0: irq 96 for MSI/MSI-X [ 6.947736] megaraid_sas 0000:c1:00.0: irq 97 for MSI/MSI-X [ 6.947761] megaraid_sas 0000:c1:00.0: irq 98 for MSI/MSI-X [ 6.947786] megaraid_sas 0000:c1:00.0: irq 99 for MSI/MSI-X [ 6.947811] megaraid_sas 0000:c1:00.0: irq 100 for MSI/MSI-X [ 6.947836] megaraid_sas 0000:c1:00.0: irq 101 for MSI/MSI-X [ 6.947859] megaraid_sas 0000:c1:00.0: irq 102 for MSI/MSI-X [ 6.947884] megaraid_sas 0000:c1:00.0: irq 103 for MSI/MSI-X [ 6.947912] megaraid_sas 0000:c1:00.0: irq 104 for MSI/MSI-X [ 6.947938] megaraid_sas 0000:c1:00.0: irq 105 for MSI/MSI-X [ 6.947969] megaraid_sas 0000:c1:00.0: irq 106 for MSI/MSI-X [ 6.947994] megaraid_sas 0000:c1:00.0: irq 107 for MSI/MSI-X [ 6.948018] megaraid_sas 0000:c1:00.0: irq 108 for MSI/MSI-X [ 6.948043] megaraid_sas 0000:c1:00.0: irq 109 for MSI/MSI-X [ 6.948068] megaraid_sas 0000:c1:00.0: irq 110 for MSI/MSI-X [ 6.948090] megaraid_sas 0000:c1:00.0: irq 111 for MSI/MSI-X [ 6.948115] megaraid_sas 0000:c1:00.0: irq 112 for MSI/MSI-X [ 6.948140] megaraid_sas 0000:c1:00.0: irq 113 for MSI/MSI-X [ 6.948163] megaraid_sas 0000:c1:00.0: irq 114 for MSI/MSI-X [ 6.948187] megaraid_sas 0000:c1:00.0: irq 115 for MSI/MSI-X [ 6.948306] megaraid_sas 0000:c1:00.0: firmware supports msix : (96) [ 6.948307] megaraid_sas 0000:c1:00.0: current msix/online cpus : (48/48) [ 6.948309] megaraid_sas 0000:c1:00.0: RDPQ mode : (disabled) [ 6.948311] megaraid_sas 0000:c1:00.0: Current firmware supports maximum commands: 928 LDIO threshold: 237 [ 6.948652] megaraid_sas 0000:c1:00.0: Configured max firmware commands: 927 [ 6.950880] megaraid_sas 0000:c1:00.0: FW supports sync cache : No [ 7.012450] libata version 3.00 loaded. [ 7.013263] Compat-mlnx-ofed backport release: 1c4bf42 [ 7.019369] Backport based on mlnx_ofed/mlnx-ofa_kernel-4.0.git 1c4bf42 [ 7.027371] compat.git: mlnx_ofed/mlnx-ofa_kernel-4.0.git [ 7.039114] tg3.c:v3.137 (May 11, 2014) [ 7.048568] mpt3sas version 31.00.00.00 loaded [ 7.054051] mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (263564432 kB) [ 7.054593] tg3 0000:81:00.0 eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 4c:d9:8f:48:5a:c7 [ 7.054597] tg3 0000:81:00.0 eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1]) [ 7.054599] tg3 0000:81:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1] [ 7.054601] tg3 0000:81:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit] [ 7.054669] ahci 0000:86:00.2: version 3.0 [ 7.055168] ahci 0000:86:00.2: irq 119 for MSI/MSI-X [ 7.055172] ahci 0000:86:00.2: irq 120 for MSI/MSI-X [ 7.055175] ahci 0000:86:00.2: irq 121 for MSI/MSI-X [ 7.055179] ahci 0000:86:00.2: irq 122 for MSI/MSI-X [ 7.055182] ahci 0000:86:00.2: irq 123 for MSI/MSI-X [ 7.055185] ahci 0000:86:00.2: irq 124 for MSI/MSI-X [ 7.055188] ahci 0000:86:00.2: irq 125 for MSI/MSI-X [ 7.055191] ahci 0000:86:00.2: irq 126 for MSI/MSI-X [ 7.055194] ahci 0000:86:00.2: irq 127 for MSI/MSI-X [ 7.055198] ahci 0000:86:00.2: irq 128 for MSI/MSI-X [ 7.055201] ahci 0000:86:00.2: irq 129 for MSI/MSI-X [ 7.055204] ahci 0000:86:00.2: irq 130 for MSI/MSI-X [ 7.055207] ahci 0000:86:00.2: irq 131 for MSI/MSI-X [ 7.055210] ahci 0000:86:00.2: irq 132 for MSI/MSI-X [ 7.055214] ahci 0000:86:00.2: irq 133 for MSI/MSI-X [ 7.055217] ahci 0000:86:00.2: irq 134 for MSI/MSI-X [ 7.055277] ahci 0000:86:00.2: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x1 impl SATA mode [ 7.055279] ahci 0000:86:00.2: flags: 64bit ncq sntf ilck pm led clo only pmp fbs pio slum part [ 7.057168] scsi host2: ahci [ 7.057408] ata1: SATA max UDMA/133 abar m4096@0xc0a02000 port 0xc0a02100 irq 119 [ 7.081092] tg3 0000:81:00.1 eth1: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 4c:d9:8f:48:5a:c8 [ 7.081096] tg3 0000:81:00.1 eth1: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1]) [ 7.081098] tg3 0000:81:00.1 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1] [ 7.081099] tg3 0000:81:00.1 eth1: dma_rwctrl[00000001] dma_mask[64-bit] [ 7.167286] mpt3sas_cm0: IOC Number : 0 [ 7.171244] mpt3sas 0000:84:00.0: irq 136 for MSI/MSI-X [ 7.171270] mpt3sas 0000:84:00.0: irq 137 for MSI/MSI-X [ 7.171295] mpt3sas 0000:84:00.0: irq 138 for MSI/MSI-X [ 7.171319] mpt3sas 0000:84:00.0: irq 139 for MSI/MSI-X [ 7.171343] mpt3sas 0000:84:00.0: irq 140 for MSI/MSI-X [ 7.171367] mpt3sas 0000:84:00.0: irq 141 for MSI/MSI-X [ 7.171394] mpt3sas 0000:84:00.0: irq 142 for MSI/MSI-X [ 7.171417] mpt3sas 0000:84:00.0: irq 143 for MSI/MSI-X [ 7.171440] mpt3sas 0000:84:00.0: irq 144 for MSI/MSI-X [ 7.171470] mpt3sas 0000:84:00.0: irq 145 for MSI/MSI-X [ 7.171494] mpt3sas 0000:84:00.0: irq 146 for MSI/MSI-X [ 7.171517] mpt3sas 0000:84:00.0: irq 147 for MSI/MSI-X [ 7.171541] mpt3sas 0000:84:00.0: irq 148 for MSI/MSI-X [ 7.171563] mpt3sas 0000:84:00.0: irq 149 for MSI/MSI-X [ 7.171595] mpt3sas 0000:84:00.0: irq 150 for MSI/MSI-X [ 7.171624] mpt3sas 0000:84:00.0: irq 151 for MSI/MSI-X [ 7.171648] mpt3sas 0000:84:00.0: irq 152 for MSI/MSI-X [ 7.171674] mpt3sas 0000:84:00.0: irq 153 for MSI/MSI-X [ 7.171695] mpt3sas 0000:84:00.0: irq 154 for MSI/MSI-X [ 7.171722] mpt3sas 0000:84:00.0: irq 155 for MSI/MSI-X [ 7.171743] mpt3sas 0000:84:00.0: irq 156 for MSI/MSI-X [ 7.171763] mpt3sas 0000:84:00.0: irq 157 for MSI/MSI-X [ 7.171786] mpt3sas 0000:84:00.0: irq 158 for MSI/MSI-X [ 7.171805] mpt3sas 0000:84:00.0: irq 159 for MSI/MSI-X [ 7.171825] mpt3sas 0000:84:00.0: irq 160 for MSI/MSI-X [ 7.171843] mpt3sas 0000:84:00.0: irq 161 for MSI/MSI-X [ 7.171867] mpt3sas 0000:84:00.0: irq 162 for MSI/MSI-X [ 7.171887] mpt3sas 0000:84:00.0: irq 163 for MSI/MSI-X [ 7.171908] mpt3sas 0000:84:00.0: irq 164 for MSI/MSI-X [ 7.171929] mpt3sas 0000:84:00.0: irq 165 for MSI/MSI-X [ 7.171955] mpt3sas 0000:84:00.0: irq 166 for MSI/MSI-X [ 7.171976] mpt3sas 0000:84:00.0: irq 167 for MSI/MSI-X [ 7.171994] mpt3sas 0000:84:00.0: irq 168 for MSI/MSI-X [ 7.172016] mpt3sas 0000:84:00.0: irq 169 for MSI/MSI-X [ 7.172039] mpt3sas 0000:84:00.0: irq 170 for MSI/MSI-X [ 7.172069] mpt3sas 0000:84:00.0: irq 171 for MSI/MSI-X [ 7.172094] mpt3sas 0000:84:00.0: irq 172 for MSI/MSI-X [ 7.172115] mpt3sas 0000:84:00.0: irq 173 for MSI/MSI-X [ 7.172139] mpt3sas 0000:84:00.0: irq 174 for MSI/MSI-X [ 7.172162] mpt3sas 0000:84:00.0: irq 175 for MSI/MSI-X [ 7.172182] mpt3sas 0000:84:00.0: irq 176 for MSI/MSI-X [ 7.172205] mpt3sas 0000:84:00.0: irq 177 for MSI/MSI-X [ 7.172230] mpt3sas 0000:84:00.0: irq 178 for MSI/MSI-X [ 7.172252] mpt3sas 0000:84:00.0: irq 179 for MSI/MSI-X [ 7.172276] mpt3sas 0000:84:00.0: irq 180 for MSI/MSI-X [ 7.172299] mpt3sas 0000:84:00.0: irq 181 for MSI/MSI-X [ 7.172325] mpt3sas 0000:84:00.0: irq 182 for MSI/MSI-X [ 7.172348] mpt3sas 0000:84:00.0: irq 183 for MSI/MSI-X [ 7.173138] mpt3sas0-msix0: PCI-MSI-X enabled: IRQ 136 [ 7.175499] mlx5_core 0000:01:00.0: firmware version: 20.26.1040 [ 7.175532] mlx5_core 0000:01:00.0: 126.016 Gb/s available PCIe bandwidth, limited by 8 GT/s x16 link at 0000:00:03.1 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 7.199191] mpt3sas0-msix1: PCI-MSI-X enabled: IRQ 137 [ 7.205738] mpt3sas0-msix2: PCI-MSI-X enabled: IRQ 138 [ 7.205739] mpt3sas0-msix3: PCI-MSI-X enabled: IRQ 139 [ 7.205740] mpt3sas0-msix4: PCI-MSI-X enabled: IRQ 140 [ 7.205741] mpt3sas0-msix5: PCI-MSI-X enabled: IRQ 141 [ 7.205746] mpt3sas0-msix6: PCI-MSI-X enabled: IRQ 142 [ 7.205751] mpt3sas0-msix7: PCI-MSI-X enabled: IRQ 143 [ 7.205753] mpt3sas0-msix8: PCI-MSI-X enabled: IRQ 144 [ 7.205754] mpt3sas0-msix9: PCI-MSI-X enabled: IRQ 145 [ 7.205755] mpt3sas0-msix10: PCI-MSI-X enabled: IRQ 146 [ 7.205756] mpt3sas0-msix11: PCI-MSI-X enabled: IRQ 147 [ 7.205756] mpt3sas0-msix12: PCI-MSI-X enabled: IRQ 148 [ 7.205758] mpt3sas0-msix13: PCI-MSI-X enabled: IRQ 149 [ 7.205759] mpt3sas0-msix14: PCI-MSI-X enabled: IRQ 150 [ 7.205760] mpt3sas0-msix15: PCI-MSI-X enabled: IRQ 151 [ 7.205761] mpt3sas0-msix16: PCI-MSI-X enabled: IRQ 152 [ 7.205762] mpt3sas0-msix17: PCI-MSI-X enabled: IRQ 153 [ 7.205762] mpt3sas0-msix18: PCI-MSI-X enabled: IRQ 154 [ 7.205763] mpt3sas0-msix19: PCI-MSI-X enabled: IRQ 155 [ 7.205764] mpt3sas0-msix20: PCI-MSI-X enabled: IRQ 156 [ 7.205764] mpt3sas0-msix21: PCI-MSI-X enabled: IRQ 157 [ 7.205765] mpt3sas0-msix22: PCI-MSI-X enabled: IRQ 158 [ 7.205766] mpt3sas0-msix23: PCI-MSI-X enabled: IRQ 159 [ 7.205766] mpt3sas0-msix24: PCI-MSI-X enabled: IRQ 160 [ 7.205767] mpt3sas0-msix25: PCI-MSI-X enabled: IRQ 161 [ 7.205768] mpt3sas0-msix26: PCI-MSI-X enabled: IRQ 162 [ 7.205769] mpt3sas0-msix27: PCI-MSI-X enabled: IRQ 163 [ 7.205769] mpt3sas0-msix28: PCI-MSI-X enabled: IRQ 164 [ 7.205770] mpt3sas0-msix29: PCI-MSI-X enabled: IRQ 165 [ 7.205771] mpt3sas0-msix30: PCI-MSI-X enabled: IRQ 166 [ 7.205772] mpt3sas0-msix31: PCI-MSI-X enabled: IRQ 167 [ 7.205772] mpt3sas0-msix32: PCI-MSI-X enabled: IRQ 168 [ 7.205773] mpt3sas0-msix33: PCI-MSI-X enabled: IRQ 169 [ 7.205773] mpt3sas0-msix34: PCI-MSI-X enabled: IRQ 170 [ 7.205774] mpt3sas0-msix35: PCI-MSI-X enabled: IRQ 171 [ 7.205775] mpt3sas0-msix36: PCI-MSI-X enabled: IRQ 172 [ 7.205775] mpt3sas0-msix37: PCI-MSI-X enabled: IRQ 173 [ 7.205776] mpt3sas0-msix38: PCI-MSI-X enabled: IRQ 174 [ 7.205777] mpt3sas0-msix39: PCI-MSI-X enabled: IRQ 175 [ 7.205777] mpt3sas0-msix40: PCI-MSI-X enabled: IRQ 176 [ 7.205778] mpt3sas0-msix41: PCI-MSI-X enabled: IRQ 177 [ 7.205779] mpt3sas0-msix42: PCI-MSI-X enabled: IRQ 178 [ 7.205780] mpt3sas0-msix43: PCI-MSI-X enabled: IRQ 179 [ 7.205780] mpt3sas0-msix44: PCI-MSI-X enabled: IRQ 180 [ 7.205781] mpt3sas0-msix45: PCI-MSI-X enabled: IRQ 181 [ 7.205781] mpt3sas0-msix46: PCI-MSI-X enabled: IRQ 182 [ 7.205782] mpt3sas0-msix47: PCI-MSI-X enabled: IRQ 183 [ 7.205785] mpt3sas_cm0: iomem(0x00000000ac000000), mapped(0xffffb7fd9a200000), size(1048576) [ 7.205786] mpt3sas_cm0: ioport(0x0000000000008000), size(256) [ 7.285955] mpt3sas_cm0: IOC Number : 0 [ 7.285959] mpt3sas_cm0: sending message unit reset !! [ 7.288958] mpt3sas_cm0: message unit reset: SUCCESS [ 7.306959] megaraid_sas 0000:c1:00.0: Init cmd return status SUCCESS for SCSI host 0 [ 7.327958] megaraid_sas 0000:c1:00.0: firmware type : Legacy(64 VD) firmware [ 7.327960] megaraid_sas 0000:c1:00.0: controller type : iMR(0MB) [ 7.327961] megaraid_sas 0000:c1:00.0: Online Controller Reset(OCR) : Enabled [ 7.327962] megaraid_sas 0000:c1:00.0: Secure JBOD support : No [ 7.327963] megaraid_sas 0000:c1:00.0: NVMe passthru support : No [ 7.349484] megaraid_sas 0000:c1:00.0: INIT adapter done [ 7.349486] megaraid_sas 0000:c1:00.0: Jbod map is not supported megasas_setup_jbod_map 5146 [ 7.363966] ata1: SATA link down (SStatus 0 SControl 300) [ 7.375578] megaraid_sas 0000:c1:00.0: pci id : (0x1000)/(0x005f)/(0x1028)/(0x1f4b) [ 7.375580] megaraid_sas 0000:c1:00.0: unevenspan support : yes [ 7.375581] megaraid_sas 0000:c1:00.0: firmware crash dump : no [ 7.375582] megaraid_sas 0000:c1:00.0: jbod sync map : no [ 7.375587] scsi host0: Avago SAS based MegaRAID driver [ 7.395126] scsi 0:2:0:0: Direct-Access DELL PERC H330 Mini 4.30 PQ: 0 ANSI: 5 [ 7.433968] mlx5_core 0000:01:00.0: irq 185 for MSI/MSI-X [ 7.433989] mlx5_core 0000:01:00.0: irq 186 for MSI/MSI-X [ 7.434009] mlx5_core 0000:01:00.0: irq 187 for MSI/MSI-X [ 7.434029] mlx5_core 0000:01:00.0: irq 188 for MSI/MSI-X [ 7.434049] mlx5_core 0000:01:00.0: irq 189 for MSI/MSI-X [ 7.434067] mlx5_core 0000:01:00.0: irq 190 for MSI/MSI-X [ 7.434088] mlx5_core 0000:01:00.0: irq 191 for MSI/MSI-X [ 7.434108] mlx5_core 0000:01:00.0: irq 192 for MSI/MSI-X [ 7.434126] mlx5_core 0000:01:00.0: irq 193 for MSI/MSI-X [ 7.434146] mlx5_core 0000:01:00.0: irq 194 for MSI/MSI-X [ 7.434165] mlx5_core 0000:01:00.0: irq 195 for MSI/MSI-X [ 7.434184] mlx5_core 0000:01:00.0: irq 196 for MSI/MSI-X [ 7.434202] mlx5_core 0000:01:00.0: irq 197 for MSI/MSI-X [ 7.434222] mlx5_core 0000:01:00.0: irq 198 for MSI/MSI-X [ 7.434241] mlx5_core 0000:01:00.0: irq 199 for MSI/MSI-X [ 7.434261] mlx5_core 0000:01:00.0: irq 200 for MSI/MSI-X [ 7.434280] mlx5_core 0000:01:00.0: irq 201 for MSI/MSI-X [ 7.434299] mlx5_core 0000:01:00.0: irq 202 for MSI/MSI-X [ 7.434317] mlx5_core 0000:01:00.0: irq 203 for MSI/MSI-X [ 7.434335] mlx5_core 0000:01:00.0: irq 204 for MSI/MSI-X [ 7.434353] mlx5_core 0000:01:00.0: irq 205 for MSI/MSI-X [ 7.434372] mlx5_core 0000:01:00.0: irq 206 for MSI/MSI-X [ 7.434392] mlx5_core 0000:01:00.0: irq 207 for MSI/MSI-X [ 7.434411] mlx5_core 0000:01:00.0: irq 208 for MSI/MSI-X [ 7.434432] mlx5_core 0000:01:00.0: irq 209 for MSI/MSI-X [ 7.434451] mlx5_core 0000:01:00.0: irq 210 for MSI/MSI-X [ 7.434470] mlx5_core 0000:01:00.0: irq 211 for MSI/MSI-X [ 7.434489] mlx5_core 0000:01:00.0: irq 212 for MSI/MSI-X [ 7.434508] mlx5_core 0000:01:00.0: irq 213 for MSI/MSI-X [ 7.434528] mlx5_core 0000:01:00.0: irq 214 for MSI/MSI-X [ 7.434548] mlx5_core 0000:01:00.0: irq 215 for MSI/MSI-X [ 7.434565] mlx5_core 0000:01:00.0: irq 216 for MSI/MSI-X [ 7.434586] mlx5_core 0000:01:00.0: irq 217 for MSI/MSI-X [ 7.434604] mlx5_core 0000:01:00.0: irq 218 for MSI/MSI-X [ 7.434626] mlx5_core 0000:01:00.0: irq 219 for MSI/MSI-X [ 7.434645] mlx5_core 0000:01:00.0: irq 220 for MSI/MSI-X [ 7.434665] mlx5_core 0000:01:00.0: irq 221 for MSI/MSI-X [ 7.434684] mlx5_core 0000:01:00.0: irq 222 for MSI/MSI-X [ 7.434703] mlx5_core 0000:01:00.0: irq 223 for MSI/MSI-X [ 7.434721] mlx5_core 0000:01:00.0: irq 224 for MSI/MSI-X [ 7.434740] mlx5_core 0000:01:00.0: irq 225 for MSI/MSI-X [ 7.434758] mlx5_core 0000:01:00.0: irq 226 for MSI/MSI-X [ 7.434777] mlx5_core 0000:01:00.0: irq 227 for MSI/MSI-X [ 7.434795] mlx5_core 0000:01:00.0: irq 228 for MSI/MSI-X [ 7.434813] mlx5_core 0000:01:00.0: irq 229 for MSI/MSI-X [ 7.434831] mlx5_core 0000:01:00.0: irq 230 for MSI/MSI-X [ 7.434850] mlx5_core 0000:01:00.0: irq 231 for MSI/MSI-X [ 7.434867] mlx5_core 0000:01:00.0: irq 232 for MSI/MSI-X [ 7.434887] mlx5_core 0000:01:00.0: irq 233 for MSI/MSI-X [ 7.435893] mlx5_core 0000:01:00.0: Port module event: module 0, Cable plugged [ 7.436139] mlx5_core 0000:01:00.0: mlx5_pcie_event:303:(pid 317): PCIe slot advertised sufficient power (27W). [ 7.443555] mlx5_core 0000:01:00.0: mlx5_fw_tracer_start:776:(pid 298): FWTracer: Ownership granted and active [ 7.454201] mpt3sas_cm0: Allocated physical memory: size(38831 kB) [ 7.454202] mpt3sas_cm0: Current Controller Queue Depth(7564), Max Controller Queue Depth(7680) [ 7.454203] mpt3sas_cm0: Scatter Gather Elements per IO(128) [ 7.603571] mpt3sas_cm0: FW Package Version(12.00.00.00) [ 7.603840] mpt3sas_cm0: SAS3616: FWVersion(12.00.00.00), ChipRevision(0x02), BiosVersion(00.00.00.00) [ 7.603844] mpt3sas_cm0: Protocol=(Initiator,Target,NVMe), Capabilities=(TLR,EEDP,Diag Trace Buffer,Task Set Full,NCQ) [ 7.603913] mpt3sas 0000:84:00.0: Enabled Extended Tags as Controller Supports [ 7.603928] mpt3sas_cm0: : host protection capabilities enabled DIF1 DIF2 DIF3 [ 7.603937] scsi host1: Fusion MPT SAS Host [ 7.604185] mpt3sas_cm0: registering trace buffer support [ 7.608433] mpt3sas_cm0: Trace buffer memory 2048 KB allocated [ 7.608434] mpt3sas_cm0: sending port enable !! [ 7.608722] mpt3sas_cm0: hba_port entry: ffff94f4e90a8680, port: 255 is added to hba_port list [ 7.611194] mpt3sas_cm0: host_add: handle(0x0001), sas_addr(0x500605b00deb4820), phys(21) [ 7.611899] mpt3sas_cm0: detecting: handle(0x0018), sas_address(0x500a0984dfa20c24), phy(0) [ 7.611906] mpt3sas_cm0: REPORT_LUNS: handle(0x0018), retries(0) [ 7.612671] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0018), lun(0) [ 7.613220] scsi 1:0:0:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 7.613299] scsi 1:0:0:0: SSP: handle(0x0018), sas_addr(0x500a0984dfa20c24), phy(0), device_name(0x500a0984dfa20c24) [ 7.613301] scsi 1:0:0:0: enclosure logical id(0x300605b00d114820), slot(13) [ 7.613302] scsi 1:0:0:0: enclosure level(0x0000), connector name( C3 ) [ 7.613303] scsi 1:0:0:0: serial_number(021825001558 ) [ 7.613305] scsi 1:0:0:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 7.811107] mlx5_ib: Mellanox Connect-IB Infiniband driver v4.7-1.0.0 [ 7.818867] scsi 1:0:0:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 7.827053] scsi 1:0:0:1: SSP: handle(0x0018), sas_addr(0x500a0984dfa20c24), phy(0), device_name(0x500a0984dfa20c24) [ 7.837563] scsi 1:0:0:1: enclosure logical id(0x300605b00d114820), slot(13) [ 7.844700] scsi 1:0:0:1: enclosure level(0x0000), connector name( C3 ) [ 7.844729] scsi 1:0:0:1: serial_number(021825001558 ) [ 7.844732] scsi 1:0:0:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 7.883191] scsi 1:0:0:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 7.891453] scsi 1:0:0:31: SSP: handle(0x0018), sas_addr(0x500a0984dfa20c24), phy(0), device_name(0x500a0984dfa20c24) [ 7.902052] scsi 1:0:0:31: enclosure logical id(0x300605b00d114820), slot(13) [ 7.909271] scsi 1:0:0:31: enclosure level(0x0000), connector name( C3 ) [ 7.916079] scsi 1:0:0:31: serial_number(021825001558 ) [ 7.921573] scsi 1:0:0:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 7.947231] mpt3sas_cm0: detecting: handle(0x0019), sas_address(0x500a0984dfa1fa10), phy(4) [ 7.955588] mpt3sas_cm0: REPORT_LUNS: handle(0x0019), retries(0) [ 7.962497] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0019), lun(0) [ 7.969063] scsi 1:0:1:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 7.977246] scsi 1:0:1:0: SSP: handle(0x0019), sas_addr(0x500a0984dfa1fa10), phy(4), device_name(0x500a0984dfa1fa10) [ 7.987759] scsi 1:0:1:0: enclosure logical id(0x300605b00d114820), slot(9) [ 7.994806] scsi 1:0:1:0: enclosure level(0x0000), connector name( C2 ) [ 8.001524] scsi 1:0:1:0: serial_number(021825001369 ) [ 8.006924] scsi 1:0:1:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.009115] random: crng init done [ 8.030939] scsi 1:0:1:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 8.039102] scsi 1:0:1:1: SSP: handle(0x0019), sas_addr(0x500a0984dfa1fa10), phy(4), device_name(0x500a0984dfa1fa10) [ 8.049614] scsi 1:0:1:1: enclosure logical id(0x300605b00d114820), slot(9) [ 8.056659] scsi 1:0:1:1: enclosure level(0x0000), connector name( C2 ) [ 8.063381] scsi 1:0:1:1: serial_number(021825001369 ) [ 8.068780] scsi 1:0:1:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.077981] scsi 1:0:1:1: Mode parameters changed [ 8.090732] sd 0:2:0:0: [sda] 467664896 512-byte logical blocks: (239 GB/223 GiB) [ 8.098397] sd 0:2:0:0: [sda] Write Protect is off [ 8.101193] scsi 1:0:1:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 8.101276] scsi 1:0:1:31: SSP: handle(0x0019), sas_addr(0x500a0984dfa1fa10), phy(4), device_name(0x500a0984dfa1fa10) [ 8.101277] scsi 1:0:1:31: enclosure logical id(0x300605b00d114820), slot(9) [ 8.101278] scsi 1:0:1:31: enclosure level(0x0000), connector name( C2 ) [ 8.101280] scsi 1:0:1:31: serial_number(021825001369 ) [ 8.101282] scsi 1:0:1:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.150460] sd 0:2:0:0: [sda] Mode Sense: 1f 00 10 08 [ 8.150493] sd 0:2:0:0: [sda] Write cache: disabled, read cache: disabled, supports DPO and FUA [ 8.161306] sda: sda1 sda2 sda3 [ 8.164918] sd 0:2:0:0: [sda] Attached SCSI disk [ 8.171244] mpt3sas_cm0: detecting: handle(0x0017), sas_address(0x500a0984da0f9b24), phy(8) [ 8.179598] mpt3sas_cm0: REPORT_LUNS: handle(0x0017), retries(0) [ 8.186434] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0017), lun(0) [ 8.192977] scsi 1:0:2:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 8.201150] scsi 1:0:2:0: SSP: handle(0x0017), sas_addr(0x500a0984da0f9b24), phy(8), device_name(0x500a0984da0f9b24) [ 8.211665] scsi 1:0:2:0: enclosure logical id(0x300605b00d114820), slot(5) [ 8.218711] scsi 1:0:2:0: enclosure level(0x0000), connector name( C1 ) [ 8.225431] scsi 1:0:2:0: serial_number(021812047179 ) [ 8.230830] scsi 1:0:2:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.253872] scsi 1:0:2:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 8.262041] scsi 1:0:2:1: SSP: handle(0x0017), sas_addr(0x500a0984da0f9b24), phy(8), device_name(0x500a0984da0f9b24) [ 8.272549] scsi 1:0:2:1: enclosure logical id(0x300605b00d114820), slot(5) [ 8.279596] scsi 1:0:2:1: enclosure level(0x0000), connector name( C1 ) [ 8.286316] scsi 1:0:2:1: serial_number(021812047179 ) [ 8.291715] scsi 1:0:2:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.314165] scsi 1:0:2:2: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 8.322330] scsi 1:0:2:2: SSP: handle(0x0017), sas_addr(0x500a0984da0f9b24), phy(8), device_name(0x500a0984da0f9b24) [ 8.332845] scsi 1:0:2:2: enclosure logical id(0x300605b00d114820), slot(5) [ 8.339891] scsi 1:0:2:2: enclosure level(0x0000), connector name( C1 ) [ 8.346611] scsi 1:0:2:2: serial_number(021812047179 ) [ 8.352012] scsi 1:0:2:2: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.375158] scsi 1:0:2:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 8.383407] scsi 1:0:2:31: SSP: handle(0x0017), sas_addr(0x500a0984da0f9b24), phy(8), device_name(0x500a0984da0f9b24) [ 8.394008] scsi 1:0:2:31: enclosure logical id(0x300605b00d114820), slot(5) [ 8.401139] scsi 1:0:2:31: enclosure level(0x0000), connector name( C1 ) [ 8.407944] scsi 1:0:2:31: serial_number(021812047179 ) [ 8.413434] scsi 1:0:2:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.436227] mpt3sas_cm0: detecting: handle(0x001a), sas_address(0x500a0984db2fa910), phy(12) [ 8.444663] mpt3sas_cm0: REPORT_LUNS: handle(0x001a), retries(0) [ 8.451558] mpt3sas_cm0: TEST_UNIT_READY: handle(0x001a), lun(0) [ 8.458076] scsi 1:0:3:0: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 8.466252] scsi 1:0:3:0: SSP: handle(0x001a), sas_addr(0x500a0984db2fa910), phy(12), device_name(0x500a0984db2fa910) [ 8.476852] scsi 1:0:3:0: enclosure logical id(0x300605b00d114820), slot(1) [ 8.483899] scsi 1:0:3:0: enclosure level(0x0000), connector name( C0 ) [ 8.490617] scsi 1:0:3:0: serial_number(021815000354 ) [ 8.496017] scsi 1:0:3:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.516936] scsi 1:0:3:1: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 8.525104] scsi 1:0:3:1: SSP: handle(0x001a), sas_addr(0x500a0984db2fa910), phy(12), device_name(0x500a0984db2fa910) [ 8.535701] scsi 1:0:3:1: enclosure logical id(0x300605b00d114820), slot(1) [ 8.542746] scsi 1:0:3:1: enclosure level(0x0000), connector name( C0 ) [ 8.549467] scsi 1:0:3:1: serial_number(021815000354 ) [ 8.554868] scsi 1:0:3:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.564079] scsi 1:0:3:1: Mode parameters changed [ 8.580169] scsi 1:0:3:2: Direct-Access DELL MD34xx 0825 PQ: 0 ANSI: 5 [ 8.588349] scsi 1:0:3:2: SSP: handle(0x001a), sas_addr(0x500a0984db2fa910), phy(12), device_name(0x500a0984db2fa910) [ 8.598951] scsi 1:0:3:2: enclosure logical id(0x300605b00d114820), slot(1) [ 8.605999] scsi 1:0:3:2: enclosure level(0x0000), connector name( C0 ) [ 8.612714] scsi 1:0:3:2: serial_number(021815000354 ) [ 8.618115] scsi 1:0:3:2: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.627318] scsi 1:0:3:2: Mode parameters changed [ 8.643177] scsi 1:0:3:31: Direct-Access DELL Universal Xport 0825 PQ: 0 ANSI: 5 [ 8.651444] scsi 1:0:3:31: SSP: handle(0x001a), sas_addr(0x500a0984db2fa910), phy(12), device_name(0x500a0984db2fa910) [ 8.662132] scsi 1:0:3:31: enclosure logical id(0x300605b00d114820), slot(1) [ 8.669265] scsi 1:0:3:31: enclosure level(0x0000), connector name( C0 ) [ 8.676069] scsi 1:0:3:31: serial_number(021815000354 ) [ 8.681558] scsi 1:0:3:31: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1) [ 8.704818] mpt3sas_cm0: detecting: handle(0x0011), sas_address(0x300705b00deb4820), phy(16) [ 8.713255] mpt3sas_cm0: REPORT_LUNS: handle(0x0011), retries(0) [ 8.719289] mpt3sas_cm0: TEST_UNIT_READY: handle(0x0011), lun(0) [ 8.725635] scsi 1:0:4:0: Enclosure LSI VirtualSES 03 PQ: 0 ANSI: 7 [ 8.733764] scsi 1:0:4:0: set ignore_delay_remove for handle(0x0011) [ 8.740118] scsi 1:0:4:0: SES: handle(0x0011), sas_addr(0x300705b00deb4820), phy(16), device_name(0x300705b00deb4820) [ 8.750715] scsi 1:0:4:0: enclosure logical id(0x300605b00d114820), slot(16) [ 8.757849] scsi 1:0:4:0: enclosure level(0x0000), connector name( C3 ) [ 8.764567] scsi 1:0:4:0: serial_number(300605B00D114820) [ 8.769967] scsi 1:0:4:0: qdepth(1), tagged(0), simple(0), ordered(0), scsi_level(8), cmd_que(0) [ 8.778773] mpt3sas_cm0: log_info(0x31200206): originator(PL), code(0x20), sub_code(0x0206) [ 8.807963] mpt3sas_cm0: port enable: SUCCESS [ 8.812806] scsi 1:0:0:0: rdac: LUN 0 (IOSHIP) (unowned) [ 8.818381] sd 1:0:0:0: [sdb] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.826446] scsi 1:0:0:1: rdac: LUN 1 (IOSHIP) (owned) [ 8.826664] sd 1:0:0:0: [sdb] Write Protect is off [ 8.826666] sd 1:0:0:0: [sdb] Mode Sense: 83 00 10 08 [ 8.826796] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.829441] sd 1:0:0:0: [sdb] Attached SCSI disk [ 8.849681] sd 1:0:0:1: [sdc] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.857777] scsi 1:0:1:0: rdac: LUN 0 (IOSHIP) (owned) [ 8.863008] sd 1:0:0:1: [sdc] Write Protect is off [ 8.863149] sd 1:0:1:0: [sdd] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.863527] scsi 1:0:1:1: rdac: LUN 1 (IOSHIP) (unowned) [ 8.863714] sd 1:0:1:0: [sdd] Write Protect is off [ 8.863715] sd 1:0:1:0: [sdd] Mode Sense: 83 00 10 08 [ 8.863771] sd 1:0:1:1: [sde] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.863908] sd 1:0:1:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.864106] scsi 1:0:2:0: rdac: LUN 0 (IOSHIP) (unowned) [ 8.864338] sd 1:0:2:0: [sdf] 926167040 512-byte logical blocks: (474 GB/441 GiB) [ 8.864339] sd 1:0:2:0: [sdf] 4096-byte physical blocks [ 8.864361] sd 1:0:1:1: [sde] Write Protect is off [ 8.864362] sd 1:0:1:1: [sde] Mode Sense: 83 00 10 08 [ 8.864543] sd 1:0:1:1: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.864617] scsi 1:0:2:1: rdac: LUN 1 (IOSHIP) (owned) [ 8.864890] sd 1:0:2:0: [sdf] Write Protect is off [ 8.864892] sd 1:0:2:0: [sdf] Mode Sense: 83 00 10 08 [ 8.864913] sd 1:0:2:1: [sdg] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.865160] sd 1:0:2:0: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.865269] scsi 1:0:2:2: rdac: LUN 2 (IOSHIP) (unowned) [ 8.865549] sd 1:0:2:2: [sdh] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.865695] sd 1:0:2:1: [sdg] Write Protect is off [ 8.865696] sd 1:0:2:1: [sdg] Mode Sense: 83 00 10 08 [ 8.865866] scsi 1:0:3:0: rdac: LUN 0 (IOSHIP) (owned) [ 8.865935] sd 1:0:2:1: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.866100] sd 1:0:3:0: [sdi] 926167040 512-byte logical blocks: (474 GB/441 GiB) [ 8.866101] sd 1:0:3:0: [sdi] 4096-byte physical blocks [ 8.866227] sd 1:0:2:2: [sdh] Write Protect is off [ 8.866229] sd 1:0:2:2: [sdh] Mode Sense: 83 00 10 08 [ 8.866399] scsi 1:0:3:1: rdac: LUN 1 (IOSHIP) (unowned) [ 8.866455] sd 1:0:2:2: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.866677] sd 1:0:3:1: [sdj] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.866711] sd 1:0:3:0: [sdi] Write Protect is off [ 8.866712] sd 1:0:3:0: [sdi] Mode Sense: 83 00 10 08 [ 8.867003] sd 1:0:3:0: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.867283] scsi 1:0:3:2: rdac: LUN 2 (IOSHIP) (owned) [ 8.867723] sd 1:0:1:0: [sdd] Attached SCSI disk [ 8.867724] sd 1:0:3:1: [sdj] Write Protect is off [ 8.867725] sd 1:0:3:1: [sdj] Mode Sense: 83 00 10 08 [ 8.867741] sd 1:0:3:2: [sdk] 37449707520 512-byte logical blocks: (19.1 TB/17.4 TiB) [ 8.867980] sd 1:0:3:1: [sdj] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.868081] sd 1:0:1:1: [sde] Attached SCSI disk [ 8.869014] sd 1:0:3:2: [sdk] Write Protect is off [ 8.869015] sd 1:0:3:2: [sdk] Mode Sense: 83 00 10 08 [ 8.869298] sd 1:0:2:1: [sdg] Attached SCSI disk [ 8.869408] sd 1:0:3:2: [sdk] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 8.869684] sd 1:0:2:0: [sdf] Attached SCSI disk [ 8.871295] sd 1:0:2:2: [sdh] Attached SCSI disk [ 8.872186] sd 1:0:3:0: [sdi] Attached SCSI disk [ 8.873140] sd 1:0:3:2: [sdk] Attached SCSI disk [ 8.873207] sd 1:0:3:1: [sdj] Attached SCSI disk [ 9.120326] sd 1:0:0:1: [sdc] Mode Sense: 83 00 10 08 [ 9.120492] sd 1:0:0:1: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 9.131026] sd 1:0:0:1: [sdc] Attached SCSI disk [ 9.242623] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null) [ 9.466994] systemd-journald[377]: Received SIGTERM from PID 1 (systemd). [ 9.505164] SELinux: Disabled at runtime. [ 9.510083] SELinux: Unregistering netfilter hooks [ 9.559978] type=1404 audit(1575989081.059:2): selinux=0 auid=4294967295 ses=4294967295 [ 9.588605] ip_tables: (C) 2000-2006 Netfilter Core Team [ 9.594684] systemd[1]: Inserted module 'ip_tables' [ 9.687108] EXT4-fs (sda2): re-mounted. Opts: (null) [ 9.706135] systemd-journald[4901]: Received request to flush runtime journal from PID 1 [ 9.765057] ACPI Error: No handler for Region [SYSI] (ffff951469e7aa68) [IPMI] (20130517/evregion-162) [ 9.777519] ACPI Error: Region IPMI (ID=7) has no handler (20130517/exfldio-305) [ 9.787868] ACPI Error: Method parse/execution failed [\_SB_.PMI0._GHL] (Node ffff951469e775a0), AE_NOT_EXIST (20130517/psparse-536) [ 9.806678] ACPI Error: Method parse/execution failed [\_SB_.PMI0._PMC] (Node ffff951469e77500), AE_NOT_EXIST (20130517/psparse-536) [ 9.819416] ACPI Exception: AE_NOT_EXIST, [ 9.819418] Evaluating _PMC (20130517/power_meter-753) [ 9.836737] piix4_smbus 0000:00:14.0: SMBus Host Controller at 0xb00, revision 0 [ 9.844967] piix4_smbus 0000:00:14.0: Using register 0x2e for SMBus port selection [ 9.854957] ipmi message handler version 39.2 [ 9.861529] device-mapper: uevent: version 1.0.3 [ 9.868266] device-mapper: ioctl: 4.37.1-ioctl (2018-04-03) initialised: dm-devel@redhat.com [ 9.877338] ipmi device interface [ 9.886155] ccp 0000:02:00.2: 3 command queues available [ 9.892621] ccp 0000:02:00.2: irq 235 for MSI/MSI-X [ 9.892648] ccp 0000:02:00.2: irq 236 for MSI/MSI-X [ 9.892697] ccp 0000:02:00.2: Queue 2 can access 4 LSB regions [ 9.899707] ccp 0000:02:00.2: Queue 3 can access 4 LSB regions [ 9.906928] ccp 0000:02:00.2: Queue 4 can access 4 LSB regions [ 9.914148] ccp 0000:02:00.2: Queue 0 gets LSB 4 [ 9.914149] ccp 0000:02:00.2: Queue 1 gets LSB 5 [ 9.914150] ccp 0000:02:00.2: Queue 2 gets LSB 6 [ 9.915782] ccp 0000:02:00.2: enabled [ 9.916045] ccp 0000:03:00.1: 5 command queues available [ 9.916097] ccp 0000:03:00.1: irq 238 for MSI/MSI-X [ 9.916143] ccp 0000:03:00.1: Queue 0 can access 7 LSB regions [ 9.916146] ccp 0000:03:00.1: Queue 1 can access 7 LSB regions [ 9.916148] ccp 0000:03:00.1: Queue 2 can access 7 LSB regions [ 9.916150] ccp 0000:03:00.1: Queue 3 can access 7 LSB regions [ 9.916152] ccp 0000:03:00.1: Queue 4 can access 7 LSB regions [ 9.916153] ccp 0000:03:00.1: Queue 0 gets LSB 1 [ 9.916155] ccp 0000:03:00.1: Queue 1 gets LSB 2 [ 9.916156] ccp 0000:03:00.1: Queue 2 gets LSB 3 [ 9.916157] ccp 0000:03:00.1: Queue 3 gets LSB 4 [ 9.916159] ccp 0000:03:00.1: Queue 4 gets LSB 5 [ 9.917203] ccp 0000:03:00.1: enabled [ 9.917416] ccp 0000:41:00.2: 3 command queues available [ 9.917464] ccp 0000:41:00.2: irq 240 for MSI/MSI-X [ 9.917490] ccp 0000:41:00.2: irq 241 for MSI/MSI-X [ 9.917558] ccp 0000:41:00.2: Queue 2 can access 4 LSB regions [ 9.917561] ccp 0000:41:00.2: Queue 3 can access 4 LSB regions [ 9.917563] ccp 0000:41:00.2: Queue 4 can access 4 LSB regions [ 9.917564] ccp 0000:41:00.2: Queue 0 gets LSB 4 [ 9.917566] ccp 0000:41:00.2: Queue 1 gets LSB 5 [ 9.917567] ccp 0000:41:00.2: Queue 2 gets LSB 6 [ 9.918568] ccp 0000:41:00.2: enabled [ 9.918698] ccp 0000:42:00.1: 5 command queues available [ 9.918744] ccp 0000:42:00.1: irq 243 for MSI/MSI-X [ 9.918772] ccp 0000:42:00.1: Queue 0 can access 7 LSB regions [ 9.918774] ccp 0000:42:00.1: Queue 1 can access 7 LSB regions [ 9.918776] ccp 0000:42:00.1: Queue 2 can access 7 LSB regions [ 9.918778] ccp 0000:42:00.1: Queue 3 can access 7 LSB regions [ 9.918780] ccp 0000:42:00.1: Queue 4 can access 7 LSB regions [ 9.918781] ccp 0000:42:00.1: Queue 0 gets LSB 1 [ 9.918782] ccp 0000:42:00.1: Queue 1 gets LSB 2 [ 9.918784] ccp 0000:42:00.1: Queue 2 gets LSB 3 [ 9.918785] ccp 0000:42:00.1: Queue 3 gets LSB 4 [ 9.918786] ccp 0000:42:00.1: Queue 4 gets LSB 5 [ 9.920704] input: PC Speaker as /devices/platform/pcspkr/input/input2 [ 9.921067] ccp 0000:42:00.1: enabled [ 9.921312] ccp 0000:85:00.2: 3 command queues available [ 9.921360] ccp 0000:85:00.2: irq 245 for MSI/MSI-X [ 9.921391] ccp 0000:85:00.2: irq 246 for MSI/MSI-X [ 9.921443] ccp 0000:85:00.2: Queue 2 can access 4 LSB regions [ 9.921445] ccp 0000:85:00.2: Queue 3 can access 4 LSB regions [ 9.921447] ccp 0000:85:00.2: Queue 4 can access 4 LSB regions [ 9.921449] ccp 0000:85:00.2: Queue 0 gets LSB 4 [ 9.921450] ccp 0000:85:00.2: Queue 1 gets LSB 5 [ 9.921452] ccp 0000:85:00.2: Queue 2 gets LSB 6 [ 9.921838] ccp 0000:85:00.2: enabled [ 9.921962] ccp 0000:86:00.1: 5 command queues available [ 9.922013] ccp 0000:86:00.1: irq 248 for MSI/MSI-X [ 9.922039] ccp 0000:86:00.1: Queue 0 can access 7 LSB regions [ 9.922041] ccp 0000:86:00.1: Queue 1 can access 7 LSB regions [ 9.922044] ccp 0000:86:00.1: Queue 2 can access 7 LSB regions [ 9.922046] ccp 0000:86:00.1: Queue 3 can access 7 LSB regions [ 9.922048] ccp 0000:86:00.1: Queue 4 can access 7 LSB regions [ 9.922050] ccp 0000:86:00.1: Queue 0 gets LSB 1 [ 9.922052] ccp 0000:86:00.1: Queue 1 gets LSB 2 [ 9.922053] ccp 0000:86:00.1: Queue 2 gets LSB 3 [ 9.922054] ccp 0000:86:00.1: Queue 3 gets LSB 4 [ 9.922055] ccp 0000:86:00.1: Queue 4 gets LSB 5 [ 9.922555] ccp 0000:86:00.1: enabled [ 10.256588] ccp 0000:c2:00.2: 3 command queues available [ 10.263528] ccp 0000:c2:00.2: irq 250 for MSI/MSI-X [ 10.263555] ccp 0000:c2:00.2: irq 251 for MSI/MSI-X [ 10.263668] ccp 0000:c2:00.2: Queue 2 can access 4 LSB regions [ 10.270664] ccp 0000:c2:00.2: Queue 3 can access 4 LSB regions [ 10.277892] ccp 0000:c2:00.2: Queue 4 can access 4 LSB regions [ 10.278097] cryptd: max_cpu_qlen set to 1000 [ 10.290759] ccp 0000:c2:00.2: Queue 0 gets LSB 4 [ 10.296770] ccp 0000:c2:00.2: Queue 1 gets LSB 5 [ 10.302779] ccp 0000:c2:00.2: Queue 2 gets LSB 6 [ 10.309647] ccp 0000:c2:00.2: enabled [ 10.314142] ccp 0000:c3:00.1: 5 command queues available [ 10.320593] ccp 0000:c3:00.1: irq 253 for MSI/MSI-X [ 10.320657] ccp 0000:c3:00.1: Queue 0 can access 7 LSB regions [ 10.327761] ccp 0000:c3:00.1: Queue 1 can access 7 LSB regions [ 10.334981] ccp 0000:c3:00.1: Queue 2 can access 7 LSB regions [ 10.342192] ccp 0000:c3:00.1: Queue 3 can access 7 LSB regions [ 10.349412] ccp 0000:c3:00.1: Queue 4 can access 7 LSB regions [ 10.356631] ccp 0000:c3:00.1: Queue 0 gets LSB 1 [ 10.362632] ccp 0000:c3:00.1: Queue 1 gets LSB 2 [ 10.362927] sd 0:2:0:0: Attached scsi generic sg0 type 0 [ 10.363087] sd 1:0:0:0: Attached scsi generic sg1 type 0 [ 10.363244] sd 1:0:0:1: Attached scsi generic sg2 type 0 [ 10.363367] scsi 1:0:0:31: Attached scsi generic sg3 type 0 [ 10.363868] sd 1:0:1:0: Attached scsi generic sg4 type 0 [ 10.364050] sd 1:0:1:1: Attached scsi generic sg5 type 0 [ 10.364517] scsi 1:0:1:31: Attached scsi generic sg6 type 0 [ 10.364968] sd 1:0:2:0: Attached scsi generic sg7 type 0 [ 10.365034] sd 1:0:2:1: Attached scsi generic sg8 type 0 [ 10.365099] sd 1:0:2:2: Attached scsi generic sg9 type 0 [ 10.365159] scsi 1:0:2:31: Attached scsi generic sg10 type 0 [ 10.365216] sd 1:0:3:0: Attached scsi generic sg11 type 0 [ 10.365281] sd 1:0:3:1: Attached scsi generic sg12 type 0 [ 10.365334] sd 1:0:3:2: Attached scsi generic sg13 type 0 [ 10.365411] scsi 1:0:3:31: Attached scsi generic sg14 type 0 [ 10.365473] scsi 1:0:4:0: Attached scsi generic sg15 type 13 [ 10.477690] ccp 0000:c3:00.1: Queue 2 gets LSB 3 [ 10.483694] ccp 0000:c3:00.1: Queue 3 gets LSB 4 [ 10.489712] ccp 0000:c3:00.1: Queue 4 gets LSB 5 [ 10.496805] ccp 0000:c3:00.1: enabled [ 10.502413] IPMI System Interface driver [ 10.508170] ipmi_si dmi-ipmi-si.0: ipmi_platform: probing via SMBIOS [ 10.515374] ipmi_si: SMBIOS: io 0xca8 regsize 1 spacing 4 irq 10 [ 10.522936] ipmi_si: Adding SMBIOS-specified kcs state machine [ 10.529968] ipmi_si IPI0001:00: ipmi_platform: probing via ACPI [ 10.537168] ipmi_si IPI0001:00: [io 0x0ca8] regsize 1 spacing 4 irq 10 [ 10.545136] ipmi_si dmi-ipmi-si.0: Removing SMBIOS-specified kcs state machine in favor of ACPI [ 10.555216] ipmi_si: Adding ACPI-specified kcs state machine [ 10.562402] ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca8, slave address 0x20, irq 10 [ 10.576117] AVX2 version of gcm_enc/dec engaged. [ 10.576883] sd 1:0:0:0: Embedded Enclosure Device [ 10.587051] AES CTR mode by8 optimization enabled [ 10.597850] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [ 10.605117] ipmi_si IPI0001:00: The BMC does not support setting the recv irq bit, compensating, but the BMC needs to be fixed. [ 10.613201] ipmi_si IPI0001:00: Using irq 10 [ 10.623907] alg: No test for __generic-gcm-aes-aesni (__driver-generic-gcm-aes-aesni) [ 10.633449] dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.3) [ 10.638497] ipmi_si IPI0001:00: Found new BMC (man_id: 0x0002a2, prod_id: 0x0100, dev_id: 0x20) [ 10.708809] kvm: Nested Paging enabled [ 10.714874] ipmi_si IPI0001:00: IPMI kcs interface initialized [ 10.721353] MCE: In-kernel MCE decoding enabled. [ 10.729269] AMD64 EDAC driver v3.4.0 [ 10.732873] EDAC amd64: DRAM ECC enabled. [ 10.736886] EDAC amd64: F17h detected (node 0). [ 10.741458] EDAC MC: UMC0 chip selects: [ 10.741460] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.746170] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.746172] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.746175] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.746179] EDAC MC: UMC1 chip selects: [ 10.746180] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.746181] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.746181] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.746182] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.746182] EDAC amd64: using x8 syndromes. [ 10.746183] EDAC amd64: MCT channel count: 2 [ 10.746367] EDAC MC0: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:18.3 [ 10.746373] EDAC amd64: DRAM ECC enabled. [ 10.746375] EDAC amd64: F17h detected (node 1). [ 10.746417] EDAC MC: UMC0 chip selects: [ 10.746418] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.746419] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.746420] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.746420] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.746423] EDAC MC: UMC1 chip selects: [ 10.746423] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.746424] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.746424] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.746425] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.746425] EDAC amd64: using x8 syndromes. [ 10.746426] EDAC amd64: MCT channel count: 2 [ 10.746677] EDAC MC1: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:19.3 [ 10.746683] EDAC amd64: DRAM ECC enabled. [ 10.746684] EDAC amd64: F17h detected (node 2). [ 10.746731] EDAC MC: UMC0 chip selects: [ 10.746732] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.746733] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.746734] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.746734] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.746737] EDAC MC: UMC1 chip selects: [ 10.746738] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.746738] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.746739] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.746740] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.746740] EDAC amd64: using x8 syndromes. [ 10.746741] EDAC amd64: MCT channel count: 2 [ 10.747409] EDAC MC2: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:1a.3 [ 10.747415] EDAC amd64: DRAM ECC enabled. [ 10.747416] EDAC amd64: F17h detected (node 3). [ 10.747462] EDAC MC: UMC0 chip selects: [ 10.747464] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.747465] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.747466] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.747466] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.747469] EDAC MC: UMC1 chip selects: [ 10.747470] EDAC amd64: MC: 0: 0MB 1: 0MB [ 10.747470] EDAC amd64: MC: 2: 16383MB 3: 16383MB [ 10.747471] EDAC amd64: MC: 4: 0MB 5: 0MB [ 10.747472] EDAC amd64: MC: 6: 0MB 7: 0MB [ 10.747472] EDAC amd64: using x8 syndromes. [ 10.747473] EDAC amd64: MCT channel count: 2 [ 10.748840] EDAC MC3: Giving out device to 'amd64_edac' 'F17h': DEV 0000:00:1b.3 [ 10.748887] EDAC PCI0: Giving out device to module 'amd64_edac' controller 'EDAC PCI controller': DEV '0000:00:18.0' (POLLED) [ 11.744998] sd 1:0:0:1: Embedded Enclosure Device [ 11.751783] scsi 1:0:0:31: Embedded Enclosure Device [ 11.758844] sd 1:0:1:0: Embedded Enclosure Device [ 11.765748] sd 1:0:1:1: Embedded Enclosure Device [ 11.772551] scsi 1:0:1:31: Embedded Enclosure Device [ 11.779588] sd 1:0:2:0: Embedded Enclosure Device [ 11.786457] sd 1:0:2:1: Embedded Enclosure Device [ 11.793206] sd 1:0:2:2: Embedded Enclosure Device [ 11.799955] scsi 1:0:2:31: Embedded Enclosure Device [ 11.806990] sd 1:0:3:0: Embedded Enclosure Device [ 11.813917] sd 1:0:3:1: Embedded Enclosure Device [ 11.822298] sd 1:0:3:2: Embedded Enclosure Device [ 11.829136] scsi 1:0:3:31: Embedded Enclosure Device [ 11.836864] ses 1:0:4:0: Attached Enclosure device [ 42.346433] device-mapper: multipath round-robin: version 1.2.0 loaded [ 43.220195] Adding 4194300k swap on /dev/sda3. Priority:-2 extents:1 across:4194300k FS [ 43.256802] type=1305 audit(1575989114.755:3): audit_pid=11117 old=0 auid=4294967295 ses=4294967295 res=1 [ 43.278964] RPC: Registered named UNIX socket transport module. [ 43.286177] RPC: Registered udp transport module. [ 43.292269] RPC: Registered tcp transport module. [ 43.298359] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 43.950029] mlx5_core 0000:01:00.0: slow_pci_heuristic:5575:(pid 11398): Max link speed = 100000, PCI BW = 126016 [ 43.960348] mlx5_core 0000:01:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 43.968625] mlx5_core 0000:01:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 44.429416] tg3 0000:81:00.0: irq 254 for MSI/MSI-X [ 44.429429] tg3 0000:81:00.0: irq 255 for MSI/MSI-X [ 44.429447] tg3 0000:81:00.0: irq 256 for MSI/MSI-X [ 44.429457] tg3 0000:81:00.0: irq 257 for MSI/MSI-X [ 44.429475] tg3 0000:81:00.0: irq 258 for MSI/MSI-X [ 44.555599] IPv6: ADDRCONF(NETDEV_UP): em1: link is not ready [ 47.978010] tg3 0000:81:00.0 em1: Link is up at 1000 Mbps, full duplex [ 47.984572] tg3 0000:81:00.0 em1: Flow control is on for TX and on for RX [ 47.991404] tg3 0000:81:00.0 em1: EEE is enabled [ 47.996057] IPv6: ADDRCONF(NETDEV_CHANGE): em1: link becomes ready [ 48.909325] IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready [ 49.207004] IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready [ 53.047669] FS-Cache: Loaded [ 53.078307] FS-Cache: Netfs 'nfs' registered for caching [ 53.088434] Key type dns_resolver registered [ 53.116954] NFS: Registering the id_resolver key type [ 53.123389] Key type id_resolver registered [ 53.128955] Key type id_legacy registered [ 2837.507441] LNet: HW NUMA nodes: 4, HW CPU cores: 48, npartitions: 4 [ 2837.514979] alg: No test for adler32 (adler32-zlib) [ 2838.314168] Lustre: Lustre: Build Version: 2.12.3_4_g142b4d4 [ 2838.419091] LNet: 38882:0:(config.c:1627:lnet_inet_enumerate()) lnet: Ignoring interface em2: it's down [ 2838.428850] LNet: Using FastReg for registration [ 2838.445215] LNet: Added LNI 10.0.10.54@o2ib7 [8/256/0/180] [ 2918.774498] LDISKFS-fs (dm-2): file extents enabled, maximum tree depth=5 [ 2918.864382] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc [ 2933.451571] LNetError: 38944:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.52@o2ib7 added to recovery queue. Health = 900 [ 2933.571615] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.8.21.28@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2934.080349] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.0.10.109@o2ib7 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2934.097715] LustreError: Skipped 52 previous similar messages [ 2935.117183] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.110.52@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2935.134552] LustreError: Skipped 51 previous similar messages [ 2937.136131] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.8.21.36@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2937.153420] LustreError: Skipped 108 previous similar messages [ 2941.144530] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.9.102.22@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2941.161900] LustreError: Skipped 139 previous similar messages [ 2950.548805] LustreError: 39140:0:(mgc_request.c:249:do_config_log_add()) MGC10.0.10.51@o2ib7: failed processing log, type 1: rc = -5 [ 2954.857229] LustreError: 137-5: fir-MDT0003_UUID: not available for connect from 10.8.27.29@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2954.874508] LustreError: Skipped 377 previous similar messages [ 2981.560188] LustreError: 39140:0:(mgc_request.c:249:do_config_log_add()) MGC10.0.10.51@o2ib7: failed processing log, type 4: rc = -110 [ 3023.335671] Lustre: fir-MDT0003: Imperative Recovery not enabled, recovery window 300-900 [ 3023.426286] Lustre: fir-MDD0003: changelog on [ 3023.445928] Lustre: fir-MDT0003: in recovery but waiting for the first client to connect [ 3023.571011] Lustre: fir-MDT0003: Will be in recovery for at least 5:00, or until 1290 clients reconnect [ 3024.580370] Lustre: fir-MDT0003: Connection restored to b4b07392-f919-4 (at 10.9.101.6@o2ib4) [ 3024.588921] Lustre: Skipped 9 previous similar messages [ 3025.244588] Lustre: fir-MDT0003: Connection restored to 91b198e6-ce9d-6e88-4a8a-d97e9eaae698 (at 10.8.26.36@o2ib6) [ 3025.254938] Lustre: Skipped 1 previous similar message [ 3026.417314] Lustre: fir-MDT0003: Connection restored to a607d905-25f5-e4a4-883e-1e660689fdcd (at 10.9.106.51@o2ib4) [ 3026.427746] Lustre: Skipped 7 previous similar messages [ 3028.437586] Lustre: fir-MDT0003: Connection restored to e70ce965-187a-edf2-7e34-ae13fd3057d5 (at 10.8.26.15@o2ib6) [ 3028.447939] Lustre: Skipped 95 previous similar messages [ 3032.440491] Lustre: fir-MDT0003: Connection restored to 4587af08-4157-6320-9e58-cade8713b082 (at 10.9.105.30@o2ib4) [ 3032.450951] Lustre: Skipped 358 previous similar messages [ 3040.445782] Lustre: fir-MDT0003: Connection restored to (at 10.8.28.12@o2ib6) [ 3040.453014] Lustre: Skipped 419 previous similar messages [ 3056.531830] Lustre: fir-MDT0003: Connection restored to (at 10.9.108.48@o2ib4) [ 3056.539150] Lustre: Skipped 486 previous similar messages [ 3062.204314] Lustre: fir-MDT0003: Recovery over after 0:39, of 1290 clients 1290 recovered and 0 were evicted. [ 3192.631862] LustreError: 11-0: fir-MDT0001-osp-MDT0003: operation mds_statfs to node 10.0.10.51@o2ib7 failed: rc = -107 [ 3192.642648] Lustre: fir-MDT0001-osp-MDT0003: Connection to fir-MDT0001 (at 10.0.10.51@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [ 3241.298592] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.101.36@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3241.315960] LustreError: Skipped 2 previous similar messages [ 3270.864291] Lustre: fir-MDT0003: Connection restored to fir-MDT0001-mdtlov_UUID (at 10.0.10.52@o2ib7) [ 3270.873511] Lustre: Skipped 1 previous similar message [ 3348.211999] Lustre: fir-MDT0001-osp-MDT0003: Connection restored to 10.0.10.52@o2ib7 (at 10.0.10.52@o2ib7) [ 4141.826645] Lustre: fir-MDT0003: haven't heard from client 82b9ac9e-bd42-fb9c-cb3e-f327857b510c (at 10.9.0.62@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389b38800, cur 1575993213 expire 1575993063 last 1575992986 [ 6598.564498] Lustre: fir-MDT0003: Connection restored to (at 10.9.0.62@o2ib4) [ 6599.828045] Lustre: fir-MDT0003: haven't heard from client cec884d3-ca4b-8127-2f6b-7762665aa5f8 (at 10.9.0.64@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389d3a000, cur 1575995671 expire 1575995521 last 1575995444 [ 8787.926814] Lustre: fir-MDT0003: Connection restored to (at 10.9.0.64@o2ib4) [ 9220.856757] Lustre: fir-MDT0003: haven't heard from client fb9a2d5e-e9b3-4fb9-b988-9954fcfb0920 (at 10.8.0.66@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389de3000, cur 1575998292 expire 1575998142 last 1575998065 [11273.611676] Lustre: fir-MDT0003: Connection restored to fb9a2d5e-e9b3-4fb9-b988-9954fcfb0920 (at 10.8.0.66@o2ib6) [12762.906846] Lustre: fir-MDT0003: haven't heard from client 40a204f8-61bd-7bf5-8e8b-66a640362528 (at 10.8.21.28@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9503899d1400, cur 1576001834 expire 1576001684 last 1576001607 [14499.557804] Lustre: fir-MDT0003: Connection restored to fd516b75-9a6c-4 (at 10.9.108.39@o2ib4) [14838.727556] Lustre: fir-MDT0003: Connection restored to (at 10.8.21.14@o2ib6) [14853.331204] Lustre: fir-MDT0003: Connection restored to 98c710cf-a183-35fe-d60d-8494e153f1c3 (at 10.8.21.13@o2ib6) [14855.797391] Lustre: fir-MDT0003: Connection restored to 172ec88c-3454-1411-8e15-a9b5202e9e30 (at 10.8.21.8@o2ib6) [14855.807655] Lustre: Skipped 1 previous similar message [14865.752580] Lustre: fir-MDT0003: Connection restored to (at 10.8.21.28@o2ib6) [14865.759810] Lustre: Skipped 1 previous similar message [14875.954327] Lustre: fir-MDT0003: Connection restored to (at 10.8.21.20@o2ib6) [14892.358091] Lustre: fir-MDT0003: Connection restored to (at 10.8.20.18@o2ib6) [14892.365320] Lustre: Skipped 5 previous similar messages [14926.321780] Lustre: fir-MDT0003: Connection restored to (at 10.8.20.15@o2ib6) [14926.329015] Lustre: Skipped 13 previous similar messages [15016.074436] Lustre: fir-MDT0003: Connection restored to 5f11dd29-1211-44a2-2612-f8309cf085b3 (at 10.8.21.18@o2ib6) [15016.084784] Lustre: Skipped 15 previous similar messages [15867.781432] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.12@o2ib6) [15890.643201] Lustre: fir-MDT0003: Connection restored to 92c08489-d99f-9692-0d8e-5d862ef77698 (at 10.8.22.5@o2ib6) [15964.864086] Lustre: fir-MDT0003: Connection restored to (at 10.8.20.8@o2ib6) [15964.871223] Lustre: Skipped 2 previous similar messages [16052.471946] Lustre: 39781:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0003: Failure to clear the changelog for user 1: -22 [16052.971116] Lustre: 39746:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0003: Failure to clear the changelog for user 1: -22 [16052.982764] Lustre: 39746:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 1572 previous similar messages [16053.971294] Lustre: 39833:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0003: Failure to clear the changelog for user 1: -22 [16053.982952] Lustre: 39833:0:(mdd_device.c:1807:mdd_changelog_clear()) Skipped 3310 previous similar messages [17932.566014] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.4@o2ib6) [23008.192053] Lustre: fir-MDT0003: Connection restored to (at 10.8.20.27@o2ib6) [25217.057685] Lustre: fir-MDT0003: haven't heard from client 8fbd1a16-d09d-1ef7-e10d-4e68dc0a9f97 (at 10.8.23.32@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389fda000, cur 1576014288 expire 1576014138 last 1576014061 [25217.079457] Lustre: Skipped 12 previous similar messages [27392.631002] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.32@o2ib6) [29005.778458] Lustre: fir-MDT0003: Connection restored to (at 10.8.21.36@o2ib6) [31462.145114] Lustre: fir-MDT0003: haven't heard from client ee4590b6-1057-e690-5db0-89b0af3963cd (at 10.8.22.30@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9503894bbc00, cur 1576020533 expire 1576020383 last 1576020306 [31838.332180] Lustre: fir-MDT0003: Connection restored to c20915b7-72a8-8f0f-a961-7c81095a2283 (at 10.8.23.29@o2ib6) [33546.995197] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.30@o2ib6) [37011.001790] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.20@o2ib6) [41664.233072] Lustre: fir-MDT0003: Connection restored to 43d748a2-b8c5-e7f9-8b00-d16d4390ff4d (at 10.8.22.6@o2ib6) [42368.269042] Lustre: fir-MDT0003: haven't heard from client b6bab463-5f5c-8f5c-f09a-8f0ce0f6e1cd (at 10.8.21.31@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389482800, cur 1576031439 expire 1576031289 last 1576031212 [42444.291179] Lustre: fir-MDT0003: haven't heard from client 7515dbe4-f1c8-844a-9186-76f9c6288c34 (at 10.9.104.2@o2ib4) in 222 seconds. I think it's dead, and I am evicting it. exp ffff9503894bc000, cur 1576031515 expire 1576031365 last 1576031293 [42444.312913] Lustre: Skipped 4 previous similar messages [43596.616765] Lustre: fir-MDT0003: Connection restored to (at 10.9.114.14@o2ib4) [43643.886939] Lustre: fir-MDT0003: Connection restored to (at 10.8.19.6@o2ib6) [43832.755545] Lustre: fir-MDT0003: Connection restored to (at 10.9.110.71@o2ib4) [43872.091163] Lustre: fir-MDT0003: Connection restored to (at 10.9.107.9@o2ib4) [43956.553510] Lustre: fir-MDT0003: Connection restored to e8872901-9e69-2d9a-e57a-55077a64186b (at 10.9.109.25@o2ib4) [44086.850690] Lustre: fir-MDT0003: Connection restored to 2ad8ff13-d978-9373-7245-882c6479cc4c (at 10.9.110.63@o2ib4) [44103.770476] Lustre: fir-MDT0003: Connection restored to (at 10.9.110.62@o2ib4) [44103.777790] Lustre: Skipped 2 previous similar messages [44328.166759] Lustre: fir-MDT0003: Connection restored to b5acf087-1850-f5e1-236a-4cc1bab1a9f0 (at 10.9.104.34@o2ib4) [44328.177191] Lustre: Skipped 2 previous similar messages [44435.765330] Lustre: fir-MDT0003: Connection restored to b6bab463-5f5c-8f5c-f09a-8f0ce0f6e1cd (at 10.8.21.31@o2ib6) [44435.775678] Lustre: Skipped 3 previous similar messages [44653.028791] Lustre: fir-MDT0003: Connection restored to (at 10.8.28.9@o2ib6) [44653.035938] Lustre: Skipped 4 previous similar messages [46686.343210] Lustre: fir-MDT0003: haven't heard from client aadbd140-afe6-3cc5-5efa-1bf64465f6e7 (at 10.8.20.34@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389bb6000, cur 1576035757 expire 1576035607 last 1576035530 [46686.364910] Lustre: Skipped 13 previous similar messages [48797.430382] Lustre: fir-MDT0003: Connection restored to aadbd140-afe6-3cc5-5efa-1bf64465f6e7 (at 10.8.20.34@o2ib6) [48797.440734] Lustre: Skipped 8 previous similar messages [53539.926853] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.17@o2ib6) [60506.799323] Lustre: fir-MDT0003: Connection restored to 77f07ca8-e3bd-72f6-4ac1-3da8889522b3 (at 10.8.22.19@o2ib6) [61008.924213] Lustre: fir-MDT0003: Connection restored to (at 10.8.20.5@o2ib6) [66644.626075] Lustre: fir-MDT0003: haven't heard from client 09a03217-f2a1-2632-097f-38339f6cbc7c (at 10.8.22.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389487800, cur 1576055715 expire 1576055565 last 1576055488 [66695.130281] Lustre: fir-MDT0003: Connection restored to 37c7e464-6686-fdc0-1c81-eae75026a910 (at 10.8.22.2@o2ib6) [68590.175786] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.25@o2ib6) [68764.763183] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.1@o2ib6) [69954.638618] Lustre: fir-MDT0003: haven't heard from client d48dfcab-ce8f-b93c-3409-a3e76df7c945 (at 10.8.23.22@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389bb7000, cur 1576059025 expire 1576058875 last 1576058798 [72136.162614] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.22@o2ib6) [87902.706067] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.7@o2ib6) [92698.912670] Lustre: fir-MDT0003: haven't heard from client 5a6b489d-8a0c-1dc7-c222-8c5330c92213 (at 10.8.8.20@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389af8c00, cur 1576081769 expire 1576081619 last 1576081542 [92879.898829] Lustre: fir-MDT0003: haven't heard from client dcb788f4-67f3-4 (at 10.9.109.25@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9522a4bb6c00, cur 1576081950 expire 1576081800 last 1576081723 [92879.918821] Lustre: Skipped 8 previous similar messages [92887.311317] Lustre: fir-MDT0003: Connection restored to 714da8dd-1047-4 (at 10.9.107.20@o2ib4) [93134.535223] Lustre: fir-MDT0003: Connection restored to (at 10.9.110.71@o2ib4) [93149.517880] Lustre: fir-MDT0003: Connection restored to e8872901-9e69-2d9a-e57a-55077a64186b (at 10.9.109.25@o2ib4) [94132.257926] Lustre: fir-MDT0003: Connection restored to (at 10.9.117.46@o2ib4) [94165.193694] Lustre: fir-MDT0003: Connection restored to 4c497e0b-ea41-4 (at 10.8.9.1@o2ib6) [94379.371020] Lustre: fir-MDT0003: Connection restored to 8a77a7b3-28b8-5200-390a-7fe51bf1be0a (at 10.8.7.5@o2ib6) [94472.765674] Lustre: fir-MDT0003: Connection restored to (at 10.9.101.60@o2ib4) [94489.421254] Lustre: fir-MDT0003: Connection restored to 54fd6f2e-cb6c-4 (at 10.9.101.57@o2ib4) [94500.606611] Lustre: fir-MDT0003: Connection restored to (at 10.9.101.59@o2ib4) [94588.848720] Lustre: fir-MDT0003: Connection restored to 5a6b489d-8a0c-1dc7-c222-8c5330c92213 (at 10.8.8.20@o2ib6) [94793.171929] Lustre: fir-MDT0003: Connection restored to fc841094-f1fd-2756-1968-f74105b220e6 (at 10.8.8.30@o2ib6) [95044.412512] Lustre: fir-MDT0003: Connection restored to (at 10.9.102.48@o2ib4) [95044.419835] Lustre: Skipped 2 previous similar messages [95502.554791] Lustre: fir-MDT0003: Connection restored to 6676e5f3-c59e-c628-05b4-c9153b23c3f7 (at 10.8.21.16@o2ib6) [95502.565144] Lustre: Skipped 1 previous similar message [103469.783349] Lustre: fir-MDT0003: Connection restored to 5ce2e68e-76b2-bbc3-75c5-66a5c2b02651 (at 10.8.23.15@o2ib6) [103469.793789] Lustre: Skipped 1 previous similar message [104880.105661] Lustre: fir-MDT0003: haven't heard from client 45ffa07c-203c-dad9-8f0d-e714fc6465b8 (at 10.8.22.11@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389bb5800, cur 1576093950 expire 1576093800 last 1576093723 [104880.127471] Lustre: Skipped 1 previous similar message [106557.064041] Lustre: fir-MDT0003: haven't heard from client 704e8622-7442-8eb3-b4e3-c86a69ef45af (at 10.8.20.21@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389428000, cur 1576095627 expire 1576095477 last 1576095400 [106946.217064] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.11@o2ib6) [106957.608917] Lustre: fir-MDT0003: Connection restored to 4f86dcb5-8d8c-1599-bd44-005eb718eb65 (at 10.8.22.10@o2ib6) [107013.360901] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.23@o2ib6) [107596.096007] Lustre: fir-MDT0003: haven't heard from client c3415e6e-dda3-8602-28df-a932f656881d (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389c59800, cur 1576096666 expire 1576096516 last 1576096439 [108674.867058] Lustre: fir-MDT0003: Connection restored to 704e8622-7442-8eb3-b4e3-c86a69ef45af (at 10.8.20.21@o2ib6) [108706.406434] Lustre: fir-MDT0003: Connection restored to c3415e6e-dda3-8602-28df-a932f656881d (at 10.9.112.17@o2ib4) [109066.807950] Lustre: fir-MDT0003: Connection restored to 4c497e0b-ea41-4 (at 10.8.9.1@o2ib6) [109081.355094] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.13@o2ib6) [109175.686716] Lustre: fir-MDT0003: Connection restored to 37c7e464-6686-fdc0-1c81-eae75026a910 (at 10.8.22.2@o2ib6) [109273.635261] Lustre: fir-MDT0003: Connection restored to b34be8aa-32d9-4 (at 10.9.113.13@o2ib4) [109354.753983] Lustre: fir-MDT0003: Connection restored to (at 10.9.101.60@o2ib4) [109686.910367] Lustre: fir-MDT0003: Connection restored to (at 10.8.24.7@o2ib6) [109686.917594] Lustre: Skipped 1 previous similar message [111190.129672] Lustre: fir-MDT0003: haven't heard from client 000d6715-906a-fe00-99d9-1ba39760e7f7 (at 10.8.22.16@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389b3e400, cur 1576100260 expire 1576100110 last 1576100033 [111683.134894] Lustre: fir-MDT0003: haven't heard from client 85fbdf3d-35db-072c-03b7-e9977baaa2bf (at 10.8.23.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9503894b9c00, cur 1576100753 expire 1576100603 last 1576100526 [111683.156680] Lustre: Skipped 1 previous similar message [111893.279359] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.12@o2ib6) [113268.801486] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.8@o2ib6) [113276.709857] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.18@o2ib6) [113287.360236] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.18@o2ib6) [113304.543399] Lustre: fir-MDT0003: Connection restored to 94396c8b-eccd-7da2-de85-f79420b2e641 (at 10.8.23.33@o2ib6) [113304.553835] Lustre: Skipped 1 previous similar message [114374.171347] Lustre: fir-MDT0003: haven't heard from client 8c2fd243-a078-4 (at 10.9.117.46@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff951316a31000, cur 1576103444 expire 1576103294 last 1576103217 [114528.774340] Lustre: fir-MDT0003: Connection restored to (at 10.9.117.46@o2ib4) [114528.781740] Lustre: Skipped 2 previous similar messages [114768.131164] Lustre: fir-MDT0003: Connection restored to (at 10.8.21.2@o2ib6) [114779.477475] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.32@o2ib6) [116444.945773] Lustre: fir-MDT0003: Connection restored to 98b70d1a-7357-ff1b-1e1d-8bd68b6592c2 (at 10.8.23.27@o2ib6) [116694.497294] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.27@o2ib6) [130426.448683] Lustre: fir-MDT0003: Connection restored to (at 10.8.20.26@o2ib6) [132604.453027] Lustre: fir-MDT0003: Connection restored to 207217ac-1163-df36-3120-8bf6c3ecbb93 (at 10.8.23.21@o2ib6) [140144.748508] Lustre: fir-MDT0003: Connection restored to e15078c5-8209-4 (at 10.8.25.17@o2ib6) [140198.489155] Lustre: fir-MDT0003: haven't heard from client e15078c5-8209-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389e53400, cur 1576129268 expire 1576129118 last 1576129041 [140573.492518] Lustre: fir-MDT0003: haven't heard from client 208ccf09-d6ca-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9533896c9c00, cur 1576129643 expire 1576129493 last 1576129416 [141714.093142] Lustre: fir-MDT0003: Connection restored to e15078c5-8209-4 (at 10.8.25.17@o2ib6) [142193.515329] Lustre: fir-MDT0003: haven't heard from client 0cfc0c49-f407-4 (at 10.8.25.17@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff95224169e800, cur 1576131263 expire 1576131113 last 1576131036 [144610.388050] Lustre: fir-MDT0003: Connection restored to (at 10.8.22.20@o2ib6) [144616.509853] Lustre: fir-MDT0003: Connection restored to bd358c1a-07c6-3f9f-7c84-efdb04e29ef9 (at 10.8.21.1@o2ib6) [144680.792414] Lustre: fir-MDT0003: Connection restored to 26627d4d-9b72-83d5-02a3-73c7f9501a91 (at 10.8.22.26@o2ib6) [149464.595345] Lustre: fir-MDT0003: Connection restored to (at 10.8.23.26@o2ib6) [149477.826991] Lustre: fir-MDT0003: Connection restored to ca09bd61-a4b3-111c-b997-9c7823236764 (at 10.8.22.17@o2ib6) [149480.148389] Lustre: fir-MDT0003: Connection restored to 00850750-7463-78da-94ee-623be2781c44 (at 10.8.22.22@o2ib6) [149496.950053] Lustre: fir-MDT0003: Connection restored to a507eb44-8ff1-13e2-fab8-30d1823663f8 (at 10.8.22.24@o2ib6) [149741.859128] Lustre: fir-MDT0003: Connection restored to b756702e-9d2d-2fc3-6975-b8532eea724a (at 10.8.22.14@o2ib6) [149962.731204] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576138431/real 1576138431] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576139032 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [149962.759414] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [149962.775818] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [150244.009818] INFO: task mdt00_000:39185 blocked for more than 120 seconds. [150244.016701] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150244.024623] mdt00_000 D ffff9533a23ad140 0 39185 2 0x00000080 [150244.031821] Call Trace: [150244.034367] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150244.039788] [<ffffffff8996af19>] schedule+0x29/0x70 [150244.044848] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150244.051382] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150244.057422] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150244.064212] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150244.069473] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150244.076906] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150244.083547] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150244.090623] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150244.097691] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150244.104712] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150244.111267] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150244.119028] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150244.125795] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150244.133120] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150244.140295] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150244.146939] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150244.153913] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150244.161853] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150244.168389] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150244.174419] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150244.180899] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150244.188478] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150244.194615] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150244.201243] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150244.208405] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150244.214690] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150244.221922] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150244.228480] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150244.235742] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150244.242542] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150244.249787] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150244.256264] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150244.263466] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150244.271076] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150244.277378] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150244.284397] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150244.292077] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150244.299287] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150244.307068] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150244.314048] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150244.319426] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150244.325832] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150244.333383] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150244.338359] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150244.344621] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150244.351161] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150244.357360] INFO: task mdt02_000:39191 blocked for more than 120 seconds. [150244.364260] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150244.372187] mdt02_000 D ffff9533a7b0c100 0 39191 2 0x00000080 [150244.379421] Call Trace: [150244.381971] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150244.387388] [<ffffffff8996af19>] schedule+0x29/0x70 [150244.392519] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150244.399054] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150244.405084] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150244.411871] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150244.417114] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150244.424527] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150244.431151] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150244.438121] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150244.445190] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150244.452074] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150244.458639] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150244.466395] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150244.473037] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150244.480357] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150244.487419] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150244.494065] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150244.501035] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150244.508970] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150244.515513] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150244.521553] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150244.528026] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150244.535755] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150244.541866] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150244.548590] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150244.555761] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150244.562062] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150244.569374] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150244.575921] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150244.583233] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150244.590049] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150244.597283] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150244.603854] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150244.611031] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150244.618745] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150244.624980] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150244.632105] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150244.639796] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150244.646982] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150244.654778] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150244.661678] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150244.667101] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150244.673495] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150244.680996] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150244.685972] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150244.692166] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150244.698732] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150244.704917] INFO: task mdt03_001:39195 blocked for more than 120 seconds. [150244.711812] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150244.719726] mdt03_001 D ffff9533b7b09040 0 39195 2 0x00000080 [150244.726933] Call Trace: [150244.729476] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150244.734885] [<ffffffff8996af19>] schedule+0x29/0x70 [150244.739975] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150244.746504] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150244.752526] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150244.759316] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150244.764571] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150244.771975] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150244.778612] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150244.785586] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150244.792676] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150244.799559] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150244.806115] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150244.813863] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150244.820500] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150244.827816] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150244.834885] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150244.841533] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150244.848607] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150244.856547] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150244.863143] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150244.869178] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150244.875661] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150244.883305] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150244.889423] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150244.896083] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150244.903231] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150244.909530] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150244.916794] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150244.923345] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150244.930819] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150244.937632] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150244.944884] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150244.951367] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150244.958553] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150244.966216] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150244.972435] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150244.979525] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150244.987209] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150244.994419] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150245.002294] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150245.009174] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150245.014565] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150245.020943] [<ffffffff8996aaba>] ? __schedule+0x42a/0x860 [150245.026561] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150245.034124] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150245.039100] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150245.045341] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150245.051883] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150245.058106] INFO: task mdt03_003:39243 blocked for more than 120 seconds. [150245.065039] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150245.072964] mdt03_003 D ffff9533b06c8000 0 39243 2 0x00000080 [150245.080209] Call Trace: [150245.082763] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150245.088177] [<ffffffff8996af19>] schedule+0x29/0x70 [150245.093247] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150245.099785] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150245.105842] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150245.112647] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150245.117967] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150245.125388] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150245.132026] [<ffffffffc131a86b>] ? __ldiskfs_handle_dirty_metadata+0x8b/0x220 [ldiskfs] [150245.140281] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150245.147251] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150245.154396] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150245.161280] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150245.167837] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150245.175633] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150245.182263] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150245.189625] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150245.196700] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150245.203356] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150245.210409] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150245.218353] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150245.224964] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150245.230986] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150245.237456] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150245.245083] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150245.251204] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150245.257942] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150245.265102] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150245.271396] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150245.278647] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150245.285183] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150245.292452] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150245.299248] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150245.306493] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150245.312964] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150245.320146] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150245.327754] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150245.334005] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150245.341003] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150245.348679] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150245.355848] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150245.363621] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150245.370515] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150245.375858] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150245.382261] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150245.389749] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150245.394738] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150245.400923] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150245.407591] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150245.413843] INFO: task mdt01_007:39709 blocked for more than 120 seconds. [150245.420756] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150245.428680] mdt01_007 D ffff9513b14fb0c0 0 39709 2 0x00000080 [150245.435917] Call Trace: [150245.438473] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150245.443893] [<ffffffff8996af19>] schedule+0x29/0x70 [150245.448978] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150245.455523] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150245.461610] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150245.468402] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150245.473750] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150245.481171] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150245.487796] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150245.494866] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150245.501938] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150245.508834] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150245.515374] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150245.523131] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150245.529788] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150245.537110] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150245.544187] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150245.550821] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150245.557798] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150245.565743] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150245.572278] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150245.578313] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150245.584777] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150245.592377] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150245.598479] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150245.605139] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150245.612287] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150245.618593] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150245.625824] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150245.632381] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150245.639730] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150245.646552] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150245.653849] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150245.660321] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150245.667541] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150245.675239] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150245.681471] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150245.688480] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150245.696153] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150245.703345] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150245.711131] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150245.718013] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150245.723377] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150245.729770] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150245.737283] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150245.742254] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150245.748456] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150245.754986] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150245.761175] INFO: task mdt03_007:39714 blocked for more than 120 seconds. [150245.768082] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150245.776002] mdt03_007 D ffff9513b14f9040 0 39714 2 0x00000080 [150245.783211] Call Trace: [150245.785752] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150245.791160] [<ffffffff8996af19>] schedule+0x29/0x70 [150245.796251] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150245.802784] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150245.808812] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150245.815600] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150245.820847] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150245.828254] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150245.834894] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150245.841968] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150245.849045] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150245.855957] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150245.862590] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150245.870365] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150245.877002] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150245.884327] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150245.891401] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150245.898072] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150245.905040] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150245.912974] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150245.919513] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150245.925547] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150245.932005] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150245.939620] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150245.945725] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150245.952367] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150245.959514] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150245.965805] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150245.973047] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150245.979587] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150245.986854] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150245.993649] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150246.000905] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150246.007372] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150246.014558] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150246.022163] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150246.028391] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150246.035398] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150246.043201] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150246.050374] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150246.058206] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150246.065095] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150246.070453] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150246.076875] [<ffffffff8996aaba>] ? __schedule+0x42a/0x860 [150246.082482] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150246.090020] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150246.095003] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150246.101201] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150246.107744] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150246.113965] INFO: task mdt00_016:39744 blocked for more than 120 seconds. [150246.120886] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150246.128802] mdt00_016 D ffff95039e76d140 0 39744 2 0x00000080 [150246.136113] Call Trace: [150246.138656] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150246.144080] [<ffffffff8996af19>] schedule+0x29/0x70 [150246.149136] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150246.155668] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150246.161676] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150246.168472] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150246.173726] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150246.181136] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150246.187777] [<ffffffffc131a86b>] ? __ldiskfs_handle_dirty_metadata+0x8b/0x220 [ldiskfs] [150246.195962] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150246.202950] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150246.210009] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150246.216903] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150246.223446] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150246.231215] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150246.237837] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150246.245167] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150246.252225] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150246.258871] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150246.265840] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150246.273790] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150246.280339] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150246.286441] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150246.292918] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150246.300521] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150246.306662] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150246.313303] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150246.320522] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150246.326816] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150246.334104] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150246.340728] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150246.348003] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150246.354877] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150246.362130] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150246.368632] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150246.375802] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150246.383408] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150246.389643] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150246.396639] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150246.404329] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150246.411501] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150246.419288] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150246.426170] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150246.431535] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150246.437926] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150246.445411] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150246.450400] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150246.456582] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150246.463132] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150246.469315] INFO: task mdt01_024:39759 blocked for more than 120 seconds. [150246.476222] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150246.484144] mdt01_024 D ffff95039ed49040 0 39759 2 0x00000080 [150246.491349] Call Trace: [150246.493892] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150246.499300] [<ffffffff8996af19>] schedule+0x29/0x70 [150246.504375] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150246.510916] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150246.516962] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150246.523758] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150246.529083] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150246.536548] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150246.543199] [<ffffffffc131a86b>] ? __ldiskfs_handle_dirty_metadata+0x8b/0x220 [ldiskfs] [150246.551479] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150246.558467] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150246.565548] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150246.572519] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150246.579071] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150246.586840] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150246.593462] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150246.600779] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150246.607869] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150246.614511] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150246.621497] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150246.629421] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150246.635970] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150246.641993] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150246.648469] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150246.656051] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150246.662172] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150246.668798] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150246.675973] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150246.682264] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150246.689509] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150246.696046] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150246.703312] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150246.710109] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150246.717412] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150246.723904] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150246.731158] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150246.738793] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150246.745016] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150246.752065] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150246.759753] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150246.766981] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150246.774755] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150246.781656] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150246.787041] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150246.793440] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150246.800993] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150246.805983] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150246.812184] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150246.818829] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150246.825021] INFO: task mdt03_014:39760 blocked for more than 120 seconds. [150246.831939] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150246.839861] mdt03_014 D ffff95039ed4a080 0 39760 2 0x00000080 [150246.847081] Call Trace: [150246.849632] [<ffffffff895a21c1>] ? __percpu_counter_add+0x51/0x70 [150246.855920] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150246.861321] [<ffffffff8996af19>] schedule+0x29/0x70 [150246.866382] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150246.872940] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150246.878949] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150246.885759] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150246.890994] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150246.898397] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150246.905031] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150246.912003] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150246.919073] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150246.925957] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150246.932521] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150246.940276] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150246.946918] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150246.954237] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150246.961291] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150246.967939] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150246.974907] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150246.982842] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150246.989380] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150246.995443] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150247.001927] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150247.009602] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150247.015714] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150247.022353] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150247.029543] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150247.035837] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150247.043143] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150247.049688] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150247.056942] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150247.063760] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150247.070987] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150247.077485] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150247.084651] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150247.092273] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150247.098488] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150247.105490] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150247.113163] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150247.120332] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150247.128121] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150247.135004] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150247.140381] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150247.146769] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150247.154265] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150247.159235] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150247.165426] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150247.171973] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150247.178156] INFO: task mdt02_018:39764 blocked for more than 120 seconds. [150247.185048] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [150247.192964] mdt02_018 D ffff95039f802080 0 39764 2 0x00000080 [150247.200166] Call Trace: [150247.202707] [<ffffffff89969192>] ? mutex_lock+0x12/0x2f [150247.208115] [<ffffffff8996af19>] schedule+0x29/0x70 [150247.213206] [<ffffffff8996c805>] rwsem_down_write_failed+0x225/0x3a0 [150247.219742] [<ffffffff894b11a4>] ? dquot_get_dqblk+0x144/0x1f0 [150247.225767] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [150247.232558] [<ffffffff8996a24d>] down_write+0x2d/0x3d [150247.237798] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [150247.245212] [<ffffffffc0a98cf2>] ? cfs_hash_lookup+0xa2/0xd0 [libcfs] [150247.251839] [<ffffffffc16734dc>] ? lod_qos_statfs_update+0x3c/0x2b0 [lod] [150247.258820] [<ffffffffc1674eb5>] ? lod_prepare_avoidance+0x375/0x780 [lod] [150247.265881] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [150247.272783] [<ffffffffc13be382>] ? qsd_op_begin+0x262/0x4b0 [lquota] [150247.279322] [<ffffffffc144009b>] ? osd_declare_inode_qid+0x27b/0x430 [osd_ldiskfs] [150247.287090] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [150247.293717] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [150247.301044] [<ffffffffc16775df>] ? lod_sub_declare_create+0xdf/0x210 [lod] [150247.308101] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [150247.314749] [<ffffffffc0d69f99>] ? lu_context_refill+0x19/0x50 [obdclass] [150247.321726] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [150247.329663] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [150247.336198] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [150247.342247] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [150247.348710] [<ffffffffc0d7ddf8>] ? upcall_cache_get_entry+0x218/0x8b0 [obdclass] [150247.356293] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [150247.362417] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [150247.369040] [<ffffffffc154e826>] ? mdt_intent_fixup_resent+0x36/0x220 [mdt] [150247.376205] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [150247.382493] [<ffffffffc0d47a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [150247.389742] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [150247.396281] [<ffffffffc154ea10>] ? mdt_intent_fixup_resent+0x220/0x220 [mdt] [150247.403561] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [150247.410360] [<ffffffffc0a96033>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [150247.417602] [<ffffffffc0a997be>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [150247.424076] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [150247.431245] [<ffffffffc0ff40f0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [150247.438865] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [150247.445081] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [150247.452089] [<ffffffffc1033da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [150247.459753] [<ffffffffc0a8abde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [150247.466932] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [150247.474711] [<ffffffffc0ffa805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [150247.481617] [<ffffffff892cfeb4>] ? __wake_up+0x44/0x50 [150247.486964] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [150247.493366] [<ffffffffc1002080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [150247.500854] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [150247.505829] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150247.512032] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [150247.518561] [<ffffffff892c2db0>] ? insert_kthread_work+0x40/0x40 [150564.562709] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139032/real 1576139032] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576139633 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [150564.590919] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [150564.607325] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [150718.784620] Lustre: 39464:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139032/real 1576139032] req@ffff952100e69200 x1652547527092272/t0(0) o5->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 432/432 e 0 to 1 dl 1576139788 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [150718.812920] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [151003.797140] LustreError: 39757:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576139773, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff950147330000/0x5f9f636a2ec03c85 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 240 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39757 timeout: 0 lvb_type: 0 [151003.797218] LustreError: dumping log to /tmp/lustre-log.1576140073.39712 [151003.843595] LustreError: 39757:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 8 previous similar messages [151003.878780] Lustre: fir-MDT0003: Client 7dc77806-c779-f6f7-b102-8e88c090719f (at 10.9.108.2@o2ib4) reconnecting [151003.879099] Lustre: fir-MDT0003: Connection restored to (at 10.9.108.7@o2ib4) [151003.896262] Lustre: Skipped 8 previous similar messages [151165.738145] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139634/real 1576139634] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576140235 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [151165.766349] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [151165.782780] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [151165.792706] Lustre: Skipped 8 previous similar messages [151203.804619] LustreError: 39794:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576139973, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95331e3e0000/0x5f9f636a2ec22424 lrc: 3/0,1 mode: --/CW res: [0x280000dbb:0x18a:0x0].0x0 bits 0x2/0x0 rrc: 249 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39794 timeout: 0 lvb_type: 0 [151203.844179] LustreError: 39794:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 18 previous similar messages [151203.862592] Lustre: fir-MDT0003: Client d8aa6f82-54cf-b7b6-c533-7761ac172b8e (at 10.9.108.6@o2ib4) reconnecting [151289.649686] LustreError: 84149:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140059, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff94fee666f740/0x5f9f636a2ec35362 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 255 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84149 timeout: 0 lvb_type: 0 [151303.848863] LustreError: 39743:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140073, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff952212b43cc0/0x5f9f636a2ec38d06 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 255 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39743 timeout: 0 lvb_type: 0 [151303.888511] LustreError: 39743:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 8 previous similar messages [151303.928782] Lustre: fir-MDT0003: Client a14ff60f-458c-0457-be47-2784083e18c9 (at 10.8.18.15@o2ib6) reconnecting [151303.938957] Lustre: Skipped 1 previous similar message [151303.944211] Lustre: fir-MDT0003: Connection restored to a14ff60f-458c-0457-be47-2784083e18c9 (at 10.8.18.15@o2ib6) [151303.954678] Lustre: Skipped 2 previous similar messages [151403.985108] LustreError: 39716:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140173, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff951302bb3cc0/0x5f9f636a2ec54c8f lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 283 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39716 timeout: 0 lvb_type: 0 [151404.024760] LustreError: 39716:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 5 previous similar messages [151424.473365] LustreError: 84112:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140193, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff94ff33e03f00/0x5f9f636a2ec587ec lrc: 3/0,1 mode: --/CW res: [0x280000dbb:0x18a:0x0].0x0 bits 0x2/0x0 rrc: 298 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84112 timeout: 0 lvb_type: 0 [151424.512926] LustreError: 84112:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [151475.835001] Lustre: 39464:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139789/real 1576139789] req@ffff952100e6ba80 x1652547527440464/t0(0) o5->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 432/432 e 0 to 1 dl 1576140545 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [151475.863299] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [151503.857344] LustreError: 39807:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140273, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff952212b43840/0x5f9f636a2ec68383 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 300 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39807 timeout: 0 lvb_type: 0 [151503.857347] LustreError: 39774:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140273, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff952273ace780/0x5f9f636a2ec683a6 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 299 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39774 timeout: 0 lvb_type: 0 [151503.857351] LustreError: 39774:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [151503.947504] LustreError: 39807:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 16 previous similar messages [151503.955630] Lustre: fir-MDT0003: Client d8aa6f82-54cf-b7b6-c533-7761ac172b8e (at 10.9.108.6@o2ib4) reconnecting [151503.955632] Lustre: Skipped 9 previous similar messages [151603.894587] LustreError: 39729:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140373, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff94ff1b65cc80/0x5f9f636a2ec804fa lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 311 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39729 timeout: 0 lvb_type: 0 [151648.674990] Lustre: fir-MDT0003: Connection restored to e15078c5-8209-4 (at 10.8.25.17@o2ib6) [151648.683610] Lustre: Skipped 12 previous similar messages [151660.320290] Lustre: 38972:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576139973/real 1576139973] req@ffff95036c70f980 x1652547527501584/t0(0) o6->fir-OST0054-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 0 to 1 dl 1576140729 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [151660.348500] Lustre: fir-OST0054-osc-MDT0003: Connection to fir-OST0054 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [151675.415512] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.25.17@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [151675.432880] LustreError: Skipped 1383 previous similar messages [151703.980697] Lustre: fir-MDT0003: Client a14ff60f-458c-0457-be47-2784083e18c9 (at 10.8.18.15@o2ib6) reconnecting [151703.990872] Lustre: Skipped 3 previous similar messages [151764.273568] Lustre: 38972:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576140077/real 1576140077] req@ffff9513b4417980 x1652547527537200/t0(0) o6->fir-OST005a-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 0 to 1 dl 1576140833 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [151764.301781] Lustre: fir-OST005a-osc-MDT0003: Connection to fir-OST005a (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [151767.140605] LustreError: 39761:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140536, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9522c22a2d00/0x5f9f636a2eca8f75 lrc: 3/0,1 mode: --/CW res: [0x280000dbb:0x18a:0x0].0x0 bits 0x2/0x0 rrc: 320 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39761 timeout: 0 lvb_type: 0 [151767.180163] LustreError: 39761:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 41 previous similar messages [151775.768574] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.25.17@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [151804.004117] Lustre: fir-MDT0003: Client a7c6c322-7850-feae-097c-a35b332d6e36 (at 10.9.108.67@o2ib4) reconnecting [151804.014385] Lustre: Skipped 7 previous similar messages [151864.178805] Lustre: 38993:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576140177/real 1576140177] req@ffff952e0fa1c800 x1652547527569728/t0(0) o6->fir-OST0058-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 0 to 1 dl 1576140933 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [151864.207003] Lustre: 38993:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 1 previous similar message [151864.216754] Lustre: fir-OST0058-osc-MDT0003: Connection to fir-OST0058 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [151864.232928] Lustre: Skipped 1 previous similar message [152073.479411] LustreError: 39729:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576140842, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95037db26300/0x5f9f636a2ece4c14 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 348 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39729 timeout: 0 lvb_type: 0 [152073.519051] LustreError: 39729:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 70 previous similar messages [152098.949746] Lustre: 39738:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95131de10d80 x1649315775045744/t0(0) o101->b4206b2f-67a2-cb01-c899-d99205e22b23@10.9.108.61@o2ib4:153/0 lens 576/3264 e 3 to 0 dl 1576141173 ref 2 fl Interpret:/0/0 rc 0/0 [152098.978812] Lustre: 39738:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages [152099.461740] Lustre: 84501:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9521056ab600 x1649447539405376/t0(0) o101->3dc3e4b3-1daf-f260-3956-f8f68e141bca@10.9.117.42@o2ib4:153/0 lens 592/3264 e 3 to 0 dl 1576141173 ref 2 fl Interpret:/0/0 rc 0/0 [152099.490806] Lustre: 84501:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 10 previous similar messages [152104.064405] Lustre: fir-MDT0003: Client a14ff60f-458c-0457-be47-2784083e18c9 (at 10.8.18.15@o2ib6) reconnecting [152104.074581] Lustre: Skipped 4 previous similar messages [152135.046183] Lustre: 39780:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-31), not sending early reply req@ffff95332cb09680 x1649382709902256/t0(0) o101->409d10d8-f0ec-6d0c-ba45-e36868efac65@10.9.107.67@o2ib4:189/0 lens 576/3264 e 0 to 0 dl 1576141209 ref 2 fl Interpret:/0/0 rc 0/0 [152135.075426] Lustre: 39780:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages [152141.024408] Lustre: fir-MDT0003: Client abb5ac98-d884-746d-5bad-e1a980f92130 (at 10.9.110.22@o2ib4) reconnecting [152141.034670] Lustre: Skipped 20 previous similar messages [152157.062508] Lustre: 39753:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-53), not sending early reply req@ffff9533b352f980 x1649314290036640/t0(0) o101->75af6c9a-e740-8c0d-465f-820e82ef6338@10.9.108.60@o2ib4:211/0 lens 576/3264 e 0 to 0 dl 1576141231 ref 2 fl Interpret:/0/0 rc 0/0 [152157.091754] Lustre: 39753:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages [152163.029330] Lustre: fir-MDT0003: Connection restored to (at 10.9.116.13@o2ib4) [152163.036737] Lustre: Skipped 45 previous similar messages [152205.191073] Lustre: 82366:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95008c341b00 x1649480420479008/t0(0) o101->fdca5c4a-6cf3-51e3-c2ce-f648bf33defc@10.9.106.15@o2ib4:259/0 lens 1824/3288 e 1 to 0 dl 1576141279 ref 2 fl Interpret:/0/0 rc 0/0 [152205.220249] Lustre: 82366:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 8 previous similar messages [152211.586240] Lustre: fir-MDT0003: Client fdca5c4a-6cf3-51e3-c2ce-f648bf33defc (at 10.9.106.15@o2ib4) reconnecting [152211.596507] Lustre: Skipped 19 previous similar messages [152232.885397] Lustre: 39464:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576140546/real 1576140546] req@ffff952100e6ba80 x1652547527754272/t0(0) o5->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 432/432 e 0 to 1 dl 1576141302 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [152232.913767] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [152298.888240] Lustre: 39721:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff94fee73d4c80 x1648594879812560/t0(0) o101->515141cc-68cc-d451-ee11-13fe464f05cb@10.9.106.36@o2ib4:353/0 lens 1824/3288 e 1 to 0 dl 1576141373 ref 2 fl Interpret:/0/0 rc 0/0 [152298.917394] Lustre: 39721:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [152311.638782] Lustre: fir-MDT0003: haven't heard from client 619199f2-141e-aa07-09cb-eb294e06c3f1 (at 10.9.116.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389d03c00, cur 1576141381 expire 1576141231 last 1576141154 [152334.960691] Lustre: 84618:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-31), not sending early reply req@ffff950224c11200 x1650957938649600/t0(0) o101->9f1f289d-8652-b8f8-8177-a765b83508cd@10.9.102.60@o2ib4:389/0 lens 576/3264 e 0 to 0 dl 1576141409 ref 2 fl Interpret:/0/0 rc 0/0 [152334.989933] Lustre: 84618:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages [152340.988106] Lustre: fir-MDT0003: Client 78732099-0990-5ca4-4c36-fb61df1ffcb4 (at 10.9.110.57@o2ib4) reconnecting [152340.998370] Lustre: Skipped 5 previous similar messages [152366.985083] LNet: Service thread pid 39805 was inactive for 862.56s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [152367.002103] Pid: 39805, comm: mdt00_028 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [152367.012363] Call Trace: [152367.014921] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [152367.021963] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [152367.029257] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [152367.036180] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [152367.043292] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [152367.050298] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [152367.056974] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [152367.063538] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [152367.070399] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [152367.077596] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [152367.083876] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [152367.090916] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [152367.098731] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [152367.105147] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [152367.110160] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [152367.116727] [<ffffffffffffffff>] 0xffffffffffffffff [152367.121855] LustreError: dumping log to /tmp/lustre-log.1576141436.39805 [152368.993110] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576140837/real 1576140837] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576141438 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [152369.021323] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [152399.113491] Lustre: 39873:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9522266f8000 x1649499528638496/t0(0) o101->42bcd98c-0f08-894b-11f1-d022d4ef353b@10.9.102.37@o2ib4:453/0 lens 576/3264 e 1 to 0 dl 1576141473 ref 2 fl Interpret:/0/0 rc 0/0 [152399.142586] Lustre: 39873:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 11 previous similar messages [152399.753483] LNet: Service thread pid 39798 was inactive for 895.31s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [152399.770504] Pid: 39798, comm: mdt03_023 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [152399.780762] Call Trace: [152399.783314] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [152399.790338] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [152399.797636] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [152399.804561] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [152399.811665] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [152399.818687] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [152399.825359] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [152399.831930] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [152399.838785] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [152399.845979] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [152399.852242] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [152399.859275] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [152399.867090] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [152399.873506] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [152399.878522] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [152399.885099] [<ffffffffffffffff>] 0xffffffffffffffff [152399.890220] LustreError: dumping log to /tmp/lustre-log.1576141469.39798 [152399.897629] Pid: 39815, comm: mdt03_025 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [152399.907904] Call Trace: [152399.910447] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [152399.917471] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [152399.924757] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [152399.931668] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [152399.938761] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [152399.945760] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [152399.952423] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [152399.958985] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [152399.965838] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [152399.973024] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [152399.979280] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [152399.986301] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [152399.994116] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [152400.000534] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [152400.005538] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [152400.012095] [<ffffffffffffffff>] 0xffffffffffffffff [152416.441700] Lustre: fir-OST0054-osc-MDT0003: Connection to fir-OST0054 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [152504.001526] LNet: Service thread pid 39798 completed after 999.56s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [152504.017776] LNet: Skipped 2 previous similar messages [152510.858903] Lustre: 84583:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9513022f8000 x1648660392884592/t0(0) o101->8ed76d02-8e05-3e88-fa0d-1c2cd38448b9@10.8.18.24@o2ib6:565/0 lens 1824/3288 e 1 to 0 dl 1576141585 ref 2 fl Interpret:/0/0 rc 0/0 [152510.887974] Lustre: 84583:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 18 previous similar messages [152520.627005] Lustre: fir-OST005a-osc-MDT0003: Connection to fir-OST005a (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [152526.731067] LNet: Service thread pid 84138 was inactive for 894.29s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [152526.748087] LNet: Skipped 1 previous similar message [152526.753147] Pid: 84138, comm: mdt03_043 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [152526.763426] Call Trace: [152526.765985] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [152526.772809] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [152526.780263] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [152526.787171] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [152526.793834] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [152526.801178] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [152526.807838] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [152526.815797] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [152526.822375] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [152526.828430] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [152526.834938] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [152526.841078] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [152526.847750] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [152526.854063] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [152526.860638] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [152526.867486] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [152526.874695] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [152526.880947] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [152526.887982] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [152526.895797] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [152526.902229] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [152526.907232] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [152526.913807] [<ffffffffffffffff>] 0xffffffffffffffff [152526.918917] LustreError: dumping log to /tmp/lustre-log.1576141596.84138 [152567.691582] LNet: Service thread pid 39185 was inactive for 863.70s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [152567.708605] Pid: 39185, comm: mdt00_000 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [152567.718872] Call Trace: [152567.721425] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [152567.728455] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [152567.735763] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [152567.742687] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [152567.749783] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [152567.756781] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [152567.763448] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [152567.770023] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [152567.776875] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [152567.784070] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [152567.790348] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [152567.797382] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [152567.805198] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [152567.811623] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [152567.816637] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [152567.823202] [<ffffffffffffffff>] 0xffffffffffffffff [152567.828331] LustreError: dumping log to /tmp/lustre-log.1576141637.39185 [152567.835667] LNet: Service thread pid 39187 was inactive for 863.84s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [152600.459981] LNet: Service thread pid 39784 was inactive for 896.47s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [152600.472932] LNet: Skipped 8 previous similar messages [152600.478079] LustreError: dumping log to /tmp/lustre-log.1576141669.39784 [152603.998037] LustreError: 39829:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576141373, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff950336b0a1c0/0x5f9f636a2ed28f49 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 327 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39829 timeout: 0 lvb_type: 0 [152604.037691] LustreError: 39829:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 90 previous similar messages [152604.049536] LNet: Service thread pid 84138 completed after 971.61s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [152604.065782] LNet: Skipped 12 previous similar messages [152604.071423] Lustre: fir-MDT0003: Client d8aa6f82-54cf-b7b6-c533-7761ac172b8e (at 10.9.108.6@o2ib4) reconnecting [152604.081607] Lustre: Skipped 34 previous similar messages [152804.711697] Lustre: fir-MDT0003: Connection restored to cc43915b-6aa0-7796-18f9-1827e6f9b899 (at 10.8.18.12@o2ib6) [152804.722141] Lustre: Skipped 61 previous similar messages [152970.184574] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576141438/real 1576141438] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576142039 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [152970.212797] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [152970.222630] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [152970.238802] Lustre: Skipped 1 previous similar message [152989.935826] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [153204.234460] LustreError: 39797:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576141973, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95130f31af40/0x5f9f636a2edd4081 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 435 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39797 timeout: 0 lvb_type: 0 [153204.274106] LustreError: 39797:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 187 previous similar messages [153277.196308] Lustre: fir-OST005a-osc-MDT0003: Connection to fir-OST005a (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [153277.212465] Lustre: Skipped 1 previous similar message [153304.165830] Lustre: fir-MDT0003: Client 3532db27-3550-1319-6c1b-3d6651c2c9af (at 10.9.108.62@o2ib4) reconnecting [153304.176124] Lustre: Skipped 20 previous similar messages [153354.133283] Lustre: 84698:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff95223ca17500 x1649503461975008/t0(0) o101->1805d3cc-a54f-56b0-b2d9-c40a9bdcb7fe@10.9.102.47@o2ib4:653/0 lens 592/3264 e 0 to 0 dl 1576142428 ref 2 fl Interpret:/0/0 rc 0/0 [153354.162619] Lustre: 84698:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 15 previous similar messages [153454.486494] Lustre: 84899:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff953339293a80 x1649340847875008/t0(0) o101->d5336f36-1352-ddc7-e966-e696298bb1ae@10.9.106.53@o2ib4:753/0 lens 1824/3288 e 0 to 0 dl 1576142528 ref 2 fl Interpret:/0/0 rc 0/0 [153454.515911] Lustre: 84899:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 64 previous similar messages [153460.123242] Lustre: fir-MDT0003: Connection restored to (at 10.9.104.60@o2ib4) [153460.130668] Lustre: Skipped 80 previous similar messages [153554.839689] Lustre: 84510:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/-151), not sending early reply req@ffff94feea384c80 x1652591624942976/t0(0) o101->8c85854b-9514-4@10.9.105.20@o2ib4:98/0 lens 576/3264 e 0 to 0 dl 1576142628 ref 2 fl Interpret:/0/0 rc 0/0 [153554.867113] Lustre: 84510:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 39 previous similar messages [153571.983893] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576142039/real 1576142039] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576142640 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [153572.012116] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [153654.168917] Lustre: 84834:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff950225ba8480 x1649421730428272/t0(0) o101->041c1209-eec6-f8ce-c95d-e7e9e84ecf6a@10.9.109.68@o2ib4:198/0 lens 592/3264 e 0 to 0 dl 1576142728 ref 2 fl Interpret:/0/0 rc 0/0 [153654.198248] Lustre: 84834:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 10 previous similar messages [153746.958060] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [153927.700284] Lustre: fir-OST0054-osc-MDT0003: Connection to fir-OST0054 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [153927.716438] Lustre: Skipped 2 previous similar messages [154131.703064] Lustre: fir-OST0058-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [154131.712980] Lustre: Skipped 58 previous similar messages [154173.359345] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576142641/real 1576142641] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576143242 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [154173.387549] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [154503.980558] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [154684.237852] Lustre: fir-OST0054-osc-MDT0003: Connection to fir-OST0054 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [154684.254002] Lustre: Skipped 4 previous similar messages [154775.407005] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576143242/real 1576143242] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576143843 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [154775.435209] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [154775.445286] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [154775.455221] Lustre: Skipped 3 previous similar messages [155007.191075] LustreError: 39834:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576143776, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9522b4733600/0x5f9f636a2efc80af lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 313 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39834 timeout: 0 lvb_type: 0 [155007.230723] LustreError: 39834:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 117 previous similar messages [155007.234909] Lustre: fir-MDT0003: Client a7c6c322-7850-feae-097c-a35b332d6e36 (at 10.9.108.67@o2ib4) reconnecting [155007.234911] Lustre: Skipped 113 previous similar messages [155107.170371] LustreError: 39715:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576143876, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff94ff19b0f740/0x5f9f636a2efe7154 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 343 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39715 timeout: 0 lvb_type: 0 [155107.210019] LustreError: 39715:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 27 previous similar messages [155261.003342] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [155376.646764] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576143844/real 1576143844] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576144445 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [155376.674962] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [155376.684798] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [155376.700973] Lustre: Skipped 4 previous similar messages [155376.706483] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [155376.716421] Lustre: Skipped 10 previous similar messages [155407.807120] LustreError: 39242:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576144177, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95037f83b3c0/0x5f9f636a2f054555 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 379 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39242 timeout: 0 lvb_type: 0 [155407.845172] Lustre: fir-MDT0003: Client a7c6c322-7850-feae-097c-a35b332d6e36 (at 10.9.108.67@o2ib4) reconnecting [155407.845174] Lustre: Skipped 6 previous similar messages [155407.862347] LustreError: 39242:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 64 previous similar messages [155949.566817] LustreError: 39767:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576144718, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff951362bc33c0/0x5f9f636a2fdd47fe lrc: 3/1,0 mode: --/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 120 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39767 timeout: 0 lvb_type: 0 [155949.606545] LustreError: 39767:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 1 previous similar message [155978.262163] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576144446/real 1576144446] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576145047 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [155978.290363] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [155978.300198] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [155978.316364] Lustre: Skipped 4 previous similar messages [155978.321837] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [155978.331759] Lustre: Skipped 28 previous similar messages [156008.105749] Lustre: fir-MDT0003: Client ae1d0080-04fa-5436-e145-ffdf0db9990d (at 10.0.10.3@o2ib7) reconnecting [156008.115837] Lustre: Skipped 23 previous similar messages [156018.025664] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [156103.607717] LNet: Service thread pid 84836 was inactive for 202.46s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [156103.624741] Pid: 84836, comm: mdt03_062 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156103.635000] Call Trace: [156103.637561] [<ffffffffc0fbbac0>] ldlm_completion_ast+0x430/0x860 [ptlrpc] [156103.644592] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156103.651887] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156103.658805] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156103.665911] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [156103.672151] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [156103.678649] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [156103.684788] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [156103.691460] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [156103.697771] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156103.704355] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156103.711207] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156103.718414] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156103.724675] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156103.731727] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156103.739547] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156103.745993] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156103.750996] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156103.757570] [<ffffffffffffffff>] 0xffffffffffffffff [156103.762683] LustreError: dumping log to /tmp/lustre-log.1576145173.84836 [156108.215764] LNet: Service thread pid 84608 was inactive for 202.35s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [156108.232785] Pid: 84608, comm: mdt03_051 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156108.243040] Call Trace: [156108.245589] [<ffffffffc0fbbac0>] ldlm_completion_ast+0x430/0x860 [ptlrpc] [156108.252616] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156108.259920] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156108.266840] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156108.273934] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [156108.280148] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [156108.286646] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [156108.292777] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [156108.299439] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [156108.305743] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156108.312320] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156108.319158] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156108.326383] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156108.332637] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156108.339672] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156108.347475] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156108.353902] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156108.358906] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156108.365471] [<ffffffffffffffff>] 0xffffffffffffffff [156108.370571] LustreError: dumping log to /tmp/lustre-log.1576145177.84608 [156108.727756] Pid: 39747, comm: mdt01_020 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156108.738020] Call Trace: [156108.740573] [<ffffffffc0fbbac0>] ldlm_completion_ast+0x430/0x860 [ptlrpc] [156108.747614] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156108.754936] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156108.761852] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156108.768955] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [156108.775181] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [156108.781680] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [156108.787819] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [156108.794480] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [156108.800792] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156108.807388] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156108.814234] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156108.821442] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156108.827695] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156108.834737] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156108.842542] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156108.848968] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156108.853971] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156108.860546] [<ffffffffffffffff>] 0xffffffffffffffff [156108.865667] LustreError: dumping log to /tmp/lustre-log.1576145178.39747 [156109.239798] LNet: Service thread pid 82360 was inactive for 202.40s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [156109.256817] LNet: Skipped 1 previous similar message [156109.261876] Pid: 82360, comm: mdt01_054 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156109.272144] Call Trace: [156109.274692] [<ffffffffc0fbbac0>] ldlm_completion_ast+0x430/0x860 [ptlrpc] [156109.281714] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156109.288999] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156109.295918] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156109.303014] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [156109.309232] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [156109.315735] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [156109.321867] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [156109.328528] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [156109.334832] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156109.341406] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156109.348249] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156109.355456] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156109.361700] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156109.368735] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156109.376535] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156109.382975] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156109.387970] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156109.394535] [<ffffffffffffffff>] 0xffffffffffffffff [156110.263776] Pid: 39723, comm: mdt01_013 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156110.274039] Call Trace: [156110.276590] [<ffffffffc0fbbac0>] ldlm_completion_ast+0x430/0x860 [ptlrpc] [156110.283631] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156110.290927] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156110.297848] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156110.304959] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [156110.311200] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [156110.317697] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [156110.323837] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [156110.330507] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [156110.336819] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156110.343404] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156110.350246] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156110.357452] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156110.363715] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156110.370760] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156110.378576] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156110.385005] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156110.390009] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156110.396586] [<ffffffffffffffff>] 0xffffffffffffffff [156110.401694] LustreError: dumping log to /tmp/lustre-log.1576145179.39723 [156124.599950] LNet: Service thread pid 84532 was inactive for 202.37s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156124.612893] LNet: Skipped 2 previous similar messages [156124.618037] LustreError: dumping log to /tmp/lustre-log.1576145193.84532 [156224.697192] Lustre: 39195:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9533b598bf00 x1649315775819536/t0(0) o101->b4206b2f-67a2-cb01-c899-d99205e22b23@10.9.108.61@o2ib4:504/0 lens 376/1600 e 9 to 0 dl 1576145299 ref 2 fl Interpret:/0/0 rc 0/0 [156224.726264] Lustre: 39195:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 11 previous similar messages [156231.609326] LNet: Service thread pid 84577 was inactive for 601.68s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156231.622274] LNet: Skipped 2 previous similar messages [156231.627423] LustreError: dumping log to /tmp/lustre-log.1576145300.84577 [156233.657308] LNet: Service thread pid 39852 was inactive for 602.05s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156233.670258] LNet: Skipped 66 previous similar messages [156233.675498] LustreError: dumping log to /tmp/lustre-log.1576145302.39852 [156235.705319] LNet: Service thread pid 84959 was inactive for 601.48s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156235.718265] LNet: Skipped 13 previous similar messages [156235.723496] LustreError: dumping log to /tmp/lustre-log.1576145305.84959 [156240.729388] Lustre: 39854:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff94f3cba54c80 x1649312130579504/t0(0) o101->ec90253f-845d-b95d-123f-65f4ef78e64a@10.9.108.64@o2ib4:520/0 lens 376/1600 e 9 to 0 dl 1576145315 ref 2 fl Interpret:/0/0 rc 0/0 [156240.758457] Lustre: 39854:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 90 previous similar messages [156241.849396] LNet: Service thread pid 39817 was inactive for 601.35s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156241.862350] LNet: Skipped 3 previous similar messages [156241.867498] LustreError: dumping log to /tmp/lustre-log.1576145311.39817 [156243.897428] LustreError: dumping log to /tmp/lustre-log.1576145313.82362 [156245.945452] LustreError: dumping log to /tmp/lustre-log.1576145315.39869 [156250.041498] LNet: Service thread pid 39767 was inactive for 600.47s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156250.054449] LNet: Skipped 18 previous similar messages [156250.059686] LustreError: dumping log to /tmp/lustre-log.1576145319.39767 [156252.089524] LustreError: dumping log to /tmp/lustre-log.1576145321.39745 [156258.233613] LustreError: dumping log to /tmp/lustre-log.1576145327.39858 [156273.785807] Lustre: 84489:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95035aab1680 x1649328634898848/t0(0) o101->8d232f07-b6ab-bc70-4dd8-277e82f65db5@10.9.107.58@o2ib4:553/0 lens 1792/3288 e 5 to 0 dl 1576145348 ref 2 fl Interpret:/0/0 rc 0/0 [156273.814958] Lustre: 84489:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 9 previous similar messages [156274.617839] LNet: Service thread pid 39718 was inactive for 600.78s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156274.630794] LNet: Skipped 6 previous similar messages [156274.635942] LustreError: dumping log to /tmp/lustre-log.1576145343.39718 [156276.665829] LustreError: dumping log to /tmp/lustre-log.1576145345.39820 [156280.761890] LustreError: dumping log to /tmp/lustre-log.1576145350.39757 [156285.881948] LustreError: dumping log to /tmp/lustre-log.1576145355.39774 [156288.954014] LustreError: dumping log to /tmp/lustre-log.1576145358.39788 [156298.170094] LustreError: dumping log to /tmp/lustre-log.1576145367.84137 [156307.386213] LNet: Service thread pid 84580 was inactive for 401.46s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156307.399162] LNet: Skipped 6 previous similar messages [156307.404308] LustreError: dumping log to /tmp/lustre-log.1576145376.84580 [156313.530285] LustreError: dumping log to /tmp/lustre-log.1576145382.39759 [156314.153091] Lustre: fir-MDT0003: Client f01080a0-cc7c-da9c-568d-51eacd84f956 (at 10.9.114.8@o2ib4) reconnecting [156314.163276] Lustre: Skipped 66 previous similar messages [156319.674356] LustreError: dumping log to /tmp/lustre-log.1576145388.43471 [156323.770411] LustreError: dumping log to /tmp/lustre-log.1576145393.84073 [156325.818437] LustreError: dumping log to /tmp/lustre-log.1576145395.39528 [156328.836643] LNet: Service thread pid 39841 completed after 699.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [156328.852887] LNet: Skipped 9 previous similar messages [156329.914512] LustreError: dumping log to /tmp/lustre-log.1576145399.39784 [156334.010535] LustreError: dumping log to /tmp/lustre-log.1576145403.84611 [156350.522787] Lustre: 39814:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9513899df500 x1649054644474176/t0(0) o101->970b431c-987f-2756-7c6e-fd09602f1d20@10.9.107.23@o2ib4:629/0 lens 1792/3288 e 2 to 0 dl 1576145424 ref 2 fl Interpret:/0/0 rc 0/0 [156350.551948] Lustre: 39814:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 42 previous similar messages [156356.538822] LustreError: dumping log to /tmp/lustre-log.1576145425.84705 [156366.778944] LustreError: dumping log to /tmp/lustre-log.1576145436.84808 [156370.874996] LustreError: dumping log to /tmp/lustre-log.1576145440.84268 [156391.355237] LNet: Service thread pid 39706 was inactive for 601.44s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156391.368184] LNet: Skipped 50 previous similar messages [156391.373418] LustreError: dumping log to /tmp/lustre-log.1576145460.39706 [156428.837896] LNet: Service thread pid 39731 completed after 798.92s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [156428.854141] LNet: Skipped 17 previous similar messages [156430.267721] LNet: Service thread pid 39855 was inactive for 601.42s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [156430.284744] LNet: Skipped 1 previous similar message [156430.289805] Pid: 39855, comm: mdt00_036 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156430.300082] Call Trace: [156430.302640] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [156430.309682] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156430.316977] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156430.323895] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156430.330999] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [156430.338002] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [156430.344677] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156430.351251] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156430.358112] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156430.365307] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156430.371579] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156430.378610] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156430.386426] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156430.392842] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156430.397857] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156430.404419] [<ffffffffffffffff>] 0xffffffffffffffff [156430.409557] LustreError: dumping log to /tmp/lustre-log.1576145499.39855 [156430.416997] Pid: 84630, comm: mdt00_066 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156430.427294] Call Trace: [156430.429844] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [156430.436868] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156430.444161] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156430.451081] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156430.458174] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [156430.465174] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [156430.471837] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156430.478422] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156430.485276] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156430.492474] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156430.498736] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156430.505767] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156430.513583] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156430.519998] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156430.525004] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156430.531560] [<ffffffffffffffff>] 0xffffffffffffffff [156430.536667] Pid: 84659, comm: mdt03_060 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156430.546944] Call Trace: [156430.549491] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [156430.556502] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156430.563810] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156430.570731] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156430.577815] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [156430.584824] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [156430.591474] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156430.598050] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156430.604892] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156430.612112] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156430.618358] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156430.625402] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156430.633205] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156430.639617] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156430.644609] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156430.651179] [<ffffffffffffffff>] 0xffffffffffffffff [156430.656269] Pid: 39836, comm: mdt01_046 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156430.666545] Call Trace: [156430.669095] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [156430.676116] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156430.683403] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [156430.690312] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156430.697406] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [156430.704402] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [156430.711065] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156430.717627] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156430.724481] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156430.731677] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156430.737942] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156430.744979] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156430.752796] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156430.759211] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156430.764226] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156430.770782] [<ffffffffffffffff>] 0xffffffffffffffff [156442.555872] LNet: Service thread pid 39849 was inactive for 602.24s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [156442.572924] LNet: Skipped 3 previous similar messages [156442.578082] Pid: 39849, comm: mdt03_035 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [156442.588373] Call Trace: [156442.590923] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [156442.597949] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [156442.605255] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [156442.612173] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [156442.619268] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [156442.625485] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [156442.631980] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [156442.638128] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [156442.644799] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [156442.651111] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [156442.657688] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [156442.664543] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [156442.671745] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [156442.677988] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [156442.685023] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [156442.692826] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [156442.699254] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [156442.704247] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [156442.710800] [<ffffffffffffffff>] 0xffffffffffffffff [156442.715908] LustreError: dumping log to /tmp/lustre-log.1576145512.39849 [156463.036139] LustreError: dumping log to /tmp/lustre-log.1576145532.84610 [156465.084145] LustreError: dumping log to /tmp/lustre-log.1576145534.39549 [156492.156491] Lustre: 39805:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff94ff236b8480 x1649336609635952/t0(0) o101->f1bc633f-42ca-147f-3963-7d3778b9fb73@10.9.108.25@o2ib4:16/0 lens 1792/3288 e 2 to 0 dl 1576145566 ref 2 fl Interpret:/0/0 rc 0/0 [156492.185576] Lustre: 39805:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 11 previous similar messages [156528.861121] LNet: Service thread pid 39743 completed after 898.11s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [156528.877370] LNet: Skipped 9 previous similar messages [156530.620954] LNet: Service thread pid 39715 was inactive for 601.77s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [156530.633909] LNet: Skipped 7 previous similar messages [156530.639055] LustreError: dumping log to /tmp/lustre-log.1576145599.39715 [156577.725529] LustreError: dumping log to /tmp/lustre-log.1576145647.84467 [156578.504559] Lustre: fir-MDT0003: Connection restored to (at 10.9.101.40@o2ib4) [156578.511958] Lustre: Skipped 129 previous similar messages [156579.773547] LustreError: dumping log to /tmp/lustre-log.1576145649.84699 [156580.661559] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576145047/real 1576145047] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576145648 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [156580.689761] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [156580.699594] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [156580.715765] Lustre: Skipped 3 previous similar messages [156598.105779] LustreError: 84628:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576145367, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff950352686780/0x5f9f636a2fe3c980 lrc: 3/0,1 mode: --/CW res: [0x2800347aa:0xf864:0x0].0x0 bits 0x2/0x0 rrc: 142 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84628 timeout: 0 lvb_type: 0 [156598.145431] LustreError: 84628:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 101 previous similar messages [156606.397885] LustreError: dumping log to /tmp/lustre-log.1576145675.39730 [156628.879269] LNet: Service thread pid 39853 completed after 997.95s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [156628.895532] LNet: Skipped 22 previous similar messages [156728.881813] LNet: Service thread pid 84789 completed after 1096.46s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [156728.898155] LNet: Skipped 58 previous similar messages [156775.047962] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [156793.280246] Lustre: 85095:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-271), not sending early reply req@ffff9512fa73ba80 x1649044954025712/t0(0) o101->f1233f77-bdd6-3fdd-910c-ce8f15987b77@10.9.108.9@o2ib4:317/0 lens 1816/3288 e 1 to 0 dl 1576145867 ref 2 fl Interpret:/0/0 rc 0/0 [156793.309573] Lustre: 85095:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 37 previous similar messages [156828.885069] LNet: Service thread pid 39718 completed after 1155.04s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [156828.901421] LNet: Skipped 67 previous similar messages [157181.852970] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576145650/real 1576145650] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576146251 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [157181.881172] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [157181.891005] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [157181.907179] Lustre: Skipped 4 previous similar messages [157181.912660] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [157181.922608] Lustre: Skipped 27 previous similar messages [157228.907542] LustreError: 84138:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576145998, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff952211bf5c40/0x5f9f636a2feaaa38 lrc: 3/0,1 mode: --/CW res: [0x280000dbb:0x18a:0x0].0x0 bits 0x2/0x0 rrc: 559 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84138 timeout: 0 lvb_type: 0 [157228.947104] LustreError: 84138:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 189 previous similar messages [157532.070292] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [157782.820357] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576146251/real 1576146251] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576146852 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [157782.848555] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [157782.858391] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [157782.874558] Lustre: Skipped 2 previous similar messages [157782.880042] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [157782.889960] Lustre: Skipped 2 previous similar messages [158117.919441] LustreError: 84969:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576146887, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9521b9b92880/0x5f9f636a3050dee8 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 154 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84969 timeout: 0 lvb_type: 0 [158117.959087] LustreError: 84969:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 39 previous similar messages [158289.092512] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [158384.731657] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576146852/real 1576146852] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576147453 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [158384.759860] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [158384.769692] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [158384.785857] Lustre: Skipped 3 previous similar messages [158384.791418] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [158384.801341] Lustre: Skipped 3 previous similar messages [158734.017894] Lustre: 39727:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff94fece216300 x1649409161431648/t0(0) o101->d22f0531-865c-6a0a-5b19-ab2316a51d3c@10.9.106.13@o2ib4:748/0 lens 1824/3288 e 6 to 0 dl 1576147808 ref 2 fl Interpret:/0/0 rc 0/0 [158734.047054] Lustre: 39727:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 6 previous similar messages [158852.886340] LustreError: 39755:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576147622, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9513623baf40/0x5f9f636a305873a1 lrc: 3/0,1 mode: --/PW res: [0x28003688b:0x168e:0x0].0x93746062 bits 0x2/0x0 rrc: 5 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39755 timeout: 0 lvb_type: 0 [158852.926420] LustreError: 39755:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 38 previous similar messages [158985.866951] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576147454/real 1576147454] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576148055 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [158985.895152] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [158985.904987] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [158985.921154] Lustre: Skipped 3 previous similar messages [158985.926684] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [158985.936602] Lustre: Skipped 3 previous similar messages [159046.114697] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [159558.729784] Lustre: fir-MDT0003: haven't heard from client 33fb836e-8923-4 (at 10.9.113.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff94ff24e76000, cur 1576148628 expire 1576148478 last 1576148401 [159587.498242] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576148055/real 1576148055] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576148656 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [159587.526460] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [159587.536294] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [159587.552482] Lustre: Skipped 4 previous similar messages [159587.557961] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [159587.567878] Lustre: Skipped 5 previous similar messages [159739.495093] LustreError: 84582:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576148508, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9513b4d1da00/0x5f9f636a3061d255 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 203 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84582 timeout: 0 lvb_type: 0 [159739.534738] LustreError: 84582:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 41 previous similar messages [159739.546697] Lustre: fir-MDT0003: Client 231c7fb6-1245-0cbd-1168-637ab8d5a659 (at 10.9.108.57@o2ib4) reconnecting [159739.556965] Lustre: Skipped 83 previous similar messages [159803.136882] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [159843.733953] Lustre: fir-MDT0003: haven't heard from client 99c0707c-5cac-72fe-8449-b2fab5cd2307 (at 10.9.103.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389afa000, cur 1576148913 expire 1576148763 last 1576148686 [159939.548615] Lustre: fir-MDT0003: Client 3dc3e4b3-1daf-f260-3956-f8f68e141bca (at 10.9.117.42@o2ib4) reconnecting [159939.558883] Lustre: Skipped 14 previous similar messages [160188.041561] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576148656/real 1576148656] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576149257 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [160188.069761] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [160188.079595] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [160188.095768] Lustre: Skipped 4 previous similar messages [160188.101263] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [160188.111180] Lustre: Skipped 22 previous similar messages [160454.942824] LustreError: 39847:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576149224, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9533b3369200/0x5f9f636a306983b6 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 125 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39847 timeout: 0 lvb_type: 0 [160454.982465] LustreError: 39847:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 27 previous similar messages [160560.159111] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [160746.360382] Lustre: 84797:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9521a129e780 x1649495473404992/t0(0) o101->c664be52-ca23-c3f9-13b4-abe3e43ba1b5@10.9.102.23@o2ib4:495/0 lens 376/1600 e 4 to 0 dl 1576149820 ref 2 fl Interpret:/0/0 rc 0/0 [160746.389458] Lustre: 84797:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 10 previous similar messages [160752.273553] Lustre: fir-MDT0003: Client c664be52-ca23-c3f9-13b4-abe3e43ba1b5 (at 10.9.102.23@o2ib4) reconnecting [160757.392511] Lustre: 84797:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9522263f1200 x1649316346769632/t0(0) o101->cded0104-b7e2-3351-ef3d-a03eb9e0010a@10.9.108.66@o2ib4:506/0 lens 1800/3288 e 3 to 0 dl 1576149831 ref 2 fl Interpret:/0/0 rc 0/0 [160757.421666] Lustre: 84797:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 20 previous similar messages [160772.704161] Lustre: fir-MDT0003: Client b8b7ffb5-0542-8024-8094-0defd0f81dd2 (at 10.8.27.8@o2ib6) reconnecting [160772.714252] Lustre: Skipped 32 previous similar messages [160789.528899] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576149257/real 1576149257] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576149858 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [160789.557102] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [160789.566936] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [160789.583103] Lustre: Skipped 2 previous similar messages [160789.588586] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [160789.598508] Lustre: Skipped 37 previous similar messages [160837.425488] Lustre: 39739:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9503b1a0ba80 x1649467766884768/t0(0) o101->a1a66b9b-a61e-d9ca-b446-334d54c3c8df@10.9.106.31@o2ib4:586/0 lens 584/3264 e 1 to 0 dl 1576149911 ref 2 fl Interpret:/0/0 rc 0/0 [160837.454567] Lustre: 39739:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 15 previous similar messages [161317.181345] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [161387.736201] LustreError: 84943:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576150156, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9512fa7a06c0/0x5f9f636a3074f6fd lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 94 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84943 timeout: 0 lvb_type: 0 [161387.775841] LustreError: 84943:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 43 previous similar messages [161389.880225] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576149858/real 1576149858] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576150459 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [161389.908431] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [161389.918270] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [161389.934438] Lustre: Skipped 3 previous similar messages [161389.939933] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [161389.949848] Lustre: Skipped 3 previous similar messages [161991.223561] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576150459/real 1576150459] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576151060 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [161991.251761] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [161991.261595] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [161991.277745] Lustre: Skipped 3 previous similar messages [161991.283239] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [161991.293160] Lustre: Skipped 3 previous similar messages [161998.553652] LustreError: 39189:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576150767, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95132ca13600/0x5f9f636a307c1271 lrc: 3/1,0 mode: --/PR res: [0x280033c4a:0x10f5c:0x0].0x0 bits 0x13/0x0 rrc: 43 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39189 timeout: 0 lvb_type: 0 [161998.593388] LustreError: 39189:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 3 previous similar messages [162074.203581] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [162592.398882] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576151060/real 1576151060] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576151661 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [162592.427083] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [162592.436917] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [162592.453104] Lustre: Skipped 4 previous similar messages [162592.458591] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [162592.468528] Lustre: Skipped 4 previous similar messages [162618.914207] LustreError: 39759:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576151388, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff951377e46540/0x5f9f636a3081efea lrc: 3/0,1 mode: --/CW res: [0x280000dbb:0x18a:0x0].0x0 bits 0x2/0x0 rrc: 320 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39759 timeout: 0 lvb_type: 0 [162618.953764] LustreError: 39759:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 218 previous similar messages [162831.225798] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [162841.817926] Lustre: 84656:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff951300ebf080 x1652510051951424/t0(0) o101->07edfc82-c551-0aed-6d38-6212dfcad486@10.9.106.54@o2ib4:326/0 lens 576/3264 e 1 to 0 dl 1576151916 ref 2 fl Interpret:/0/0 rc 0/0 [162841.846998] Lustre: 84656:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages [162846.449977] Lustre: 39729:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff9500ab3c3180 x1648594882245840/t0(0) o101->515141cc-68cc-d451-ee11-13fe464f05cb@10.9.106.36@o2ib4:330/0 lens 592/3264 e 2 to 0 dl 1576151920 ref 2 fl Interpret:/0/0 rc 0/0 [162846.479057] Lustre: 39729:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 12 previous similar messages [163134.786329] Lustre: fir-MDT0003: haven't heard from client a83208a9-361d-4 (at 10.9.112.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff95039e9c9400, cur 1576152204 expire 1576152054 last 1576151977 [163147.538397] Lustre: fir-MDT0003: Client b2911fad-dd6c-1241-cacc-189af7d29c2b (at 10.9.109.11@o2ib4) reconnecting [163147.548661] Lustre: Skipped 1 previous similar message [163193.286186] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576151661/real 1576151661] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576152262 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [163193.314387] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [163193.324221] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [163193.340398] Lustre: Skipped 3 previous similar messages [163193.345883] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [163193.355830] Lustre: Skipped 38 previous similar messages [163476.796422] Lustre: fir-MDT0003: haven't heard from client 46023962-0c0f-4f56-ba25-877d19751e9f (at 10.8.18.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389812c00, cur 1576152546 expire 1576152396 last 1576152319 [163476.818222] Lustre: Skipped 1 previous similar message [163588.248027] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [163793.877537] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576152262/real 1576152262] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576152863 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [163793.905735] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [163793.915569] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [163793.931735] Lustre: Skipped 3 previous similar messages [163793.937231] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [163793.947174] Lustre: Skipped 3 previous similar messages [163849.201223] LustreError: 39739:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576152618, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff94fefd754140/0x5f9f636a30c1f9ce lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 324 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39739 timeout: 0 lvb_type: 0 [163849.240882] LustreError: 39739:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 187 previous similar messages [163849.242833] Lustre: fir-MDT0003: Client ae9bd656-5a6f-05f5-a9fa-237fb2f346f5 (at 10.9.107.71@o2ib4) reconnecting [163849.242836] Lustre: Skipped 32 previous similar messages [163849.898303] Lustre: fir-MDT0003: Client cc43915b-6aa0-7796-18f9-1827e6f9b899 (at 10.8.18.12@o2ib6) reconnecting [163849.908481] Lustre: Skipped 9 previous similar messages [164309.793467] Lustre: fir-MDT0003: haven't heard from client 3c8d6e9e-a50e-0a1b-c656-8992c6066eb7 (at 10.9.103.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff95038942ac00, cur 1576153379 expire 1576153229 last 1576153152 [164345.270342] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [164395.716962] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576152863/real 1576152863] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576153464 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [164395.745165] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [164395.754991] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [164395.771161] Lustre: Skipped 3 previous similar messages [164395.776636] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [164395.786575] Lustre: Skipped 15 previous similar messages [164487.769126] LustreError: 84724:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576153256, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95237a12dc40/0x5f9f636a30cc4e07 lrc: 3/1,0 mode: --/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 84 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84724 timeout: 0 lvb_type: 0 [164487.808767] LustreError: 84724:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 11 previous similar messages [164664.334319] LustreError: 84738:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576153433, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95237a4b9f80/0x5f9f636a30ce684a lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 85 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 84738 timeout: 0 lvb_type: 0 [164664.373965] LustreError: 84738:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 48 previous similar messages [164758.859493] Lustre: 84752:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95236dac1200 x1649436547660800/t0(0) o101->a3673fdd-c091-ae6f-4781-d627da6f4e17@10.9.117.20@o2ib4:733/0 lens 376/1600 e 9 to 0 dl 1576153833 ref 2 fl Interpret:/0/0 rc 0/0 [164758.888561] Lustre: 84752:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages [164759.861503] Lustre: 84752:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95215a78a400 x1650926758552656/t0(0) o101->7dc77806-c779-f6f7-b102-8e88c090719f@10.9.108.2@o2ib4:734/0 lens 376/1600 e 9 to 0 dl 1576153834 ref 2 fl Interpret:/0/0 rc 0/0 [164759.890486] Lustre: 84752:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 16 previous similar messages [164761.865527] Lustre: 84752:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff952289e30480 x1648799146761280/t0(0) o101->57b26761-b79f-628f-0ec2-0a10fd7ac3bd@10.8.18.17@o2ib6:736/0 lens 376/1600 e 9 to 0 dl 1576153836 ref 2 fl Interpret:/0/0 rc 0/0 [164761.894511] Lustre: 84752:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 25 previous similar messages [164765.461345] Lustre: fir-MDT0003: Client a3673fdd-c091-ae6f-4781-d627da6f4e17 (at 10.9.117.20@o2ib4) reconnecting [164765.873583] Lustre: 39852:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95214dba4c80 x1649316133862704/t0(0) o101->d4052736-4891-c960-79ef-e2aeecd2f5cf@10.9.108.58@o2ib4:740/0 lens 376/1600 e 9 to 0 dl 1576153840 ref 2 fl Interpret:/0/0 rc 0/0 [164765.902653] Lustre: 39852:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 34 previous similar messages [164766.075828] Lustre: fir-MDT0003: Client 24fab89a-6f6a-550a-7225-4734c7f7b849 (at 10.8.27.12@o2ib6) reconnecting [164766.086002] Lustre: Skipped 17 previous similar messages [164767.490314] Lustre: fir-MDT0003: Client bc3bfe9e-ae2a-3090-298d-b1536a6ddfe9 (at 10.9.108.59@o2ib4) reconnecting [164767.500577] Lustre: Skipped 4 previous similar messages [164767.777614] LNet: Service thread pid 84963 was inactive for 602.84s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [164767.794639] Pid: 84963, comm: mdt02_076 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [164767.804895] Call Trace: [164767.807449] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [164767.814274] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [164767.821726] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [164767.828645] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [164767.836615] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [164767.843982] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [164767.851164] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [164767.857814] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [164767.864402] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [164767.870972] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [164767.877555] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [164767.884423] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [164767.891631] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [164767.897881] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [164767.904927] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [164767.912742] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [164767.919173] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [164767.924177] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [164767.930753] [<ffffffffffffffff>] 0xffffffffffffffff [164767.935862] LustreError: dumping log to /tmp/lustre-log.1576153837.84963 [164767.945251] Pid: 84718, comm: mdt02_056 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [164767.955545] Call Trace: [164767.958095] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [164767.964921] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [164767.972372] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [164767.979295] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [164767.987264] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [164767.994609] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [164768.001780] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [164768.008439] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [164768.015014] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [164768.021598] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [164768.028170] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [164768.035032] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [164768.042226] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [164768.048497] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [164768.055522] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [164768.063336] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [164768.069752] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [164768.074769] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [164768.081323] [<ffffffffffffffff>] 0xffffffffffffffff [164768.086428] Pid: 39716, comm: mdt01_011 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [164768.096684] Call Trace: [164768.099235] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [164768.106050] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [164768.113511] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [164768.120417] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [164768.128380] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [164768.135723] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [164768.142919] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [164768.149566] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [164768.156149] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [164768.162714] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [164768.169287] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [164768.176129] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [164768.183350] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [164768.189589] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [164768.196623] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [164768.204426] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [164768.210866] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [164768.215856] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [164768.222424] [<ffffffffffffffff>] 0xffffffffffffffff [164768.227513] Pid: 39813, comm: mdt01_039 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [164768.237784] Call Trace: [164768.240331] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [164768.247145] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [164768.254605] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [164768.261515] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [164768.269478] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [164768.276819] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [164768.284000] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [164768.290655] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [164768.297227] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [164768.303795] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [164768.310384] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [164768.317240] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [164768.324446] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [164768.330683] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [164768.337718] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [164768.345523] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [164768.351948] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [164768.356949] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [164768.363512] [<ffffffffffffffff>] 0xffffffffffffffff [164768.368612] Pid: 39822, comm: mdt01_044 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [164768.378885] Call Trace: [164768.381435] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [164768.388249] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [164768.395701] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [164768.402609] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [164768.410572] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [164768.417915] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [164768.425098] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [164768.431748] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [164768.438323] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [164768.444888] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [164768.451476] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [164768.458311] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [164768.465512] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [164768.471753] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [164768.478788] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [164768.486593] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [164768.493021] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [164768.498031] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [164768.504583] [<ffffffffffffffff>] 0xffffffffffffffff [164768.509687] LNet: Service thread pid 84137 was inactive for 603.70s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [164768.522630] LNet: Skipped 14 previous similar messages [164769.636046] Lustre: fir-MDT0003: Client 635a05c8-c7a3-e96d-15e7-653531254cf2 (at 10.9.110.38@o2ib4) reconnecting [164769.646310] Lustre: Skipped 5 previous similar messages [164769.825616] LustreError: dumping log to /tmp/lustre-log.1576153839.39754 [164771.873655] LustreError: dumping log to /tmp/lustre-log.1576153841.84138 [164773.818866] Lustre: fir-MDT0003: Client 83be24ed-ef36-c298-4c93-73347c93a212 (at 10.9.106.26@o2ib4) reconnecting [164773.829123] Lustre: Skipped 14 previous similar messages [164773.921675] LustreError: dumping log to /tmp/lustre-log.1576153843.39195 [164773.921683] Lustre: 39839:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95332a751f80 x1649311250494608/t0(0) o101->a1cdb09e-9388-b95a-7a8b-46dd482f7de4@10.9.110.53@o2ib4:748/0 lens 376/1600 e 9 to 0 dl 1576153848 ref 2 fl Interpret:/0/0 rc 0/0 [164773.921685] Lustre: 39839:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 21 previous similar messages [164775.969698] LustreError: dumping log to /tmp/lustre-log.1576153845.85095 [164778.017728] LustreError: dumping log to /tmp/lustre-log.1576153847.84880 [164780.065752] LustreError: dumping log to /tmp/lustre-log.1576153849.39242 [164782.113771] LustreError: dumping log to /tmp/lustre-log.1576153851.39744 [164783.248697] Lustre: fir-MDT0003: Client ac744819-a0e9-dce1-af3e-f5ed5c20fc63 (at 10.9.104.14@o2ib4) reconnecting [164783.258955] Lustre: Skipped 17 previous similar messages [164784.161793] LustreError: dumping log to /tmp/lustre-log.1576153853.84148 [164786.209819] LustreError: dumping log to /tmp/lustre-log.1576153855.84697 [164789.945873] Lustre: 39723:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95131aa4cc80 x1648697234874944/t0(0) o101->b2911fad-dd6c-1241-cacc-189af7d29c2b@10.9.109.11@o2ib4:9/0 lens 584/3264 e 5 to 0 dl 1576153864 ref 2 fl Interpret:/0/0 rc 0/0 [164789.974763] Lustre: 39723:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 18 previous similar messages [164790.305869] LustreError: dumping log to /tmp/lustre-log.1576153859.84835 [164792.353894] LustreError: dumping log to /tmp/lustre-log.1576153861.39527 [164794.401921] LustreError: dumping log to /tmp/lustre-log.1576153863.39832 [164796.449942] LustreError: dumping log to /tmp/lustre-log.1576153865.39783 [164798.497986] LustreError: dumping log to /tmp/lustre-log.1576153867.82365 [164801.342224] Lustre: fir-MDT0003: Client 3c685a68-a01b-8133-ff2a-2e92a7918740 (at 10.9.109.60@o2ib4) reconnecting [164801.352498] Lustre: Skipped 14 previous similar messages [164802.594018] LNet: Service thread pid 85029 was inactive for 602.30s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [164802.606983] LNet: Skipped 134 previous similar messages [164802.612313] LustreError: dumping log to /tmp/lustre-log.1576153871.85029 [164804.642054] LustreError: dumping log to /tmp/lustre-log.1576153873.84808 [164806.690076] LustreError: dumping log to /tmp/lustre-log.1576153875.39787 [164808.738095] LustreError: dumping log to /tmp/lustre-log.1576153877.84582 [164814.882177] LustreError: dumping log to /tmp/lustre-log.1576153884.39731 [164823.010286] Lustre: 82360:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95131b3e4800 x1648398813962496/t0(0) o101->abaf865a-8fcb-451b-e3df-c50916747fa5@10.8.27.11@o2ib6:42/0 lens 1808/3288 e 3 to 0 dl 1576153897 ref 2 fl Interpret:/0/0 rc 0/0 [164823.039268] Lustre: 82360:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 10 previous similar messages [164823.074279] LustreError: dumping log to /tmp/lustre-log.1576153892.84941 [164831.266379] LustreError: dumping log to /tmp/lustre-log.1576153900.82366 [164833.402471] Lustre: fir-MDT0003: Client 0c302cf4-1147-d945-dfa2-e9bc796b3175 (at 10.9.101.32@o2ib4) reconnecting [164833.412734] Lustre: Skipped 11 previous similar messages [164835.362427] LustreError: dumping log to /tmp/lustre-log.1576153904.84719 [164839.458492] LustreError: dumping log to /tmp/lustre-log.1576153908.84102 [164841.506502] LustreError: dumping log to /tmp/lustre-log.1576153910.39816 [164843.554525] LustreError: dumping log to /tmp/lustre-log.1576153912.39854 [164845.602554] LustreError: dumping log to /tmp/lustre-log.1576153914.84569 [164857.890707] LustreError: dumping log to /tmp/lustre-log.1576153927.39828 [164859.938732] LustreError: dumping log to /tmp/lustre-log.1576153929.39849 [164861.986760] LustreError: dumping log to /tmp/lustre-log.1576153931.39736 [164864.034785] LustreError: dumping log to /tmp/lustre-log.1576153933.39246 [164864.329993] LNet: Service thread pid 84137 completed after 699.52s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [164864.346232] LNet: Skipped 2 previous similar messages [164868.130830] LNet: Service thread pid 85096 was inactive for 603.79s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [164868.143777] LNet: Skipped 28 previous similar messages [164868.149013] LustreError: dumping log to /tmp/lustre-log.1576153937.85096 [164870.178868] LustreError: dumping log to /tmp/lustre-log.1576153939.84781 [164872.226886] LustreError: dumping log to /tmp/lustre-log.1576153941.39601 [164874.274911] LustreError: dumping log to /tmp/lustre-log.1576153943.39819 [164886.563061] LustreError: dumping log to /tmp/lustre-log.1576153955.84943 [164900.067244] Lustre: 39837:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95237abc5a00 x1649453378705024/t0(0) o101->115600dd-e8d0-f526-46d3-125e5e3a170e@10.9.117.8@o2ib4:119/0 lens 576/3264 e 2 to 0 dl 1576153974 ref 2 fl Interpret:/0/0 rc 0/0 [164900.096232] Lustre: 39837:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 17 previous similar messages [164906.239556] Lustre: fir-MDT0003: Client 69741a10-bb12-4 (at 10.9.104.34@o2ib4) reconnecting [164906.248014] Lustre: Skipped 15 previous similar messages [164907.043355] LustreError: dumping log to /tmp/lustre-log.1576153976.39239 [164909.091348] LustreError: dumping log to /tmp/lustre-log.1576153978.39863 [164919.331471] LustreError: dumping log to /tmp/lustre-log.1576153988.39772 [164935.715673] LustreError: dumping log to /tmp/lustre-log.1576154004.39756 [164941.859766] LustreError: dumping log to /tmp/lustre-log.1576154011.39707 [164956.195925] LustreError: dumping log to /tmp/lustre-log.1576154025.39784 [164964.331166] LNet: Service thread pid 39798 completed after 799.50s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [164964.355034] LustreError: 84533:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576153733, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9533a5a51d40/0x5f9f636a30d1e4e6 lrc: 3/1,0 mode: --/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 85 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 84533 timeout: 0 lvb_type: 0 [164964.394714] LustreError: 84533:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 15 previous similar messages [164966.436053] LustreError: dumping log to /tmp/lustre-log.1576154035.84738 [164976.676178] LustreError: dumping log to /tmp/lustre-log.1576154045.84490 [164980.772226] LustreError: dumping log to /tmp/lustre-log.1576154049.84577 [164997.324443] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576153465/real 1576153465] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576154066 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [164997.352645] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [164997.362482] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [164997.378669] Lustre: Skipped 3 previous similar messages [164997.384166] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [164997.394103] Lustre: Skipped 119 previous similar messages [165009.444578] LNet: Service thread pid 82363 was inactive for 602.66s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [165009.457524] LNet: Skipped 30 previous similar messages [165009.462762] LustreError: dumping log to /tmp/lustre-log.1576154078.82363 [165027.876812] LustreError: dumping log to /tmp/lustre-log.1576154097.39842 [165064.332396] LNet: Service thread pid 39843 completed after 899.49s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [165064.348640] LNet: Skipped 4 previous similar messages [165069.477339] Lustre: 39831:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff95037cfe6300 x1649318476470624/t0(0) o101->5e12be0f-0711-9952-47cf-1c56f1c332e3@10.9.107.57@o2ib4:288/0 lens 1792/3288 e 2 to 0 dl 1576154143 ref 2 fl Interpret:/0/0 rc 0/0 [165069.506496] Lustre: 39831:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 18 previous similar messages [165075.070242] Lustre: fir-MDT0003: Client 5e12be0f-0711-9952-47cf-1c56f1c332e3 (at 10.9.107.57@o2ib4) reconnecting [165075.080503] Lustre: Skipped 12 previous similar messages [165081.125471] LNet: Service thread pid 39817 was inactive for 600.21s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [165081.142495] LNet: Skipped 4 previous similar messages [165081.147645] Pid: 39817, comm: mdt01_041 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165081.157925] Call Trace: [165081.160481] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [165081.167524] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [165081.174839] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [165081.181763] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [165081.188866] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [165081.195083] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [165081.201583] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165081.207719] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165081.214379] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165081.220691] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165081.227277] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165081.234125] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165081.241363] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165081.247611] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165081.254657] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165081.262459] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165081.268886] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165081.273889] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165081.280451] [<ffffffffffffffff>] 0xffffffffffffffff [165081.285572] LustreError: dumping log to /tmp/lustre-log.1576154150.39817 [165081.292980] Pid: 39838, comm: mdt03_029 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165081.303251] Call Trace: [165081.305799] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [165081.312819] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [165081.320106] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [165081.327025] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [165081.334127] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [165081.340377] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [165081.346878] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165081.353016] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165081.359678] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165081.365982] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165081.372578] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165081.379424] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165081.386626] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165081.392866] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165081.399902] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165081.407706] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165081.414132] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165081.419126] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165081.425693] [<ffffffffffffffff>] 0xffffffffffffffff [165102.292742] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [165164.349629] LNet: Service thread pid 39807 completed after 999.51s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [165164.365892] LNet: Skipped 3 previous similar messages [165271.039831] LustreError: 39460:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0054-osc-MDT0003: cannot cleanup orphans: rc = -11 [165343.784734] Lustre: 39869:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-489), not sending early reply req@ffff9521e839b600 x1649316347560864/t0(0) o101->cded0104-b7e2-3351-ef3d-a03eb9e0010a@10.9.108.66@o2ib4:562/0 lens 576/3264 e 2 to 0 dl 1576154417 ref 2 fl Interpret:/0/0 rc 0/0 [165343.814064] Lustre: 39869:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 22 previous similar messages [165348.887193] Lustre: fir-MDT0003: Client cded0104-b7e2-3351-ef3d-a03eb9e0010a (at 10.9.108.66@o2ib4) reconnecting [165348.897455] Lustre: Skipped 16 previous similar messages [165355.560858] LNet: Service thread pid 84532 was inactive for 814.26s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [165355.577882] LNet: Skipped 1 previous similar message [165355.582940] Pid: 84532, comm: mdt01_063 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165355.593216] Call Trace: [165355.595765] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [165355.602801] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [165355.610087] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [165355.617005] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [165355.624101] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [165355.631098] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [165355.637773] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165355.644333] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165355.651184] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165355.658372] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165355.664634] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165355.671657] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165355.679472] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165355.685890] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165355.690905] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165355.697470] [<ffffffffffffffff>] 0xffffffffffffffff [165355.702603] LustreError: dumping log to /tmp/lustre-log.1576154424.84532 [165364.352123] LNet: Service thread pid 39796 completed after 1199.48s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [165364.368456] LNet: Skipped 2 previous similar messages [165374.658110] LustreError: 39472:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005a-osc-MDT0003: cannot cleanup orphans: rc = -11 [165378.089135] LNet: Service thread pid 84501 was inactive for 813.73s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [165378.106173] Pid: 84501, comm: mdt02_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165378.116435] Call Trace: [165378.118990] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [165378.126035] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [165378.133338] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [165378.140254] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [165378.147375] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [165378.154384] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [165378.161055] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165378.167622] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165378.174470] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165378.181678] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165378.187931] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165378.194965] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165378.202790] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165378.209257] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165378.214259] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165378.220822] [<ffffffffffffffff>] 0xffffffffffffffff [165378.225959] LustreError: dumping log to /tmp/lustre-log.1576154447.84501 [165378.233495] Pid: 84149, comm: mdt00_046 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165378.243783] Call Trace: [165378.246333] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [165378.253346] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [165378.260631] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [165378.267542] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [165378.274651] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [165378.281642] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [165378.288305] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165378.294869] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165378.301722] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165378.308908] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165378.315162] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165378.322187] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165378.330001] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165378.336416] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165378.341447] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165378.348006] [<ffffffffffffffff>] 0xffffffffffffffff [165378.353136] LNet: Service thread pid 39767 was inactive for 813.99s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [165378.366082] LNet: Skipped 1 previous similar message [165437.481868] LNet: Service thread pid 39740 was inactive for 863.63s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [165437.498891] LNet: Skipped 1 previous similar message [165437.503948] Pid: 39740, comm: mdt03_009 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165437.514241] Call Trace: [165437.516789] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165437.523607] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [165437.531059] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [165437.537969] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165437.544631] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165437.551975] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165437.558637] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165437.566588] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165437.573165] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165437.579207] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165437.585726] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165437.591868] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165437.598530] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165437.604834] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165437.611410] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165437.618267] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165437.625477] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165437.631727] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165437.638761] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165437.646565] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165437.653006] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165437.658003] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165437.664579] [<ffffffffffffffff>] 0xffffffffffffffff [165437.669688] LustreError: dumping log to /tmp/lustre-log.1576154506.39740 [165437.677103] Pid: 84966, comm: mdt03_073 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165437.687391] Call Trace: [165437.689937] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165437.696762] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [165437.704204] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [165437.711114] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165437.717792] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165437.725137] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165437.731799] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165437.739751] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165437.746330] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165437.752369] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165437.758857] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165437.764985] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165437.771649] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165437.777964] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165437.784522] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165437.791387] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165437.798570] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165437.804826] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165437.811848] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165437.819667] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165437.826080] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165437.831084] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165437.837644] [<ffffffffffffffff>] 0xffffffffffffffff [165437.842755] Pid: 84798, comm: mdt01_070 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165437.853013] Call Trace: [165437.855563] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165437.862376] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [165437.869830] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [165437.876736] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165437.883398] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165437.890741] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165437.897404] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165437.905356] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165437.911930] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165437.917989] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165437.924480] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165437.930608] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165437.937272] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165437.943574] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165437.950149] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165437.956992] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165437.964190] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165437.970433] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165437.977469] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165437.985290] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165437.991716] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165437.996711] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165438.003284] [<ffffffffffffffff>] 0xffffffffffffffff [165438.008373] Pid: 84598, comm: mdt03_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165438.018645] Call Trace: [165438.021193] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165438.028022] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [165438.035453] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [165438.042375] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165438.049039] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165438.056395] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165438.063047] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165438.071010] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165438.077572] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165438.083628] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165438.090109] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165438.096255] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165438.102906] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165438.109222] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165438.115800] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165438.122655] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165438.129844] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165438.136098] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165438.143124] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165438.150942] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165438.157360] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165438.162366] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165438.168921] [<ffffffffffffffff>] 0xffffffffffffffff [165438.174019] Pid: 84656, comm: mdt01_069 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165438.184284] Call Trace: [165438.186833] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165438.193647] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [165438.201091] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [165438.207999] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165438.214661] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165438.222005] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165438.228668] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165438.236617] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165438.243193] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165438.249252] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165438.255745] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165438.261872] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165438.268521] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165438.274835] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165438.281399] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165438.288251] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165438.295443] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165438.301702] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165438.308726] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165438.316553] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165438.322960] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165438.327976] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165438.334526] [<ffffffffffffffff>] 0xffffffffffffffff [165464.353496] LNet: Service thread pid 84583 completed after 1299.46s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [165464.369825] LNet: Skipped 1 previous similar message [165473.090321] LustreError: 39468:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0058-osc-MDT0003: cannot cleanup orphans: rc = -11 [165519.402880] LustreError: dumping log to /tmp/lustre-log.1576154588.39726 [165571.360523] LustreError: 84500:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576154340, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9513463a7080/0x5f9f636a30d81253 lrc: 3/0,1 mode: --/CW res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x2/0x0 rrc: 99 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84500 timeout: 0 lvb_type: 0 [165571.400082] LustreError: 84500:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 31 previous similar messages [165589.035745] LustreError: dumping log to /tmp/lustre-log.1576154658.84789 [165598.443850] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576154066/real 1576154066] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576154667 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [165598.472055] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [165598.481888] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [165598.498063] Lustre: Skipped 5 previous similar messages [165598.503564] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [165598.513491] Lustre: Skipped 132 previous similar messages [165664.355799] LNet: Service thread pid 84645 completed after 1499.44s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [165664.372126] LNet: Skipped 2 previous similar messages [165679.148842] LustreError: dumping log to /tmp/lustre-log.1576154748.84960 [165752.877757] LNet: Service thread pid 39826 was inactive for 1064.72s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [165752.894876] LNet: Skipped 4 previous similar messages [165752.900024] Pid: 39826, comm: mdt02_031 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165752.910301] Call Trace: [165752.912857] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165752.919682] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [165752.926533] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [165752.933372] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165752.940048] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165752.947390] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165752.954055] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165752.962012] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165752.968588] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165752.974631] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165752.981138] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165752.987278] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165752.993949] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165753.000275] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165753.006862] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165753.013724] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165753.020936] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165753.027191] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165753.034239] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165753.042043] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165753.048473] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165753.053475] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165753.060054] [<ffffffffffffffff>] 0xffffffffffffffff [165753.065175] LustreError: dumping log to /tmp/lustre-log.1576154822.39826 [165753.072550] Pid: 39747, comm: mdt01_020 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165753.082806] Call Trace: [165753.085360] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [165753.092380] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [165753.099675] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [165753.106594] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [165753.113696] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [165753.119916] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [165753.126421] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165753.132576] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165753.139246] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165753.145559] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165753.152135] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165753.158979] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165753.166191] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165753.172444] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165753.179479] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165753.187282] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165753.193709] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165753.198714] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165753.205295] [<ffffffffffffffff>] 0xffffffffffffffff [165805.878422] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -11 [165822.510617] LNet: Service thread pid 86195 was inactive for 1118.33s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [165822.527751] LNet: Skipped 1 previous similar message [165822.532831] Pid: 86195, comm: mdt00_085 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165822.543109] Call Trace: [165822.545667] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165822.552490] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [165822.559334] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [165822.566168] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165822.572831] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165822.580174] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165822.586836] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165822.594794] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165822.601369] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165822.607421] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165822.613927] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165822.620066] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165822.626736] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165822.633049] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165822.639648] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165822.646509] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165822.653716] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165822.659969] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165822.667019] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165822.674823] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165822.681252] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165822.686254] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165822.692843] [<ffffffffffffffff>] 0xffffffffffffffff [165822.697949] LustreError: dumping log to /tmp/lustre-log.1576154891.86195 [165859.315074] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [165892.143461] LNet: Service thread pid 39245 was inactive for 1167.76s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [165892.160572] Pid: 39245, comm: mdt02_006 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165892.170835] Call Trace: [165892.173392] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [165892.180434] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [165892.187731] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [165892.194648] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [165892.201752] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [165892.208757] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [165892.215428] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165892.222001] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165892.228853] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165892.236057] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165892.242307] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165892.249349] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165892.257153] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165892.263581] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165892.268597] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165892.275180] [<ffffffffffffffff>] 0xffffffffffffffff [165892.280296] LustreError: dumping log to /tmp/lustre-log.1576154961.39245 [165896.239514] Pid: 39848, comm: mdt00_034 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [165896.249775] Call Trace: [165896.252334] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [165896.259162] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [165896.266016] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [165896.272852] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [165896.279514] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [165896.286872] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [165896.293538] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [165896.301497] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [165896.308073] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [165896.314115] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [165896.320621] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [165896.326759] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [165896.333430] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [165896.339743] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [165896.346320] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [165896.353200] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [165896.360410] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [165896.366671] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [165896.373715] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [165896.381517] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [165896.387945] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [165896.392950] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [165896.399522] [<ffffffffffffffff>] 0xffffffffffffffff [165896.404634] LustreError: dumping log to /tmp/lustre-log.1576154965.39848 [165904.431610] LNet: Service thread pid 39775 was inactive for 1167.66s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [165904.444650] LNet: Skipped 10 previous similar messages [165904.449889] LustreError: dumping log to /tmp/lustre-log.1576154973.39775 [165908.828559] Lustre: fir-MDT0003: Client 7dc77806-c779-f6f7-b102-8e88c090719f (at 10.9.108.2@o2ib4) reconnecting [165908.838732] Lustre: Skipped 135 previous similar messages [165914.671762] Lustre: 39243:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff9533aec0a050 x1649053188899120/t0(0) o101->f27082f8-7761-5c5d-b196-67b78beb0e67@10.9.101.66@o2ib4:378/0 lens 584/3264 e 0 to 0 dl 1576154988 ref 2 fl Interpret:/0/0 rc 0/0 [165914.701094] Lustre: 39243:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 40 previous similar messages [165957.680271] LustreError: dumping log to /tmp/lustre-log.1576155026.39793 [165964.359455] LNet: Service thread pid 39873 completed after 1799.39s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [165964.375787] LNet: Skipped 2 previous similar messages [165965.872358] LustreError: dumping log to /tmp/lustre-log.1576155035.84964 [166002.736818] LustreError: dumping log to /tmp/lustre-log.1576155071.39852 [166010.928914] LustreError: dumping log to /tmp/lustre-log.1576155080.39805 [166015.024960] LustreError: dumping log to /tmp/lustre-log.1576155084.39786 [166023.217066] LustreError: dumping log to /tmp/lustre-log.1576155092.84752 [166028.062131] LustreError: 39460:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0054-osc-MDT0003: cannot cleanup orphans: rc = -107 [166068.273617] LNet: Service thread pid 39792 was inactive for 1203.92s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [166068.290730] LNet: Skipped 1 previous similar message [166068.295793] Pid: 39792, comm: mdt00_026 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166068.306084] Call Trace: [166068.308635] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166068.315680] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166068.322975] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166068.329892] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166068.336995] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166068.343994] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166068.350663] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166068.357238] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166068.364090] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166068.371301] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166068.377575] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166068.384607] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166068.392420] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166068.398835] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166068.403854] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166068.410417] [<ffffffffffffffff>] 0xffffffffffffffff [166068.415536] LustreError: dumping log to /tmp/lustre-log.1576155137.39792 [166068.422920] Pid: 84696, comm: mdt02_052 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166068.433205] Call Trace: [166068.435753] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166068.442777] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166068.450062] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166068.456972] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166068.464067] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166068.471065] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166068.477728] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166068.484291] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166068.491144] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166068.498330] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166068.504599] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166068.511618] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166068.519430] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166068.525847] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166068.530853] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166068.537409] [<ffffffffffffffff>] 0xffffffffffffffff [166068.542514] Pid: 82360, comm: mdt01_054 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166068.552767] Call Trace: [166068.555311] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166068.562327] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166068.569629] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166068.576540] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166068.583634] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166068.590632] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166068.597293] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166068.603858] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166068.610711] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166068.617897] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166068.624152] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166068.631175] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166068.639003] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166068.645407] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166068.650399] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166068.656966] [<ffffffffffffffff>] 0xffffffffffffffff [166072.001682] LustreError: 39476:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005c-osc-MDT0003: cannot cleanup orphans: rc = -11 [166072.369672] Pid: 84137, comm: mdt00_044 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166072.379931] Call Trace: [166072.382489] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [166072.389315] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [166072.396771] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [166072.403702] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [166072.410373] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [166072.417717] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [166072.424381] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [166072.432355] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [166072.438928] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [166072.444973] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [166072.451478] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [166072.457619] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [166072.464280] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [166072.470609] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166072.477188] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166072.484045] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166072.491252] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166072.497512] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166072.504556] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166072.512357] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166072.518785] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166072.523789] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166072.530366] [<ffffffffffffffff>] 0xffffffffffffffff [166072.535490] LustreError: dumping log to /tmp/lustre-log.1576155141.84137 [166084.657829] Pid: 84686, comm: mdt02_051 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166084.668086] Call Trace: [166084.670636] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166084.677680] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166084.684976] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166084.691893] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166084.698997] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166084.705994] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166084.712667] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166084.719253] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166084.726115] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166084.733311] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166084.739576] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166084.746607] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166084.754422] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166084.760838] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166084.765854] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166084.772417] [<ffffffffffffffff>] 0xffffffffffffffff [166084.777547] LustreError: dumping log to /tmp/lustre-log.1576155153.84686 [166158.386732] LustreError: dumping log to /tmp/lustre-log.1576155227.39837 [166166.578832] LustreError: dumping log to /tmp/lustre-log.1576155235.82356 [166191.155134] LustreError: dumping log to /tmp/lustre-log.1576155260.39764 [166199.347239] LustreError: dumping log to /tmp/lustre-log.1576155268.39757 [166199.659245] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576154667/real 1576154667] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576155268 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [166199.687448] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [166199.697285] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [166199.713456] Lustre: Skipped 4 previous similar messages [166199.718942] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [166199.728878] Lustre: Skipped 145 previous similar messages [166206.550331] LustreError: 84488:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576154975, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95237c378fc0/0x5f9f636a30dcbfae lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 103 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84488 timeout: 0 lvb_type: 0 [166206.590056] LustreError: 84488:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 14 previous similar messages [166230.112624] LustreError: 39468:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0058-osc-MDT0003: cannot cleanup orphans: rc = -107 [166230.125743] LustreError: 39468:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 1 previous similar message [166252.595889] LustreError: dumping log to /tmp/lustre-log.1576155321.84646 [166260.787994] LustreError: dumping log to /tmp/lustre-log.1576155329.39761 [166264.884049] LustreError: dumping log to /tmp/lustre-log.1576155334.39843 [166273.076143] LustreError: dumping log to /tmp/lustre-log.1576155342.39241 [166364.364517] LNet: Service thread pid 84596 completed after 2197.86s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [166364.380852] LNet: Skipped 4 previous similar messages [166367.285305] LustreError: dumping log to /tmp/lustre-log.1576155436.39716 [166457.398408] LNet: Service thread pid 39839 was inactive for 1201.31s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [166457.415514] LNet: Skipped 4 previous similar messages [166457.420665] Pid: 39839, comm: mdt03_030 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166457.430936] Call Trace: [166457.433486] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166457.440523] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166457.447834] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166457.454753] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166457.461855] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166457.468862] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166457.475536] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166457.482106] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166457.488967] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166457.496164] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166457.502428] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166457.509457] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166457.517288] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166457.523706] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166457.528723] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166457.535286] [<ffffffffffffffff>] 0xffffffffffffffff [166457.540413] LustreError: dumping log to /tmp/lustre-log.1576155526.39839 [166473.782609] Pid: 84500, comm: mdt01_062 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166473.792870] Call Trace: [166473.795428] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166473.802469] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166473.809783] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [166473.816718] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166473.823798] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [166473.830027] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [166473.836513] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [166473.842670] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [166473.849331] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [166473.855656] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166473.862227] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166473.869111] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166473.876339] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166473.882619] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166473.889693] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166473.897530] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166473.903941] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166473.908956] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166473.915530] [<ffffffffffffffff>] 0xffffffffffffffff [166473.920659] LustreError: dumping log to /tmp/lustre-log.1576155543.84500 [166514.921570] Lustre: fir-MDT0003: Client cded0104-b7e2-3351-ef3d-a03eb9e0010a (at 10.9.108.66@o2ib4) reconnecting [166514.931836] Lustre: Skipped 141 previous similar messages [166518.839159] Pid: 39709, comm: mdt01_007 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166518.849416] Call Trace: [166518.851975] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166518.859014] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166518.866306] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166518.873221] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166518.880317] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166518.887314] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166518.893977] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166518.900554] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166518.907409] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166518.914596] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166518.920859] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166518.927890] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166518.935707] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166518.942122] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166518.947122] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166518.953692] [<ffffffffffffffff>] 0xffffffffffffffff [166518.958800] LustreError: dumping log to /tmp/lustre-log.1576155588.39709 [166562.900717] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [166567.991763] Pid: 39189, comm: mdt01_001 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166568.002040] Call Trace: [166568.004595] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166568.011637] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166568.018934] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166568.025853] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166568.032956] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166568.039964] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166568.046641] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166568.053214] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166568.060065] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166568.067261] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166568.073526] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166568.080555] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166568.088369] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166568.094788] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166568.099803] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166568.106368] [<ffffffffffffffff>] 0xffffffffffffffff [166568.111510] LustreError: dumping log to /tmp/lustre-log.1576155637.39189 [166568.118867] Pid: 84453, comm: mdt02_047 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166568.129142] Call Trace: [166568.131687] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166568.138709] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166568.145996] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166568.152906] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166568.160000] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166568.166998] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166568.173661] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166568.180243] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166568.187095] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166568.194282] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166568.200537] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166568.207558] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166568.215373] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166568.221790] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166568.226781] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166568.233348] [<ffffffffffffffff>] 0xffffffffffffffff [166568.238444] LNet: Service thread pid 84146 was inactive for 1203.87s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [166568.251506] LNet: Skipped 29 previous similar messages [166572.087815] LustreError: dumping log to /tmp/lustre-log.1576155641.39790 [166573.623847] Lustre: 39185:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff9503b9b9de80 x1649315777724672/t0(0) o101->b4206b2f-67a2-cb01-c899-d99205e22b23@10.9.108.61@o2ib4:282/0 lens 1800/3288 e 0 to 0 dl 1576155647 ref 2 fl Interpret:/0/0 rc 0/0 [166573.653263] Lustre: 39185:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 9 previous similar messages [166588.472015] LustreError: dumping log to /tmp/lustre-log.1576155657.39820 [166651.839470] Lustre: fir-MDT0003: haven't heard from client 27dd63c4-0630-b8af-eb2d-2f38c1747230 (at 10.8.19.5@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389fa2c00, cur 1576155721 expire 1576155571 last 1576155494 [166666.296975] LustreError: dumping log to /tmp/lustre-log.1576155735.84583 [166686.777258] LustreError: dumping log to /tmp/lustre-log.1576155755.39855 [166748.217971] LustreError: dumping log to /tmp/lustre-log.1576155817.39187 [166789.178493] Pid: 39794, comm: mdt03_022 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166789.188758] Call Trace: [166789.191315] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166789.198358] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166789.205672] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [166789.212598] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166789.219709] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [166789.225942] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [166789.232454] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [166789.238591] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [166789.245263] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [166789.251573] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166789.258149] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166789.265014] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166789.272225] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166789.278476] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166789.285519] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166789.293322] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166789.299751] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166789.304754] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166789.311331] [<ffffffffffffffff>] 0xffffffffffffffff [166789.316442] LustreError: dumping log to /tmp/lustre-log.1576155858.39794 [166800.290616] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576155268/real 1576155268] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576155869 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [166800.318820] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [166800.328741] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [166800.344918] Lustre: Skipped 3 previous similar messages [166800.350432] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [166800.360360] Lustre: Skipped 152 previous similar messages [166826.532937] LustreError: 84596:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576155595, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff94fee870a640/0x5f9f636a30e14de4 lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 96 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84596 timeout: 0 lvb_type: 0 [166826.572577] LustreError: 84596:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 19 previous similar messages [166829.023972] LustreError: 39476:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005c-osc-MDT0003: cannot cleanup orphans: rc = -107 [166829.037088] LustreError: 39476:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 2 previous similar messages [166867.003433] Pid: 84645, comm: mdt01_068 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166867.013696] Call Trace: [166867.016246] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166867.023290] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166867.030586] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166867.037504] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166867.044608] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166867.051613] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166867.058282] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166867.064869] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166867.071726] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166867.078922] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166867.085195] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166867.092224] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166867.100047] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166867.106467] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166867.111479] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166867.118045] [<ffffffffffffffff>] 0xffffffffffffffff [166867.123175] LustreError: dumping log to /tmp/lustre-log.1576155936.84645 [166867.130515] Pid: 84751, comm: mdt02_065 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166867.140786] Call Trace: [166867.143329] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [166867.150352] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [166867.157638] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [166867.164548] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [166867.171643] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [166867.178641] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [166867.185302] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166867.191866] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166867.198719] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166867.205905] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166867.212160] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166867.219183] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166867.227000] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166867.233415] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166867.238447] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166867.245002] [<ffffffffffffffff>] 0xffffffffffffffff [166965.308628] Pid: 84716, comm: mdt00_070 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [166965.318892] Call Trace: [166965.321454] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [166965.328276] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [166965.335731] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [166965.342646] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [166965.350607] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [166965.357959] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [166965.365150] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [166965.371802] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [166965.378392] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [166965.384980] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [166965.391566] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [166965.398424] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [166965.405635] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [166965.411894] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [166965.418939] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [166965.426743] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [166965.433185] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [166965.438189] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [166965.444764] [<ffffffffffffffff>] 0xffffffffffffffff [166965.449895] LustreError: dumping log to /tmp/lustre-log.1576156034.84716 [167026.749372] LNet: Service thread pid 39831 was inactive for 1203.70s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [167026.766486] LNet: Skipped 8 previous similar messages [167026.771631] Pid: 39831, comm: mdt00_030 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167026.781910] Call Trace: [167026.784467] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167026.791509] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167026.798799] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [167026.805717] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167026.812815] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [167026.819035] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [167026.825519] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [167026.831685] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [167026.838338] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [167026.844652] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167026.851216] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167026.858068] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167026.865257] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167026.871518] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167026.878545] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167026.886362] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167026.892789] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167026.897807] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167026.904370] [<ffffffffffffffff>] 0xffffffffffffffff [167026.909490] LustreError: dumping log to /tmp/lustre-log.1576156096.39831 [167064.372957] LNet: Service thread pid 39734 completed after 2896.97s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [167064.389292] LNet: Skipped 16 previous similar messages [167108.670378] Pid: 84488, comm: mdt00_049 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167108.680634] Call Trace: [167108.683184] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167108.690226] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167108.697545] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167108.704466] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167108.711572] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167108.718577] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167108.725248] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167108.731817] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167108.738672] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167108.745868] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167108.752155] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167108.759188] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167108.767004] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167108.773419] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167108.778436] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167108.785000] [<ffffffffffffffff>] 0xffffffffffffffff [167108.790119] LustreError: dumping log to /tmp/lustre-log.1576156177.84488 [167116.862480] Pid: 39860, comm: mdt03_040 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167116.872742] Call Trace: [167116.875300] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167116.882343] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167116.889655] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167116.896582] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167116.903686] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167116.910691] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167116.917364] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167116.923935] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167116.930796] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167116.937993] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167116.944259] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167116.951302] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167116.959120] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167116.965535] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167116.970552] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167116.977116] [<ffffffffffffffff>] 0xffffffffffffffff [167116.982244] LustreError: dumping log to /tmp/lustre-log.1576156186.39860 [167120.958533] Pid: 39796, comm: mdt02_026 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167120.968793] Call Trace: [167120.971344] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167120.978366] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167120.985665] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [167120.992588] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167120.999697] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [167121.005916] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [167121.012411] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [167121.018542] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [167121.025205] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [167121.031510] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167121.038085] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167121.044927] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167121.052138] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167121.058377] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167121.065410] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167121.073214] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167121.079641] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167121.084637] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167121.091202] [<ffffffffffffffff>] 0xffffffffffffffff [167121.096303] LustreError: dumping log to /tmp/lustre-log.1576156190.39796 [167137.842524] Lustre: fir-MDT0003: Client 4e97c29c-283b-4253-402d-db9d46beedd7 (at 10.9.101.39@o2ib4) reconnecting [167137.852806] Lustre: Skipped 150 previous similar messages [167166.015075] Pid: 39190, comm: mdt01_002 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167166.025339] Call Trace: [167166.027897] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167166.034938] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167166.042236] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167166.049152] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167166.056256] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167166.063267] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167166.069944] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167166.076531] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167166.083395] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167166.090590] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167166.096863] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167166.103894] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167166.111708] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167166.118125] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167166.123140] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167166.129702] [<ffffffffffffffff>] 0xffffffffffffffff [167166.134827] LustreError: dumping log to /tmp/lustre-log.1576156235.39190 [167166.142226] Pid: 84276, comm: mdt02_043 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167166.152505] Call Trace: [167166.155048] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [167166.161871] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [167166.169324] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [167166.176234] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [167166.184197] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [167166.191538] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [167166.198729] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [167166.205379] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [167166.211955] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [167166.218518] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167166.225103] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167166.231952] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167166.239159] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167166.245411] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167166.252454] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167166.260258] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167166.266698] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167166.271699] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167166.278279] [<ffffffffffffffff>] 0xffffffffffffffff [167174.207175] LNet: Service thread pid 82364 was inactive for 1202.15s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [167174.220214] LNet: Skipped 6 previous similar messages [167174.225360] LustreError: dumping log to /tmp/lustre-log.1576156243.82364 [167198.783458] LustreError: dumping log to /tmp/lustre-log.1576156267.85077 [167214.655659] Lustre: 84790:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff952238361b00 x1648702464453552/t0(0) o101->8960bab2-ad07-45af-2f53-f9cf8eadf367@10.9.109.70@o2ib4:168/0 lens 576/3264 e 0 to 0 dl 1576156288 ref 2 fl Interpret:/0/0 rc 0/0 [167214.684986] Lustre: 84790:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 12 previous similar messages [167256.128138] LustreError: dumping log to /tmp/lustre-log.1576156325.39727 [167260.224196] LustreError: dumping log to /tmp/lustre-log.1576156329.39763 [167268.416290] LustreError: dumping log to /tmp/lustre-log.1576156337.39822 [167325.832172] Lustre: fir-MDT0003: haven't heard from client d4e78436-48cb-55f2-4bab-88419072f51d (at 10.9.103.16@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389fa7c00, cur 1576156395 expire 1576156245 last 1576156168 [167350.337300] LustreError: dumping log to /tmp/lustre-log.1576156419.39778 [167366.721502] LustreError: dumping log to /tmp/lustre-log.1576156435.39781 [167373.346598] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST0056-osc-MDT0003: cannot cleanup orphans: rc = -107 [167373.359723] LustreError: 39464:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 3 previous similar messages [167399.489901] LustreError: dumping log to /tmp/lustre-log.1576156468.84965 [167401.553932] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576155869/real 1576155869] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576156470 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [167401.582137] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [167401.592057] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [167401.608235] Lustre: Skipped 4 previous similar messages [167401.613716] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [167401.623634] Lustre: Skipped 156 previous similar messages [167419.970151] Pid: 39751, comm: mdt03_011 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167419.980409] Call Trace: [167419.982961] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167419.989996] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167419.997292] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167420.004210] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167420.011313] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167420.018319] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167420.024989] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167420.031561] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167420.038416] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167420.045610] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167420.051872] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167420.058896] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167420.066712] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167420.073128] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167420.078127] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167420.084712] [<ffffffffffffffff>] 0xffffffffffffffff [167420.089821] LustreError: dumping log to /tmp/lustre-log.1576156489.39751 [167441.500416] LustreError: 39807:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576156210, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9513a994fbc0/0x5f9f636a30e60fcf lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 93 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39807 timeout: 0 lvb_type: 0 [167441.540060] LustreError: 39807:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 23 previous similar messages [167460.930634] Pid: 39780, comm: mdt03_017 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167460.940891] Call Trace: [167460.943440] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167460.950468] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167460.957783] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167460.964700] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167460.971801] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167460.978809] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167460.985480] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167460.992050] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167460.998904] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167461.006100] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167461.012364] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167461.019411] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167461.027227] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167461.033643] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167461.038660] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167461.045222] [<ffffffffffffffff>] 0xffffffffffffffff [167461.050344] LustreError: dumping log to /tmp/lustre-log.1576156530.39780 [167461.057619] Pid: 39753, comm: mdt03_012 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167461.067916] Call Trace: [167461.070467] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167461.077480] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167461.084782] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167461.091702] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167461.098805] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167461.105811] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167461.112474] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167461.119037] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167461.125890] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167461.133088] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167461.139342] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167461.146364] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167461.154192] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167461.160611] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167461.165610] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167461.172179] [<ffffffffffffffff>] 0xffffffffffffffff [167518.275458] Pid: 39777, comm: mdt03_016 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167518.285717] Call Trace: [167518.288267] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167518.295300] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167518.302600] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167518.309513] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167518.316621] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167518.323640] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167518.330302] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167518.336874] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167518.343729] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167518.350923] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167518.357197] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167518.364246] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167518.372070] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167518.378495] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167518.383510] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167518.390096] [<ffffffffffffffff>] 0xffffffffffffffff [167518.395219] LustreError: dumping log to /tmp/lustre-log.1576156587.39777 [167567.427925] Pid: 39856, comm: mdt03_038 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167567.438182] Call Trace: [167567.440733] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167567.447768] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167567.455054] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167567.461975] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167567.469078] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167567.476082] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167567.482754] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167567.489325] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167567.496179] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167567.503376] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167567.509641] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167567.516668] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167567.524484] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167567.530899] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167567.535901] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167567.542469] [<ffffffffffffffff>] 0xffffffffffffffff [167567.547578] LustreError: dumping log to /tmp/lustre-log.1576156636.39856 [167664.380322] LNet: Service thread pid 39713 completed after 3494.64s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [167664.396655] LNet: Skipped 16 previous similar messages [167665.733132] LustreError: dumping log to /tmp/lustre-log.1576156734.39813 [167727.173877] LNet: Service thread pid 84596 was inactive for 1200.62s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [167727.190991] LNet: Skipped 10 previous similar messages [167727.196230] Pid: 84596, comm: mdt00_060 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167727.206512] Call Trace: [167727.209068] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167727.216109] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167727.223407] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167727.230332] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167727.237436] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167727.244441] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167727.251113] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167727.257675] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167727.264537] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167727.271750] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167727.278012] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167727.285047] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167727.292868] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167727.299285] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167727.304320] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167727.310916] [<ffffffffffffffff>] 0xffffffffffffffff [167727.316031] LustreError: dumping log to /tmp/lustre-log.1576156796.84596 [167747.654128] Pid: 39721, comm: mdt00_009 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167747.664384] Call Trace: [167747.666942] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167747.673985] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167747.681283] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167747.688199] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167747.695304] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167747.702308] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167747.708971] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167747.715537] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167747.722402] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167747.729615] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167747.735890] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167747.742911] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167747.750736] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167747.757151] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167747.762166] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167747.768734] [<ffffffffffffffff>] 0xffffffffffffffff [167747.773859] LustreError: dumping log to /tmp/lustre-log.1576156816.39721 [167753.770382] Lustre: fir-MDT0003: Client a9d053ed-e7f0-2ae2-0053-2ec126b51fb7 (at 10.9.107.72@o2ib4) reconnecting [167753.780645] Lustre: Skipped 154 previous similar messages [167768.134377] Pid: 39244, comm: mdt02_005 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167768.144635] Call Trace: [167768.147187] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167768.154219] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167768.161512] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167768.168425] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167768.175522] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167768.182518] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167768.189178] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167768.195743] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167768.202597] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167768.209791] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167768.216056] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167768.223077] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167768.230895] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167768.237309] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167768.242308] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167768.248879] [<ffffffffffffffff>] 0xffffffffffffffff [167768.253985] LustreError: dumping log to /tmp/lustre-log.1576156837.39244 [167768.261349] Pid: 39759, comm: mdt01_024 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167768.271624] Call Trace: [167768.274165] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167768.281181] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167768.288469] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167768.295378] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167768.302473] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167768.309468] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167768.316134] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167768.322695] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167768.329549] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167768.336749] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167768.343011] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167768.350029] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167768.357846] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167768.364264] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167768.369252] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167768.375821] [<ffffffffffffffff>] 0xffffffffffffffff [167768.380912] Pid: 85079, comm: mdt00_074 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [167768.391177] Call Trace: [167768.393726] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [167768.400754] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [167768.408044] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [167768.414954] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [167768.422046] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [167768.429043] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [167768.435710] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [167768.442269] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [167768.449124] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [167768.456311] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [167768.462565] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [167768.469603] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [167768.477419] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [167768.483836] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [167768.488844] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [167768.495399] [<ffffffffffffffff>] 0xffffffffffffffff [167792.710676] LNet: Service thread pid 39732 was inactive for 1203.60s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [167792.723717] LNet: Skipped 11 previous similar messages [167792.728952] LustreError: dumping log to /tmp/lustre-log.1576156861.39732 [167841.863298] Lustre: 82358:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (3/-152), not sending early reply req@ffff9513b2ad6780 x1649329211504384/t0(0) o101->882378af-0b41-73ee-5c10-5cc51464645c@10.9.108.22@o2ib4:39/0 lens 1792/3288 e 0 to 0 dl 1576156914 ref 2 fl Interpret:/0/0 rc 0/0 [167841.892630] Lustre: 82358:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 10 previous similar messages [167866.439577] LustreError: dumping log to /tmp/lustre-log.1576156935.39760 [167927.880334] LustreError: dumping log to /tmp/lustre-log.1576156997.39243 [167959.843508] Lustre: fir-MDT0003: haven't heard from client a322cdb3-da3a-2edb-3b54-5c31a21230cc (at 10.9.104.20@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389a0b000, cur 1576157029 expire 1576156879 last 1576156802 [167968.840822] LustreError: dumping log to /tmp/lustre-log.1576157038.39729 [167972.936877] LustreError: dumping log to /tmp/lustre-log.1576157042.84968 [168002.209236] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576156470/real 1576156470] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576157071 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [168002.237435] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [168002.247269] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [168002.263436] Lustre: Skipped 4 previous similar messages [168002.268932] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [168002.278872] Lustre: Skipped 156 previous similar messages [168067.146040] Pid: 84761, comm: mdt02_067 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168067.156300] Call Trace: [168067.158859] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168067.165900] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168067.173212] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168067.180149] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168067.187252] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168067.194251] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168067.200919] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168067.207492] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168067.214352] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168067.221549] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168067.227815] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168067.234844] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168067.242662] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168067.249091] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168067.254107] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168067.260691] [<ffffffffffffffff>] 0xffffffffffffffff [168067.265827] LustreError: dumping log to /tmp/lustre-log.1576157136.84761 [168076.932154] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [168076.945279] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 4 previous similar messages [168120.394670] Pid: 84967, comm: mdt01_082 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168120.404930] Call Trace: [168120.407486] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168120.414528] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168120.421916] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [168120.428864] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168120.436050] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [168120.442280] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [168120.448807] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [168120.454950] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [168120.461613] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [168120.467915] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168120.474489] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168120.481332] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168120.488532] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168120.494796] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168120.501835] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168120.509638] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168120.516063] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168120.521061] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168120.527616] [<ffffffffffffffff>] 0xffffffffffffffff [168120.532734] LustreError: dumping log to /tmp/lustre-log.1576157189.84967 [168120.540095] Pid: 39754, comm: mdt01_022 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168120.550363] Call Trace: [168120.552907] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168120.559920] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168120.567224] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [168120.574136] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168120.581231] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [168120.587460] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [168120.594048] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [168120.600196] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [168120.606923] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [168120.613232] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168120.619816] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168120.626677] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168120.633888] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168120.640126] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168120.647146] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168120.654961] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168120.661377] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168120.666368] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168120.672939] [<ffffffffffffffff>] 0xffffffffffffffff [168153.544081] LustreError: 39240:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576156922, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95237a0b3f00/0x5f9f636a30eb6cb6 lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 77 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39240 timeout: 0 lvb_type: 0 [168153.583722] LustreError: 39240:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 17 previous similar messages [168165.451225] Pid: 39728, comm: mdt02_010 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168165.461482] Call Trace: [168165.464036] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168165.471077] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168165.478374] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168165.485290] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168165.492394] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168165.499400] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168165.506084] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168165.512661] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168165.519512] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168165.526709] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168165.532980] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168165.540013] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168165.547827] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168165.554244] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168165.559257] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168165.565822] [<ffffffffffffffff>] 0xffffffffffffffff [168165.570957] LustreError: dumping log to /tmp/lustre-log.1576157234.39728 [168177.739370] Pid: 84255, comm: mdt03_044 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168177.749631] Call Trace: [168177.752188] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168177.759222] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168177.766520] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168177.773436] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168177.780543] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168177.787549] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168177.794220] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168177.800806] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168177.807660] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168177.814856] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168177.821122] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168177.828152] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168177.835967] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168177.842383] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168177.847398] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168177.853961] [<ffffffffffffffff>] 0xffffffffffffffff [168177.859089] LustreError: dumping log to /tmp/lustre-log.1576157247.84255 [168254.846962] Lustre: fir-MDT0003: haven't heard from client ee45735a-3c72-071c-fe40-2e82d3a751bd (at 10.8.7.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389a86800, cur 1576157324 expire 1576157174 last 1576157097 [168255.564333] LustreError: dumping log to /tmp/lustre-log.1576157324.85064 [168264.387573] LNet: Service thread pid 39846 completed after 4093.69s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [168264.403912] LNet: Skipped 7 previous similar messages [168267.852488] LustreError: dumping log to /tmp/lustre-log.1576157337.39836 [168292.428795] LustreError: dumping log to /tmp/lustre-log.1576157361.85019 [168345.677455] LustreError: dumping log to /tmp/lustre-log.1576157414.39807 [168358.552491] Lustre: fir-MDT0003: Client bc3bfe9e-ae2a-3090-298d-b1536a6ddfe9 (at 10.9.108.59@o2ib4) reconnecting [168358.562756] Lustre: Skipped 159 previous similar messages [168366.157711] LustreError: dumping log to /tmp/lustre-log.1576157435.84715 [168439.886614] LNet: Service thread pid 39806 was inactive for 1203.89s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [168439.903724] LNet: Skipped 9 previous similar messages [168439.908871] Pid: 39806, comm: mdt01_038 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168439.919141] Call Trace: [168439.921701] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168439.928742] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168439.936053] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168439.942974] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168439.950078] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168439.957084] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168439.963754] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168439.970326] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168439.977186] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168439.984385] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168439.990657] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168439.997688] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168440.005519] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168440.011934] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168440.016952] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168440.023516] [<ffffffffffffffff>] 0xffffffffffffffff [168440.028634] LustreError: dumping log to /tmp/lustre-log.1576157509.39806 [168521.295643] Lustre: 84630:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-457), not sending early reply req@ffff94fee1ea6780 x1649369987870592/t0(0) o101->227d7a25-50be-a469-9b6d-83846499cd76@10.8.27.14@o2ib6:720/0 lens 1792/3288 e 0 to 0 dl 1576157595 ref 2 fl Interpret:/0/0 rc 0/0 [168521.324977] Lustre: 84630:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 13 previous similar messages [168599.632592] Pid: 85020, comm: mdt01_084 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168599.642854] Call Trace: [168599.645406] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [168599.652229] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [168599.659074] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [168599.665898] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [168599.673861] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [168599.681201] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [168599.688394] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [168599.695048] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [168599.701638] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [168599.708199] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168599.714775] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168599.721651] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168599.728861] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168599.735109] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168599.742154] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168599.749957] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168599.756386] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168599.761390] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168599.767964] [<ffffffffffffffff>] 0xffffffffffffffff [168599.773078] LustreError: dumping log to /tmp/lustre-log.1576157668.85020 [168603.328639] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576157071/real 1576157071] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576157672 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [168603.356838] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15 previous similar messages [168603.366758] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [168603.382924] Lustre: Skipped 5 previous similar messages [168603.388407] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [168603.398346] Lustre: Skipped 155 previous similar messages [168624.208900] Pid: 86140, comm: mdt00_081 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168624.219162] Call Trace: [168624.221717] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168624.228758] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168624.236063] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168624.242998] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168624.250102] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168624.257110] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168624.263791] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168624.270359] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168624.277222] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168624.284419] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168624.290698] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168624.297727] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168624.305544] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168624.311976] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168624.316992] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168624.323556] [<ffffffffffffffff>] 0xffffffffffffffff [168624.328671] LustreError: dumping log to /tmp/lustre-log.1576157693.86140 [168665.169398] Pid: 39763, comm: mdt01_026 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168665.179656] Call Trace: [168665.182213] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168665.189256] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168665.196569] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168665.203486] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168665.210576] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168665.217581] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168665.224250] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168665.230823] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168665.237676] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168665.244873] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168665.251144] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168665.258175] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168665.266002] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168665.272414] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168665.277430] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168665.283995] [<ffffffffffffffff>] 0xffffffffffffffff [168665.289116] LustreError: dumping log to /tmp/lustre-log.1576157734.39763 [168665.296419] Pid: 84834, comm: mdt00_071 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168665.306693] Call Trace: [168665.309238] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168665.316259] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168665.323544] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168665.330471] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168665.337567] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168665.344565] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168665.351228] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168665.357792] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168665.364643] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168665.371833] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168665.378086] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168665.385111] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168665.392924] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168665.399354] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168665.404348] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168665.410919] [<ffffffffffffffff>] 0xffffffffffffffff [168665.416009] LNet: Service thread pid 39779 was inactive for 1201.01s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [168665.429058] LNet: Skipped 20 previous similar messages [168677.457548] LustreError: dumping log to /tmp/lustre-log.1576157746.39789 [168757.701534] LustreError: 39821:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576157526, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95338a2486c0/0x5f9f636a30f035a1 lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 84 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39821 timeout: 0 lvb_type: 0 [168757.741184] LustreError: 39821:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 21 previous similar messages [168767.570651] Pid: 84830, comm: mdt01_074 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168767.580915] Call Trace: [168767.583466] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168767.590508] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168767.597809] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168767.604730] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168767.611835] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168767.618840] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168767.625510] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168767.632082] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168767.638946] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168767.646157] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168767.652429] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168767.659460] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168767.667277] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168767.673692] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168767.678707] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168767.685269] [<ffffffffffffffff>] 0xffffffffffffffff [168767.690391] LustreError: dumping log to /tmp/lustre-log.1576157836.84830 [168767.697823] Pid: 39714, comm: mdt03_007 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168767.708113] Call Trace: [168767.710660] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [168767.717484] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [168767.724926] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [168767.731836] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [168767.739799] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [168767.747142] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [168767.754325] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [168767.760973] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [168767.767550] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [168767.774114] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168767.780704] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168767.787546] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168767.794745] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168767.800990] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168767.808031] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168767.815834] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168767.822264] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168767.827258] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168767.833823] [<ffffffffffffffff>] 0xffffffffffffffff [168767.838922] Pid: 84644, comm: mdt03_058 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168767.849207] Call Trace: [168767.851756] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168767.858785] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168767.866059] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168767.872980] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168767.880063] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168767.887074] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168767.893729] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168767.900306] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168767.907149] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168767.914356] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168767.920599] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168767.927638] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168767.935437] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168767.941865] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168767.946860] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168767.953428] [<ffffffffffffffff>] 0xffffffffffffffff [168767.958519] Pid: 39724, comm: mdt00_011 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168767.968771] Call Trace: [168767.971314] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [168767.978128] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [168767.985592] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [168767.992499] [<ffffffffc1657c7a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [168768.000461] [<ffffffffc166a605>] lod_declare_layout_change+0xb65/0x10f0 [lod] [168768.007805] [<ffffffffc16dd2b2>] mdd_declare_layout_change+0x62/0x120 [mdd] [168768.014986] [<ffffffffc16e6022>] mdd_layout_change+0x882/0x1000 [mdd] [168768.021647] [<ffffffffc15472f7>] mdt_layout_change+0x337/0x430 [mdt] [168768.028222] [<ffffffffc154f59e>] mdt_intent_layout+0x7ee/0xcc0 [mdt] [168768.034783] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168768.041360] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168768.048217] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168768.055418] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168768.061660] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168768.068695] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168768.076498] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168768.082917] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168768.087913] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168768.094479] [<ffffffffffffffff>] 0xffffffffffffffff [168833.965474] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [168833.978591] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [168853.587705] Pid: 39723, comm: mdt01_013 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [168853.597962] Call Trace: [168853.600512] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [168853.607555] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [168853.614852] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [168853.621770] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [168853.628862] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [168853.635868] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [168853.642540] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [168853.649104] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [168853.655958] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [168853.663169] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [168853.669441] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [168853.676476] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [168853.684288] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [168853.690706] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [168853.695721] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [168853.702283] [<ffffffffffffffff>] 0xffffffffffffffff [168853.707405] LustreError: dumping log to /tmp/lustre-log.1576157922.39723 [168864.411124] LNet: Service thread pid 39242 completed after 4687.68s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [168864.427456] LNet: Skipped 21 previous similar messages [168865.875848] LustreError: dumping log to /tmp/lustre-log.1576157935.85093 [168890.876848] Lustre: fir-MDT0003: haven't heard from client 2d6a9cf7-46ee-4 (at 10.8.7.5@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff95138e170000, cur 1576157960 expire 1576157810 last 1576157733 [168966.845328] Lustre: fir-MDT0003: haven't heard from client 19c70918-a172-38a5-2512-02b987cb686f (at 10.9.116.8@o2ib4) in 152 seconds. I think it's dead, and I am evicting it. exp ffff950389996800, cur 1576158036 expire 1576157886 last 1576157884 [168972.524053] Lustre: fir-MDT0003: Client a3673fdd-c091-ae6f-4781-d627da6f4e17 (at 10.9.117.20@o2ib4) reconnecting [168972.534313] Lustre: Skipped 155 previous similar messages [169054.294178] LustreError: dumping log to /tmp/lustre-log.1576158123.39240 [169066.582301] LustreError: dumping log to /tmp/lustre-log.1576158135.84871 [169074.774395] LNet: Service thread pid 89409 was inactive for 1200.30s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [169074.791507] LNet: Skipped 9 previous similar messages [169074.796654] Pid: 89409, comm: mdt01_087 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169074.806930] Call Trace: [169074.809481] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169074.816521] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169074.823817] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169074.830736] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169074.837831] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169074.844853] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169074.851524] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169074.858098] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169074.864948] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169074.872149] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169074.878418] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169074.885449] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169074.893266] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169074.899680] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169074.904698] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169074.911260] [<ffffffffffffffff>] 0xffffffffffffffff [169074.916396] LustreError: dumping log to /tmp/lustre-log.1576158144.89409 [169126.743042] Lustre: 84807:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-962), not sending early reply req@ffff951389739b00 x1649442866796560/t0(0) o101->1ca33a17-2a16-9d12-d021-e37db0ce1d5c@10.9.117.37@o2ib4:570/0 lens 576/3264 e 0 to 0 dl 1576158200 ref 2 fl Interpret:/0/0 rc 0/0 [169126.772376] Lustre: 84807:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 20 previous similar messages [169168.983554] Pid: 39186, comm: mdt00_001 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169168.993814] Call Trace: [169168.996373] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169169.003415] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169169.010730] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169169.017643] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169169.024740] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169169.031746] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169169.038415] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169169.044987] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169169.051841] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169169.059038] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169169.065301] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169169.072333] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169169.080161] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169169.086573] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169169.091586] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169169.098150] [<ffffffffffffffff>] 0xffffffffffffffff [169169.103272] LustreError: dumping log to /tmp/lustre-log.1576158238.39186 [169204.551988] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576157672/real 1576157672] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576158273 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [169204.580190] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [169204.590110] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [169204.606281] Lustre: Skipped 4 previous similar messages [169204.611755] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [169204.621687] Lustre: Skipped 157 previous similar messages [169246.808513] Pid: 84110, comm: mdt00_041 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169246.818770] Call Trace: [169246.821320] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169246.828359] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169246.835663] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169246.842579] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169246.849684] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169246.856692] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169246.863382] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169246.869958] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169246.876818] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169246.884016] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169246.890277] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169246.897310] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169246.905125] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169246.911540] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169246.916557] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169246.923125] [<ffffffffffffffff>] 0xffffffffffffffff [169246.928264] LustreError: dumping log to /tmp/lustre-log.1576158316.84110 [169358.863898] LustreError: 39725:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576158128, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95130db169c0/0x5f9f636a30f4d4bd lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 108 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39725 timeout: 0 lvb_type: 0 [169358.903537] LustreError: 39725:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 13 previous similar messages [169394.266328] Pid: 82358, comm: mdt01_052 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169394.276581] Call Trace: [169394.279135] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169394.286176] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169394.293475] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169394.300392] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169394.307518] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169394.314527] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169394.321206] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169394.327778] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169394.334639] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169394.341836] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169394.348107] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169394.355137] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169394.362954] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169394.369387] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169394.374403] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169394.380965] [<ffffffffffffffff>] 0xffffffffffffffff [169394.386096] LustreError: dumping log to /tmp/lustre-log.1576158463.82358 [169439.322881] Pid: 39745, comm: mdt00_017 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169439.333140] Call Trace: [169439.335692] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169439.342733] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169439.350032] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169439.356948] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169439.364051] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169439.371056] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169439.377741] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169439.384318] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169439.391168] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169439.398366] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169439.404630] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169439.411662] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169439.419476] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169439.425908] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169439.430926] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169439.437489] [<ffffffffffffffff>] 0xffffffffffffffff [169439.442624] LustreError: dumping log to /tmp/lustre-log.1576158508.39745 [169455.707079] Pid: 39788, comm: mdt00_025 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169455.717338] Call Trace: [169455.719896] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169455.726938] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169455.734236] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169455.741153] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169455.748257] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169455.755265] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169455.761940] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169455.768529] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169455.775385] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169455.782580] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169455.788855] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169455.795884] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169455.803696] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169455.810114] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169455.815129] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169455.821692] [<ffffffffffffffff>] 0xffffffffffffffff [169455.826815] LustreError: dumping log to /tmp/lustre-log.1576158524.39788 [169467.995236] Pid: 39798, comm: mdt03_023 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169468.005496] Call Trace: [169468.008054] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169468.015095] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169468.022389] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169468.029310] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169468.036412] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169468.043416] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169468.050088] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169468.056661] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169468.063540] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169468.070737] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169468.077008] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169468.084039] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169468.091862] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169468.098278] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169468.103294] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169468.109858] [<ffffffffffffffff>] 0xffffffffffffffff [169468.114989] LustreError: dumping log to /tmp/lustre-log.1576158537.39798 [169468.122354] Pid: 85078, comm: mdt00_073 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169468.132635] Call Trace: [169468.135178] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169468.142199] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169468.149486] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169468.156397] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169468.163491] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169468.170487] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169468.177149] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169468.183714] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169468.190566] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169468.197769] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169468.204025] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169468.211048] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169468.218864] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169468.225279] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169468.230285] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169468.236832] [<ffffffffffffffff>] 0xffffffffffffffff [169468.241937] LNet: Service thread pid 84720 was inactive for 1203.81s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [169468.254969] LNet: Skipped 8 previous similar messages [169564.419572] LNet: Service thread pid 39706 completed after 5381.68s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [169564.435906] LNet: Skipped 11 previous similar messages [169566.300448] LustreError: dumping log to /tmp/lustre-log.1576158635.82361 [169573.547519] Lustre: fir-MDT0003: Client a3673fdd-c091-ae6f-4781-d627da6f4e17 (at 10.9.117.20@o2ib4) reconnecting [169573.557781] Lustre: Skipped 161 previous similar messages [169590.998754] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [169591.011873] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [169603.164890] LustreError: dumping log to /tmp/lustre-log.1576158672.84969 [169660.509605] LustreError: dumping log to /tmp/lustre-log.1576158729.39821 [169664.605646] LustreError: dumping log to /tmp/lustre-log.1576158733.39766 [169669.862586] Lustre: fir-MDT0003: haven't heard from client 75c6d6d0-df4c-7543-716f-77a06d0b577a (at 10.9.103.68@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9523b0050000, cur 1576158739 expire 1576158589 last 1576158512 [169738.334549] LNet: Service thread pid 84790 was inactive for 1200.36s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [169738.351659] LNet: Skipped 7 previous similar messages [169738.356808] Pid: 84790, comm: mdt02_070 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169738.367089] Call Trace: [169738.369642] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169738.376699] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169738.383983] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169738.390900] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169738.398004] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169738.405008] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169738.411676] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169738.418251] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169738.425097] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169738.432307] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169738.438579] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169738.445628] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169738.453437] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169738.459866] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169738.464869] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169738.471453] [<ffffffffffffffff>] 0xffffffffffffffff [169738.476561] LustreError: dumping log to /tmp/lustre-log.1576158807.84790 [169747.038669] Lustre: 39744:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff950383982400 x1649614697528256/t0(0) o101->5b41e348-8633-a21d-46d9-7918979d9d25@10.9.104.19@o2ib4:436/0 lens 584/3264 e 0 to 0 dl 1576158821 ref 2 fl Interpret:/0/0 rc 0/0 [169747.068006] Lustre: 39744:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 21 previous similar messages [169767.006906] Pid: 39846, comm: mdt01_048 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169767.017171] Call Trace: [169767.019727] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169767.026769] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169767.034069] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169767.040983] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169767.048090] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169767.055097] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169767.061776] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169767.068346] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169767.075200] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169767.082412] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169767.088685] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169767.095714] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169767.103531] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169767.109946] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169767.114962] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169767.121532] [<ffffffffffffffff>] 0xffffffffffffffff [169767.126655] LustreError: dumping log to /tmp/lustre-log.1576158836.39846 [169767.134087] Pid: 39829, comm: mdt00_029 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169767.144379] Call Trace: [169767.146929] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169767.153945] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169767.161233] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169767.168153] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169767.175246] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169767.182242] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169767.188915] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169767.195478] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169767.202330] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169767.209528] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169767.215805] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169767.222829] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169767.230644] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169767.237062] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169767.242052] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169767.248607] [<ffffffffffffffff>] 0xffffffffffffffff [169767.253710] Pid: 39739, comm: mdt00_015 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169767.263966] Call Trace: [169767.266517] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [169767.273531] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [169767.280839] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [169767.287782] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [169767.294860] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [169767.301877] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [169767.308528] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169767.315104] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169767.321944] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169767.329152] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169767.335404] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169767.342447] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169767.350266] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169767.356682] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169767.361672] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169767.368243] [<ffffffffffffffff>] 0xffffffffffffffff [169779.295055] Pid: 84832, comm: mdt01_076 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [169779.305315] Call Trace: [169779.307866] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [169779.314692] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [169779.321541] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [169779.328371] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [169779.335032] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [169779.342390] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [169779.349064] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [169779.357013] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [169779.363590] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [169779.369631] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [169779.376138] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [169779.382278] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [169779.388938] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [169779.395251] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [169779.401826] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [169779.408699] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [169779.415912] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [169779.422159] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [169779.429204] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [169779.437008] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [169779.443432] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [169779.448439] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [169779.455013] [<ffffffffffffffff>] 0xffffffffffffffff [169779.460125] LustreError: dumping log to /tmp/lustre-log.1576158848.84832 [169805.519380] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576158273/real 1576158273] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576158874 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [169805.547583] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [169805.557422] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [169805.573590] Lustre: Skipped 2 previous similar messages [169805.579075] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [169805.589015] Lustre: Skipped 158 previous similar messages [169864.423817] Lustre: 39527:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (4186:1490s); client may timeout. req@ffff94fefc326780 x1648771058724240/t0(0) o101->c104d961-ddd0-a5eb-3382-4ecbd88b591c@10.8.18.16@o2ib6:583/0 lens 584/536 e 8 to 0 dl 1576157443 ref 1 fl Complete:/0/0 rc 0/0 [169864.452638] Lustre: 39527:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1 previous similar message [169865.312116] LustreError: dumping log to /tmp/lustre-log.1576158934.82357 [169955.425253] LustreError: dumping log to /tmp/lustre-log.1576159024.84454 [169964.202358] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 10.9.108.66@o2ib4 ns: mdt-fir-MDT0003_UUID lock: ffff95237b6af980/0x5f9f636a30cd2a5d lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 121 type: IBT flags: 0x60200400000020 nid: 10.9.108.66@o2ib4 remote: 0x48a0cf2e6b7e9285 expref: 34 pid: 39828 timeout: 169961 lvb_type: 0 [169964.467363] LustreError: 39720:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576158733, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff950371b03180/0x5f9f636a30f9c58d lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 101 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39720 timeout: 0 lvb_type: 0 [169964.507112] LustreError: 39720:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 10 previous similar messages [169967.713397] LustreError: dumping log to /tmp/lustre-log.1576159036.39529 [170066.018611] Pid: 84630, comm: mdt00_066 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170066.028869] Call Trace: [170066.031421] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170066.038463] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170066.045764] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170066.052703] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170066.059797] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170066.066794] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170066.073455] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170066.080020] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170066.086872] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170066.094067] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170066.100332] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170066.107363] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170066.115163] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170066.121607] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170066.126613] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170066.133188] [<ffffffffffffffff>] 0xffffffffffffffff [170066.138298] LustreError: dumping log to /tmp/lustre-log.1576159135.84630 [170070.114663] Pid: 39741, comm: mdt02_012 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170070.124920] Call Trace: [170070.127475] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170070.134520] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170070.141815] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170070.148735] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170070.155851] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170070.162851] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170070.169513] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170070.176084] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170070.182948] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170070.190142] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170070.196415] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170070.203448] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170070.211263] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170070.217679] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170070.222707] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170070.229273] [<ffffffffffffffff>] 0xffffffffffffffff [170070.234402] LustreError: dumping log to /tmp/lustre-log.1576159139.39741 [170090.891458] Lustre: fir-MDT0003: haven't heard from client 030cce72-3f78-2631-9a21-d2dac6dcbefa (at 10.8.19.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389a83800, cur 1576159160 expire 1576159010 last 1576158933 [170094.690979] Pid: 84608, comm: mdt03_051 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170094.701238] Call Trace: [170094.703789] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [170094.710617] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [170094.717477] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [170094.724309] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [170094.730980] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [170094.738332] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [170094.744994] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [170094.752953] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [170094.759528] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [170094.765573] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [170094.772077] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [170094.778216] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [170094.784892] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [170094.791200] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170094.797781] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170094.804633] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170094.811841] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170094.818093] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170094.825136] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170094.832938] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170094.839369] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170094.844371] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170094.850931] [<ffffffffffffffff>] 0xffffffffffffffff [170094.856063] LustreError: dumping log to /tmp/lustre-log.1576159163.84608 [170168.419875] Pid: 84899, comm: mdt03_065 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170168.430130] Call Trace: [170168.432679] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170168.439715] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170168.447014] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170168.453938] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170168.461043] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170168.468063] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170168.474735] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170168.481307] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170168.488160] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170168.495355] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170168.501619] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170168.508650] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170168.516466] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170168.522880] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170168.527899] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170168.534475] [<ffffffffffffffff>] 0xffffffffffffffff [170168.539599] LustreError: dumping log to /tmp/lustre-log.1576159237.84899 [170174.570672] Lustre: fir-MDT0003: Client a3673fdd-c091-ae6f-4781-d627da6f4e17 (at 10.9.117.20@o2ib4) reconnecting [170174.580941] Lustre: Skipped 168 previous similar messages [170176.611985] Pid: 84597, comm: mdt01_066 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170176.622245] Call Trace: [170176.624791] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [170176.631614] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [170176.638459] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [170176.645283] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [170176.651958] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [170176.659305] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [170176.665970] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [170176.673925] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [170176.680500] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [170176.686543] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [170176.693042] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [170176.699178] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [170176.705843] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [170176.712146] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170176.718742] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170176.725589] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170176.732796] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170176.739038] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170176.746075] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170176.753878] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170176.760304] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170176.765308] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170176.771883] [<ffffffffffffffff>] 0xffffffffffffffff [170176.776976] LustreError: dumping log to /tmp/lustre-log.1576159245.84597 [170201.188280] LNet: Service thread pid 39185 was inactive for 1203.94s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [170201.201316] LNet: Skipped 18 previous similar messages [170201.206553] LustreError: dumping log to /tmp/lustre-log.1576159270.39185 [170229.860638] LustreError: dumping log to /tmp/lustre-log.1576159299.84182 [170262.629044] LustreError: dumping log to /tmp/lustre-log.1576159331.39725 [170264.429346] LNet: Service thread pid 84721 completed after 6099.89s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [170264.445699] LNet: Skipped 78 previous similar messages [170266.725104] LustreError: dumping log to /tmp/lustre-log.1576159335.39734 [170283.868023] Lustre: fir-MDT0003: haven't heard from client 7e6b1bcc-06cc-6146-e31c-86eefaf425fd (at 10.9.101.53@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389991800, cur 1576159353 expire 1576159203 last 1576159126 [170283.889927] Lustre: Skipped 1 previous similar message [170336.357953] LustreError: dumping log to /tmp/lustre-log.1576159405.84278 [170348.032108] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [170348.045229] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [170353.126171] Lustre: 84789:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-887), not sending early reply req@ffff952124a37080 x1649316348406560/t0(0) o101->cded0104-b7e2-3351-ef3d-a03eb9e0010a@10.9.108.66@o2ib4:287/0 lens 592/3264 e 0 to 0 dl 1576159427 ref 2 fl Interpret:/0/0 rc 0/0 [170353.155505] Lustre: 84789:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 31 previous similar messages [170365.030305] LustreError: dumping log to /tmp/lustre-log.1576159434.39844 [170369.126355] LNet: Service thread pid 84619 was inactive for 1204.67s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [170369.143472] LNet: Skipped 9 previous similar messages [170369.148618] Pid: 84619, comm: mdt03_056 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170369.158895] Call Trace: [170369.161443] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170369.168479] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170369.175780] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170369.182706] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170369.189812] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170369.196812] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170369.203472] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170369.210036] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170369.216888] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170369.224077] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170369.230342] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170369.237377] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170369.245198] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170369.251611] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170369.256626] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170369.263189] [<ffffffffffffffff>] 0xffffffffffffffff [170369.268310] LustreError: dumping log to /tmp/lustre-log.1576159438.84619 [170377.318471] Pid: 84836, comm: mdt03_062 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170377.328730] Call Trace: [170377.331291] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170377.338323] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170377.345635] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170377.352552] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170377.359655] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170377.366662] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170377.373332] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170377.379904] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170377.386757] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170377.393953] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170377.400219] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170377.407248] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170377.415065] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170377.421496] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170377.426511] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170377.433074] [<ffffffffffffffff>] 0xffffffffffffffff [170377.438197] LustreError: dumping log to /tmp/lustre-log.1576159446.84836 [170406.078815] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576158874/real 1576158874] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576159475 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [170406.107021] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15 previous similar messages [170406.116940] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [170406.133112] Lustre: Skipped 5 previous similar messages [170406.138601] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [170406.148531] Lustre: Skipped 128 previous similar messages [170459.239481] Pid: 39242, comm: mdt01_004 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170459.249739] Call Trace: [170459.252298] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [170459.259129] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [170459.265980] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [170459.272807] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [170459.279467] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [170459.286812] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [170459.293487] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [170459.301442] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [170459.308016] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [170459.314060] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [170459.320567] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [170459.326705] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [170459.333376] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [170459.339679] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170459.346263] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170459.353122] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170459.360344] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170459.366607] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170459.373651] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170459.381456] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170459.387889] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170459.392894] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170459.399470] [<ffffffffffffffff>] 0xffffffffffffffff [170459.404578] LustreError: dumping log to /tmp/lustre-log.1576159528.39242 [170464.448100] Lustre: 84646:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (1511:3903s); client may timeout. req@ffff95138bb33600 x1649154887214032/t584152454417(0) o101->295209bb-0224-d868-bd7c-cd75c3b19a1c@10.8.18.20@o2ib6:264/0 lens 1800/904 e 0 to 0 dl 1576155630 ref 1 fl Complete:/0/0 rc 0/0 [170467.431565] Pid: 39713, comm: mdt01_009 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170467.441829] Call Trace: [170467.444394] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [170467.451220] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [170467.458673] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [170467.465589] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [170467.472251] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [170467.479594] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [170467.486257] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [170467.494216] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [170467.500792] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [170467.506851] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [170467.513359] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [170467.519498] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [170467.526169] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [170467.532472] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170467.539044] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170467.545904] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170467.553117] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170467.559375] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170467.566416] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170467.574220] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170467.580663] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170467.585669] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170467.592244] [<ffffffffffffffff>] 0xffffffffffffffff [170467.597354] LustreError: dumping log to /tmp/lustre-log.1576159536.39713 [170467.604697] Pid: 39785, comm: mdt01_032 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170467.615000] Call Trace: [170467.617549] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170467.624563] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170467.631852] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170467.638776] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170467.645870] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170467.652870] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170467.659531] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170467.666095] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170467.672948] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170467.680137] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170467.686388] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170467.693415] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170467.701228] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170467.707658] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170467.712667] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170467.719231] [<ffffffffffffffff>] 0xffffffffffffffff [170508.392078] LustreError: dumping log to /tmp/lustre-log.1576159577.86141 [170553.448711] LustreError: dumping log to /tmp/lustre-log.1576159622.84658 [170564.449189] Lustre: 84965:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (3779:587s); client may timeout. req@ffff95330df1ba80 x1652361348225856/t584152458069(0) o101->8442b5f1-7da8-4@10.9.104.25@o2ib4:657/0 lens 1800/904 e 0 to 0 dl 1576159046 ref 1 fl Complete:/0/0 rc 0/0 [170564.477230] Lustre: 84965:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1 previous similar message [170570.007861] LustreError: 84958:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576159339, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9533ad7a6540/0x5f9f636a30ff157e lrc: 3/0,1 mode: --/PW res: [0x28003688b:0x168e:0x0].0x93746263 bits 0x2/0x0 rrc: 5 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84958 timeout: 0 lvb_type: 0 [170570.047933] LustreError: 84958:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 54 previous similar messages [170573.928879] LustreError: dumping log to /tmp/lustre-log.1576159643.39715 [170776.445132] Lustre: fir-MDT0003: Client 24fab89a-6f6a-550a-7225-4734c7f7b849 (at 10.8.27.12@o2ib6) reconnecting [170776.455307] Lustre: Skipped 144 previous similar messages [170854.880752] Lustre: fir-MDT0003: haven't heard from client 7ac0db55-de36-c1c6-f1a9-d7191d6b9947 (at 10.9.103.29@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389e8dc00, cur 1576159924 expire 1576159774 last 1576159697 [170864.480605] LNet: Service thread pid 84610 completed after 6693.86s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [170864.496935] LNet: Skipped 24 previous similar messages [170868.844521] Pid: 39720, comm: mdt02_008 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170868.854775] Call Trace: [170868.857328] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170868.864369] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170868.871667] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170868.878583] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170868.885687] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170868.892691] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170868.899364] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170868.905934] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170868.912798] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170868.920009] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170868.926281] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170868.933314] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170868.941129] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170868.947545] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170868.952559] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170868.959122] [<ffffffffffffffff>] 0xffffffffffffffff [170868.964252] LustreError: dumping log to /tmp/lustre-log.1576159938.39720 [170889.324767] Pid: 39814, comm: mdt01_040 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170889.335039] Call Trace: [170889.337593] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170889.344633] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170889.351965] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170889.358881] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170889.365985] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170889.372983] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170889.379652] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170889.386225] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170889.393087] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170889.400283] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170889.406568] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170889.413597] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170889.421416] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170889.427835] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170889.432851] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170889.439413] [<ffffffffffffffff>] 0xffffffffffffffff [170889.444532] LustreError: dumping log to /tmp/lustre-log.1576159958.39814 [170913.901075] Pid: 89444, comm: mdt01_088 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [170913.911334] Call Trace: [170913.913883] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [170913.920928] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [170913.928227] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [170913.935141] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [170913.942246] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [170913.949250] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [170913.955923] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [170913.962492] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [170913.969348] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [170913.976543] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [170913.982819] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [170913.989838] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [170913.997651] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [170914.004069] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [170914.009083] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [170914.015647] [<ffffffffffffffff>] 0xffffffffffffffff [170914.020770] LustreError: dumping log to /tmp/lustre-log.1576159983.89444 [171007.046228] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576159475/real 1576159475] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576160076 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [171007.074433] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [171007.084264] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [171007.100433] Lustre: Skipped 4 previous similar messages [171007.105921] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [171007.115857] Lustre: Skipped 160 previous similar messages [171014.510345] Lustre: 84957:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff95331b74da00 x1649045963991616/t0(0) o101->5aa7d6de-875b-7f1e-fa2e-01fb0b68841a@10.9.108.1@o2ib4:193/0 lens 1808/3288 e 0 to 0 dl 1576160088 ref 2 fl Interpret:/0/0 rc 0/0 [171014.539673] Lustre: 84957:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 73 previous similar messages [171065.454975] LNet: Service thread pid 84611 was inactive for 1201.01s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [171065.472087] LNet: Skipped 7 previous similar messages [171065.477233] Pid: 84611, comm: mdt03_054 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171065.487532] Call Trace: [171065.490088] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171065.497129] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171065.504427] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171065.511354] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171065.518458] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171065.525461] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171065.532133] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171065.538703] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171065.545553] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171065.552774] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171065.559038] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171065.566080] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171065.573884] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171065.580312] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171065.585316] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171065.591876] [<ffffffffffffffff>] 0xffffffffffffffff [171065.596999] LustreError: dumping log to /tmp/lustre-log.1576160134.84611 [171065.604462] Pid: 39851, comm: mdt03_036 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171065.614739] Call Trace: [171065.617284] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171065.624305] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171065.631603] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171065.638515] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171065.645624] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171065.652620] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171065.659282] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171065.665846] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171065.672703] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171065.679895] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171065.686165] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171065.693190] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171065.701007] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171065.707421] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171065.712427] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171065.718991] [<ffffffffffffffff>] 0xffffffffffffffff [171065.724088] LNet: Service thread pid 39719 was inactive for 1201.28s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [171065.737116] LNet: Skipped 9 previous similar messages [171069.550994] LustreError: dumping log to /tmp/lustre-log.1576160138.84835 [171073.647045] LustreError: dumping log to /tmp/lustre-log.1576160142.39782 [171077.743091] LustreError: dumping log to /tmp/lustre-log.1576160146.84807 [171105.065446] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [171105.078569] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [171164.484457] Lustre: 39747:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (3344:3130s); client may timeout. req@ffff9513bea67980 x1648792099283056/t584152480277(0) o101->89a72012-e97c-d585-565d-f62ed54d0fcf@10.8.7.2@o2ib6:572/0 lens 1800/904 e 0 to 0 dl 1576157103 ref 1 fl Complete:/0/0 rc 0/0 [171164.484481] LustreError: 39241:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389812400 ns: mdt-fir-MDT0003_UUID lock: ffff951313a82880/0x5f9f636a30d6677d lrc: 3/0,0 mode: PR/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 98 type: IBT flags: 0x50200400000020 nid: 10.9.108.66@o2ib4 remote: 0x48a0cf2e6b7e9311 expref: 6 pid: 39241 timeout: 0 lvb_type: 0 [171164.548923] Lustre: 39747:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1 previous similar message [171167.856214] LustreError: dumping log to /tmp/lustre-log.1576160236.39746 [171171.952260] Pid: 90038, comm: mdt01_093 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171171.962525] Call Trace: [171171.965080] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171171.972123] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171171.979423] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171171.986347] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171171.993451] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171172.000455] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171172.007128] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171172.013698] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171172.020560] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171172.027757] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171172.034035] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171172.041069] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171172.048884] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171172.055299] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171172.060316] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171172.066877] [<ffffffffffffffff>] 0xffffffffffffffff [171172.071999] LustreError: dumping log to /tmp/lustre-log.1576160241.90038 [171172.079298] Pid: 82359, comm: mdt01_053 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171172.089568] Call Trace: [171172.092112] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171172.099134] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171172.106421] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171172.113331] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171172.120426] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171172.127424] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171172.134085] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171172.140650] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171172.147505] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171172.154689] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171172.160945] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171172.167982] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171172.175797] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171172.182215] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171172.187207] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171172.193775] [<ffffffffffffffff>] 0xffffffffffffffff [171188.336461] Pid: 84499, comm: mdt00_052 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171188.346720] Call Trace: [171188.349279] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171188.356328] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171188.363629] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171188.370553] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171188.377669] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171188.384671] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171188.391341] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171188.397915] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171188.404784] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171188.411983] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171188.418264] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171188.425298] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171188.433115] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171188.439553] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171188.444571] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171188.451137] [<ffffffffffffffff>] 0xffffffffffffffff [171188.456265] LustreError: dumping log to /tmp/lustre-log.1576160257.84499 [171188.463583] Pid: 39752, comm: mdt02_015 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171188.473861] Call Trace: [171188.476413] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171188.483433] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171188.490730] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171188.497648] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171188.504757] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171188.511758] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171188.518428] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171188.524999] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171188.531840] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171188.539047] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171188.545298] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171188.552347] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171188.560144] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171188.566557] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171188.571585] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171188.578145] [<ffffffffffffffff>] 0xffffffffffffffff [171188.583250] Pid: 39772, comm: mdt02_020 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171188.593506] Call Trace: [171188.596050] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171188.603070] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171188.610358] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171188.617269] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171188.624363] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171188.631359] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171188.638035] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171188.644602] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171188.651461] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171188.658652] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171188.664902] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171188.671945] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171188.679757] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171188.686185] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171188.691180] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171188.697754] [<ffffffffffffffff>] 0xffffffffffffffff [171241.585122] LustreError: dumping log to /tmp/lustre-log.1576160310.90040 [171267.242437] LustreError: 85019:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576160036, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9513b185d580/0x5f9f636a31047ca6 lrc: 3/1,0 mode: --/PR res: [0x280036928:0xaa0c:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 85019 timeout: 0 lvb_type: 0 [171267.281992] LustreError: 85019:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 6 previous similar messages [171286.641675] LustreError: dumping log to /tmp/lustre-log.1576160355.84577 [171303.025875] LustreError: dumping log to /tmp/lustre-log.1576160372.90005 [171335.794277] LustreError: dumping log to /tmp/lustre-log.1576160404.84964 [171364.486988] Lustre: 39830:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (6224:947s); client may timeout. req@ffff9533b6ea7500 x1649317735972256/t584152487661(0) o101->3532db27-3550-1319-6c1b-3d6651c2c9af@10.9.108.62@o2ib4:346/0 lens 1800/904 e 4 to 0 dl 1576159486 ref 1 fl Complete:/0/0 rc 0/0 [171364.516846] Lustre: 39830:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 4 previous similar messages [171376.498927] Lustre: fir-MDT0003: Client 041c1209-eec6-f8ce-c95d-e7e9e84ecf6a (at 10.9.109.68@o2ib4) reconnecting [171376.509191] Lustre: Skipped 138 previous similar messages [171401.331091] LustreError: dumping log to /tmp/lustre-log.1576160470.39847 [171464.220868] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 10.8.18.12@o2ib6 ns: mdt-fir-MDT0003_UUID lock: ffff951389542f40/0x5f9f636a30ce6882 lrc: 3/0,0 mode: PR/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 102 type: IBT flags: 0x60200400000020 nid: 10.8.18.12@o2ib6 remote: 0x7ba85d36aa32e8cc expref: 15 pid: 39549 timeout: 171461 lvb_type: 0 [171464.259023] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [171464.488905] LNet: Service thread pid 39528 completed after 1499.96s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [171464.505231] LNet: Skipped 83 previous similar messages [171466.867895] LustreError: dumping log to /tmp/lustre-log.1576160535.39773 [171470.963958] LustreError: dumping log to /tmp/lustre-log.1576160540.84958 [171475.059992] Pid: 39740, comm: mdt03_009 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171475.070249] Call Trace: [171475.072800] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171475.079824] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171475.087119] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171475.094038] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171475.101142] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [171475.107360] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [171475.113856] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [171475.119995] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171475.126656] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [171475.132959] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171475.139537] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171475.146377] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171475.153584] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171475.159844] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171475.166888] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171475.174690] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171475.181120] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171475.186123] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171475.192714] [<ffffffffffffffff>] 0xffffffffffffffff [171475.197823] LustreError: dumping log to /tmp/lustre-log.1576160544.39740 [171475.205099] Pid: 90020, comm: mdt01_092 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171475.215373] Call Trace: [171475.217918] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171475.224932] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171475.232219] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171475.239128] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171475.246224] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [171475.252441] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [171475.258942] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [171475.265069] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171475.271732] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [171475.278033] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171475.284608] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171475.291450] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171475.298650] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171475.304891] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171475.311927] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171475.319731] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171475.326171] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171475.331161] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171475.337729] [<ffffffffffffffff>] 0xffffffffffffffff [171565.173105] Pid: 39802, comm: mdt03_024 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171565.183364] Call Trace: [171565.185915] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [171565.192738] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [171565.199586] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [171565.206424] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [171565.213086] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [171565.220429] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [171565.227093] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171565.235051] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [171565.241625] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [171565.247671] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [171565.254168] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [171565.260297] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171565.266961] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [171565.273280] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171565.279870] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171565.286723] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171565.293930] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171565.300184] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171565.307226] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171565.315028] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171565.321458] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171565.326460] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171565.333029] [<ffffffffffffffff>] 0xffffffffffffffff [171565.338149] LustreError: dumping log to /tmp/lustre-log.1576160634.39802 [171608.141655] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576160076/real 1576160076] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576160677 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [171608.169858] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14 previous similar messages [171608.179773] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [171608.195942] Lustre: Skipped 5 previous similar messages [171608.201422] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [171608.211352] Lustre: Skipped 132 previous similar messages [171615.349740] Lustre: 84715:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff9503807a0d80 x1649315415417056/t0(0) o101->25827f45-931d-eb26-8907-81e567064f86@10.9.106.11@o2ib4:39/0 lens 376/1600 e 0 to 0 dl 1576160689 ref 2 fl Interpret:/0/0 rc 0/0 [171615.378979] Lustre: 84715:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 44 previous similar messages [171667.574372] LNet: Service thread pid 39194 was inactive for 1203.07s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [171667.591481] LNet: Skipped 9 previous similar messages [171667.596635] Pid: 39194, comm: mdt03_000 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171667.606910] Call Trace: [171667.609467] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [171667.616301] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [171667.623753] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [171667.630664] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [171667.637324] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [171667.644667] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [171667.651329] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171667.659290] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [171667.665864] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [171667.671909] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [171667.678413] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [171667.684551] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171667.691215] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [171667.697528] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171667.704111] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171667.710971] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171667.718192] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171667.724455] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171667.731499] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171667.739302] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171667.745730] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171667.750732] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171667.757292] [<ffffffffffffffff>] 0xffffffffffffffff [171667.762417] LustreError: dumping log to /tmp/lustre-log.1576160736.39194 [171667.770013] Pid: 84943, comm: mdt03_067 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171667.780300] Call Trace: [171667.782848] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [171667.789669] [<ffffffffc1670225>] lod_alloc_qos.constprop.18+0x205/0x1840 [lod] [171667.797099] [<ffffffffc1676847>] lod_qos_prep_create+0x12d7/0x1890 [lod] [171667.804007] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [171667.810667] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [171667.818011] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [171667.824673] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [171667.832627] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [171667.839212] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [171667.845252] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [171667.851764] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [171667.857898] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [171667.864557] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [171667.870870] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171667.877447] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171667.884288] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171667.891495] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171667.897739] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171667.904771] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171667.912575] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171667.919016] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171667.924008] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171667.930575] [<ffffffffffffffff>] 0xffffffffffffffff [171667.935668] LNet: Service thread pid 39786 was inactive for 1203.29s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [171667.948716] LNet: Skipped 75 previous similar messages [171745.399335] LustreError: dumping log to /tmp/lustre-log.1576160814.84789 [171765.879586] LustreError: dumping log to /tmp/lustre-log.1576160835.84276 [171862.098783] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [171862.111906] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [171864.494699] Lustre: 39189:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (5809:691s); client may timeout. req@ffff9512fe294c80 x1649317736188800/t0(0) o101->3532db27-3550-1319-6c1b-3d6651c2c9af@10.9.108.62@o2ib4:347/0 lens 576/536 e 0 to 0 dl 1576160242 ref 1 fl Complete:/0/0 rc 0/0 [171864.495013] LustreError: 39753:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389812400 ns: mdt-fir-MDT0003_UUID lock: ffff950368ad1b00/0x5f9f636a30df617c lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 106 type: IBT flags: 0x50200400000020 nid: 10.9.108.66@o2ib4 remote: 0x48a0cf2e6b7e9357 expref: 4 pid: 39753 timeout: 0 lvb_type: 0 [171864.558296] Lustre: 39189:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 10 previous similar messages [171964.227037] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 10.8.18.17@o2ib6 ns: mdt-fir-MDT0003_UUID lock: ffff951301650b40/0x5f9f636a30e197b8 lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 120 type: IBT flags: 0x60200400000020 nid: 10.8.18.17@o2ib6 remote: 0x52560b3815716641 expref: 19 pid: 39759 timeout: 171961 lvb_type: 0 [171964.264949] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 2 previous similar messages [171968.044081] LustreError: 84597:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576160737, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9513b7547080/0x5f9f636a3109fc01 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 117 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84597 timeout: 0 lvb_type: 0 [171968.083728] LustreError: 84597:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 48 previous similar messages [171973.283932] LustreError: 52104:0:(ldlm_lockd.c:2324:ldlm_cancel_handler()) ldlm_cancel from 10.9.108.65@o2ib4 arrived at 1576161042 with bad export cookie 6890335261548486220 [171978.874211] Pid: 90039, comm: mdt01_094 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [171978.884492] Call Trace: [171978.887050] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [171978.894092] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [171978.901391] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [171978.908306] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [171978.915410] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [171978.922416] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [171978.929086] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [171978.935657] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [171978.942511] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [171978.949708] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [171978.955980] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [171978.963010] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [171978.970825] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [171978.977241] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [171978.982260] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [171978.988821] [<ffffffffffffffff>] 0xffffffffffffffff [171978.993953] LustreError: dumping log to /tmp/lustre-log.1576161048.90039 [171980.483532] Lustre: fir-MDT0003: Client de02546b-f416-f3b2-d476-06fb4a31366f (at 10.8.7.20@o2ib6) reconnecting [171980.493620] Lustre: Skipped 97 previous similar messages [172064.891274] Pid: 84752, comm: mdt02_066 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172064.901562] Call Trace: [172064.904114] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [172064.911163] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [172064.918466] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [172064.925424] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [172064.932542] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [172064.939571] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [172064.946273] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [172064.952859] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172064.959720] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172064.966916] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [172064.973179] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172064.980211] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172064.988026] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172064.994441] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [172064.999457] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [172065.006022] [<ffffffffffffffff>] 0xffffffffffffffff [172065.011155] LustreError: dumping log to /tmp/lustre-log.1576161134.84752 [172164.499970] LNet: Service thread pid 82364 completed after 6192.39s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [172164.516533] LNet: Skipped 55 previous similar messages [172209.109054] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576160677/real 1576160677] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576161278 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [172209.137257] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 11 previous similar messages [172209.147177] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [172209.163352] Lustre: Skipped 4 previous similar messages [172209.168831] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [172209.178760] Lustre: Skipped 145 previous similar messages [172228.221300] Lustre: 86186:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff94f3cc764800 x1649309808627536/t0(0) o101->1431f338-e19b-6337-4b33-ec6ebaff454a@10.8.18.22@o2ib6:652/0 lens 592/3264 e 0 to 0 dl 1576161302 ref 2 fl Interpret:/0/0 rc 0/0 [172228.250545] Lustre: 86186:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 52 previous similar messages [172264.501348] Lustre: 84761:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (5241:159s); client may timeout. req@ffff95237f14ec00 x1649325572460832/t0(0) o101->358a15c3-f754-d3ae-5032-fe8c697bbb16@10.9.106.55@o2ib4:522/0 lens 576/536 e 0 to 0 dl 1576161174 ref 1 fl Complete:/0/0 rc 0/0 [172264.501542] LustreError: 84738:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389959400 ns: mdt-fir-MDT0003_UUID lock: ffff95333f62a640/0x5f9f636a30fb3c3e lrc: 3/0,0 mode: PR/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 91 type: IBT flags: 0x50200400000020 nid: 10.9.108.59@o2ib4 remote: 0xcb92faf98ce52ac7 expref: 4 pid: 84738 timeout: 0 lvb_type: 0 [172264.565026] Lustre: 84761:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 3 previous similar messages [172364.231971] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 10.8.27.14@o2ib6 ns: mdt-fir-MDT0003_UUID lock: ffff95138e0bb840/0x5f9f636a30ddf6e8 lrc: 3/0,0 mode: PR/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 117 type: IBT flags: 0x60200400000020 nid: 10.8.27.14@o2ib6 remote: 0x7b5ba68f77c3d452 expref: 13 pid: 39822 timeout: 172361 lvb_type: 0 [172364.269967] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 4 previous similar messages [172367.999018] LNet: Service thread pid 84686 was inactive for 1203.49s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [172368.016127] LNet: Skipped 3 previous similar messages [172368.021274] Pid: 84686, comm: mdt02_051 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172368.031549] Call Trace: [172368.034103] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [172368.041145] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [172368.048431] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [172368.055349] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [172368.062453] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [172368.069459] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [172368.076130] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [172368.082691] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172368.089545] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172368.096742] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [172368.103013] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172368.110045] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172368.117860] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172368.124276] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [172368.129293] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [172368.135855] [<ffffffffffffffff>] 0xffffffffffffffff [172368.141001] LustreError: dumping log to /tmp/lustre-log.1576161437.84686 [172368.148663] Pid: 39852, comm: mdt02_037 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172368.158970] Call Trace: [172368.161515] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [172368.168527] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [172368.175814] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [172368.182725] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [172368.189819] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [172368.196824] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [172368.203489] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [172368.210066] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172368.216922] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172368.224107] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [172368.230364] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172368.237386] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172368.245201] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172368.251617] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [172368.256624] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [172368.263189] [<ffffffffffffffff>] 0xffffffffffffffff [172368.268283] Pid: 84965, comm: mdt03_072 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172368.278552] Call Trace: [172368.281096] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [172368.288118] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [172368.295393] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [172368.302315] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [172368.309408] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [172368.316415] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [172368.323075] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [172368.329651] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172368.336499] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172368.343721] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [172368.349958] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172368.356992] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172368.364797] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172368.371227] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [172368.376219] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [172368.382785] [<ffffffffffffffff>] 0xffffffffffffffff [172368.387872] Pid: 39709, comm: mdt01_007 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172368.398144] Call Trace: [172368.400691] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [172368.407707] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [172368.415011] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [172368.421918] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [172368.429016] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [172368.436021] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [172368.442682] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [172368.449246] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172368.456101] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172368.463287] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [172368.469542] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172368.476580] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172368.484397] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172368.490813] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [172368.495805] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [172368.502373] [<ffffffffffffffff>] 0xffffffffffffffff [172368.507456] Pid: 39713, comm: mdt01_009 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172368.517730] Call Trace: [172368.520277] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [172368.527305] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [172368.534580] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [172368.541524] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [172368.548604] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [172368.555610] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [172368.562261] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [172368.568836] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172368.575679] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172368.582878] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [172368.589119] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172368.596155] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172368.603957] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172368.610399] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [172368.615388] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [172368.621942] [<ffffffffffffffff>] 0xffffffffffffffff [172368.627040] LNet: Service thread pid 84149 was inactive for 1204.12s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [172368.640065] LNet: Skipped 4 previous similar messages [172564.509410] LustreError: 39718:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389fdf800 ns: mdt-fir-MDT0003_UUID lock: ffff95237a279f80/0x5f9f636a310b65ae lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 143 type: IBT flags: 0x50200400000020 nid: 10.9.108.65@o2ib4 remote: 0xd16b3df9ca42f4fb expref: 9 pid: 39718 timeout: 0 lvb_type: 0 [172564.544216] LustreError: 39718:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) Skipped 1 previous similar message [172564.554366] Lustre: 39793:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:327s); client may timeout. req@ffff9522587ba400 x1648771060079168/t0(0) o101->c104d961-ddd0-a5eb-3382-4ecbd88b591c@10.8.18.16@o2ib6:656/0 lens 592/536 e 0 to 0 dl 1576161306 ref 1 fl Complete:/0/0 rc 0/0 [172564.583000] Lustre: 39793:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 18 previous similar messages [172564.609456] LustreError: dumping log to /tmp/lustre-log.1576161633.85029 [172619.132122] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [172619.145248] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [172620.504365] Lustre: fir-MDT0003: Client 4095f119-51dd-a7fd-e70a-75b35c300b3d (at 10.9.110.58@o2ib4) reconnecting [172620.514627] Lustre: Skipped 105 previous similar messages [172664.647683] LustreError: 84490:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576161433, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff951357625e80/0x5f9f636a310f533f lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 131 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84490 timeout: 0 lvb_type: 0 [172664.687324] LustreError: 84490:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 73 previous similar messages [172714.236290] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.9.108.57@o2ib4 ns: mdt-fir-MDT0003_UUID lock: ffff9513082698c0/0x5f9f636a30fb452f lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 131 type: IBT flags: 0x60200400000020 nid: 10.9.108.57@o2ib4 remote: 0x50aa58b02d27d6de expref: 66 pid: 84882 timeout: 172661 lvb_type: 0 [172714.274371] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 2 previous similar messages [172810.556489] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576161278/real 1576161278] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576161879 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [172810.584688] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [172810.594523] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [172810.610698] Lustre: Skipped 2 previous similar messages [172810.616207] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [172810.626148] Lustre: Skipped 46 previous similar messages [172864.513542] LNet: Service thread pid 39799 completed after 1500.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [172864.529880] LNet: Skipped 153 previous similar messages [172914.821773] Lustre: 39778:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff9513b3f4c800 x1649437505573952/t0(0) o101->a069d4e3-7e33-a168-b75e-93beddcfbdc0@10.9.117.39@o2ib4:583/0 lens 592/3264 e 0 to 0 dl 1576161988 ref 2 fl Interpret:/0/0 rc 0/0 [172914.851104] Lustre: 39778:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 29 previous similar messages [172966.022391] Pid: 82356, comm: mdt01_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [172966.032654] Call Trace: [172966.035212] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [172966.042253] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [172966.049546] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [172966.056484] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [172966.063580] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [172966.070576] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [172966.077238] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [172966.083802] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [172966.090654] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [172966.097852] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [172966.104124] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [172966.111146] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [172966.118962] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [172966.125405] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [172966.130428] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [172966.136992] [<ffffffffffffffff>] 0xffffffffffffffff [172966.142110] LustreError: dumping log to /tmp/lustre-log.1576162035.82356 [173068.423659] LNet: Service thread pid 39751 was inactive for 1203.80s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [173068.440771] LNet: Skipped 5 previous similar messages [173068.445918] Pid: 39751, comm: mdt03_011 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173068.456197] Call Trace: [173068.458754] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173068.465794] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173068.473091] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173068.480025] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173068.487132] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173068.494135] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173068.500806] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173068.507371] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173068.514222] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173068.521419] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173068.527689] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173068.534723] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173068.542537] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173068.548968] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173068.553988] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173068.560550] [<ffffffffffffffff>] 0xffffffffffffffff [173068.565679] LustreError: dumping log to /tmp/lustre-log.1576162137.39751 [173068.573128] Pid: 39813, comm: mdt01_039 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173068.583417] Call Trace: [173068.585966] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173068.592987] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173068.600283] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173068.607203] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173068.614320] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173068.621309] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173068.627959] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173068.634536] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173068.641377] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173068.648584] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173068.654835] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173068.661879] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173068.669683] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173068.676109] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173068.681120] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173068.687696] [<ffffffffffffffff>] 0xffffffffffffffff [173068.692782] Pid: 39189, comm: mdt01_001 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173068.703065] Call Trace: [173068.705612] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173068.712634] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173068.719922] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173068.726830] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173068.733926] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173068.740924] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173068.747608] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173068.754164] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173068.761016] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173068.768215] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173068.774478] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173068.781508] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173068.789324] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173068.795739] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173068.800746] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173068.807303] [<ffffffffffffffff>] 0xffffffffffffffff [173068.812421] Pid: 39729, comm: mdt00_013 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173068.822678] Call Trace: [173068.825222] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173068.832244] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173068.839532] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173068.846441] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173068.853538] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173068.860541] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173068.867203] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173068.873769] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173068.880633] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173068.887825] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173068.894089] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173068.901119] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173068.908935] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173068.915352] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173068.920343] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173068.926917] [<ffffffffffffffff>] 0xffffffffffffffff [173068.932014] LNet: Service thread pid 84511 was inactive for 1204.42s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [173068.945074] LNet: Skipped 30 previous similar messages [173072.519704] LustreError: dumping log to /tmp/lustre-log.1576162141.39759 [173080.711805] LustreError: dumping log to /tmp/lustre-log.1576162149.39721 [173164.521475] LustreError: 39826:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff9503898e4800 ns: mdt-fir-MDT0003_UUID lock: ffff9521e873dc40/0x5f9f636a31155b83 lrc: 3/0,0 mode: CR/CR res: [0x2800347a5:0x451c:0x0].0x0 bits 0x9/0x0 rrc: 2 type: IBT flags: 0x50200000000000 nid: 10.9.108.57@o2ib4 remote: 0x50aa58b02d27f355 expref: 5 pid: 39826 timeout: 0 lvb_type: 0 [173164.556175] LustreError: 39826:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) Skipped 19 previous similar messages [173166.728865] LustreError: dumping log to /tmp/lustre-log.1576162235.39187 [173174.920967] LustreError: dumping log to /tmp/lustre-log.1576162244.39756 [173179.017019] LustreError: dumping log to /tmp/lustre-log.1576162248.89968 [173183.113068] LustreError: dumping log to /tmp/lustre-log.1576162252.82367 [173199.497272] LustreError: dumping log to /tmp/lustre-log.1576162268.90070 [173276.108228] LustreError: 39722:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576162045, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9512fa677080/0x5f9f636a3113feef lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 140 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39722 timeout: 0 lvb_type: 0 [173276.147879] LustreError: 39722:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 111 previous similar messages [173276.525927] Lustre: fir-MDT0003: Client d5336f36-1352-ddc7-e966-e696298bb1ae (at 10.9.106.53@o2ib4) reconnecting [173276.536191] Lustre: Skipped 63 previous similar messages [173364.519609] LustreError: 43471:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff9503899d5000 ns: mdt-fir-MDT0003_UUID lock: ffff9503660033c0/0x5f9f636a3116c08a lrc: 3/0,0 mode: CR/CR res: [0x280033b72:0x9afe:0x0].0x0 bits 0x9/0x0 rrc: 2 type: IBT flags: 0x50200000000000 nid: 10.8.27.12@o2ib6 remote: 0x1a84d1693ee69398 expref: 3 pid: 43471 timeout: 0 lvb_type: 0 [173364.554169] Lustre: 43471:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:45s); client may timeout. req@ffff95136efb6050 x1648475366637072/t584152562184(0) o101->24fab89a-6f6a-550a-7225-4734c7f7b849@10.8.27.12@o2ib6:228/0 lens 1800/560 e 0 to 0 dl 1576162388 ref 1 fl Complete:/0/0 rc -107/-107 [173367.435340] Pid: 84598, comm: mdt03_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173367.445599] Call Trace: [173367.448157] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173367.455180] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173367.462479] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173367.469407] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173367.476519] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173367.483523] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173367.490195] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173367.496774] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173367.503637] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173367.510832] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173367.517106] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173367.524135] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173367.531951] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173367.538366] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173367.543382] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173367.549944] [<ffffffffffffffff>] 0xffffffffffffffff [173367.555067] LustreError: dumping log to /tmp/lustre-log.1576162436.84598 [173367.562600] Pid: 82364, comm: mdt01_058 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173367.572893] Call Trace: [173367.575438] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173367.582452] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173367.589739] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173367.596649] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173367.603744] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173367.610740] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173367.617404] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173367.623967] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173367.630833] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173367.638016] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173367.644269] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173367.651294] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173367.659099] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173367.665515] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173367.670520] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173367.677078] [<ffffffffffffffff>] 0xffffffffffffffff [173376.165462] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [173376.178588] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [173411.547884] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576161879/real 1576161879] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576162480 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [173411.576083] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15 previous similar messages [173411.586001] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [173411.602168] Lustre: Skipped 5 previous similar messages [173411.607689] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [173411.617605] Lustre: Skipped 145 previous similar messages [173441.164246] Pid: 90706, comm: mdt01_100 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173441.174506] Call Trace: [173441.177056] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173441.184099] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173441.191413] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173441.198330] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173441.205434] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173441.212439] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173441.219109] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173441.225674] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173441.232535] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173441.239732] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173441.246003] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173441.253049] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173441.260868] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173441.267284] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173441.272298] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173441.278862] [<ffffffffffffffff>] 0xffffffffffffffff [173441.283992] LustreError: dumping log to /tmp/lustre-log.1576162510.90706 [173464.521128] LustreError: 90706:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff95214e2ff000 ns: mdt-fir-MDT0003_UUID lock: ffff9533b0631f80/0x5f9f636a310e54f3 lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 158 type: IBT flags: 0x50200400000020 nid: 10.8.18.16@o2ib6 remote: 0x5d87cb40b6a90727 expref: 2 pid: 90706 timeout: 0 lvb_type: 0 [173464.555914] Lustre: 90706:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:471s); client may timeout. req@ffff951302341b00 x1648771060217824/t0(0) o101->c104d961-ddd0-a5eb-3382-4ecbd88b591c@10.8.18.16@o2ib6:657/0 lens 592/536 e 0 to 0 dl 1576162062 ref 1 fl Complete:/0/0 rc -107/-107 [173464.555949] LNet: Service thread pid 39759 completed after 1595.19s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [173464.555952] LNet: Skipped 1 previous similar message [173464.606487] Lustre: 90706:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 4 previous similar messages [173465.740577] Pid: 84277, comm: mdt02_044 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173465.750836] Call Trace: [173465.753393] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173465.760438] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173465.767734] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173465.774658] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173465.781763] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173465.788768] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173465.795440] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173465.802029] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173465.808888] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173465.816086] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173465.822358] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173465.829390] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173465.837204] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173465.843622] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173465.848635] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173465.855199] [<ffffffffffffffff>] 0xffffffffffffffff [173465.860330] LustreError: dumping log to /tmp/lustre-log.1576162534.84277 [173465.867773] Pid: 84611, comm: mdt03_054 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173465.878061] Call Trace: [173465.880608] [<ffffffff89588c47>] call_rwsem_down_write_failed+0x17/0x30 [173465.887429] [<ffffffffc1673537>] lod_qos_statfs_update+0x97/0x2b0 [lod] [173465.894266] [<ffffffffc16756da>] lod_qos_prep_create+0x16a/0x1890 [lod] [173465.901090] [<ffffffffc1677015>] lod_prepare_create+0x215/0x2e0 [lod] [173465.907751] [<ffffffffc1666e1e>] lod_declare_striped_create+0x1ee/0x980 [lod] [173465.915095] [<ffffffffc166b6f4>] lod_declare_create+0x204/0x590 [lod] [173465.921756] [<ffffffffc16e1ca2>] mdd_declare_create_object_internal+0xe2/0x2f0 [mdd] [173465.929706] [<ffffffffc16d16dc>] mdd_declare_create+0x4c/0xcb0 [mdd] [173465.936298] [<ffffffffc16d5067>] mdd_create+0x847/0x14e0 [mdd] [173465.942344] [<ffffffffc15725ff>] mdt_reint_open+0x224f/0x3240 [mdt] [173465.948842] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [173465.954979] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [173465.961642] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [173465.967945] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173465.974520] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173465.981361] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173465.988561] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173465.994805] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173466.001853] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173466.009660] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173466.016085] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173466.021080] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173466.027648] [<ffffffffffffffff>] 0xffffffffffffffff [173568.141813] LustreError: dumping log to /tmp/lustre-log.1576162637.84699 [173584.526016] LustreError: dumping log to /tmp/lustre-log.1576162653.84073 [173614.222394] Lustre: 39747:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff9512f2b5cc80 x1650958613876480/t0(0) o101->717fa73e-8071-a76f-931e-8957a8ca32aa@10.9.101.41@o2ib4:528/0 lens 1792/3288 e 0 to 0 dl 1576162688 ref 2 fl Interpret:/0/0 rc 0/0 [173614.251814] Lustre: 39747:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 160 previous similar messages [173645.966773] LustreError: dumping log to /tmp/lustre-log.1576162715.39784 [173764.526390] LustreError: 39784:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389ec7800 ns: mdt-fir-MDT0003_UUID lock: ffff9521c736b600/0x5f9f636a310fe750 lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 147 type: IBT flags: 0x50200400000020 nid: 10.9.106.54@o2ib4 remote: 0x47b7ef4311c30df5 expref: 8 pid: 39784 timeout: 0 lvb_type: 0 [173764.527334] Lustre: 39796:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:445s); client may timeout. req@ffff9521bce3d580 x1649314293998464/t584152577234(0) o101->75af6c9a-e740-8c0d-465f-820e82ef6338@10.9.108.60@o2ib4:228/0 lens 1800/560 e 0 to 0 dl 1576162388 ref 1 fl Complete:/0/0 rc -107/-107 [173764.591530] LustreError: 39784:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) Skipped 5 previous similar messages [173768.848345] LNet: Service thread pid 39188 was inactive for 1204.27s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [173768.865455] LNet: Skipped 8 previous similar messages [173768.870602] Pid: 39188, comm: mdt01_000 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173768.880878] Call Trace: [173768.883433] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173768.890506] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173768.897794] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173768.904731] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173768.911823] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173768.918841] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173768.925500] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173768.932086] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173768.938949] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173768.946159] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173768.952417] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173768.959463] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173768.967265] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173768.973694] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173768.978700] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173768.985266] [<ffffffffffffffff>] 0xffffffffffffffff [173768.990392] LustreError: dumping log to /tmp/lustre-log.1576162838.39188 [173768.997857] Pid: 39763, comm: mdt01_026 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173769.008130] Call Trace: [173769.010675] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173769.017695] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173769.024982] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [173769.031901] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173769.038995] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [173769.045214] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [173769.051708] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [173769.057849] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [173769.064520] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [173769.070848] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173769.077424] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173769.084265] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173769.091472] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173769.097726] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173769.104768] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173769.112580] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173769.119010] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173769.124011] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173769.130578] [<ffffffffffffffff>] 0xffffffffffffffff [173769.135687] Pid: 84628, comm: mdt03_057 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173769.145967] Call Trace: [173769.148508] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173769.155547] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173769.162829] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173769.169751] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173769.176837] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173769.183843] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173769.190494] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173769.197071] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173769.203926] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173769.211136] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173769.217389] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173769.224432] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173769.232235] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173769.238660] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173769.243657] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173769.250218] [<ffffffffffffffff>] 0xffffffffffffffff [173769.255317] Pid: 39770, comm: mdt01_029 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173769.265576] Call Trace: [173769.268121] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173769.275142] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173769.282449] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [173769.289365] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173769.296462] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [173769.302687] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [173769.309183] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [173769.315323] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [173769.321971] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [173769.328295] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173769.334857] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173769.341709] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173769.348907] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173769.355169] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173769.362202] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173769.370016] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173769.376432] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173769.381438] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173769.388002] [<ffffffffffffffff>] 0xffffffffffffffff [173769.393107] Pid: 39259, comm: mdt03_004 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [173769.403387] Call Trace: [173769.405934] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [173769.412953] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [173769.420261] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [173769.427182] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [173769.434279] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [173769.441295] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [173769.447951] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [173769.454536] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [173769.461377] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [173769.468595] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [173769.474844] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [173769.481878] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [173769.489682] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [173769.496110] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [173769.501105] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [173769.507683] [<ffffffffffffffff>] 0xffffffffffffffff [173769.512766] LNet: Service thread pid 82360 was inactive for 1204.98s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [173769.525822] LNet: Skipped 61 previous similar messages [173777.040389] LustreError: dumping log to /tmp/lustre-log.1576162846.39715 [173789.328536] LustreError: dumping log to /tmp/lustre-log.1576162858.86140 [173793.424584] LustreError: dumping log to /tmp/lustre-log.1576162862.39815 [173805.712739] LustreError: dumping log to /tmp/lustre-log.1576162874.82363 [173822.096937] LustreError: dumping log to /tmp/lustre-log.1576162891.84542 [173838.481140] LustreError: dumping log to /tmp/lustre-log.1576162907.85064 [173920.402160] LustreError: dumping log to /tmp/lustre-log.1576162989.39549 [173920.694898] Lustre: fir-MDT0003: Client fdca5c4a-6cf3-51e3-c2ce-f648bf33defc (at 10.9.106.15@o2ib4) reconnecting [173920.705165] Lustre: Skipped 148 previous similar messages [173924.498200] LustreError: dumping log to /tmp/lustre-log.1576162993.39785 [173940.882406] LustreError: dumping log to /tmp/lustre-log.1576163009.39528 [173964.527704] LustreError: 84278:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576162733, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9521e8739440/0x5f9f636a31191168 lrc: 3/1,0 mode: --/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 99 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 84278 timeout: 0 lvb_type: 0 [173964.567373] LustreError: 84278:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 50 previous similar messages [173964.567973] LustreError: 86139:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389486800 ns: mdt-fir-MDT0003_UUID lock: ffff9500a722e780/0x5f9f636a3107ca6b lrc: 3/0,0 mode: PR/PR res: [0x2800347a9:0x7e4d:0x0].0x0 bits 0x13/0x0 rrc: 92 type: IBT flags: 0x50200400000020 nid: 10.8.18.12@o2ib6 remote: 0x7ba85d36aa32e9e4 expref: 2 pid: 86139 timeout: 0 lvb_type: 0 [173964.568001] Lustre: 86139:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:1845s); client may timeout. req@ffff95034e796300 x1648773927120112/t0(0) o101->cc43915b-6aa0-7796-18f9-1827e6f9b899@10.8.18.12@o2ib6:538/0 lens 584/536 e 0 to 0 dl 1576161188 ref 1 fl Complete:/0/0 rc -107/-107 [173964.568003] Lustre: 86139:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1 previous similar message [174012.051277] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576162480/real 1576162480] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576163081 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [174012.079476] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [174012.089308] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [174012.105475] Lustre: Skipped 4 previous similar messages [174012.110979] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [174012.120911] Lustre: Skipped 71 previous similar messages [174064.571528] LNet: Service thread pid 39749 completed after 1500.04s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [174064.587864] LNet: Skipped 113 previous similar messages [174133.198778] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [174133.211904] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [174164.574650] LustreError: 39188:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389959800 ns: mdt-fir-MDT0003_UUID lock: ffff951300397080/0x5f9f636a3110e1ac lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 192 type: IBT flags: 0x50200400000020 nid: 10.9.108.60@o2ib4 remote: 0x286615e8b93d8e54 expref: 2 pid: 39188 timeout: 0 lvb_type: 0 [174164.609447] LustreError: 39188:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) Skipped 5 previous similar messages [174164.619682] Lustre: 39188:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:845s); client may timeout. req@ffff95138a9b3180 x1649314293998480/t0(0) o101->75af6c9a-e740-8c0d-465f-820e82ef6338@10.9.108.60@o2ib4:228/0 lens 576/536 e 0 to 0 dl 1576162388 ref 1 fl Complete:/0/0 rc -107/-107 [174164.648951] Lustre: 39188:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 7 previous similar messages [174178.453278] Pid: 39722, comm: mdt00_010 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174178.463538] Call Trace: [174178.466090] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174178.473135] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174178.480433] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174178.487357] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174178.494460] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174178.501465] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174178.508135] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174178.514710] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174178.521578] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174178.528775] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174178.535039] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174178.542074] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174178.549893] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174178.556308] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174178.561323] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174178.567888] [<ffffffffffffffff>] 0xffffffffffffffff [174178.573009] LustreError: dumping log to /tmp/lustre-log.1576163247.39722 [174214.677716] Lustre: 84698:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff95223e6ead00 x1648799148520192/t0(0) o101->57b26761-b79f-628f-0ec2-0a10fd7ac3bd@10.8.18.17@o2ib6:373/0 lens 592/3264 e 0 to 0 dl 1576163288 ref 2 fl Interpret:/0/0 rc 0/0 [174214.706959] Lustre: 84698:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 55 previous similar messages [174366.871561] Pid: 84276, comm: mdt02_043 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174366.881824] Call Trace: [174366.884382] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174366.891422] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174366.898730] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174366.905643] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174366.912750] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174366.919761] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174366.926426] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174366.932996] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174366.939852] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174366.947047] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174366.953317] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174366.960349] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174366.968165] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174366.974580] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174366.979598] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174366.986173] [<ffffffffffffffff>] 0xffffffffffffffff [174366.991298] LustreError: dumping log to /tmp/lustre-log.1576163436.84276 [174366.998920] Pid: 39819, comm: mdt03_026 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174367.009210] Call Trace: [174367.011756] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174367.018772] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174367.026059] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174367.032967] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174367.040080] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174367.047086] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174367.053772] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174367.060337] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174367.067189] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174367.074377] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174367.080633] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174367.087655] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174367.095470] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174367.101876] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174367.106884] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174367.113438] [<ffffffffffffffff>] 0xffffffffffffffff [174367.118545] Pid: 84149, comm: mdt00_046 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174367.128812] Call Trace: [174367.131357] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174367.138373] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174367.145662] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174367.152568] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174367.159665] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174367.166663] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174367.173324] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174367.179893] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174367.186771] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174367.193980] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174367.200233] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174367.207256] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174367.215073] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174367.221489] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174367.226480] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174367.233050] [<ffffffffffffffff>] 0xffffffffffffffff [174367.238141] Pid: 39243, comm: mdt03_003 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174367.248413] Call Trace: [174367.250961] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174367.257974] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174367.265263] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174367.272172] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174367.279267] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174367.286262] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174367.292925] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174367.299490] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174367.306340] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174367.313528] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174367.319797] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174367.326823] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174367.334656] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174367.341072] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174367.346080] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174367.352634] [<ffffffffffffffff>] 0xffffffffffffffff [174460.934422] Lustre: fir-MDT0003: haven't heard from client 3bd651a1-07e6-0cec-1800-45156860eb64 (at 10.9.110.39@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389d3b400, cur 1576163530 expire 1576163380 last 1576163303 [174465.176771] LNet: Service thread pid 84723 was inactive for 1200.49s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [174465.189809] LNet: Skipped 95 previous similar messages [174465.195046] LustreError: dumping log to /tmp/lustre-log.1576163534.84723 [174520.747839] Lustre: fir-MDT0003: Client de02546b-f416-f3b2-d476-06fb4a31366f (at 10.8.7.20@o2ib6) reconnecting [174520.757930] Lustre: Skipped 105 previous similar messages [174522.521480] LNet: Service thread pid 39833 was inactive for 1201.92s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [174522.538640] LNet: Skipped 9 previous similar messages [174522.543789] Pid: 39833, comm: mdt00_031 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174522.554066] Call Trace: [174522.556638] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174522.563681] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174522.571018] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174522.577940] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174522.585063] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174522.592076] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174522.598781] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174522.605362] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174522.612232] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174522.619446] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174522.625703] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174522.632770] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174522.640594] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174522.647058] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174522.652067] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174522.658668] [<ffffffffffffffff>] 0xffffffffffffffff [174522.663787] LustreError: dumping log to /tmp/lustre-log.1576163591.39833 [174522.671167] Pid: 84880, comm: mdt01_080 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174522.681456] Call Trace: [174522.684003] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174522.691024] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174522.698325] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174522.705247] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174522.712352] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174522.719356] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174522.726027] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174522.732600] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174522.739451] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174522.746667] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174522.752921] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174522.759953] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174522.767780] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174522.774200] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174522.779213] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174522.785780] [<ffffffffffffffff>] 0xffffffffffffffff [174588.058288] Pid: 39775, comm: mdt00_023 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174588.068548] Call Trace: [174588.071099] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174588.078133] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174588.085429] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174588.092359] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174588.099472] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174588.106492] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174588.113163] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174588.119737] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174588.126605] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174588.133801] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174588.140066] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174588.147097] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174588.154911] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174588.161329] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174588.166344] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174588.172921] [<ffffffffffffffff>] 0xffffffffffffffff [174588.178054] LustreError: dumping log to /tmp/lustre-log.1576163657.39775 [174588.185383] Pid: 39794, comm: mdt03_022 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174588.195657] Call Trace: [174588.198199] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174588.205222] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174588.212512] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174588.219420] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174588.226516] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174588.233511] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174588.240186] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174588.246747] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174588.253598] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174588.260802] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174588.267059] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174588.274082] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174588.281896] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174588.288311] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174588.293320] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174588.299875] [<ffffffffffffffff>] 0xffffffffffffffff [174613.466608] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576163081/real 1576163081] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576163682 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [174613.494810] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15 previous similar messages [174613.504733] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [174613.520909] Lustre: Skipped 5 previous similar messages [174613.526390] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [174613.536329] Lustre: Skipped 114 previous similar messages [174637.210904] Pid: 84967, comm: mdt01_082 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174637.221169] Call Trace: [174637.223726] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174637.230775] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174637.238091] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174637.245016] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174637.252112] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174637.259116] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174637.265789] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174637.272360] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174637.279222] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174637.286417] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174637.292690] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174637.299722] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174637.307540] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174637.313953] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174637.318968] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174637.325532] [<ffffffffffffffff>] 0xffffffffffffffff [174637.330653] LustreError: dumping log to /tmp/lustre-log.1576163706.84967 [174663.514226] LustreError: 39750:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576163432, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95237a4af2c0/0x5f9f636a311e766e lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 197 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 39750 timeout: 0 lvb_type: 0 [174663.553873] LustreError: 39750:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 80 previous similar messages [174664.581586] LNet: Service thread pid 39753 completed after 2100.04s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [174664.597918] LNet: Skipped 11 previous similar messages [174665.883255] LustreError: dumping log to /tmp/lustre-log.1576163734.84511 [174768.284485] LustreError: dumping log to /tmp/lustre-log.1576163837.39734 [174826.397200] Lustre: 39838:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-262), not sending early reply req@ffff9533ab636c00 x1650926760823968/t0(0) o101->7dc77806-c779-f6f7-b102-8e88c090719f@10.9.108.2@o2ib4:230/0 lens 592/3264 e 0 to 0 dl 1576163900 ref 2 fl Interpret:/0/0 rc 0/0 [174826.426446] Lustre: 39838:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 66 previous similar messages [174870.685765] Pid: 39757, comm: mdt00_020 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174870.696026] Call Trace: [174870.698584] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174870.705632] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174870.712935] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174870.719866] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174870.726971] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174870.733976] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174870.740648] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174870.747217] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174870.754081] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174870.761276] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174870.767547] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174870.774582] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174870.782397] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174870.788827] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174870.793843] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174870.800407] [<ffffffffffffffff>] 0xffffffffffffffff [174870.805526] LustreError: dumping log to /tmp/lustre-log.1576163939.39757 [174890.231987] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [174890.245111] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [174964.894900] Pid: 84705, comm: mdt02_055 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174964.905163] Call Trace: [174964.907720] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174964.914764] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174964.922060] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174964.928978] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174964.936083] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174964.943088] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174964.949758] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174964.956330] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174964.963193] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174964.970406] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174964.976677] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174964.983708] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174964.991524] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174964.997941] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174965.002954] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174965.009518] [<ffffffffffffffff>] 0xffffffffffffffff [174965.014639] LustreError: dumping log to /tmp/lustre-log.1576164034.84705 [174965.022217] Pid: 39742, comm: mdt03_010 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174965.032476] Call Trace: [174965.035021] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174965.042044] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174965.049315] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [174965.056227] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174965.063329] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [174965.069544] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [174965.076040] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [174965.082172] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [174965.088833] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [174965.095143] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174965.101705] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174965.108559] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174965.115747] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174965.122001] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174965.129025] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174965.136839] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174965.143255] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174965.148260] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174965.154830] [<ffffffffffffffff>] 0xffffffffffffffff [174965.159930] Pid: 84964, comm: mdt02_077 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174965.170198] Call Trace: [174965.172746] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174965.179758] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174965.187048] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174965.193955] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174965.201056] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174965.208048] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174965.214709] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174965.221275] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174965.228127] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174965.235328] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174965.241595] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174965.248616] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174965.256437] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174965.262847] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174965.267844] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174965.274417] [<ffffffffffffffff>] 0xffffffffffffffff [174965.279504] Pid: 39828, comm: mdt02_032 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [174965.289776] Call Trace: [174965.292319] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [174965.299336] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [174965.306624] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [174965.313532] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [174965.320631] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [174965.327625] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [174965.334285] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [174965.340848] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [174965.347702] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [174965.354890] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [174965.361146] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [174965.368182] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [174965.376000] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [174965.382416] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [174965.387423] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [174965.393976] [<ffffffffffffffff>] 0xffffffffffffffff [174968.990947] LustreError: dumping log to /tmp/lustre-log.1576164038.39240 [175034.527755] LustreError: dumping log to /tmp/lustre-log.1576164103.84646 [175165.601391] LNet: Service thread pid 39191 was inactive for 1201.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [175165.614427] LNet: Skipped 42 previous similar messages [175165.619664] LustreError: dumping log to /tmp/lustre-log.1576164234.39191 [175178.710888] Lustre: fir-MDT0003: Client 86e62df2-7242-7426-b57d-2f82dc7070cb (at 10.9.102.40@o2ib4) reconnecting [175178.721148] Lustre: Skipped 121 previous similar messages [175215.385973] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576163682/real 1576163682] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576164283 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [175215.414175] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 11 previous similar messages [175215.424096] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [175215.440262] Lustre: Skipped 4 previous similar messages [175215.445743] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [175215.455674] Lustre: Skipped 118 previous similar messages [175264.607591] LustreError: 84609:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576164033, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95213023ba80/0x5f9f636a31235b1a lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 66 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 84609 timeout: 0 lvb_type: 0 [175264.647238] LustreError: 84609:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 50 previous similar messages [175268.002627] LNet: Service thread pid 39239 was inactive for 1203.21s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [175268.019734] LNet: Skipped 9 previous similar messages [175268.024880] Pid: 39239, comm: mdt02_003 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175268.035152] Call Trace: [175268.037710] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175268.044754] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175268.052074] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175268.058999] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175268.066103] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175268.073110] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175268.079779] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175268.086354] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175268.093205] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175268.100401] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175268.106667] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175268.113696] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175268.121527] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175268.127944] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175268.132962] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175268.139525] [<ffffffffffffffff>] 0xffffffffffffffff [175268.144644] LustreError: dumping log to /tmp/lustre-log.1576164337.39239 [175268.152073] Pid: 84871, comm: mdt01_079 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175268.162356] Call Trace: [175268.164904] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175268.171928] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175268.179215] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175268.186147] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175268.193244] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175268.200250] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175268.206912] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175268.213485] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175268.220347] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175268.227543] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175268.233807] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175268.240860] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175268.248683] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175268.255108] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175268.260113] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175268.266690] [<ffffffffffffffff>] 0xffffffffffffffff [175268.271782] Pid: 84968, comm: mdt02_078 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175268.282058] Call Trace: [175268.284603] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175268.291641] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175268.298921] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175268.305853] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175268.312946] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175268.319978] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175268.326640] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175268.333214] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175268.340065] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175268.347263] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175268.353506] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175268.360542] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175268.368344] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175268.374773] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175268.379766] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175268.386355] [<ffffffffffffffff>] 0xffffffffffffffff [175268.391449] Pid: 84581, comm: mdt00_059 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175268.401721] Call Trace: [175268.404265] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175268.411287] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175268.418577] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175268.425492] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175268.432600] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175268.439607] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175268.446284] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175268.452867] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175268.459725] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175268.466919] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175268.473176] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175268.480201] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175268.488020] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175268.494437] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175268.499439] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175268.506006] [<ffffffffffffffff>] 0xffffffffffffffff [175366.307844] Pid: 39801, comm: mdt02_027 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175366.318103] Call Trace: [175366.320653] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175366.327697] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175366.334995] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175366.341911] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175366.349016] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175366.356020] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175366.362693] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175366.369264] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175366.376140] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175366.383338] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175366.389602] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175366.396634] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175366.404444] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175366.410863] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175366.415878] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175366.422443] [<ffffffffffffffff>] 0xffffffffffffffff [175366.427563] LustreError: dumping log to /tmp/lustre-log.1576164435.39801 [175388.953731] Lustre: fir-MDT0003: haven't heard from client c1504d4c-7504-c251-de3c-6f26c7b8e7d5 (at 10.9.102.26@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389be9000, cur 1576164458 expire 1576164308 last 1576164231 [175469.989171] Lustre: 84958:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-206), not sending early reply req@ffff9533b7507080 x1649421735352960/t0(0) o101->041c1209-eec6-f8ce-c95d-e7e9e84ecf6a@10.9.109.68@o2ib4:119/0 lens 584/3264 e 0 to 0 dl 1576164544 ref 2 fl Interpret:/0/0 rc 0/0 [175470.018505] Lustre: 84958:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 39 previous similar messages [175567.014422] LustreError: dumping log to /tmp/lustre-log.1576164636.39750 [175647.265420] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -107 [175647.278546] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [175665.319632] Pid: 39190, comm: mdt01_002 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175665.329899] Call Trace: [175665.332454] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175665.339497] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175665.346792] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175665.353710] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175665.360818] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175665.367821] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175665.374489] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175665.381054] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175665.387906] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175665.395102] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175665.401387] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175665.408415] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175665.416229] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175665.422645] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175665.427662] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175665.434225] [<ffffffffffffffff>] 0xffffffffffffffff [175665.439344] LustreError: dumping log to /tmp/lustre-log.1576164734.39190 [175669.415677] Pid: 39772, comm: mdt02_020 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175669.425938] Call Trace: [175669.428492] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175669.435535] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175669.442849] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175669.449770] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175669.456866] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175669.463871] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175669.470539] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175669.477105] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175669.483959] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175669.491155] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175669.497434] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175669.504481] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175669.512296] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175669.518713] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175669.523729] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175669.530292] [<ffffffffffffffff>] 0xffffffffffffffff [175669.535416] LustreError: dumping log to /tmp/lustre-log.1576164738.39772 [175816.481520] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576164284/real 1576164284] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576164885 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [175816.509724] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [175816.519560] Lustre: fir-OST0056-osc-MDT0003: Connection to fir-OST0056 (at 10.0.10.115@o2ib7) was lost; in progress operations using this service will wait for recovery to complete [175816.535733] Lustre: Skipped 2 previous similar messages [175816.541215] Lustre: fir-OST0056-osc-MDT0003: Connection restored to 10.0.10.115@o2ib7 (at 10.0.10.115@o2ib7) [175816.551153] Lustre: Skipped 159 previous similar messages [175833.340275] Lustre: fir-MDT0003: Client 24fab89a-6f6a-550a-7225-4734c7f7b849 (at 10.8.27.12@o2ib6) reconnecting [175833.350459] Lustre: Skipped 157 previous similar messages [175866.026152] Pid: 39847, comm: mdt03_034 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175866.036413] Call Trace: [175866.038970] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175866.046019] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175866.053317] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [175866.060263] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175866.067358] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [175866.073586] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [175866.080086] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [175866.086221] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [175866.092895] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [175866.099206] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175866.105792] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175866.112654] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175866.119867] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175866.126125] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175866.133170] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175866.140980] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175866.147418] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175866.152420] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175866.159000] [<ffffffffffffffff>] 0xffffffffffffffff [175866.164113] LustreError: dumping log to /tmp/lustre-log.1576164935.39847 [175866.171611] Pid: 84644, comm: mdt03_058 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175866.181884] Call Trace: [175866.184431] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175866.191454] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175866.198748] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175866.205668] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175866.212770] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175866.219769] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175866.226429] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175866.232993] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175866.239848] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175866.247060] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175866.253323] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175866.260355] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175866.268168] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175866.274586] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175866.279592] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175866.286157] [<ffffffffffffffff>] 0xffffffffffffffff [175866.291261] Pid: 84880, comm: mdt01_080 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175866.301517] Call Trace: [175866.304067] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175866.311091] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175866.318397] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175866.325303] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175866.332408] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175866.339405] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175866.346066] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175866.352638] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175866.359492] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175866.366686] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175866.372941] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175866.379988] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175866.387819] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175866.394233] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175866.399247] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175866.405800] [<ffffffffffffffff>] 0xffffffffffffffff [175866.410907] LNet: Service thread pid 39753 was inactive for 1201.68s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [175866.423937] LNet: Skipped 45 previous similar messages [175964.617373] LustreError: 39844:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576164733, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff950147332880/0x5f9f636a31288e5c lrc: 3/1,0 mode: --/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 53 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 39844 timeout: 0 lvb_type: 0 [175964.619583] LNet: Service thread pid 39728 completed after 1300.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [175964.673380] LustreError: 39844:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 18 previous similar messages [175968.427418] LNet: Service thread pid 39770 was inactive for 1203.69s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [175968.444523] LNet: Skipped 9 previous similar messages [175968.449671] Pid: 39770, comm: mdt01_029 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175968.459953] Call Trace: [175968.462509] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175968.469551] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175968.476844] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [175968.483764] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175968.490868] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [175968.497880] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [175968.504551] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175968.511125] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175968.517976] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175968.525176] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175968.531453] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175968.538485] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175968.546301] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175968.552718] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175968.557733] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175968.564311] [<ffffffffffffffff>] 0xffffffffffffffff [175968.569433] LustreError: dumping log to /tmp/lustre-log.1576165037.39770 [175968.576899] Pid: 84501, comm: mdt02_050 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [175968.587187] Call Trace: [175968.589736] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [175968.596759] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [175968.604055] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [175968.610973] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [175968.618068] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [175968.624296] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [175968.630807] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [175968.636964] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [175968.643627] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [175968.649932] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [175968.656507] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [175968.663346] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [175968.670557] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [175968.676797] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [175968.683834] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [175968.691638] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [175968.698077] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [175968.703076] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [175968.709643] [<ffffffffffffffff>] 0xffffffffffffffff [176100.948369] Lustre: fir-MDT0003: haven't heard from client 3c020cd0-089d-acb1-e879-86429192cebf (at 10.8.27.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389aff000, cur 1576165170 expire 1576165020 last 1576164943 [176114.861252] Lustre: 84965:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff9533192cb600 x1649438700512128/t0(0) o101->ec235f57-e420-df70-aa31-3bf46ecc12f9@10.9.117.17@o2ib4:8/0 lens 584/3264 e 0 to 0 dl 1576165188 ref 2 fl Interpret:/0/0 rc 0/0 [176114.890407] Lustre: 84965:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 20 previous similar messages [176169.133910] Pid: 39719, comm: mdt01_012 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176169.144169] Call Trace: [176169.146721] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176169.153760] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176169.161056] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176169.168001] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176169.175124] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176169.182124] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176169.188852] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176169.195451] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176169.202328] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176169.209544] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176169.215813] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176169.222847] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176169.230683] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176169.237111] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176169.242153] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176169.248725] [<ffffffffffffffff>] 0xffffffffffffffff [176169.253860] LustreError: dumping log to /tmp/lustre-log.1576165238.39719 [176169.261287] Pid: 84580, comm: mdt00_058 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176169.271573] Call Trace: [176169.274127] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176169.281191] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176169.288487] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176169.295428] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176169.302525] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176169.309539] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176169.316227] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176169.322799] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176169.329687] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176169.336891] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176169.343145] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176169.350168] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176169.357985] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176169.364415] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176169.369424] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176169.375979] [<ffffffffffffffff>] 0xffffffffffffffff [176169.381084] Pid: 84609, comm: mdt03_052 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176169.391338] Call Trace: [176169.393881] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176169.400895] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176169.408185] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176169.415092] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176169.422189] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176169.429198] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176169.435862] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176169.442426] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176169.449279] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176169.456467] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176169.462722] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176169.469744] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176169.477559] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176169.484001] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176169.489008] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176169.495606] [<ffffffffffffffff>] 0xffffffffffffffff [176339.575025] LNetError: 38927:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [176339.585108] LNetError: 38927:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.115@o2ib7 (31): c: 0, oc: 0, rc: 8 [176339.597711] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176340.575046] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 1 seconds [176341.575059] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 2 seconds [176341.585311] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 4 previous similar messages [176342.575077] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 3 seconds [176342.585332] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 3 previous similar messages [176344.575093] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 5 seconds [176344.585346] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 2 previous similar messages [176351.575181] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 1 seconds [176351.585437] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 7 previous similar messages [176365.575354] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 0 seconds [176365.585610] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [176365.594925] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176365.744350] Pid: 39840, comm: mdt03_031 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176365.754606] Call Trace: [176365.757157] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176365.764192] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176365.771486] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176365.778404] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176365.785507] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176365.792516] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176365.799186] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176365.805760] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176365.812610] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176365.819823] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176365.826087] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176365.833118] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176365.840934] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176365.847349] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176365.852366] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176365.858928] [<ffffffffffffffff>] 0xffffffffffffffff [176365.864049] LustreError: dumping log to /tmp/lustre-log.1576165434.39840 [176371.664441] LNetError: 92020:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176404.298844] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) fir-OST005e-osc-MDT0003: cannot cleanup orphans: rc = -11 [176404.311888] LustreError: 39480:0:(osp_precreate.c:940:osp_precreate_cleanup_orphans()) Skipped 5 previous similar messages [176405.575858] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 6 seconds [176405.586116] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 1 previous similar message [176405.595432] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176409.860112] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.104.8@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176417.476478] Lustre: fir-MDT0003: Connection restored to (at 10.9.105.48@o2ib4) [176417.483885] Lustre: Skipped 140 previous similar messages [176417.585001] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576164885/real 1576164885] req@ffff953387792880 x1652547526839584/t0(0) o6->fir-OST0056-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 544/432 e 2 to 1 dl 1576165486 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [176417.613197] Lustre: 38988:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 18 previous similar messages [176418.291088] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.109.59@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176434.298009] Lustre: fir-MDT0003: Client 4c53748e-746c-128b-a760-b7a4f9c1d7e9 (at 10.9.106.7@o2ib4) reconnecting [176434.308194] Lustre: Skipped 153 previous similar messages [176437.844990] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.8.7.18@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [176448.386482] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.102.5@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176455.576471] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 6 seconds [176455.586730] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 14 previous similar messages [176455.596219] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176459.595862] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.107.3@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176480.433771] Pid: 39814, comm: mdt01_040 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176480.444031] Call Trace: [176480.446581] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176480.453625] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176480.460921] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176480.467847] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176480.474951] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176480.481957] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176480.488628] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176480.495200] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176480.502053] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176480.509263] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176480.515533] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176480.522562] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176480.530374] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176480.536790] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176480.541805] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176480.548370] [<ffffffffffffffff>] 0xffffffffffffffff [176480.553489] LustreError: dumping log to /tmp/lustre-log.1576165549.39814 [176481.162904] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.116.13@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176481.180358] LustreError: Skipped 5 previous similar messages [176505.577083] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176514.637592] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.109.56@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176514.655049] LustreError: Skipped 79 previous similar messages [176520.962271] Lustre: fir-MDT0003: haven't heard from client fir-MDT0003-lwp-OST005a_UUID (at 10.0.10.115@o2ib7) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9513aab29c00, cur 1576165590 expire 1576165440 last 1576165363 [176527.577335] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 0 seconds [176527.587593] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 26 previous similar messages [176556.577687] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176566.450803] Pid: 84738, comm: mdt02_063 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176566.461059] Call Trace: [176566.463615] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176566.470649] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176566.477965] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176566.484888] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176566.491995] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176566.499001] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176566.505673] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176566.512243] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176566.519098] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176566.526294] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176566.532583] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176566.539614] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176566.547428] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176566.553844] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176566.558859] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176566.565423] [<ffffffffffffffff>] 0xffffffffffffffff [176566.570547] LustreError: dumping log to /tmp/lustre-log.1576165635.84738 [176566.578235] Pid: 84658, comm: mdt03_059 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176566.588526] Call Trace: [176566.591074] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176566.598096] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176566.605394] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176566.612304] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176566.619404] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176566.626402] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176566.633072] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176566.639637] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176566.646487] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176566.653676] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176566.659932] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176566.666970] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176566.674787] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176566.681223] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176566.686234] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176566.692789] [<ffffffffffffffff>] 0xffffffffffffffff [176566.697894] Pid: 39196, comm: mdt03_002 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176566.708152] Call Trace: [176566.710700] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176566.717715] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176566.725004] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176566.731911] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176566.739023] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176566.746025] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176566.752698] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176566.759265] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176566.766115] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176566.773304] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176566.779558] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176566.786580] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176566.794390] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176566.800818] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176566.805828] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176566.812382] [<ffffffffffffffff>] 0xffffffffffffffff [176566.817478] LNet: Service thread pid 84718 was inactive for 1202.19s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [176566.830505] LNet: Skipped 19 previous similar messages [176605.578278] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176605.590273] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 1 previous similar message [176657.578925] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.115@o2ib7: 7 seconds [176657.589180] LNet: 38927:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 32 previous similar messages [176674.579141] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.54@o2ib7 added to recovery queue. Health = 900 [176674.591136] LNetError: 38927:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 3 previous similar messages [176681.140209] LNet: Service thread pid 84630 was inactive for 1204.04s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [176681.157324] LNet: Skipped 9 previous similar messages [176681.162473] Pid: 84630, comm: mdt00_066 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176681.172746] Call Trace: [176681.175295] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176681.182343] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176681.189644] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176681.196564] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176681.203675] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176681.210678] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176681.217351] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176681.223914] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176681.230766] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176681.237962] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176681.244231] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176681.251259] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176681.259072] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176681.265488] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176681.270516] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176681.277080] [<ffffffffffffffff>] 0xffffffffffffffff [176681.282206] LustreError: dumping log to /tmp/lustre-log.1576165750.84630 [176714.932644] Lustre: 39723:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff9513b1a91f80 x1649442039911456/t0(0) o101->9b2adb9a-cb2a-9455-2efd-8f2f55c616be@10.9.117.15@o2ib4:608/0 lens 584/3264 e 0 to 0 dl 1576165788 ref 2 fl Interpret:/0/0 rc 0/0 [176714.961974] Lustre: 39723:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 15 previous similar messages [176764.629860] LNet: Service thread pid 39847 completed after 2100.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [176764.646193] LNet: Skipped 2 previous similar messages [176807.597053] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.107.9@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176807.614427] LustreError: Skipped 111 previous similar messages [176857.970114] Lustre: fir-MDT0003: haven't heard from client 2f29ff9b-1f0b-7030-94fa-3b368aa715dc (at 10.9.103.24@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389bb6400, cur 1576165927 expire 1576165777 last 1576165700 [176857.992014] Lustre: Skipped 5 previous similar messages [176865.462491] Pid: 84138, comm: mdt03_043 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176865.472754] Call Trace: [176865.475303] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176865.482348] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176865.489626] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176865.496546] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176865.503662] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176865.510687] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176865.517358] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176865.523933] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176865.530792] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176865.537988] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176865.544252] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176865.551287] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176865.559120] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176865.565531] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176865.570548] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176865.577125] [<ffffffffffffffff>] 0xffffffffffffffff [176865.582258] LustreError: dumping log to /tmp/lustre-log.1576165934.84138 [176865.589671] Pid: 39777, comm: mdt03_016 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176865.599962] Call Trace: [176865.602508] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176865.609530] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176865.616818] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176865.623727] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176865.630821] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176865.637828] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176865.644491] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176865.651054] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176865.657907] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176865.665095] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176865.671348] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176865.678371] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176865.686187] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176865.692603] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176865.697610] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176865.704164] [<ffffffffffffffff>] 0xffffffffffffffff [176865.709285] Pid: 39844, comm: mdt00_033 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [176865.719539] Call Trace: [176865.722084] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [176865.729101] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [176865.736386] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [176865.743296] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [176865.750391] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [176865.757388] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [176865.764049] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [176865.770614] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [176865.777485] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [176865.784669] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [176865.790924] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [176865.797949] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [176865.805761] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [176865.812171] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [176865.817177] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [176865.823731] [<ffffffffffffffff>] 0xffffffffffffffff [176947.770516] LustreError: 85020:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576165716, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff951315f85100/0x5f9f636a312ee7f9 lrc: 3/0,1 mode: --/PW res: [0x28003688b:0x1690:0x0].0x93742d65 bits 0x2/0x0 rrc: 5 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 85020 timeout: 0 lvb_type: 0 [176947.810595] LustreError: 85020:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 21 previous similar messages [176964.632817] LNet: Service thread pid 39773 completed after 2300.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [176964.649151] LNet: Skipped 1 previous similar message [176997.636735] LustreError: 137-5: fir-MDT0001_UUID: not available for connect from 10.9.107.9@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [176997.654104] LustreError: Skipped 1 previous similar message [177044.802995] Lustre: fir-MDT0003: Client f07c5f6c-7b37-2e06-010b-5b9cd07d02b4 (at 10.9.110.36@o2ib4) reconnecting [177044.813256] Lustre: Skipped 268 previous similar messages [177044.818776] Lustre: fir-MDT0003: Connection restored to f07c5f6c-7b37-2e06-010b-5b9cd07d02b4 (at 10.9.110.36@o2ib4) [177044.829328] Lustre: Skipped 292 previous similar messages [177064.105022] Lustre: 38992:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1576165377/real 1576165377] req@ffff95218bbc1f80 x1652547536815696/t0(0) o400->fir-OST005e-osc-MDT0003@10.0.10.115@o2ib7:28/4 lens 224/224 e 0 to 1 dl 1576166133 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [177064.133481] Lustre: 38992:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 23 previous similar messages [177066.169051] Pid: 39707, comm: mdt00_007 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [177066.179316] Call Trace: [177066.181874] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [177066.188917] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [177066.196219] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [177066.203140] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [177066.210254] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [177066.217273] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [177066.223944] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [177066.230516] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [177066.237378] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [177066.244574] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [177066.250839] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [177066.257870] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [177066.265684] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [177066.272100] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [177066.277118] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [177066.283694] [<ffffffffffffffff>] 0xffffffffffffffff [177066.288818] LustreError: dumping log to /tmp/lustre-log.1576166135.39707 [177066.296248] Pid: 84182, comm: mdt00_047 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [177066.306542] Call Trace: [177066.309085] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [177066.316100] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [177066.323387] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [177066.330296] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [177066.337390] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [177066.344389] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [177066.351065] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [177066.357623] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [177066.364476] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [177066.371662] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [177066.377917] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [177066.384942] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [177066.392768] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [177066.399180] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [177066.404187] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [177066.410743] [<ffffffffffffffff>] 0xffffffffffffffff [177066.415861] Pid: 39802, comm: mdt03_024 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [177066.426116] Call Trace: [177066.428659] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [177066.435668] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [177066.442956] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [177066.449863] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [177066.456958] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [177066.463955] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [177066.470638] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [177066.477198] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [177066.484068] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [177066.491257] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [177066.497523] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [177066.504554] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [177066.512371] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [177066.518782] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [177066.523772] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [177066.530342] [<ffffffffffffffff>] 0xffffffffffffffff [177066.535427] Pid: 84716, comm: mdt00_070 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [177066.545698] Call Trace: [177066.548246] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [177066.555288] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [177066.562558] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [177066.569481] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [177066.576563] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [177066.583576] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [177066.590222] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [177066.596782] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [177066.603635] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [177066.610824] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [177066.617091] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [177066.624110] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [177066.631923] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [177066.638330] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [177066.643337] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [177066.649894] [<ffffffffffffffff>] 0xffffffffffffffff [177066.654991] Pid: 86195, comm: mdt00_085 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [177066.665243] Call Trace: [177066.667786] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [177066.674802] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [177066.682106] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [177066.689015] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [177066.696113] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [177066.703109] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [177066.709769] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [177066.716332] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [177066.723187] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [177066.730374] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [177066.736637] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [177066.743660] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [177066.751479] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [177066.757889] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [177066.762883] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [177066.769451] [<ffffffffffffffff>] 0xffffffffffffffff [177164.635972] LNet: Service thread pid 39858 completed after 2500.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [177164.652329] LNet: Skipped 9 previous similar messages [177168.570348] LNet: Service thread pid 90715 was inactive for 1203.93s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [177168.583382] LNet: Skipped 10 previous similar messages [177168.588626] LustreError: dumping log to /tmp/lustre-log.1576166237.90715 [177206.958314] Lustre: fir-MDT0003: haven't heard from client 4c5e6f33-2d0c-f229-3fed-c30688bbed72 (at 10.9.116.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389e1b800, cur 1576166276 expire 1576166126 last 1576166049 [177282.958941] Lustre: fir-MDT0003: haven't heard from client 29e66763-b95c-3d3e-5532-53facc0d6b7a (at 10.9.109.32@o2ib4) in 220 seconds. I think it's dead, and I am evicting it. exp ffff950389ccb800, cur 1576166352 expire 1576166202 last 1576166132 [177282.980854] Lustre: Skipped 19 previous similar messages [177300.677996] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 10.0.10.51@o2ib7) was lost; in progress operations using this service will fail [177300.691911] LustreError: 39180:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166069, 300s ago), entering recovery for MGS@MGC10.0.10.51@o2ib7_0 ns: MGC10.0.10.51@o2ib7 lock: ffff95132ee90d80/0x5f9f636a313178be lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1d5a2e4 expref: -99 pid: 39180 timeout: 0 lvb_type: 0 [177300.729741] LustreError: 92296:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff95034e349500) refcount nonzero (1) after lock cleanup; forcing cleanup. [177364.639329] LNet: Service thread pid 84718 completed after 2000.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [177364.639343] Lustre: 39844:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (1511:189s); client may timeout. req@ffff9503668cde80 x1651217296246144/t0(0) o101->a1acf167-afde-6f5a-879d-1a7c0814f282@10.9.117.21@o2ib4:308/0 lens 584/536 e 0 to 0 dl 1576166244 ref 1 fl Complete:/0/0 rc 0/0 [177364.684590] LNet: Skipped 14 previous similar messages [177462.246945] Lustre: fir-MDT0003: haven't heard from client 646257db-4a10-1d7d-1435-2f2425d1bdb2 (at 10.8.18.26@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389995000, cur 1576166531 expire 1576166381 last 1576166304 [177462.268777] Lustre: Skipped 1 previous similar message [177464.295070] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 10.9.117.21@o2ib4 ns: mdt-fir-MDT0003_UUID lock: ffff950147332880/0x5f9f636a31288e5c lrc: 3/0,0 mode: PR/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 35 type: IBT flags: 0x60200400000020 nid: 10.9.117.21@o2ib4 remote: 0x291c575ea196512a expref: 16 pid: 39844 timeout: 177461 lvb_type: 0 [177464.333155] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 6 previous similar messages [177514.686759] Lustre: 39838:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff953386b0d100 x1649614765265024/t0(0) o101->5b41e348-8633-a21d-46d9-7918979d9d25@10.9.104.19@o2ib4:653/0 lens 592/3264 e 0 to 0 dl 1576166588 ref 2 fl Interpret:/0/0 rc 0/0 [177514.716087] Lustre: 39838:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages [177548.098113] LustreError: 84699:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166317, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff95215c7460c0/0x5f9f636a31333300 lrc: 3/0,1 mode: --/PW res: [0x28003688b:0x1690:0x0].0x93742d66 bits 0x2/0x0 rrc: 5 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84699 timeout: 0 lvb_type: 0 [177548.138188] LustreError: 84699:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 15 previous similar messages [177564.642186] LustreError: 39527:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff950389d3e400 ns: mdt-fir-MDT0003_UUID lock: ffff95237b835340/0x5f9f636a3110e32d lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 222 type: IBT flags: 0x50200400000020 nid: 10.9.109.70@o2ib4 remote: 0xd155d59b44851db3 expref: 2 pid: 39527 timeout: 0 lvb_type: 0 [177564.642191] LNet: Service thread pid 39822 completed after 5000.02s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [177564.693342] Lustre: 39527:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (4535:465s); client may timeout. req@ffff95036c709f80 x1648702465619248/t0(0) o101->8960bab2-ad07-45af-2f53-f9cf8eadf367@10.9.109.70@o2ib4:228/0 lens 576/536 e 0 to 0 dl 1576166168 ref 1 fl Complete:/0/0 rc -107/-107 [177564.722666] Lustre: 39527:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1 previous similar message [177606.954846] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 10.0.10.51@o2ib7) was lost; in progress operations using this service will fail [177606.968763] LustreError: 39180:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166376, 300s ago), entering recovery for MGS@MGC10.0.10.51@o2ib7_0 ns: MGC10.0.10.51@o2ib7 lock: ffff95038086cec0/0x5f9f636a313396ea lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1d77a8b expref: -99 pid: 39180 timeout: 0 lvb_type: 0 [177607.006604] LustreError: 92409:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff9500afa1da40) refcount nonzero (1) after lock cleanup; forcing cleanup. [177629.832534] LustreError: 137-5: fir-MDT0002_UUID: not available for connect from 10.9.107.9@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [177629.849908] LustreError: Skipped 6 previous similar messages [177688.661814] Lustre: fir-MDT0003: Client 3dc3e4b3-1daf-f260-3956-f8f68e141bca (at 10.9.117.42@o2ib4) reconnecting [177688.672075] Lustre: Skipped 122 previous similar messages [177688.677594] Lustre: fir-MDT0003: Connection restored to (at 10.9.117.42@o2ib4) [177688.684988] Lustre: Skipped 125 previous similar messages [177714.298192] LustreError: 39179:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.9.108.66@o2ib4 ns: mdt-fir-MDT0003_UUID lock: ffff950366006540/0x5f9f636a3110e2d9 lrc: 3/0,0 mode: PR/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 221 type: IBT flags: 0x60200400000020 nid: 10.9.108.66@o2ib4 remote: 0x48a0cf2e6b7e9667 expref: 12 pid: 82359 timeout: 177711 lvb_type: 0 [177764.644069] LNet: Service thread pid 84716 completed after 1900.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [177764.660508] LNet: Skipped 6 previous similar messages [177913.281639] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 10.0.10.51@o2ib7) was lost; in progress operations using this service will fail [177913.295553] LustreError: 39180:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166682, 300s ago), entering recovery for MGS@MGC10.0.10.51@o2ib7_0 ns: MGC10.0.10.51@o2ib7 lock: ffff95038086f500/0x5f9f636a3135ac5d lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1d991a9 expref: -99 pid: 39180 timeout: 0 lvb_type: 0 [177913.333390] LustreError: 92490:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff950149d0e900) refcount nonzero (1) after lock cleanup; forcing cleanup. [177935.970296] Lustre: fir-MDT0003: haven't heard from client db44fcc6-df61-0a83-7c51-af3e9a77d479 (at 10.8.7.13@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9503894b8000, cur 1576167005 expire 1576166855 last 1576166778 [177935.992008] Lustre: Skipped 14 previous similar messages [177967.300299] LNet: Service thread pid 39827 was inactive for 1202.65s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [177967.317404] LNet: Skipped 8 previous similar messages [177967.322555] Pid: 39827, comm: mdt01_045 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [177967.332831] Call Trace: [177967.335390] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [177967.342430] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [177967.349726] [<ffffffffc1546438>] mdt_object_local_lock+0x438/0xb20 [mdt] [177967.356661] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [177967.363765] [<ffffffffc1546ea0>] mdt_object_lock+0x20/0x30 [mdt] [177967.369992] [<ffffffffc157141a>] mdt_reint_open+0x106a/0x3240 [mdt] [177967.376488] [<ffffffffc1565693>] mdt_reint_rec+0x83/0x210 [mdt] [177967.382629] [<ffffffffc15421b3>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [177967.389300] [<ffffffffc154ea92>] mdt_intent_open+0x82/0x3a0 [mdt] [177967.395610] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [177967.402195] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [177967.409043] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [177967.416253] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [177967.422528] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [177967.429575] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [177967.437375] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [177967.443806] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [177967.448806] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [177967.455381] [<ffffffffffffffff>] 0xffffffffffffffff [177967.460493] LustreError: dumping log to /tmp/lustre-log.1576167036.39827 [177967.468272] Pid: 39847, comm: mdt03_034 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [177967.478560] Call Trace: [177967.481105] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [177967.488128] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [177967.495416] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [177967.502325] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [177967.509429] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [177967.516435] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [177967.523106] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [177967.529666] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [177967.536523] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [177967.543716] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [177967.549973] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [177967.557008] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [177967.564828] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [177967.571245] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [177967.576247] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [177967.582823] [<ffffffffffffffff>] 0xffffffffffffffff [178064.648189] Lustre: 39797:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (1823:377s); client may timeout. req@ffff9513b2800480 x1649447543934496/t584152732117(0) o101->3dc3e4b3-1daf-f260-3956-f8f68e141bca@10.9.117.42@o2ib4:65/0 lens 1800/904 e 0 to 0 dl 1576166756 ref 1 fl Complete:/0/0 rc 0/0 [178064.648810] LNet: Service thread pid 90019 completed after 5500.02s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [178064.648812] LNet: Skipped 5 previous similar messages [178064.649674] LustreError: 84899:0:(ldlm_lockd.c:1348:ldlm_handle_enqueue0()) ### lock on destroyed export ffff9523b958ec00 ns: mdt-fir-MDT0003_UUID lock: ffff9533ad7a5100/0x5f9f636a3132b8f7 lrc: 3/0,0 mode: PR/PR res: [0x2800347aa:0xf864:0x0].0x0 bits 0x13/0x0 rrc: 30 type: IBT flags: 0x50200400000020 nid: 10.9.117.21@o2ib4 remote: 0x291c575ea1965281 expref: 2 pid: 84899 timeout: 0 lvb_type: 0 [178064.734197] Lustre: 39797:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 4 previous similar messages [178114.758135] Lustre: 92685:0:(service.c:1372:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff95335a630d80 x1649831375167744/t0(0) o101->a957717b-6209-b5a8-8735-870660cbdba3@10.9.117.26@o2ib4:498/0 lens 584/3264 e 0 to 0 dl 1576167188 ref 2 fl Interpret:/0/0 rc 0/0 [178114.787461] Lustre: 92685:0:(service.c:1372:ptlrpc_at_send_early_reply()) Skipped 19 previous similar messages [178164.652061] Lustre: 39749:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:45s); client may timeout. req@ffff951304734050 x1649331490446240/t584152758485(0) o101->6fe05dcf-b9e2-99d7-33ce-acbd0a395824@10.9.117.43@o2ib4:498/0 lens 1800/904 e 0 to 0 dl 1576167188 ref 1 fl Complete:/0/0 rc 0/0 [178164.681743] Lustre: 39749:0:(service.c:2165:ptlrpc_server_handle_request()) Skipped 1 previous similar message [178168.006782] LNet: Service thread pid 84532 was inactive for 1203.15s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [178168.023890] LNet: Skipped 1 previous similar message [178168.028948] Pid: 84532, comm: mdt01_063 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178168.039226] Call Trace: [178168.041783] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [178168.048827] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [178168.056113] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [178168.063031] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [178168.070125] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [178168.077123] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [178168.083786] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [178168.090366] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178168.097220] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178168.104407] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [178168.110670] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178168.117702] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178168.125515] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178168.131932] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [178168.136950] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [178168.143511] [<ffffffffffffffff>] 0xffffffffffffffff [178168.148632] LustreError: dumping log to /tmp/lustre-log.1576167237.84532 [178222.688459] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 10.0.10.51@o2ib7) was lost; in progress operations using this service will fail [178222.702371] LustreError: 39180:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576166991, 300s ago), entering recovery for MGS@MGC10.0.10.51@o2ib7_0 ns: MGC10.0.10.51@o2ib7 lock: ffff95038086af40/0x5f9f636a3137bb1d lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1dcd780 expref: -99 pid: 39180 timeout: 0 lvb_type: 0 [178222.740210] LustreError: 92730:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff950149d0f8c0) refcount nonzero (1) after lock cleanup; forcing cleanup. [178242.723055] Lustre: fir-MDT0003: haven't heard from client d59b4a25-94cd-9118-509c-0144bd0df5bb (at 10.9.109.19@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389c5ec00, cur 1576167311 expire 1576167161 last 1576167084 [178300.692570] Lustre: fir-MDT0003: Client 05fc04dc-2fd8-3281-f920-e1a710b648d6 (at 10.8.18.21@o2ib6) reconnecting [178300.702754] Lustre: Skipped 136 previous similar messages [178300.708267] Lustre: fir-MDT0003: Connection restored to (at 10.8.18.21@o2ib6) [178300.715577] Lustre: Skipped 146 previous similar messages [178319.052535] Lustre: fir-MDT0003: haven't heard from client ec95172c-af62-15aa-37b1-9f40e3145075 (at 10.9.107.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff950389fa0c00, cur 1576167388 expire 1576167238 last 1576167161 [178319.074351] Lustre: Skipped 19 previous similar messages [178320.471630] LustreError: 84958:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576167089, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0003_UUID lock: ffff9533b7be7740/0x5f9f636a313867d1 lrc: 3/1,0 mode: --/PR res: [0x280000dbb:0x18a:0x0].0x0 bits 0x13/0x0 rrc: 354 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 84958 timeout: 0 lvb_type: 0 [178320.511273] LustreError: 84958:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) Skipped 29 previous similar messages [178368.713171] LNet: Service thread pid 39858 was inactive for 1203.88s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [178368.730283] Pid: 39858, comm: mdt00_037 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178368.740540] Call Trace: [178368.743092] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [178368.750133] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [178368.757430] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [178368.764348] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [178368.771452] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [178368.778456] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [178368.785127] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [178368.791690] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178368.798553] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178368.805748] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [178368.812010] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178368.819043] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178368.826859] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178368.833294] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [178368.838314] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [178368.844880] [<ffffffffffffffff>] 0xffffffffffffffff [178368.850000] LustreError: dumping log to /tmp/lustre-log.1576167437.39858 [178368.857561] Pid: 39765, comm: mdt02_019 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178368.867852] Call Trace: [178368.870400] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [178368.877423] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [178368.884718] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [178368.891662] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [178368.898770] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [178368.905787] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [178368.912457] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [178368.919045] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178368.925905] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178368.933111] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [178368.939364] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178368.946398] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178368.954202] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178368.960631] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [178368.965639] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [178368.972207] [<ffffffffffffffff>] 0xffffffffffffffff [178368.977301] Pid: 39773, comm: mdt00_022 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178368.987576] Call Trace: [178368.990120] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [178368.997145] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [178369.004432] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [178369.011351] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [178369.018456] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [178369.025452] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [178369.032127] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [178369.038694] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178369.045557] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178369.052743] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [178369.058998] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178369.066021] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178369.073834] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178369.080268] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [178369.085277] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [178369.091838] [<ffffffffffffffff>] 0xffffffffffffffff [178524.363111] Pid: 39834, comm: mdt02_034 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178524.373375] Call Trace: [178524.375938] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [178524.382983] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [178524.390279] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [178524.397195] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [178524.404299] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [178524.411321] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [178524.417991] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [178524.424567] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178524.431427] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178524.438623] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [178524.444886] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178524.451920] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178524.459735] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178524.466148] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [178524.471164] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [178524.477742] [<ffffffffffffffff>] 0xffffffffffffffff [178524.482864] LustreError: dumping log to /tmp/lustre-log.1576167593.39834 [178530.065170] LustreError: 166-1: MGC10.0.10.51@o2ib7: Connection to MGS (at 10.0.10.51@o2ib7) was lost; in progress operations using this service will fail [178530.079095] LustreError: 39180:0:(ldlm_request.c:147:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1576167299, 300s ago), entering recovery for MGS@MGC10.0.10.51@o2ib7_0 ns: MGC10.0.10.51@o2ib7 lock: ffff950371ac18c0/0x5f9f636a3164e1e4 lrc: 4/1,0 mode: --/CR res: [0x726966:0x2:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0xc3c20c06c1e93302 expref: -99 pid: 39180 timeout: 0 lvb_type: 0 [178530.116939] LustreError: 92920:0:(ldlm_resource.c:1147:ldlm_resource_complain()) MGC10.0.10.51@o2ib7: namespace resource [0x726966:0x2:0x0].0x0 (ffff950110769200) refcount nonzero (1) after lock cleanup; forcing cleanup. [178536.651253] Pid: 39735, comm: mdt03_008 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [178536.661514] Call Trace: [178536.664074] [<ffffffffc0fbbb75>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [178536.671116] [<ffffffffc0fbc5e1>] ldlm_cli_enqueue_local+0x231/0x830 [ptlrpc] [178536.678434] [<ffffffffc154650b>] mdt_object_local_lock+0x50b/0xb20 [mdt] [178536.685352] [<ffffffffc1546b90>] mdt_object_lock_internal+0x70/0x360 [mdt] [178536.692456] [<ffffffffc1547cfa>] mdt_getattr_name_lock+0x90a/0x1c30 [mdt] [178536.699462] [<ffffffffc154fd25>] mdt_intent_getattr+0x2b5/0x480 [mdt] [178536.706133] [<ffffffffc154cbb5>] mdt_intent_policy+0x435/0xd80 [mdt] [178536.712703] [<ffffffffc0fa2d46>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [178536.719567] [<ffffffffc0fcb336>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [178536.726763] [<ffffffffc1053a12>] tgt_enqueue+0x62/0x210 [ptlrpc] [178536.733034] [<ffffffffc105836a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [178536.740065] [<ffffffffc0fff24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [178536.747904] [<ffffffffc1002bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc] [178536.754324] [<ffffffff892c2e81>] kthread+0xd1/0xe0 [178536.759338] [<ffffffff89977c24>] ret_from_fork_nospec_begin+0xe/0x21 [178536.765902] [<ffffffffffffffff>] 0xffffffffffffffff [178536.771032] LustreError: dumping log to /tmp/lustre-log.1576167605.39735 [178536.778389] LNet: Service thread pid 39241 was inactive for 1204.24s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [178536.791425] LNet: Skipped 1 previous similar message [178564.043637] SysRq : Trigger a crash [178564.047293] BUG: unable to handle kernel NULL pointer dereference at (null) [178564.055260] IP: [<ffffffff89664446>] sysrq_handle_crash+0x16/0x20 [178564.061474] PGD 4012a01067 PUD 401b27e067 PMD 0 [178564.066259] Oops: 0002 [#1] SMP [178564.069629] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) dell_rbu sunrpc vfat fat dm_round_robin amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel dcdbas ses aesni_intel enclosure lrw gf128mul glue_helper ablk_helper sg cryptd ipmi_si pcspkr ccp dm_multipath ipmi_devintf dm_mod ipmi_msghandler i2c_piix4 k10temp acpi_power_meter ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx5_ib(OE) [178564.141996] ib_uverbs(OE) ib_core(OE) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mlx5_core(OE) ttm ahci mlxfw(OE) devlink crct10dif_pclmul libahci crct10dif_common mpt3sas(OE) drm tg3 mlx_compat(OE) libata crc32c_intel raid_class ptp megaraid_sas scsi_transport_sas drm_panel_orientation_quirks pps_core [178564.170920] CPU: 18 PID: 92886 Comm: bash Kdump: loaded Tainted: G OE ------------ 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 [178564.183251] Hardware name: Dell Inc. PowerEdge R6415/07YXFK, BIOS 1.10.6 08/15/2019 [178564.190990] task: ffff9521056fe180 ti: ffff952212ae0000 task.ti: ffff952212ae0000 [178564.198558] RIP: 0010:[<ffffffff89664446>] [<ffffffff89664446>] sysrq_handle_crash+0x16/0x20 [178564.207190] RSP: 0018:ffff952212ae3e58 EFLAGS: 00010246 [178564.212590] RAX: ffffffff89664430 RBX: ffffffff89ee4f80 RCX: 0000000000000000 [178564.219808] RDX: 0000000000000000 RSI: ffff9523bf713898 RDI: 0000000000000063 [178564.227026] RBP: ffff952212ae3e58 R08: ffffffff8a1e38bc R09: ffffffff8a2956e3 [178564.234246] R10: 0000000000001f2f R11: 0000000000001f2e R12: 0000000000000063 [178564.241464] R13: 0000000000000000 R14: 0000000000000007 R15: 0000000000000000 [178564.248685] FS: 00007f0435f02740(0000) GS:ffff9523bf700000(0000) knlGS:0000000000000000 [178564.256858] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [178564.262691] CR2: 0000000000000000 CR3: 0000003fa52dc000 CR4: 00000000003407e0 [178564.269911] Call Trace: [178564.272452] [<ffffffff89664c6d>] __handle_sysrq+0x10d/0x170 [178564.278197] [<ffffffff896650d8>] write_sysrq_trigger+0x28/0x40 [178564.284207] [<ffffffff894b9710>] proc_reg_write+0x40/0x80 [178564.289780] [<ffffffff89442700>] vfs_write+0xc0/0x1f0 [178564.295003] [<ffffffff8944351f>] SyS_write+0x7f/0xf0 [178564.300146] [<ffffffff89977ddb>] system_call_fastpath+0x22/0x27 [178564.306233] Code: eb 9b 45 01 f4 45 39 65 34 75 e5 4c 89 ef e8 e2 f7 ff ff eb db 66 66 66 66 90 55 48 89 e5 c7 05 91 31 7e 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 66 66 66 66 90 55 31 c0 c7 05 0e [178564.326842] RIP [<ffffffff89664446>] sysrq_handle_crash+0x16/0x20 [178564.333134] RSP <ffff952212ae3e58> [178564.336715] CR2: 0000000000000000