[LU-10625] recovery-mds-scale test_failover_ost: test_failover_ost returned 7 Created: 07/Feb/18  Updated: 27/Mar/20  Resolved: 27/Mar/20

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: Yang Sheng
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

recovery-mds-scale test_failover_ost - test_failover_ost returned 7
^^^^^^^^^^^^^ DO NOT REMOVE LINE ABOVE ^^^^^^^^^^^^^

This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

This issue relates to the following test suite run:
https://testing.hpdd.intel.com/test_sets/b989acc6-ff51-11e7-a7cd-52540065bddc

test_failover_ost failed with the following error:

test_failover_ost returned 7

client dmesg

Server failover period: 1200 seconds
Exited after:           0 seconds
Number of failovers before exit:
mds1: 0 times
ost1: 0 times
ost2: 0 times
ost3: 0 times
ost4: 0 times
ost5: 0 times
ost6: 0 times
ost
[86211.448649] Lustre: DEBUG MARKER: Duration: 86400
[86211.613986] Lustre: DEBUG MARKER: test -f /tmp/client-load.pid &&
        { kill -s TERM $(cat /tmp/client-load.pid); rm -f /tmp/client-load.pid; }
[86401.075681] INFO: task sync:23671 blocked for more than 120 seconds.
[86401.076443] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[86401.077287] sync            D ffff880036be0000     0 23671  23636 0x00000080
[86401.078094] Call Trace:
[86401.078486]  [<ffffffff816a9700>] ? bit_wait+0x50/0x50
[86401.079156]  [<ffffffff816ab6d9>] schedule+0x29/0x70
[86401.079713]  [<ffffffff816a90e9>] schedule_timeout+0x239/0x2c0
[86401.080472]  [<ffffffff810cf98c>] ? dequeue_entity+0x11c/0x5d0
[86401.081134]  [<ffffffff81062efe>] ? kvm_clock_get_cycles+0x1e/0x20
[86401.081802]  [<ffffffff816a9700>] ? bit_wait+0x50/0x50
[86401.082360]  [<ffffffff816aac5d>] io_schedule_timeout+0xad/0x130
[86401.083121]  [<ffffffff816aacf8>] io_schedule+0x18/0x20
[86401.083693]  [<ffffffff816a9711>] bit_wait_io+0x11/0x50
[86401.084315]  [<ffffffff816a9235>] __wait_on_bit+0x65/0x90
[86401.084968]  [<ffffffff811839b1>] wait_on_page_bit+0x81/0xa0
[86401.085676]  [<ffffffff810b3570>] ? wake_bit_function+0x40/0x40
[86401.086327]  [<ffffffff81183ae1>] __filemap_fdatawait_range+0x111/0x190
[86401.087055]  [<ffffffff811868d7>] filemap_fdatawait_keep_errors+0x27/0x30
[86401.087913]  [<ffffffff8122fdcd>] sync_inodes_sb+0x16d/0x1f0
[86401.088524]  [<ffffffff812353e0>] ? generic_write_sync+0x60/0x60
[86401.089179]  [<ffffffff812353f9>] sync_inodes_one_sb+0x19/0x20
[86401.089911]  [<ffffffff81206c61>] iterate_supers+0xc1/0x120
[86401.090500]  [<ffffffff812356c4>] sys_sync+0x44/0xb0
[86401.091065]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[86401.091795]  [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214
[86521.091648] INFO: task sync:23671 blocked for more than 120 seconds.
[86521.092460] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[86521.093362] sync            D ffff880036be0000     0 23671  23636 0x00000080
[86521.094227] Call Trace:
[86521.094524]  [<ffffffff816a9700>] ? bit_wait+0x50/0x50
[86521.095146]  [<ffffffff816ab6d9>] schedule+0x29/0x70
[86521.095867]  [<ffffffff816a90e9>] schedule_timeout+0x239/0x2c0
[86521.096549]  [<ffffffff810cf98c>] ? dequeue_entity+0x11c/0x5d0
[86521.097238]  [<ffffffff81062efe>] ? kvm_clock_get_cycles+0x1e/0x20
[86521.098039]  [<ffffffff816a9700>] ? bit_wait+0x50/0x50
[86521.098641]  [<ffffffff816aac5d>] io_schedule_timeout+0xad/0x130
[86521.099402]  [<ffffffff816aacf8>] io_schedule+0x18/0x20
[86521.100042]  [<ffffffff816a9711>] bit_wait_io+0x11/0x50
[86521.100724]  [<ffffffff816a9235>] __wait_on_bit+0x65/0x90
[86521.101351]  [<ffffffff811839b1>] wait_on_page_bit+0x81/0xa0
[86521.102021]  [<ffffffff810b3570>] ? wake_bit_function+0x40/0x40
[86521.102790]  [<ffffffff81183ae1>] __filemap_fdatawait_range+0x111/0x190
[86521.103543]  [<ffffffff811868d7>] filemap_fdatawait_keep_errors+0x27/0x30
[86521.104338]  [<ffffffff8122fdcd>] sync_inodes_sb+0x16d/0x1f0
[86521.105080]  [<ffffffff812353e0>] ? generic_write_sync+0x60/0x60
[86521.105778]  [<ffffffff812353f9>] sync_inodes_one_sb+0x19/0x20
[86521.106516]  [<ffffffff81206c61>] iterate_supers+0xc1/0x120
[86521.107177]  [<ffffffff812356c4>] sys_sync+0x44/0xb0
[86521.107837]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[86521.108535]  [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214
[86641.108765] INFO: task sync:23671 blocked for more than 120 seconds.
[86641.109585] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message


 Comments   
Comment by James Nunez (Inactive) [ 08/Feb/18 ]

From the suite_log, we see a problem getting the OST recovery_status:

PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/sbin:/sbin:/bin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh _wait_recovery_complete *.lustre-OST0006.recovery_status 1475 
onyx-43vm5: onyx-43vm5.onyx.hpdd.intel.com: executing _wait_recovery_complete *.lustre-OST0006.recovery_status 1475
onyx-43vm5: error: get_param: param_path '*/lustre-OST0006/recovery_status': No such file or directory
ost7 recovery is not completed!
Comment by Joseph Gmitter (Inactive) [ 12/Feb/18 ]

Hi Emoly,

Would you be able to look into this one?

Thanks.
Joe

Comment by Brad Hoagland (Inactive) [ 12/Feb/18 ]

Hi Yang Sheng,

Can you take a look at this one?

Thanks,

Brad

Comment by Yang Sheng [ 13/Feb/18 ]

Looks like lustre hasn't mounted on OSS 43vm5. So test case was failed to get status of recovery. Client encountered a local storage corruption:

.....
[169060.860916] EXT4-fs error (device vda1): ext4_get_branch:170: inode #8: block 2553887680: comm jbd2/vda1-8: invalid block
[169060.865212] jbd2_journal_bmap: journal block not found at offset 25612 on vda1-8
[169060.866129] Aborting journal on device vda1-8.
[169060.867083] EXT4-fs error (device vda1): ext4_journal_check_start:56: Detected aborted journal
[169060.868197] EXT4-fs (vda1): Remounting filesystem read-only
[172647.495043] LustreError: can't open /tmp/lustre-log1516486001.30309 for dump: rc -30
.......

But seems unrelated to OSS failure. Does this issue be hitted frequently?

Thanks,
YangSheng

Comment by James Nunez (Inactive) [ 13/Feb/18 ]

Yang Shen - So far, we've only seen this type of failure once. Thanks for investigating this issue.

Comment by Sarah Liu [ 19/Mar/18 ]

another on on master zfs failover test tag-2.10.59

https://testing.hpdd.intel.com/test_sets/7b9bba92-2a1e-11e8-b3c6-52540065bddc

 

I found the OSS hit kernel panic

console.trevis-6vm1.log

[37623.026896] Lustre: lustre-OST0000: Connection restored to lustre-MDT0000-mdtlov_UUID (at 10.9.4.57@tcp)
[37623.031350] Lustre: Skipped 5 previous similar messages
[37623.903131] Lustre: 25608:0:(client.c:2100:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1521077843/real 1521077843] req@ffff88001cff4900 x1594926482776128/t0(0) o400->lustre-MDT0000-lwp-OST0000@10.9.4.58@tcp:12/10 lens 224/224 e 0 to 1 dl 1521077852 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
[37623.915211] Lustre: 25608:0:(client.c:2100:ptlrpc_expire_one_request()) Skipped 6 previous similar messages
[37623.924087] Lustre: Evicted from MGS (at 10.9.4.57@tcp) after server handle changed from 0xcc9d4faef44b16 to 0x80d1978e791999d7
[37623.931873] Lustre: MGC10.9.4.57@tcp: Connection restored to 10.9.4.57@tcp (at 10.9.4.57@tcp)
[37623.940042] Lustre: Skipped 4 previous similar messages
[37625.913058] Lustre: 25609:0:(client.c:2100:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1521077845/real 1521077845] req@ffff88001cff7300 x1594926482776256/t0(0) o400->lustre-MDT0000-lwp-OST0000@10.9.4.58@tcp:12/10 lens 224/224 e 0 to 1 dl 1521077854 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
[37625.922469] Lustre: 25609:0:(client.c:2100:ptlrpc_expire_one_request()) Skipped 6 previous similar messages
[37655.308664] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-6vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[37657.039527] Lustre: DEBUG MARKER: trevis-6vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[37658.582350] Lustre: lustre-OST0002: deleting orphan objects from 0x0:31686 to 0x0:31745
[37658.587893] Lustre: lustre-OST0003: deleting orphan objects from 0x0:32116 to 0x0:32193
[37658.591017] Lustre: lustre-OST0004: deleting orphan objects from 0x0:31783 to 0x0:31937
[37658.591141] Lustre: lustre-OST0006: deleting orphan objects from 0x0:31913 to 0x0:31937
[37658.591245] Lustre: lustre-OST0005: deleting orphan objects from 0x0:31820 to 0x0:31873
[37658.591345] Lustre: lustre-OST0001: deleting orphan objects from 0x0:31845 to 0x0:31905
[37658.591443] Lustre: lustre-OST0000: deleting orphan objects from 0x0:31788 to 0x0:31873
[37658.928567] LustreError: 167-0: lustre-MDT0000-lwp-OST0000: This client was evicted by lustre-MDT0000; in progress operations using this service will fail.
[37658.934793] LustreError: Skipped 6 previous similar messages
[37658.939547] Lustre: lustre-MDT0000-lwp-OST0000: Connection restored to 10.9.4.57@tcp (at 10.9.4.57@tcp)
[37660.015817] Lustre: DEBUG MARKER: /usr/sbin/lctl mark ==== Checking the clients loads AFTER failover -- failure NOT OK
[37660.350343] Lustre: DEBUG MARKER: ==== Checking the clients loads AFTER failover -- failure NOT OK
[37663.692895] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 has failed over 32 times, and counting...
[37663.954190] Lustre: DEBUG MARKER: mds1 has failed over 32 times, and counting...
[37695.408011] connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4332352682, last ping 4332357696, now 4332362704
[37695.408011] connection1:0: detected conn error (1022)
[37700.828017] Kernel panic - not syncing: Pool 'lustre-ost3' has encountered an uncorrectable I/O failure and the failure mode property for this pool is set to panic.
[37700.828017] CPU: 0 PID: 28963 Comm: mmp Tainted: P OE ------------ 3.10.0-693.21.1.el7_lustre.x86_64 #1
[37700.828017] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[37700.828017] Call Trace:
[37700.828017] [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
[37700.828017] [<ffffffff816a8634>] panic+0xe8/0x21f
[37700.828017] [<ffffffffc08da256>] zio_suspend+0x106/0x110 [zfs]
[37700.828017] [<ffffffffc0860dda>] mmp_thread+0x70a/0x760 [zfs]
[37700.828017] [<ffffffffc0860560>] ? mmp_random_leaf+0xb0/0xb0 [zfs]
[37700.828017] [<ffffffffc08606d0>] ? mmp_write_done+0x170/0x170 [zfs]
[37700.828017] [<ffffffffc0742fc3>] thread_generic_wrapper+0x73/0x80 [spl]
[37700.828017] [<ffffffffc0742f50>] ? __thread_exit+0x20/0x20 [spl]
[37700.828017] [<ffffffff810b4031>] kthread+0xd1/0xe0
[37700.828017] [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
[37700.828017] [<ffffffff816c0577>] ret_from_fork+0x77/0xb0
[37700.828017] [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.10.0-693.21.1.el7_lustre.x86_64 (jenkins@trevis-309-el7-x8664-2.trevis.hpdd.intel.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Mon Mar 12 15:21:31 UTC 2018
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.21.1.el7_lustre.x86_64 root=UUID=36a4fa8e-8395-4c4c-9d40-93a0779cd2bb ro console=tty0 LANG=en_US.UTF-8 console=ttyS0,115200 net.ifnames=0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867708K
[ 0.000000] Disabled fast string operations
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009dbff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009dc00-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000021000000-0x0000000034f5efff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000034fffc00-0x0000000034ffffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007fffd000-0x000000007fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fffbc000-0x00000000ffffffff] reserved
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.4 present.
[ 0.000000] Hypervisor detected: KVM
[ 0.000000] e820: last_pfn = 0x35000 max_arch_pfn = 0x400000000
[ 0.000000] x86 PAT enabled: cpu 0, old 0x70106, new 0x7010600070106
[ 0.000000] x2apic enabled by BIOS, switching to x2apic ops
[ 0.000000] found SMP MP-table at [mem 0x000fda30-0x000fda3f] mapped at [ffff8800000fda30]
[ 0.000000] iBFT found at 0x9aff0.
[ 0.000000] Using GB pages for direct mapping
[ 0.000000] RAMDISK: [mem 0x322a2000-0x339fffff]
[ 0.000000] Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 00000000000fd9e0 00014 (v00 BOCHS )
[ 0.000000] ACPI: RSDT 000000007fffd5d0 00034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: FACP 000000007ffffe20 00074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001)
[ 0.000000] ACPI: DSDT 000000007fffd910 024A2 (v01 BXPC BXDSDT 00000001 INTL 20090123)
[ 0.000000] ACPI: FACS 000000007ffffdc0 00040
[ 0.000000] ACPI: SSDT 000000007fffd810 000FF (v01 BOCHS BXPCSSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: APIC 000000007fffd720 00080 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001)
[ 0.000000] ACPI: SSDT 000000007fffd610 0010F (v01 BXPC BXSSDTPC 00000001 INTL 20090123)
[ 0.000000] Setting APIC routing to cluster x2apic.
[ 0.000000] NUMA turned off
[ 0.000000] Faking a node at [mem 0x0000000000000000-0x0000000034ffffff]
[ 0.000000] NODE_DATA(0) allocated [mem 0x34f38000-0x34f5efff]
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: cpu 0, msr 0:34ee8001, primary cpu clock
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x00001000-0x00ffffff]
[ 0.000000] DMA32 [mem 0x01000000-0xffffffff]
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x00001000-0x0009cfff]
[ 0.000000] node 0: [mem 0x21000000-0x34f5efff]
[ 0.000000] Initmem setup node 0 [mem 0x00001000-0x34f5efff]
[ 0.000000] ACPI: PM-Timer IO Port: 0xb008
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 1 reached. Processor 1/0x1 ignored.
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] smpboot: 2 Processors exceeds NR_CPUS limit of 1
[ 0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
[ 0.000000] PM: Registered nosave memory: [mem 0x0009d000-0x0009dfff]
[ 0.000000] PM: Registered nosave memory: [mem 0x0009e000-0x0009ffff]
[ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff]
[ 0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff]
[ 0.000000] PM: Registered nosave memory: [mem 0x00100000-0x20ffffff]
[ 0.000000] PM: Registered nosave memory: [mem 0x34f5f000-0x34ffffff]
[ 0.000000] e820: [mem 0x80000000-0xfffbbfff] available for PCI devices
[ 0.000000] Booting paravirtualized kernel on KVM
[ 0.000000] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:1 nr_cpu_ids:1 nr_node_ids:1
[ 0.000000] PERCPU: Embedded 35 pages/cpu @ffff880034c00000 s104600 r8192 d30568 u2097152
[ 0.000000] kvm-stealtime: cpu 0, msr 34c13480
[ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 80611
[ 0.000000] Policy zone: DMA32
[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.21.1.el7_lustre.x86_64 root=UUID=36a4fa8e-8395-4c4c-9d40-93a0779cd2bb ro console=tty0 LANG=en_US.UTF-8 console=ttyS0,115200 net.ifnames=0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867708K
[ 0.000000] Misrouted IRQ fixup and polling support enabled
[ 0.000000] This may significantly impact system performance
[ 0.000000] Disabling memory control group subsystem
[ 0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes)
[ 0.000000] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100
[ 0.000000] xsave: enabled xstate_bv 0x7, cntxt size 0x340 using standard form
[ 0.000000] Memory: 278152k/868352k available (6940k kernel code, 540692k absent, 49508k reserved, 4560k data, 1792k init)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] x86/pti: Unmapping kernel while in userspace
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] RCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=1.
[ 0.000000] NR_IRQS:327936 nr_irqs:256 0
[ 0.000000] Console: colour dummy device 80x25
[ 0.000000] console [tty0] enabled
[ 0.000000] console [ttyS0] enabled
[ 0.000000] tsc: Detected 2693.508 MHz processor
[ 0.001000] Calibrating delay loop (skipped) preset value.. 5387.01 BogoMIPS (lpj=2693508)
[ 0.001000] pid_max: default: 32768 minimum: 301
[ 0.001000] Security Framework initialized
[ 0.001000] SELinux: Initializing.
[ 0.001000] Yama: becoming mindful.
[ 0.001000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 0.001000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 0.001000] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)
[ 0.001000] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[ 0.001000] Initializing cgroup subsys memory
[ 0.001000] Initializing cgroup subsys devices
[ 0.001000] Initializing cgroup subsys freezer
[ 0.001000] Initializing cgroup subsys net_cls
[ 0.001000] Initializing cgroup subsys blkio
[ 0.001000] Initializing cgroup subsys perf_event
[ 0.001000] Initializing cgroup subsys hugetlb
[ 0.001000] Initializing cgroup subsys pids
[ 0.001000] Initializing cgroup subsys net_prio
[ 0.001000] Disabled fast string operations
[ 0.001000] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.001000] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.001000] tlb_flushall_shift: 6
[ 0.001000] FEATURE SPEC_CTRL Not Present
[ 0.001000] FEATURE IBPB_SUPPORT Not Present
[ 0.001000] Spectre V2 : Vulnerable: Retpoline without IBPB
[ 0.001000] Freeing SMP alternatives: 24k freed
[ 0.001000] ACPI: Core revision 20130517
[ 0.001000] ACPI: All ACPI Tables successfully acquired
[ 0.001000] ftrace: allocating 26649 entries in 105 pages
[ 0.001000] Switched APIC routing to physical x2apic.
[ 0.001000] ------------[ cut here ]------------
[ 0.001000] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/apic/apic.c:1411 setup_local_APIC+0x2d4/0x3e0
[ 0.001000] Modules linked in:
[ 0.001000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-693.21.1.el7_lustre.x86_64 #1
[ 0.001000] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[ 0.001000] Call Trace:
[ 0.001000] [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
[ 0.001000] [<ffffffff8108ae58>] __warn+0xd8/0x100
[ 0.001000] [<ffffffff8108af9d>] warn_slowpath_null+0x1d/0x20
[ 0.001000] [<ffffffff81055124>] setup_local_APIC+0x2d4/0x3e0
[ 0.001000] [<ffffffff81b6c145>] native_smp_prepare_cpus+0x2a6/0x3b4
[ 0.001000] [<ffffffff81b581d7>] kernel_init_freeable+0xc3/0x21f
[ 0.001000] [<ffffffff8169d5e0>] ? rest_init+0x80/0x80
[ 0.001000] [<ffffffff8169d5ee>] kernel_init+0xe/0xf0
[ 0.001000] [<ffffffff816c0577>] ret_from_fork+0x77/0xb0
[ 0.001000] [<ffffffff8169d5e0>] ? rest_init+0x80/0x80
[ 0.001000] ---[ end trace f68728a0d3053b52 ]---

Comment by Yang Sheng [ 27/Mar/20 ]

Please reopen it if hit again.

Generated at Sat Feb 10 02:36:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.