Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.10.4
-
None
-
3
-
9223372036854775807
Description
Experiencing LRU kernel panics and reboot of systems.
On terminal at kernel panic the following is displayed:
kernel:[1333527.166678] LustreError: 68:0:(cl_page.c:410:cl_vmpage_page()) ASSERTION( page->cp_type == CPT_CACHEABLE ) failed:
This has happened several different systems:
KERNEL: /usr/lib/debug/lib/modules/3.10.0-514.el7.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2018-11-07-19:19:44/vmcore [PARTIAL DUMP]
CPUS: 8
DATE: Wed Nov 7 19:18:32 2018
UPTIME: 15 days, 10:25:29
LOAD AVERAGE: 4.80, 4.70, 3.88
TASKS: 635
NODENAME: scdm1804.jlab.org
RELEASE: 3.10.0-514.el7.x86_64
VERSION: #1 SMP Tue Nov 22 16:42:41 UTC 2016
MACHINE: x86_64 (3600 Mhz)
MEMORY: 95.4 GB
PANIC: "Kernel panic - not syncing: LBUG"
PID: 68
COMMAND: "khugepaged"
TASK: ffff880c402aedd0 [THREAD_INFO: ffff880c3c294000]
CPU: 7
STATE: TASK_RUNNING (PANIC)
[1333527.166678] LustreError: 68:0:(cl_page.c:410:cl_vmpage_page()) ASSERTION( page->cp_type == CPT_CACHEABLE ) failed:
[1333527.167005] LustreError: 68:0:(cl_page.c:410:cl_vmpage_page()) LBUG
[1333527.167173] Pid: 68, comm: khugepaged
[1333527.167174]
Call Trace:
[1333527.167193] [<ffffffffa0ac27ee>] libcfs_call_trace+0x4e/0x60 [libcfs]
[1333527.167200] [<ffffffffa0ac287c>] lbug_with_loc+0x4c/0xb0 [libcfs]
[1333527.167234] [<ffffffffa0c8f870>] cl_page_slice_add+0x0/0x140 [obdclass]
[1333527.167261] [<ffffffffa109dba3>] ll_releasepage+0x73/0x1a0 [lustre]
[1333527.167266] [<ffffffff81180462>] try_to_release_page+0x32/0x50
[1333527.167269] [<ffffffff811953a0>] shrink_page_list+0x950/0xb00
[1333527.167273] [<ffffffff81195bda>] shrink_inactive_list+0x1fa/0x630
[1333527.167276] [<ffffffff81196775>] shrink_lruvec+0x385/0x770
[1333527.167279] [<ffffffff810c4e83>] ? wake_up_process+0x23/0x40
[1333527.167282] [<ffffffff81196bd6>] shrink_zone+0x76/0x1a0
[1333527.167285] [<ffffffff81196f6d>] zone_reclaim+0x26d/0x2f0
[1333527.167288] [<ffffffff8118a424>] get_page_from_freelist+0x2c4/0x9f0
[1333527.167292] [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
[1333527.167295] [<ffffffff8168b070>] ? __schedule+0x3b0/0x990
[1333527.167298] [<ffffffff8118acc6>] __alloc_pages_nodemask+0x176/0x420
[1333527.167300] [<ffffffff8118e920>] ? __pagevec_lru_add_fn+0x0/0x220
[1333527.167303] [<ffffffff811e8983>] khugepaged_scan_mm_slot+0x433/0xc70
[1333527.167306] [<ffffffff811e9417>] khugepaged+0x257/0x480
[1333527.167310] [<ffffffff810b1600>] ? autoremove_wake_function+0x0/0x40
[1333527.167312] [<ffffffff811e91c0>] ? khugepaged+0x0/0x480
[1333527.167315] [<ffffffff810b052f>] kthread+0xcf/0xe0
[1333527.167317] [<ffffffff810b0460>] ? kthread+0x0/0xe0
[1333527.167321] [<ffffffff81696518>] ret_from_fork+0x58/0x90
[1333527.167324] [<ffffffff810b0460>] ? kthread+0x0/0xe0
[1333527.167326]
[1333527.167327] Kernel panic - not syncing: LBUG
[1333527.167490] CPU: 7 PID: 68 Comm: khugepaged Tainted: G W OE ------------ 3.10.0-514.el7.x86_64 #1
[1333527.167812] Hardware name: Supermicro SYS-2029BT-HNR/X11DPT-B, BIOS 2.0b 02/24/2018
[1333527.168125] ffffffffa0ae0e8b 00000000a7bd8849 ffff880c3c2976c8 ffffffff81685fac
[1333527.168453] ffff880c3c297748 ffffffff8167f3b3 ffffffff00000008 ffff880c3c297758
[1333527.168774] ffff880c3c2976f8 00000000a7bd8849 00000000a7bd8849 0000000000000246
[1333527.169096] Call Trace:
[1333527.169252] [<ffffffff81685fac>] dump_stack+0x19/0x1b
[1333527.169417] [<ffffffff8167f3b3>] panic+0xe3/0x1f2
[1333527.169586] [<ffffffffa0ac2894>] lbug_with_loc+0x64/0xb0 [libcfs]
[1333527.169787] [<ffffffffa0c8f870>] cl_vmpage_page+0x140/0x140 [obdclass]
[1333527.169969] [<ffffffffa109dba3>] ll_releasepage+0x73/0x1a0 [lustre]
[1333527.170138] [<ffffffff81180462>] try_to_release_page+0x32/0x50
[1333527.170305] [<ffffffff811953a0>] shrink_page_list+0x950/0xb00
[1333527.170471] [<ffffffff81195bda>] shrink_inactive_list+0x1fa/0x630
[1333527.170639] [<ffffffff81196775>] shrink_lruvec+0x385/0x770
[1333527.170804] [<ffffffff810c4e83>] ? wake_up_process+0x23/0x40
[1333527.170971] [<ffffffff81196bd6>] shrink_zone+0x76/0x1a0
[1333527.171135] [<ffffffff81196f6d>] zone_reclaim+0x26d/0x2f0
[1333527.171300] [<ffffffff8118a424>] get_page_from_freelist+0x2c4/0x9f0
[1333527.171469] [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
[1333527.171634] [<ffffffff8168b070>] ? __schedule+0x3b0/0x990
[1333527.171799] [<ffffffff8118acc6>] __alloc_pages_nodemask+0x176/0x420
[1333527.171967] [<ffffffff8118e920>] ? lru_deactivate_fn+0x1d0/0x1d0
[1333527.172134] [<ffffffff811e8983>] khugepaged_scan_mm_slot+0x433/0xc70
[1333527.172303] [<ffffffff811e9417>] khugepaged+0x257/0x480
[1333527.172468] [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
[1333527.172633] [<ffffffff811e91c0>] ? khugepaged_scan_mm_slot+0xc70/0xc70
[1333527.172813] [<ffffffff810b052f>] kthread+0xcf/0xe0
[1333527.172975] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[1333527.173142] [<ffffffff81696518>] ret_from_fork+0x58/0x90
[1333527.173306] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140