Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.3.0
-
lustre 2.2.93
bullxlinux distribution (based on redhat 6.2)
kernel 2.6.32-220
-
3
-
4442
Description
The lustre version is 2.2.93.
When reading the file /proc/fs/lustre/ost/OSS/ost_create/req_history, the system crashed with LBUG ASSERTION( !list_empty(&svcpt->scp_hist_reqs).
Here are some information from the core dump
KERNEL: /usr/lib/debug/lib/modules/2.6.32-220.23.1.bl6.Bull.28.8.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2012-09-07-09:51:01/vmcore [PARTIAL DUMP]
CPUS: 16
DATE: Fri Sep 7 09:50:45 2012
UPTIME: 1 days, 19:58:49
LOAD AVERAGE: 0.05, 0.05, 0.05
TASKS: 1006
NODENAME: mo88
RELEASE: 2.6.32-220.23.1.bl6.Bull.28.8.x86_64
VERSION: #1 SMP Thu Jul 5 17:34:18 CEST 2012
MACHINE: x86_64 (2199 Mhz)
MEMORY: 32 GB
PANIC: "Kernel panic - not syncing: LBUG"
PID: 29617
COMMAND: "cat"
TASK: ffff8806e65437d0 [THREAD_INFO: ffff8804dbf1c000]
CPU: 9
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 29617 TASK: ffff8806e65437d0 CPU: 9 COMMAND: "cat"
#0 [ffff8804dbf1fbf0] machine_kexec at ffffffff8102895b
#1 [ffff8804dbf1fc50] crash_kexec at ffffffff810a4622
#2 [ffff8804dbf1fd20] panic at ffffffff81484647
#3 [ffff8804dbf1fda0] lbug_with_loc at ffffffffa0680f6b [libcfs]
#4 [ffff8804dbf1fdc0] ptlrpc_lprocfs_svc_req_history_seek at ffffffffa0c30104 [ptlrpc]
#5 [ffff8804dbf1fdd0] ptlrpc_lprocfs_svc_req_history_next at ffffffffa0c301e1 [ptlrpc]
#6 [ffff8804dbf1fe20] seq_read at ffffffff81185e9a
#7 [ffff8804dbf1fea0] proc_reg_read at ffffffff811c84ee
#8 [ffff8804dbf1fef0] vfs_read at ffffffff81163a15
#9 [ffff8804dbf1ff30] sys_read at ffffffff81163b51
#10 [ffff8804dbf1ff80] system_call_fastpath at ffffffff810030f2
RIP: 0000003dc64d83f0 RSP: 00007fff6cb0c9e0 RFLAGS: 00010206
RAX: 0000000000000000 RBX: ffffffff810030f2 RCX: 00000000024a7030
RDX: 0000000000008000 RSI: 000000000249f000 RDI: 0000000000000003
RBP: 000000000249f000 R8: 0000000000000003 R9: 0000000001000000
R10: 0000000000008fff R11: 0000000000000246 R12: ffffffffffff8000
R13: 0000000000000003 R14: 0000000000008000 R15: 0000000000000003
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
crash> dmesg | tail -n 50
Lustre: fsperf-OST0005: Now serving fsperf-OST0005 on /dev/dm-11 with recovery enabled
Lustre: 27386:0:(ldlm_lib.c:2110:target_recovery_init()) RECOVERY: service fsperf-OST000a, 1 recoverable clients, last_transno 1340929
Lustre: 27386:0:(ldlm_lib.c:2110:target_recovery_init()) Skipped 3 previous similar messages
Lustre: fsperf-OST000a: Now serving fsperf-OST000a on /dev/dm-26 with recovery enabled
Lustre: Skipped 3 previous similar messages
Lustre: 27419:0:(ldlm_lib.c:2110:target_recovery_init()) RECOVERY: service fsperf-OST0001, 1 recoverable clients, last_transno 1340929
Lustre: 27419:0:(ldlm_lib.c:2110:target_recovery_init()) Skipped 6 previous similar messages
Lustre: fsperf-OST0001: Now serving fsperf-OST0001 on /dev/dm-16 with recovery enabled
Lustre: Skipped 6 previous similar messages
LustreError: 137-5: UUID 'fsperf-OST000f_UUID' is not available for connect (no target)
Lustre: fsperf-OST0001: Will be in recovery for at least 5:00, or until 1 client reconnects
Lustre: fsperf-OST000b: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted.
Lustre: fsperf-OST000b: received MDS connection from 32.0.0.39@o2ib1
Lustre: Skipped 14 previous similar messages
Lustre: fsperf-OST0006: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted.
Lustre: fsperf-OST000e: received MDS connection from 32.0.0.39@o2ib1
Lustre: Skipped 13 previous similar messages
Lustre: Echo OBD driver; http://www.lustre.org/
mlx4_core 0000:04:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.
mlx4_core 0000:82:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.
process `cat' is using deprecated sysctl (syscall) net.ipv6.neigh.default.retrans_time; Use net.ipv6.neigh.default.retrans_time_ms instead.
LustreError: 29617:0:(lproc_ptlrpc.c:431:ptlrpc_lprocfs_svc_req_history_seek()) ASSERTION( !list_empty(&svcpt->scp_hist_reqs) ) failed:
LustreError: 29617:0:(lproc_ptlrpc.c:431:ptlrpc_lprocfs_svc_req_history_seek()) LBUG
Pid: 29617, comm: cat
Call Trace:
[<ffffffffa0680905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa0680f17>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa0c30104>] ptlrpc_lprocfs_svc_req_history_seek+0xf4/0x100 [ptlrpc]
[<ffffffffa0c301e1>] ptlrpc_lprocfs_svc_req_history_next+0x71/0x1b0 [ptlrpc]
[<ffffffff81185e9a>] seq_read+0x24a/0x3f0
[<ffffffff811c84ee>] proc_reg_read+0x7e/0xc0
[<ffffffff81163a15>] vfs_read+0xb5/0x1a0
[<ffffffff810c0e1a>] ? audit_syscall_entry+0x26a/0x290
[<ffffffff81163b51>] sys_read+0x51/0x90
[<ffffffff810030f2>] system_call_fastpath+0x16/0x1b
Kernel panic - not syncing: LBUG
Pid: 29617, comm: cat Not tainted 2.6.32-220.23.1.bl6.Bull.28.8.x86_64 #1
Call Trace:
[<ffffffff81484640>] ? panic+0x78/0x143
[<ffffffffa0680f6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
[<ffffffffa0c30104>] ? ptlrpc_lprocfs_svc_req_history_seek+0xf4/0x100 [ptlrpc]
[<ffffffffa0c301e1>] ? ptlrpc_lprocfs_svc_req_history_next+0x71/0x1b0 [ptlrpc]
[<ffffffff81185e9a>] ? seq_read+0x24a/0x3f0
[<ffffffff811c84ee>] ? proc_reg_read+0x7e/0xc0
[<ffffffff81163a15>] ? vfs_read+0xb5/0x1a0
[<ffffffff810c0e1a>] ? audit_syscall_entry+0x26a/0x290
[<ffffffff81163b51>] ? sys_read+0x51/0x90
[<ffffffff810030f2>] ? system_call_fastpath+0x16/0x1b
crash> files
PID: 29617 TASK: ffff8806e65437d0 CPU: 9 COMMAND: "cat"
ROOT: / CWD: /root
FD FILE DENTRY INODE TYPE PATH
0 ffff88045cba6180 ffff880217b5b800 ffff88046a2e89c8 FIFO
1 ffff88045cba6600 ffff880217b5bbc0 ffff8802ace6e148 FIFO
2 ffff88045cba6cc0 ffff880217b5b380 ffff88023fd6a048 FIFO
3 ffff880872bc5240 ffff880519514480 ffff880874370d38 REG /proc/fs/lustre/ost/OSS/ost_create/req_history
I can provide additional information from the dump if needed.