[LU-8467] MDS crashed with (tgt_lastrcvd.c:1054:tgt_client_del()) LBUG Created: 02/Aug/16 Updated: 17/Oct/16 Resolved: 17/Oct/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Frank Heckes (Inactive) | Assignee: | Mikhail Pershin |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | soak | ||
| Environment: |
lola |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
Error happened during soaktesting of build '20160727' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160727) The issue is eventually a duplicate of https://jira.hpdd.intel.com/browse/LU-8165 Sequence of events:
I couldn't extract the debug from kernel dump KERNEL: usr/lib/debug/lib/modules/2.6.32-573.26.1.el6_lustre.x86_64/vmlinux
DUMPFILE: 127.0.0.1-2016-08-01-10:04:00/vmcore [PARTIAL DUMP]
CPUS: 32
DATE: Mon Aug 1 10:03:45 2016
UPTIME: 2 days, 19:40:09
LOAD AVERAGE: 16.98, 16.25, 16.97
TASKS: 1536
NODENAME: lola-9.lola.whamcloud.com
RELEASE: 2.6.32-573.26.1.el6_lustre.x86_64
VERSION: #1 SMP Tue Jul 26 04:04:13 PDT 2016
MACHINE: x86_64 (2693 Mhz)
MEMORY: 31.9 GB
PANIC: "Kernel panic - not syncing: LBUG"
PID: 6208
COMMAND: "ll_evictor"
TASK: ffff880413fe2040 [THREAD_INFO: ffff880413fec000]
CPU: 25
STATE: TASK_RUNNING (PANIC)
crash> extend /scratch/crash_lustre/lustre.so
/scratch/crash_lustre/lustre.so: shared object loaded
crash> lustre -l /scratch/lola-9-latest-crash.bin
lustre_walk_cpus(0, 5, 1)
cmd p (*cfs_trace_data[0])[0].tcd.tcd_cur_pages // p (*cfs_trace_data[0])[0].tcd.tcd_pages.next
lustre: gdb request failed: "p (*cfs_trace_data[0])[0].tcd.tcd_cur_pages"
Attached files: |
| Comments |
| Comment by Frank Heckes (Inactive) [ 02/Aug/16 ] |
|
Crash file has been saved to lhn.lola.hpdd.intel.com:/scratch/crashdumps/lu-8467/lola-9/127.0.0.1-2016-08-01-10\:04\:00/ |
| Comment by Peter Jones [ 10/Sep/16 ] |
|
Does this issue still occur now |
| Comment by Cliff White (Inactive) [ 17/Oct/16 ] |
|
We have not had a re-appearance of this issue since running tip of 2.9, continuing to test |
| Comment by Peter Jones [ 17/Oct/16 ] |
|
ok then let's close out the ticket for now and reopen if it ever does reoccur |