[LU-2820] Crash in lmv_readpage Created: 15/Feb/13 Updated: 09/Jan/20 Resolved: 09/Jan/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Oleg Drokin | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 6825 |
| Description |
|
Just had a dbench fail in lmv_readpage: #0 lu_dirent_next (ent=0xffff880025404387)
at /home/green/git/lustre-release/lustre/include/lustre/lustre_idl.h:957
#1 lmv_readpage (exp=<optimized out>, op_data=<optimized out>,
pages=<optimized out>, request=<optimized out>)
at /home/green/git/lustre-release/lustre/lmv/lmv_obd.c:1955
#2 0xffffffffa0dd4bc0 in md_readpage (request=0xffff880055041ba0,
pages=0xffff88008e7d87f0, opdata=0xffff88003fb9edf0, exp=0xffff8800900b6bf0)
at /home/green/git/lustre-release/lustre/include/obd_class.h:2052
#3 ll_dir_filler (_hash=<optimized out>, page0=0xffffea0000825f90)
at /home/green/git/lustre-release/lustre/llite/dir.c:188
#4 0xffffffff811142db in __read_cache_page (gfp=<optimized out>,
data=<optimized out>, filler=<optimized out>, index=<optimized out>,
mapping=<optimized out>) at mm/filemap.c:1771
#5 do_read_cache_page (mapping=0xffff88005461bc70, index=18446744073709551615,
filler=0xffffffffa0dd4920 <ll_dir_filler>, data=0xffff880055041d60,
gfp=<optimized out>) at mm/filemap.c:1791
#6 0xffffffff8111443c in read_cache_page_async (mapping=<optimized out>,
index=<optimized out>, filler=<optimized out>, data=<optimized out>)
at mm/filemap.c:1837
#7 0xffffffff8111444e in read_cache_page (mapping=<optimized out>,
index=<optimized out>, filler=<optimized out>, data=<optimized out>)
at mm/filemap.c:1894
#8 0xffffffffa0dd276d in ll_get_dir_page (dir=0xffff88005461bb08, hash=0, chain=<optimized out>)
at /home/green/git/lustre-release/lustre/llite/dir.c:417
#9 0xffffffffa0dd3387 in ll_dir_read (inode=0xffff88005461bb08,
_pos=0xffff880055041ea0, cookie=0xffff880055041f38,
filldir=0xffffffff8118f230 <filldir>)
at /home/green/git/lustre-release/lustre/llite/dir.c:492
#10 0xffffffffa0dd3749 in ll_readdir (filp=0xffff8800824d3f08,
cookie=0xffff880055041f38, filldir=0xffffffff8118f230 <filldir>)
at /home/green/git/lustre-release/lustre/llite/dir.c:616
#11 0xffffffff8118f4c0 in vfs_readdir (file=0xffff8800824d3f08,
filler=0xffffffff8118f230 <filldir>, buf=0xffff880055041f38)
at fs/readdir.c:39
#12 0xffffffff8118f6b9 in sys_getdents (fd=<optimized out>, dirent=0xbcc068,
count=32768) at fs/readdir.c:213
#13 0xffffffff8100b0f2 in system_call_fastpath ()
at arch/x86/kernel/entry_64.S:488
#14 0x0000000000000246 in per_cpu__irq_stack_union ()
Cannot access memory at address 0xffffffffffffffb0
The reason it failed is because dir entry is outside of mapped area: (gdb) p tmp
$1 = (struct lu_dirent *) 0xffff880025404387
(gdb) p ent
$2 = (struct lu_dirent *) 0xffff880025404387
(gdb) p ent->lde_reclen
Cannot access memory at address 0xffff88002540439f <-- Not mapped
(gdb) p dp
$6 = (struct lu_dirpage *) 0xffff8800253fd000
(gdb) p *dp
$8 = {ldp_hash_start = 21410096, ldp_hash_end = 21410112, ldp_flags = 21337584,
ldp_pad0 = 0, ldp_entries = 0xffff8800253fd018}
(gdb) p ent
$9 = (struct lu_dirent *) 0xffff880025404387
(gdb) p 0xffff880025404387-0xffff8800253fd018
$10 = 29551
(gdb) p LDF_EMPTY
$11 = LDF_EMPTY
(gdb) p dp->ldp_flags & LDF_EMPTY
$12 = 0
(gdb) p ((struct lu_dirent *)0xffff8800253fd018)->lde_reclen
$13 = 29551
So the directory entry claims it is 29k long which does not make too much sense to me. There was a bunch of watchdog firings on OSTs right before this happened, but I don't think it's related. |
| Comments |
| Comment by Oleg Drokin [ 15/Feb/13 ] |
|
I also have a crashdump if needed |
| Comment by Andreas Dilger [ 09/Jan/20 ] |
|
Close old bug |