[LU-7866] BUG: unable to handle kernel NULL pointer dereference at (null) - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Unresolved
Priority: Blocker
Fix Version/s: None
Affects Version/s: Lustre 2.8.0
Labels:
- patch
- soak
Environment:
lola
build: https://build.hpdd.intel.com/job/lustre-b2_8/12/

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

Error occurred during soak testing of build '20160309' (b2_8 RC5) (see: https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160309 also). DNE is enabled. MDTs had been formatted using ldiskfs, OSTs using zfs. MDS nodes are configured in active - active HA failover configuration. (For teset set-up configuration see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration)

Sequence of events:

mds_restart : 2016-03-11 03:41:05,597 - 2016-03-11 03:54:49,109 lola-8

2016-03-11 03:56 Lustre client lola-32 crashed with the following error:

<1>BUG: unable to handle kernel NULL pointer dereference at (null)
<1>IP: [<ffffffffa0a0241f>] ll_open_cleanup+0xaf/0x600 [lustre]
<4>PGD 38a867067 PUD 775372067 PMD 0
<4>Oops: 0000 [#1] SMP
<4>last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:06:00.0/infiniband_mad/umad0/port
<4>CPU 1
<4>Modules linked in: osc(U) mgc(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) nfsi]
<4>
<4>Pid: 201682, comm: simul Not tainted 2.6.32-504.30.3.el6.x86_64 #1 Intel Corporation S2600GZ/S2600GZ
<4>RIP: 0010:[<ffffffffa0a0241f>]  [<ffffffffa0a0241f>] ll_open_cleanup+0xaf/0x600 [lustre]
<4>RSP: 0018:ffff8801d3f298e8  EFLAGS: 00010286
<4>RAX: 0000000000000000 RBX: ffff8807f7271a00 RCX: ffff88102dd13ca0
<4>RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88102f395000
<4>RBP: ffff8801d3f29928 R08: ffff88034f47e9c0 R09: 0000000000000000
<4>R10: 0000000000000010 R11: 0000000000000000 R12: 0000000000000000
<4>R13: ffff88082dd02c00 R14: ffff88081a5fc800 R15: ffff8801d3f29988
<4>FS:  00007f72f89d3700(0000) GS:ffff880045e20000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>CR2: 0000000000000000 CR3: 00000003b89f2000 CR4: 00000000001407e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process simul (pid: 201682, threadinfo ffff8801d3f28000, task ffff8804c6d73520)
<4>Stack:
<4> ffff8801d3f298f8 0000000000000000 ffff8801d3f29928 0000000000000001
<4><d> ffff88082dd02c00 ffff88081a5fc800 ffff8809648898c0 ffff8801d3f29988
<4><d> ffff8801d3f299f8 ffffffffa0a09c4a fffffffffffffffb 00ff880a531788c0
<4>Call Trace:
<4> [<ffffffffa0a09c4a>] ll_prep_inode+0x20a/0xc40 [lustre]
<4> [<ffffffffa07804b2>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]
<4> [<ffffffffa0a214f0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre]
<4> [<ffffffffa0a21fe1>] ll_lookup_it_finish+0x321/0x12e0 [lustre]
<4> [<ffffffffa0723dd0>] ? ldlm_lock_decref_internal+0x2e0/0xa80 [ptlrpc]
<4> [<ffffffffa0569925>] ? class_handle2object+0x95/0x190 [obdclass]
<4> [<ffffffff81174ab3>] ? kmem_cache_alloc_trace+0x1b3/0x1c0
<4> [<ffffffffa0a1e439>] ? ll_i2suppgid+0x19/0x30 [lustre]
<4> [<ffffffffa0a1e47e>] ? ll_i2gids+0x2e/0xd0 [lustre]
<4> [<ffffffffa0a214f0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre]
<4> [<ffffffffa0a23226>] ll_lookup_it+0x286/0xda0 [lustre]
<4> [<ffffffffa0a23dc9>] ll_lookup_nd+0x89/0x4f0 [lustre]
<4> [<ffffffff8119e055>] do_lookup+0x1a5/0x230
<4> [<ffffffff8119ece4>] __link_path_walk+0x7a4/0x1000
<4> [<ffffffff8119f7fa>] path_walk+0x6a/0xe0
<4> [<ffffffff8119fa0b>] filename_lookup+0x6b/0xc0
<4> [<ffffffff8122daa6>] ? security_file_alloc+0x16/0x20
<4> [<ffffffff811a0ee4>] do_filp_open+0x104/0xd20
<4> [<ffffffff81063c63>] ? perf_event_task_sched_out+0x33/0x70
<4> [<ffffffff8129943a>] ? strncpy_from_user+0x4a/0x90
<4> [<ffffffff811ae392>] ? alloc_fd+0x92/0x160
<4> [<ffffffff8118b157>] do_sys_open+0x67/0x130
<4> [<ffffffff8118b260>] sys_open+0x20/0x30
<4> [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
<4>Code: ba 38 01 00 00 31 f6 e8 30 62 b6 ff f6 05 fd 6b a9 ff 10 74 0d 80 3d f0 6b a9 ff 00 0f 88 ba 01 00 00 48 85 db 0f 84 16 02 00 00 <49> 8b 04 24 48 89 03 49
<1>RIP  [<ffffffffa0a0241f>] ll_open_cleanup+0xaf/0x600 [lustre]
<4> RSP <ffff8801d3f298e8>
<4>CR2: 0000000000000000

Attached client (lola-32) message, console and vmcore-dmesg.txt file.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

console-loa-32.log.bz2
96 kB
11/Mar/16 7:30 PM
lola-32-vmcore-dmesg.txt.bz2
30 kB
11/Mar/16 7:30 PM
messages-lola-32.log.bz2
354 kB
11/Mar/16 7:32 PM

BUG: unable to handle kernel NULL pointer dereference at (null)

Details

Description

Attachments

Attachments

Activity

People

Dates