[LU-2186] seq_server_alloc_meta() NULL deref Created: 15/Oct/12  Updated: 02/Jan/13  Resolved: 08/Nov/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Brian Behlendorf Assignee: Alex Zhuravlev
Resolution: Fixed Votes: 0
Labels: topsequoia
Environment:

RHEL6.2 the Sequoia MDS


Issue Links:
Duplicate
is duplicated by LU-2256 NULL pointer dereference in seq_serve... Resolved
Severity: 3
Epic: server
Project: Orion
Rank (Obsolete): 5228

 Description   

Observed running a current version of master, 2.3.53.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffffa06ad93e>] seq_server_alloc_meta+0x51e/0x700 [fid]
PGD 0 
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:80/0000:80:02.2/0000:83:00.0/host7/port-7:0/expander-7:0/port-7:0:13/end_device-7:0:13/target7:0:17/7:0:17:0/timeout
CPU 9 

Pid: 33477, comm: mdt_mdss_0003 Tainted: P        W  ----------------   2.6.32-220.23.1.1chaos.ch5.x86_64 #1 appro 2620x-in/S2600GZ
RIP: 0010:[<ffffffffa06ad93e>]  [<ffffffffa06ad93e>] seq_server_alloc_meta+0x51e/0x700 [fid]
RSP: 0018:ffff881fa8007ca0  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000200003e98 RCX: 0000000200004280
RDX: 00000000000003e8 RSI: ffff881faadb40c0 RDI: ffff880fcc9e9500
RBP: ffff881fa8007ce0 R08: 0000000000000000 R09: ffff881e0ee63e00
R10: 0000000000000009 R11: ffffffffa09e2090 R12: ffff881e0ee63fe8
R13: ffff881faadb4130 R14: ffff881faadb40c0 R15: ffff880fcc9e9500
FS:  00007ffff7fdc700(0000) GS:ffff881078820000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000001a85000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 
Process mdt_mdss_0003 (pid: 33477, threadinfo ffff881fa8006000, task ffff8820150b8080)
Stack:
 ffff881fa8007cb0 ffff881a5019a400 ffff880fcc9e9b40 ffff881a5019a400
<0> ffff880fcc9e9b40 ffff880fcc9e9500 ffff881e0ee63fe8 00000000ffffffea
<0> ffff881fa8007d30 ffffffffa06ade9f ffff881fa8007d10 ffffc900c2888988
Call Trace:
 [<ffffffffa06ade9f>] seq_query+0x37f/0x6d0 [fid]
 [<ffffffffa0f39322>] mdt_handle_common+0x932/0x1760 [mdt]
 [<ffffffffa0f3a1c5>] mdt_mdss_handle+0x15/0x20 [mdt]
 [<ffffffffa0948bfc>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
 [<ffffffffa05b26be>] ? cfs_timer_arm+0xe/0x10 [libcfs]
 [<ffffffffa05c414f>] ? lc_watchdog_touch+0x6f/0x180 [libcfs]
 [<ffffffffa093ffb9>] ? ptlrpc_wait_event+0xa9/0x2a0 [ptlrpc]
 [<ffffffff81051ba3>] ? __wake_up+0x53/0x70
 [<ffffffffa094a1ec>] ptlrpc_main+0xc0c/0x19f0 [ptlrpc]
 [<ffffffffa09495e0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
 [<ffffffff8100c14a>] child_rip+0xa/0x20
 [<ffffffffa09495e0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
 [<ffffffffa09495e0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
 [<ffffffff8100c140>] ? child_rip+0x0/0x20


 Comments   
Comment by Peter Jones [ 15/Oct/12 ]

Alex

Could you please assign someone to look into this one?

Thanks

Peter

Comment by Brian Behlendorf [ 15/Oct/12 ]
(gdb) list *(seq_server_alloc_meta+0x51e)
0x196e is in seq_server_alloc_meta (/builddir/build/BUILD/lustre-2.3.53/lustre/fid/fid_handler.c:211).
206     /builddir/build/BUILD/lustre-2.3.53/lustre/fid/fid_handler.c: No such file or directory.
        in /builddir/build/BUILD/lustre-2.3.53/lustre/fid/fid_handler.c


        if (range_is_exhausted(loset)) {
                /* reached high water mark. */
>>>             struct lu_device *dev = seq->lss_site->ms_lu->ls_top_dev;
                int obd_num_clients = dev->ld_obd->obd_num_exports;
                __u64 set_sz;
        }

It looks like seq->lss_site->ms_lu = NULL. At least that's consistent with what offset in the NULL deref and is roughly where gdb pointed me. How that can happen I'm not sure.

Comment by Alex Zhuravlev [ 16/Oct/12 ]

please try with http://review.whamcloud.com/4280

Comment by Christopher Morrone [ 31/Oct/12 ]

I added patch 4280 to our 2.3.54-llnl branch.

Comment by Alex Zhuravlev [ 08/Nov/12 ]

landed on master

Generated at Sat Feb 10 01:23:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.