[LU-3201] lmv_locate_mds() must check return of lmv_find_target() Created: 22/Apr/13  Updated: 25/Apr/13  Resolved: 25/Apr/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: John Hammond
Resolution: Fixed Votes: 0
Labels: lmv

Severity: 3
Rank (Obsolete): 7821

 Description   

lmv_locate_mds() calls lmv_find_target() and dereferences the result but does not check for an ERR_PTR().



 Comments   
Comment by John Hammond [ 22/Apr/13 ]

Please see http://review.whamcloud.com/6116.

Comment by John Hammond [ 23/Apr/13 ]

Should have mentioned before but this causes an easily reproducible Oops running racer with DNE:

# MDSCOUNT=2 MOUNT_2=y llmount.sh
# sh ./lustre/tests/racer.sh
Lustre: DEBUG MARKER: == racer test 1: racer on clients: m DURATION=300 == 09:04:48 (1366725888)
Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400):0:mdt
Lustre: cli-ctl-lustre-MDT0000-osp-MDT0001: Allocated super-sequence [0x00000002c0000400-0x0000000300000400):1:mdt]
LustreError: 18097:0:(fld_handler.c:158:fld_server_lookup()) srv-lustre-MDT0000: FLD cache range [0x0000000280000400-0x00000002c0000400):0:mdt does not matchrequested flag ffff8801: rc = -5
LustreError: 19895:0:(lmv_fld.c:78:lmv_fld_lookup()) Error while looking for mds number. Seq 0x280000400, err = -5
BUG: unable to handle kernel NULL pointer dereference at 000000000000002b
IP: [<ffffffffa0578c72>] lmv_locate_mds+0x92/0xb0 [lmv]
PGD 16eb72067 PUD 17abf1067 PMD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/possible
CPU 1 
Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) osd_ldiskfs(U) fsfilt_ldiskfs(U) ldiskfs(U) mdd(U) mgs(U) lquota(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) exportfs jbd sha512_generic sha256_generic autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate microcode virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: libcfs]

Pid: 19895, comm: mv Tainted: P           ---------------    2.6.32-279.19.1.el6_lustre_gcov.x86_64 #1 Bochs Bochs
RIP: 0010:[<ffffffffa0578c72>]  [<ffffffffa0578c72>] lmv_locate_mds+0x92/0xb0 [lmv]
RSP: 0018:ffff88017ea459d8  EFLAGS: 00010282
RAX: fffffffffffffffb RBX: ffff88019931e800 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88018ce1b940
RBP: ffff88017ea459f8 R08: ffffffff81c01b00 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88016babe600
R13: ffff88017d5a6c00 R14: 0000000000000001 R15: ffff88014e83a4c0
FS:  00007f2a796d07a0(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000002b CR3: 000000016eb6f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mv (pid: 19895, threadinfo ffff88017ea44000, task ffff88017ea43500)
Stack:
 0000000000000095 0000000000000080 ffff88016babe600 ffff88019931e2f8
<d> ffff88017ea45a78 ffffffffa058cdc9 ffffffffffffffff ffffffffa0aa727b
<d> 0000000051769500 ffff88018ce1b8c0 ffff880100000030 ffff88017ea45b28
Call Trace:
 [<ffffffffa058cdc9>] lmv_intent_lookup+0x59/0x770 [lmv]
 [<ffffffffa0aa727b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
 [<ffffffffa058e0ba>] lmv_intent_lock+0x31a/0x370 [lmv]
 [<ffffffffa0ee88d0>] ? ll_md_blocking_ast+0x0/0x750 [lustre]
 [<ffffffffa0ee799e>] ? ll_i2gids+0x2e/0xd0 [lustre]
 [<ffffffffa0ece3ca>] ? ll_prep_md_op_data+0xfa/0x3a0 [lustre]
 [<ffffffffa0eecf21>] ll_lookup_it+0x3a1/0xbf0 [lustre]
 [<ffffffffa0ee88d0>] ? ll_md_blocking_ast+0x0/0x750 [lustre]
 [<ffffffffa0eed7fc>] ll_lookup_nd+0x8c/0x430 [lustre]
 [<ffffffff81190087>] ? d_alloc+0x137/0x1b0
 [<ffffffff81185d45>] do_lookup+0x1a5/0x230
 [<ffffffff81186604>] __link_path_walk+0x734/0x1030
 [<ffffffff8113c307>] ? handle_pte_fault+0xf7/0xb50
 [<ffffffff8118718a>] path_walk+0x6a/0xe0
 [<ffffffff8118735b>] do_path_lookup+0x5b/0xa0
 [<ffffffff81187fc7>] user_path_at+0x57/0xa0
 [<ffffffff8126a2e1>] ? cpumask_any_but+0x31/0x50
 [<ffffffff8117cccc>] vfs_fstatat+0x3c/0x80
 [<ffffffff8117ce3b>] vfs_stat+0x1b/0x20
 [<ffffffff8117ce64>] sys_newstat+0x24/0x50
 [<ffffffff810d3d27>] ? audit_syscall_entry+0x1d7/0x200
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Code: 01 48 83 c2 08 39 cf 7f e8 8b 34 25 30 00 00 00 31 c0 41 89 74 24 40 48 83 c4 10 5b 41 5c 
LustreError: 18096:0:(fld_handler.c:158:fld_server_lookup()) srv-lustre-MDT0000: FLD cache range [0x0000000280000400-0x00000002c0000400):0:mdt does not matchrequested flag ffff8801: rc = -5
c9 c3 66 0f 1f 84 00 00 00 00 00 48 98 <8b> 70 30 41 89 74 24 40 48 83 c4 10 5b 41 5c c9 c3 0f 1f 44 00 
RIP  [<ffffffffa0578c72>] lmv_locate_mds+0x92/0xb0 [lmv]
 RSP <ffff88017ea459d8>
CR2: 000000000000002b
Comment by John Hammond [ 25/Apr/13 ]

Patch landed to master.

Generated at Sat Feb 10 01:31:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.