[LU-4360] Use after free in ksocklnd Created: 09/Dec/13 Updated: 11/Mar/14 Resolved: 23/Jan/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.1, Lustre 2.4.3 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Maloo | Assignee: | Emoly Liu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 11940 | ||||
| Description |
|
This issue was created by maloo for Oleg Drokin <green@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/d275a9d4-5eb4-11e3-ae30-52540035b04c. The sub-test lustre-initialization_1 failed with the following error:
Info required for matching: lustre-initialization-1 lustre-initialization_1 Testing with DEBUG_PAGEALLOC enabled frequently OST fails to starup hitting an oops: 11:23:12:[ 289.592410] BUG: unable to handle kernel paging request at ffff880013a91e30 11:23:12:[ 289.593112] IP: [<ffffffffa09ef395>] ksocknal_send+0x165/0x450 [ksocklnd] 11:23:12:[ 289.593691] PGD 1a26063 PUD 1a2a063 PMD 19d067 PTE 13a91060 11:23:12:[ 289.594169] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC 11:23:12:[ 289.594557] last sysfs file: /sys/devices/system/cpu/possible 11:23:12:[ 289.595026] CPU 0 11:23:12:[ 289.595188] Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd exportfs autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 11:23:12:[ 289.596009] 11:23:12:[ 289.596009] Pid: 5138, comm: ll_cfg_requeue Not tainted 2.6.32-358.23.2.el6_lustre.g3ddc521.x86_64 #1 Red Hat KVM 11:23:12:[ 289.596009] RIP: 0010:[<ffffffffa09ef395>] [<ffffffffa09ef395>] ksocknal_send+0x165/0x450 [ksocklnd] 11:23:12:[ 289.596009] RSP: 0018:ffff88002e315880 EFLAGS: 00010286 11:23:12:[ 289.596009] RAX: 0000000000000000 RBX: ffff880013a79ef0 RCX: 0000000000000009 11:23:12:[ 289.596009] RDX: ffff880002200000 RSI: 0000000000000292 RDI: ffffffffa09feb38 11:23:12:[ 289.596009] RBP: ffff88002e3158f0 R08: 0000000000000000 R09: 0000000000000000 11:23:12:[ 289.596009] R10: ffff8800148179e0 R11: 0000000000000000 R12: ffff880013a91df0 11:23:12:[ 289.596009] R13: 0000000000000001 R14: ffff880015c6e740 R15: 0000000000000148 11:23:12:[ 289.596009] FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 11:23:12:[ 289.596009] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 11:23:12:[ 289.596009] CR2: ffff880013a91e30 CR3: 000000001541c000 CR4: 00000000000006f0 11:23:12:[ 289.596009] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 11:23:12:[ 289.596009] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 11:23:12:[ 289.596009] Process ll_cfg_requeue (pid: 5138, threadinfo ffff88002e314000, task ffff88002e3121c0) 11:23:12:[ 289.596009] Stack: 11:23:12:[ 289.596009] ffff8800350d9001 0000000113a91df0 0000000000000000 ffff8800346a4df0 11:23:12:[ 289.596009] <d> 0000000000000000 000000e800000000 000200000a0a11cc 0000000000003039 11:23:12:[ 289.596009] <d> ffff88002e315910 ffff8800346a4df0 ffff880013a91df0 ffff880013a91df0 11:23:12:[ 289.596009] Call Trace: 11:23:12:[ 289.596009] [<ffffffffa0532dfb>] lnet_ni_send+0x4b/0xf0 [lnet] 11:23:12:[ 289.596009] [<ffffffffa0537005>] lnet_send+0x655/0xb80 [lnet] 11:23:12:[ 289.596009] [<ffffffffa053806a>] LNetPut+0x31a/0x860 [lnet] 11:23:12:[ 289.596009] [<ffffffffa0808dc0>] ptl_send_buf+0x1e0/0x550 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffff8103b82c>] ? kvm_clock_read+0x1c/0x20 11:23:12:[ 289.596009] [<ffffffffa080c2bd>] ptl_send_rpc+0x4dd/0xcc0 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa0800994>] ptlrpc_send_new_req+0x454/0x7c0 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa08061fe>] ptlrpc_set_wait+0x5ce/0x830 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa08101c6>] ? lustre_msg_set_jobid+0xb6/0x140 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa08064e7>] ptlrpc_queue_wait+0x87/0x220 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa07e1ef5>] ldlm_cli_enqueue+0x365/0x790 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa07e6bd0>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa0db0080>] ? mgc_blocking_ast+0x0/0x810 [mgc] 11:23:12:[ 289.596009] [<ffffffffa0db5633>] mgc_process_log+0x283/0x17f0 [mgc] 11:23:12:[ 289.596009] [<ffffffff81080722>] ? del_timer_sync+0x22/0x30 11:23:12:[ 289.596009] [<ffffffffa0db0080>] ? mgc_blocking_ast+0x0/0x810 [mgc] 11:23:12:[ 289.596009] [<ffffffffa07e6bd0>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc] 11:23:12:[ 289.596009] [<ffffffffa0487951>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 11:23:12:[ 289.596009] [<ffffffffa0db775a>] mgc_requeue_thread+0x31a/0x7a0 [mgc] 11:23:12:[ 289.596009] [<ffffffff8105bca0>] ? default_wake_function+0x0/0x20 11:23:12:[ 289.596009] [<ffffffffa0db7440>] ? mgc_requeue_thread+0x0/0x7a0 [mgc] 11:23:12:[ 289.596009] [<ffffffff81095696>] kthread+0x96/0xa0 11:23:12:[ 289.596009] [<ffffffff8100c10a>] child_rip+0xa/0x20 11:23:12:[ 289.596009] [<ffffffff81095600>] ? kthread+0x0/0xa0 11:23:12:[ 289.596009] [<ffffffff8100c100>] ? child_rip+0x0/0x20 11:23:12:[ 289.596009] Code: 00 c7 43 60 c1 00 00 00 48 89 de 48 c7 43 70 00 00 00 00 48 c7 43 68 00 00 00 00 48 8b 55 c0 8b 4d c8 48 8b 7d a8 e8 7b fa ff ff <41> f6 44 24 40 08 74 1b 8b 4d b4 85 c9 0f 84 c8 00 00 00 65 48 11:23:12:[ 289.596009] RIP [<ffffffffa09ef395>] ksocknal_send+0x165/0x450 [ksocklnd] 11:23:12:[ 289.596009] RSP <ffff88002e315880> 11:23:12:[ 289.596009] CR2: ffff880013a91e30 |
| Comments |
| Comment by Peter Jones [ 09/Dec/13 ] |
|
Emoly Could you please look into this one? Thanks Peter |
| Comment by Emoly Liu [ 13/Dec/13 ] |
|
I built lustre on kerne-2.6.32.358.23.2.el6 with CONFIG_DEBUG_PAGEALLOC=y, and ran llmount.sh or started OST manually many times, but failed to hit this failure. Do I miss anything? |
| Comment by Oleg Drokin [ 28/Dec/13 ] |
|
The trick seems to be to have real networking in place and real clients. Interestingly enough my own stacktrace is different which seems to indicate a generic class of problem somewhere, that causes a request taht is being sent to be freed mid-sending. <1>[125127.785385] BUG: unable to handle kernel paging request at ffff8800814bee30 <1>[125127.785701] IP: [<ffffffffa0559395>] ksocknal_send+0x165/0x450 [ksocklnd] <4>[125127.786001] PGD 1a26063 PUD 501067 PMD 50c067 PTE 80000000814be060 <4>[125127.786417] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC <4>[125127.786685] last sysfs file: /sys/devices/system/cpu/possible <4>[125127.786946] CPU 2 <4>[125127.786983] Modules linked in: lustre ofd osp lod ost mdt mdd mgs osd_ldiskfs ldiskfs exportfs lquota lfsck jbd obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet sha512_generic sha256_generic libcfs ext4 jbd2 mbcache ppdev parport_pc parport virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: speedstep_lib] <4>[125127.789257] <4>[125127.789257] Pid: 7588, comm: ldlm_cn00_000 Not tainted 2.6.32-rhe6.4-debug2 #1 Bochs Bochs <4>[125127.789257] RIP: 0010:[<ffffffffa0559395>] [<ffffffffa0559395>] ksocknal_send+0x165/0x450 [ksocklnd] <4>[125127.789257] RSP: 0018:ffff8800b611da80 EFLAGS: 00010286 <4>[125127.789257] RAX: 0000000000000000 RBX: ffff88007a167ef0 RCX: ffffc9000638e060 <4>[125127.789257] RDX: 0000000000000001 RSI: 0000000000000282 RDI: ffffffffa0568658 <4>[125127.789257] RBP: ffff8800b611daf0 R08: 0000000000000000 R09: 0000000000000001 <4>[125127.789257] R10: 0000000000000001 R11: ffffffff81d5fb40 R12: ffff8800814bedf0 <4>[125127.789257] R13: 0000000000000001 R14: ffff8800b658fad0 R15: 00000000000000c0 <4>[125127.789257] FS: 0000000000000000(0000) GS:ffff880006280000(0000) knlGS:0000000000000000 <4>[125127.789257] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>[125127.789257] CR2: ffff8800814bee30 CR3: 0000000001a25000 CR4: 00000000000006e0 <4>[125127.789257] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[125127.789257] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>[125127.789257] Process ldlm_cn00_000 (pid: 7588, threadinfo ffff8800b611c000, task ffff8800b611a240) <4>[125127.789257] Stack: <4>[125127.789257] ffff8800814be001 00000001a04f6dec 0000000000000000 ffff8800b5974df0 <4>[125127.789257] <d> 0000000000000000 000000e800000000 00020000c0a80abf 0000000000003039 <4>[125127.789257] <d> ffff8800b611db10 ffff8800b5974df0 ffff8800814bedf0 ffff8800814bedf0 <4>[125127.789257] Call Trace: <4>[125127.789257] [<ffffffffa04f1dfb>] lnet_ni_send+0x4b/0xf0 [lnet] <4>[125127.789257] [<ffffffffa04f6005>] lnet_send+0x655/0xb80 [lnet] <4>[125127.789257] [<ffffffffa04f706a>] LNetPut+0x31a/0x860 [lnet] <4>[125127.789257] [<ffffffffa07ff290>] ptl_send_buf+0x1e0/0x550 [ptlrpc] <4>[125127.789257] [<ffffffffa081f04b>] ? at_measured+0x25b/0x380 [ptlrpc] <4>[125127.789257] [<ffffffffa0841495>] ? null_authorize+0x75/0x100 [ptlrpc] <4>[125127.789257] [<ffffffffa07ff884>] ptlrpc_send_reply+0x284/0x800 [ptlrpc] <4>[125127.789257] [<ffffffffa0800007>] ptlrpc_reply+0x27/0x30 [ptlrpc] <4>[125127.789257] [<ffffffffa07e01b8>] ldlm_handle_cancel+0x188/0x240 [ptlrpc] <4>[125127.789257] [<ffffffffa07e5c79>] ldlm_cancel_handler+0x1e9/0x500 [ptlrpc] <4>[125127.789257] [<ffffffffa0817c13>] ptlrpc_main+0xcd3/0x1940 [ptlrpc] <4>[125127.789257] [<ffffffffa0816f40>] ? ptlrpc_main+0x0/0x1940 [ptlrpc] <4>[125127.789257] [<ffffffff81094726>] kthread+0x96/0xa0 <4>[125127.789257] [<ffffffff8100c10a>] child_rip+0xa/0x20 <4>[125127.789257] [<ffffffff81094690>] ? kthread+0x0/0xa0 <4>[125127.789257] [<ffffffff8100c100>] ? child_rip+0x0/0x20 <4>[125127.789257] Code: 00 c7 43 60 c1 00 00 00 48 89 de 48 c7 43 70 00 00 00 00 48 c7 43 68 00 00 00 00 48 8b 55 c0 8b 4d c8 48 8b 7d a8 e8 7b fa ff ff <41> f6 44 24 40 08 74 1b 8b 4d b4 85 c9 0f 84 c8 00 00 00 65 48 <1>[125127.789257] RIP [<ffffffffa0559395>] ksocknal_send+0x165/0x450 [ksocklnd] |
| Comment by Oleg Drokin [ 28/Dec/13 ] |
|
This bug was introduced by Isaac in commit e2a2fab993d01597010cb2b44df44a522af0eec8 there is a very profilic use after free in ksocknal_send - once it calls ksocknal_launch_packet - this might trigger a callback that frees the request, and then ksocknal_send tries to access the request again with and we die Potentially other LNDs are succeptible to a similar problem as well. I verified that moving call to ksocknal_launch_packet after the vmflush check fixes the issue for me, but I am not 100% sure if it actually never allocates anything (making it the valid fix) or not. |
| Comment by Oleg Drokin [ 28/Dec/13 ] |
|
My proposed patch: http://review.whamcloud.com/8667 |
| Comment by Andreas Dilger [ 23/Jan/14 ] |
|
Oleg's patch has landed, and recent problems reported against this bug appear to be related to configuration of the test infrastructure |