[LU-2167] Race in ptlrpcd_stop/fini vs ptlrpcd leaving to use after free Created: 13/Oct/12 Updated: 19/Apr/13 Resolved: 23/Oct/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Oleg Drokin | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 5198 |
| Description |
|
I met this oops on my testing system several times and thought it did not make any sense initially, but after looking into the code I see that it does make sense after all. [11898.846141] BUG: unable to handle kernel paging request at ffff88008fb39800 [11898.846670] IP: [<ffffffffa1368c93>] ptlrpcd+0x2c3/0x3a0 [ptlrpc] [11898.847291] PGD 1a26063 PUD 501067 PMD 57f067 PTE 800000008fb39160 [11898.847789] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC [11898.848237] last sysfs file: /sys/devices/system/cpu/possible [11898.848678] CPU 2 [11898.848743] Modules linked in: ptlrpc(-) obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: fld] [11898.850099] [11898.850099] Pid: 30737, comm: ptlrpcd_rcv Not tainted 2.6.32-debug #6 Bochs Bochs [11898.850099] RIP: 0010:[<ffffffffa1368c93>] [<ffffffffa1368c93>] ptlrpcd+0x2c3/0x3a0 [ptlrpc] [11898.850099] RSP: 0018:ffff880088529e70 EFLAGS: 00010286 [11898.850099] RAX: 0000000000000002 RBX: ffff88008fb39800 RCX: 0000000000000001 [11898.850099] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000282 [11898.850099] RBP: ffff880088529f40 R08: 0000000000000000 R09: 0000000000000001 [11898.850099] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880088529ea0 [11898.850099] R13: ffff880083785f30 R14: 0000000000000002 R15: ffff880088529ee0 [11898.850099] FS: 00007f449616c700(0000) GS:ffff880006300000(0000) knlGS:0000000000000000 [11898.850099] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [11898.850099] CR2: ffff88008fb39800 CR3: 00000000b3178000 CR4: 00000000000006e0 [11898.850099] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [11898.850099] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [11898.850099] Process ptlrpcd_rcv (pid: 30737, threadinfo ffff880088528000, task ffff880070208500) [11898.850099] Stack: [11898.850099] ffff880070208500 ffffffffffffffff ffff880083785f40 0000000000000000 [11898.850099] <d> ffff880070208500 0000000000000000 00000004b0000008 0000000000000000 [11898.850099] <d> 0000000000000000 ffff880088529eb8 ffff880088529eb8 0000000000000054 [11898.850099] Call Trace: [11898.850099] [<ffffffff81057d60>] ? default_wake_function+0x0/0x20 [11898.850099] [<ffffffffa13689d0>] ? ptlrpcd+0x0/0x3a0 [ptlrpc] [11898.850099] [<ffffffff8100c14a>] child_rip+0xa/0x20 [11898.850099] [<ffffffffa13689d0>] ? ptlrpcd+0x0/0x3a0 [ptlrpc] [11898.850099] [<ffffffffa13689d0>] ? ptlrpcd+0x0/0x3a0 [ptlrpc] [11898.850099] [<ffffffff8100c140>] ? child_rip+0x0/0x20 [11898.850099] Code: e9 ba fe ff ff 0f 1f 00 49 8d 45 40 49 39 45 40 74 08 4c 89 ef e8 8e f6 fc ff 4c 89 e7 e8 46 1f ea ff 48 8d 7b 50 e8 8d 91 ce df <f0> 80 23 fd f0 80 23 fb f0 80 23 ef f0 80 63 02 fe e9 17 fe ff [11898.850099] RIP [<ffffffffa1368c93>] ptlrpcd+0x2c3/0x3a0 [ptlrpc] [11898.850099] RSP <ffff880088529e70> [11898.850099] CR2: ffff88008fb39800 Looking at the code, in ptlrpcd() we see: if (!cfs_list_empty(&set->set_requests))
ptlrpc_set_wait(set);
lu_context_fini(&env.le_ctx);
cfs_complete(&pc->pc_finishing);
cfs_clear_bit(LIOD_START, &pc->pc_flags); <=== Crash Here
cfs_clear_bit(LIOD_STOP, &pc->pc_flags);
cfs_clear_bit(LIOD_FORCE, &pc->pc_flags);
And in ptlrpcd_stop/fini(): {
...
#ifdef __KERNEL__
cfs_wait_for_completion(&pc->pc_finishing);
#else
liblustre_deregister_wait_callback(pc->pc_wait_callback);
liblustre_deregister_idle_callback(pc->pc_idle_callback);
#endif
lu_context_fini(&pc->pc_env.le_ctx);
...
EXIT;
}
static void ptlrpcd_fini(void)
{
int i;
ENTRY;
if (ptlrpcds != NULL) {
for (i = 0; i < ptlrpcds->pd_nthreads; i++)
ptlrpcd_stop(&ptlrpcds->pd_threads[i], 0);
ptlrpcd_stop(&ptlrpcds->pd_thread_rcv, 0);
lost race =>> OBD_FREE(ptlrpcds, ptlrpcds->pd_size);
ptlrpcds = NULL;
}
}
I think the proper solution to avoid the race should be to move cfs_clear_bit()s to before signalling the completion. |
| Comments |
| Comment by Oleg Drokin [ 13/Oct/12 ] |
|
Suggested patch in http://review.whamcloud.com/4264 |
| Comment by Oleg Drokin [ 23/Oct/12 ] |
|
Patch landed, resolving |