scispace - formally typeset
Journal ArticleDOI

Twenty Years of Mixture of Experts

TLDR
A comprehensive survey of the mixture of experts (ME), discussing the fundamental models for regression and classification and also their training with the expectation-maximization algorithm, and covering the variational learning of ME in detail.
Abstract
In this paper, we provide a comprehensive survey of the mixture of experts (ME). We discuss the fundamental models for regression and classification and also their training with the expectation-maximization algorithm. We follow the discussion with improvements to the ME model and focus particularly on the mixtures of Gaussian process experts. We provide a review of the literature for other training methods, such as the alternative localized ME training, and cover the variational learning of ME in detail. In addition, we describe the model selection literature which encompasses finding the optimum number of experts, as well as the depth of the tree. We present the advances in ME in the classification area and present some issues concerning the classification model. We list the statistical properties of ME, discuss how the model has been modified over the years, compare ME to some popular algorithms, and list several applications. We conclude our survey with future directions and provide a list of publicly available datasets and a list of publicly available software that implement ME. Finally, we provide examples for regression and classification. We believe that the study described in this paper will provide quick access to the relevant literature for researchers and practitioners who would like to improve or use ME, and that it will stimulate further studies in ME.

read more

Content maybe subject to copyright    Report

Citations
More filters

Pattern Recognition and Machine Learning

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Proceedings ArticleDOI

Multi-level Factorisation Net for Person Re-identification

TL;DR: Multi-Level Factorization Net (MLFN) as discussed by the authors is a novel network architecture that factorises the visual appearance of a person into latent discriminative factors at multiple semantic levels without manual annotation.
Journal ArticleDOI

When Gaussian Process Meets Big Data: A Review of Scalable GPs

TL;DR: In this article, a review of state-of-the-art scalable Gaussian process regression (GPR) models is presented, focusing on global and local approximations for subspace learning.
Journal ArticleDOI

Dynamic classifier selection

TL;DR: An updated taxonomy of Dynamic Selection techniques is proposed based on the main characteristics found in a dynamic selection system, and an extensive experimental analysis, considering a total of 18 state-of-the-art dynamic selection techniques, as well as static ensemble combination and single classification models.
Journal ArticleDOI

A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees

TL;DR: In this article, the authors proposed a new hybrid algorithm, the logit leaf model (LLM), which consists of two stages: a segmentation phase and a prediction phase, where in the first stage customer segments are identified using decision rules and in the second stage a model is created for every leaf of this tree.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Book

Neural Networks: A Comprehensive Foundation

Simon Haykin
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Book

Pattern Recognition and Machine Learning

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.