scispace - formally typeset
Journal ArticleDOI

A Review of Ensemble Methods in Bioinformatics

Reads0
Chats0
TLDR
This article provides a review of the most widely used ensemble learning methods and their application in various bioinformatics problems, including the main topics of gene expression, mass spectrometry-based proteomics, gene-gene interaction identification from genome-wide association studies, and prediction of regulatory elements from DNA and protein sequences.
Abstract
Ensemble learning is an intensively studies technique in machine learning and pattern recognition. Recent work in computational biology has seen an increasing use of ensemble learning methods due to their unique advantages in dealing with small sample size, high-dimensionality, and complexity data structures. The aim of this article is two-fold. First, it is to provide a review of the most widely used ensemble learning methods and their application in various bioinformatics problems, including the main topics of gene expression, mass spectrometry-based proteomics, gene-gene interaction identification from genome-wide association studies, and prediction of regulatory elements from DNA and protein sequences. Second, we try to identify and summarize future trends of ensemble methods in bioinformatics. Promising directions such as ensemble of support vector machine, meta-ensemble, and ensemble based feature selection are discussed.

read more

Citations
More filters
Journal ArticleDOI

Critical assessment of automated flow cytometry data analysis techniques

TL;DR: Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.
Journal ArticleDOI

A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis

TL;DR: This survey focuses on filter feature selection methods for informative feature discovery in gene expression microarray (GEM) analysis, which is also known as differentially expressed genes (DEGs) discovery, gene prioritization, or biomarker discovery, and presents them in a unified framework.
Book ChapterDOI

Random Forest for Bioinformatics

Yanjun Qi
TL;DR: The Random Forest technique, which includes an ensemble of decision trees and incorporates feature selection and interactions naturally in the learning process, is a popular choice because it is nonparametric, interpretable, efficient, and has high prediction accuracy for many types of data.
Journal ArticleDOI

Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article]

TL;DR: This paper reviews traditional as well as state-of-the-art ensemble methods and thus can serve as an extensive summary for practitioners and beginners.
Journal ArticleDOI

Efficient Machine Learning for Big Data

TL;DR: This paper reviews the theoretical and experimental data-modeling literature, in large-scale data-intensive fields, and introduces new algorithmic approaches with the least memory requirements and processing to minimize computational cost, while maintaining/improving its predictive/classification accuracy and stability.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal ArticleDOI

Bagging predictors

Leo Breiman
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.
Proceedings Article

Experiments with a new boosting algorithm

TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
Journal ArticleDOI

Mass spectrometry-based proteomics

TL;DR: The ability of mass spectrometry to identify and, increasingly, to precisely quantify thousands of proteins from complex samples can be expected to impact broadly on biology and medicine.