scispace - formally typeset
Journal ArticleDOI

A novel point density based validity index for clustering gene expression datasets

TLDR
A new cluster validity index (ARPoints index) for the purpose of cluster validation is proposed and a new approach to determine the compactness measure and distinctness measure of clusters is presented.
Abstract
Elucidating the patterns hidden in gene expression data offers an opportunity for identifying co-expressed genes and biologically relevant grouping of genes. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the microarray data. A first step toward addressing this challenge is the use of clustering techniques. Validation of results obtained from a clustering algorithm is an important part of the clustering process. In this paper, we propose a new cluster validity index (ARPoints index) for the purpose of cluster validation. A new approach to determine the compactness measure and distinctness measure of clusters is presented. We revisit commonly known indices and conduct a thorough comparison of these indices with the proposed index and provide a summary of performance evaluation of different indices. Experimental results show that the proposed index performs better than the commonly known cluster validity indices.

read more

Citations
More filters
Journal ArticleDOI

Analysis of determining centroid clustering x-means algorithm with davies-bouldin index evaluation

TL;DR: The X-means algorithm with the Davies-Bouldin Index evaluation to determine the number of Centroid clusters is done by modifying the X-Means method to do some centroid determination to get 11 iterations and produces cluster members that have a good level of similarity with other data.
Proceedings ArticleDOI

Multiclass SVM algorithms for wind speed prediction

TL;DR: It has been shown that multiclass Directed Acyclic Graph based Support Vector Machine algorithm produces better results than other multiclass SVM algorithms.
Proceedings ArticleDOI

A New Framework for Fine Tuning of Deep Networks

TL;DR: This paper proposes a hybrid approach that integrates gain parameter based backpropagation algorithm and the dropout technique and evaluates its effectiveness in the fine tuning of deep neural networks on three benchmark datasets and results indicate that the proposed hybrid approach performs better fine tuning than backpropAGation algorithm alone.
Proceedings ArticleDOI

Improved K-Means Algorithm on Home Industry Data Clustering in the Province of Bangka Belitung

TL;DR: This study aims to identify the optimal group by making improvements to the k-means algorithm and then to test it by applying an internal cluster, namely the Davies-Bouldin Index and the Silhouette Index on the data of home industry in Bangka Belitung Island Province.
References
More filters

Some methods for classification and analysis of multivariate observations

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Journal ArticleDOI

Cluster analysis and display of genome-wide expression patterns

TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Journal ArticleDOI

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.
Journal ArticleDOI

Gene expression profiling predicts clinical outcome of breast cancer

TL;DR: DNA microarray analysis on primary breast tumours of 117 young patients is used and supervised classification is applied to identify a gene expression signature strongly predictive of a short interval to distant metastases (‘poor prognosis’ signature) in patients without tumour cells in local lymph nodes at diagnosis, providing a strategy to select patients who would benefit from adjuvant therapy.
Journal ArticleDOI

A Cluster Separation Measure

TL;DR: A measure is presented which indicates the similarity of clusters which are assumed to have a data density which is a decreasing function of distance from a vector characteristic of the cluster which can be used to infer the appropriateness of data partitions.
Related Papers (5)