Journal ArticleDOI
A novel point density based validity index for clustering gene expression datasets
M. Arif Wani,Romana Riyaz +1 more
TLDR
A new cluster validity index (ARPoints index) for the purpose of cluster validation is proposed and a new approach to determine the compactness measure and distinctness measure of clusters is presented.Abstract:
Elucidating the patterns hidden in gene expression data offers an opportunity for identifying co-expressed genes and biologically relevant grouping of genes. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the microarray data. A first step toward addressing this challenge is the use of clustering techniques. Validation of results obtained from a clustering algorithm is an important part of the clustering process. In this paper, we propose a new cluster validity index (ARPoints index) for the purpose of cluster validation. A new approach to determine the compactness measure and distinctness measure of clusters is presented. We revisit commonly known indices and conduct a thorough comparison of these indices with the proposed index and provide a summary of performance evaluation of different indices. Experimental results show that the proposed index performs better than the commonly known cluster validity indices.read more
Citations
More filters
Journal ArticleDOI
Enhancement Clustering Evaluation Result of Davies-Bouldin Index with Determining Initial Centroid of K-Means Algorithm
Journal ArticleDOI
Analysis of determining centroid clustering x-means algorithm with davies-bouldin index evaluation
TL;DR: The X-means algorithm with the Davies-Bouldin Index evaluation to determine the number of Centroid clusters is done by modifying the X-Means method to do some centroid determination to get 11 iterations and produces cluster members that have a good level of similarity with other data.
Proceedings ArticleDOI
Multiclass SVM algorithms for wind speed prediction
M. Arif Wani,Heena Farooq Bhat +1 more
TL;DR: It has been shown that multiclass Directed Acyclic Graph based Support Vector Machine algorithm produces better results than other multiclass SVM algorithms.
Proceedings ArticleDOI
A New Framework for Fine Tuning of Deep Networks
M. Arif Wani,Saduf Afzal +1 more
TL;DR: This paper proposes a hybrid approach that integrates gain parameter based backpropagation algorithm and the dropout technique and evaluates its effectiveness in the fine tuning of deep neural networks on three benchmark datasets and results indicate that the proposed hybrid approach performs better fine tuning than backpropAGation algorithm alone.
Proceedings ArticleDOI
Improved K-Means Algorithm on Home Industry Data Clustering in the Province of Bangka Belitung
Hadi Santoso,Hilyah Magdalena +1 more
TL;DR: This study aims to identify the optimal group by making improvements to the k-means algorithm and then to test it by applying an internal cluster, namely the Davies-Bouldin Index and the Silhouette Index on the data of home industry in Bangka Belitung Island Province.
References
More filters
Some methods for classification and analysis of multivariate observations
TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Journal ArticleDOI
Cluster analysis and display of genome-wide expression patterns
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Journal ArticleDOI
Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.
Journal ArticleDOI
Gene expression profiling predicts clinical outcome of breast cancer
Laura J. van't Veer,Hongyue Dai,Marc J. van de Vijver,Yudong D. He,Augustinus A. M. Hart,Mao Mao,Hans Peterse,Karin van der Kooy,Matthew J. Marton,Anke T. Witteveen,George J. Schreiber,Ron M. Kerkhoven,Christopher J. Roberts,Peter S. Linsley,René Bernards,Stephen H. Friend +15 more
TL;DR: DNA microarray analysis on primary breast tumours of 117 young patients is used and supervised classification is applied to identify a gene expression signature strongly predictive of a short interval to distant metastases (‘poor prognosis’ signature) in patients without tumour cells in local lymph nodes at diagnosis, providing a strategy to select patients who would benefit from adjuvant therapy.
Journal ArticleDOI
A Cluster Separation Measure
TL;DR: A measure is presented which indicates the similarity of clusters which are assumed to have a data density which is a decreasing function of distance from a vector characteristic of the cluster which can be used to infer the appropriateness of data partitions.
Related Papers (5)
A new cluster validity index using maximum cluster spread based compactness measure
M. Arif Wani,Romana Riyaz +1 more
Local and Global Data Spread Based Index for Determining Number of Clusters in a Dataset
Romana Riyaz,M. Arif Wani +1 more