scispace - formally typeset
Journal ArticleDOI

A novel method to measure the semantic similarity of HPO terms

TLDR
PhenoSim, a new similarity measure that includes a noise reduction component to model the noisy patient phenotype data, and a path-constrained Information Content-based method for phenotype semantics similarity measurement, could effectively improve the performance of HPO-based phenotype similarity Measurement, thus increasing the accuracy of phenotype-based causative gene prediction and disease prediction.
Abstract
It is critical yet remains to be challenging to make precise disease diagnosis from complex clinical features and highly heterogeneous genetic background. Recently, phenotype similarity has been effectively applied to model patient phenotype data. However, the existing measurements are revised based on the Gene Ontology-based term similarity models, which are not optimised for human phenotype ontologies. We propose a new similarity measure called PhenoSim. Our model includes a noise reduction component to model the noisy patient phenotype data, and a path-constrained Information Content-based method for phenotype semantics similarity measurement. Evaluation tests compared PhenoSim with four existing approaches. It showed that PhenoSim, could effectively improve the performance of HPO-based phenotype similarity measurement, thus increasing the accuracy of phenotype-based causative gene prediction and disease prediction.

read more

Citations
More filters
Journal ArticleDOI

Predicting Parkinson's Disease Genes Based on Node2vec and Autoencoder.

TL;DR: A novel prediction method for Parkinson's disease gene prediction, named N2A-SVM, which includes three parts: extracting features of genes based on network, reducing the dimension using deep neural network, and predicting Parkinson’s disease genes using a machine learning method.
Journal ArticleDOI

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

TL;DR: In benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) and low standard deviations in both human and yeast benchmark datasets exhibiting superior performance.
Journal ArticleDOI

BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data

TL;DR: The multi-class-grained scanning and boosting strategy in the model provide an effective solution to ease the overfitting challenge and improve the robustness of deep forest model working on small-scale data.
Journal ArticleDOI

Exposing the Causal Effect of C-Reactive Protein on the Risk of Type 2 Diabetes Mellitus: A Mendelian Randomization Study.

TL;DR: A Mendelian randomization analysis using genetic variations as instrumental variables (IVs) showed that high levels of CRP significantly increase the risk of T2DM, and the subsequent analysis of the relationship between CRP and type 1 diabetes mellitus (T1DM), supported that CRP levels cannot determine the riskof developing T1DM.
Journal ArticleDOI

k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification.

TL;DR: In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features, and classify the feature vectors by random forest.
References
More filters
Journal ArticleDOI

Gene Ontology: tool for the unification of biology

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Proceedings Article

The PageRank Citation Ranking : Bringing Order to the Web

TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
Proceedings Article

An Information-Theoretic Definition of Similarity

Dekang Lin
TL;DR: This work presents an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model and demonstrates how this definition can be used to measure the similarity in a number of different domains.

Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy

TL;DR: This paper presents a new approach for measuring semantic similarity/distance between words and concepts that combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better quantified with the computational evidence derived from a distributional analysis of corpus data.
Journal ArticleDOI

Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders

TL;DR: Online Mendelian Inheritance in Man (OMIM) is a comprehensive, authoritative and timely knowledgebase of human genes and genetic disorders compiled to support research and education in human genomics and the practice of clinical genetics.
Related Papers (5)