Network Propagation-Based Semi-supervised Identification of Genes Associated with Autism Spectrum Disorder

Autor: Hugo F. M. C. Martiniano, Muhammad Asif, Astrid Moura Vicente, Luís Correia
Rok vydání: 2020
Předmět:
Zdroj: Datacite
ORCID
Microsoft Academic Graph
CIBB
Computational Intelligence Methods for Bioinformatics and Biostatistics ISBN: 9783030345846
Popis: Autism Spectrum Disorder (ASD) is an etiologically and clinically heterogeneous neurodevelopmental disorder with more than 800 putative risk genes. This heterogeneity, coupled with the low penetrance of most ASD-associated mutations presents a challenge in identifying the relevant genetic determinants of ASD. We developed a machine learning semi-supervised gene scoring and classification method based on network propagation using a variant of the random walk with restart algorithm to identify and rank genes according to their association to know ASD-related genes. The method combines information from protein-protein interactions and positive (disease-related) and negative (disease-unrelated) genes. Our results indicate that the proposed method can classify held-out known disease genes in a cross-validation setting with good performance (area under the receiver operating curve \(\sim \)0.85, area under the precision-recall curve \(\sim \)0.8 and Matthews correlation coefficient 0.57). We found a set of top-ranking novel candidate genes identified by the method to be significantly enriched for pathways related to synaptic transmission and ion transport and specific neurotransmitter-associated pathways previously shown to be associated with ASD. Most of the novel candidate genes were found to be targeted by denovo single nucleotide variants in ASD patients.
Databáze: OpenAIRE