Learning with phenotypic similarity improves the prediction of functional effects of missense variants in voltage-gated sodium channels

Autor:	Christian Malte Boßelmann, Ulrike B.S. Hedrich, Holger Lerche, Nico Pfeifer
Rok vydání:	2022
DOI:	10.1101/2022.09.29.510111
Popis:	BackgroundMissense variants in genes encoding voltage-gated sodium channels are associated with a spectrum of severe diseases affecting neuronal and muscle cells, the so-called sodium channelopathies. Variant effects on the biophysical function of the channel correlate with clinical features and can in most cases be categorized as an overall gain- or loss-of-function. This information enables a timely diagnosis, facilitates precision therapy, and guides prognosis. Machine learning models may be able to rapidly generate supporting evidence by predicting variant functional effects.MethodsHere, we describe a novel multi-task multi-kernel learning framework capable of harmonizing functional results and structural information with clinical phenotypes. We included 62 sequence- and structure-based features such as amino acid physiochemical properties, substitution radicality, conservation, protein-protein interaction sites, expert annotation, and others. We harmonized phenotypes as human phenotype ontology (HPO) terms, and compared different measures of phenotypic similarity under simulated sparsity or noise. The final model was trained on whole-cell patch-clamp recordings of 375 unique non-synonymous missense variants each expressed in mammalian cells.ResultsOur gain- or loss-of-function classifier outperformed both conventional baseline and state-of-the-art methods on internal validation (mean accuracy 0.837 ± 0.035, mean AU-ROC 0.890 ± 0.023) and on an independent set of recently described variants (n = 30, accuracy 0.967, AU-ROC 1.000). Model performance was robust across different phenotypic similarity measures and largely insensitive to phenotypic noise or sparsity. Localized multi-kernel learning offered biological insight and interpretability by highlighting channels with implicit genotype-phenotype correlations or latent task similarity for downstream analysis.ConclusionsLearning with phenotypic similarity makes efficient use of clinical information to enable accurate and robust prediction of variant functional effects. Our framework extends the use of human phenotype ontology terms towards kernel-based methods in machine learning. Training data, pre-trained models, and a web-based graphical user interface for the model are publicly available.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::874c74115beab1fcb4c847b5591c9d5e https://doi.org/10.1101/2022.09.29.510111 Zobrazit plný text záznamu