Prediction of LncRNA Subcellular Localization with Deep Learning from Sequence Features.

Autor: Gudenas BL; Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA., Wang L; Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA. liangjw@clemson.edu.
Jazyk: angličtina
Zdroj: Scientific reports [Sci Rep] 2018 Nov 06; Vol. 8 (1), pp. 16385. Date of Electronic Publication: 2018 Nov 06.
DOI: 10.1038/s41598-018-34708-w
Abstrakt: Long non-coding RNAs are involved in biological processes throughout the cell including the nucleus, chromatin and cytosol. However, most lncRNAs remain unannotated and functional annotation of lncRNAs is difficult due to their low conservation and their tissue and developmentally specific expression. LncRNA subcellular localization is highly informative regarding its biological function, although it is difficult to discover because few prediction methods currently exist. While protein subcellular localization prediction is a well-established research field, lncRNA localization prediction is a novel research problem. We developed DeepLncRNA, a deep learning algorithm which predicts lncRNA subcellular localization directly from lncRNA transcript sequences. We analyzed 93 strand-specific RNA-seq samples of nuclear and cytosolic fractions from multiple cell types to identify differentially localized lncRNAs. We then extracted sequence-based features from the lncRNAs to construct our DeepLncRNA model, which achieved an accuracy of 72.4%, sensitivity of 83%, specificity of 62.4% and area under the receiver operating characteristic curve of 0.787. Our results suggest that primary sequence motifs are a major driving force in the subcellular localization of lncRNAs.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje