2lpiRNApred: a two-layered integrated algorithm for identifying piRNAs and their functions based on LFE-GM feature selection
Autor: | Jianyuan Lin, Xiangrong Liu, Quan Zou, Yun Zuo, Min Jiang |
---|---|
Rok vydání: | 2020 |
Předmět: |
Chemical Phenomena
Feature extraction Piwi-interacting RNA Feature selection Biology 03 medical and health sciences 0302 clinical medicine Humans Radial basis function RNA Small Interfering Molecular Biology 030304 developmental biology 0303 health sciences Mahalanobis distance Computational Biology Reproducibility of Results Cell Biology Function (mathematics) Support vector machine 030220 oncology & carcinogenesis Databases Nucleic Acid Classifier (UML) Algorithm Algorithms Software Research Paper |
Zdroj: | RNA Biol |
ISSN: | 1555-8584 1547-6286 |
DOI: | 10.1080/15476286.2020.1734382 |
Popis: | Piwi-interacting RNAs (piRNAs) are indispensable in the transposon silencing, including in germ cell formation, germline stem cell maintenance, spermatogenesis, and oogenesis. piRNA pathways are amongst the major genome defence mechanisms, which maintain genome integrity. They also have important functions in tumorigenesis, as indicated by aberrantly expressed piRNAs being recently shown to play roles in the process of cancer development. A number of computational methods for this have recently been proposed, but they still have not yielded satisfactory predictive performance. Moreover, only one computational method that identifies whether piRNAs function in inducting target mRNA deadenylation been reported in the literature. In this study, we developed a two-layered integrated classifier algorithm, 2lpiRNApred. It identifies piRNAs in the first layer and determines whether they function in inducting target mRNA deadenylation in the second layer. A new feature selection algorithm, which was based on Luca fuzzy entropy and Gaussian membership function (LFE-GM), was proposed to reduce the dimensionality of the features. Five feature extraction strategies, namely, Kmer, General parallel correlation pseudo-dinucleotide composition, General series correlation pseudo-dinucleotide composition, Normalized Moreau-Broto autocorrelation, and Geary autocorrelation, and two types of classifier, Sparse Representation Classifier (SRC) and support vector machine with Mahalanobis distance-based radial basis function (SVMMDRBF), were used to construct a two-layered integrated classifier algorithm, 2lpiRNApred. The results indicate that 2lpiRNApred performs significantly better than six other existing prediction tools. |
Databáze: | OpenAIRE |
Externí odkaz: |