Gene Expression Analysis through Parallel Non-Negative Matrix Factorization

Autor: Angelica Alejandra Serrano-Rubio, Guillermo B. Morales-Luna, Amilcar Meneses-Viveros
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Computation, Vol 9, Iss 10, p 106 (2021)
Druh dokumentu: article
ISSN: 2079-3197
DOI: 10.3390/computation9100106
Popis: Genetic expression analysis is a principal tool to explain the behavior of genes in an organism when exposed to different experimental conditions. In the state of art, many clustering algorithms have been proposed. It is overwhelming the amount of biological data whose high-dimensional structure exceeds mostly current computational architectures. The computational time and memory consumption optimization actually become decisive factors in choosing clustering algorithms. We propose a clustering algorithm based on Non-negative Matrix Factorization and K-means to reduce data dimensionality but whilst preserving the biological context and prioritizing gene selection, and it is implemented within parallel GPU-based environments through the CUDA library. A well-known dataset is used in our tests and the quality of the results is measured through the Rand and Accuracy Index. The results show an increase in the acceleration of 6.22× compared to the sequential version. The algorithm is competitive in the biological datasets analysis and it is invariant with respect to the classes number and the size of the gene expression matrix.
Databáze: Directory of Open Access Journals