Deep Learning to Discover Cancer Glycome Genes Signifying the Origins of Cancer
Autor: | Raihanul Bari Tanvir, Abdullah Al Mamun, Charles J. Dimitroff, Masrur Sobhan, Ananda Mohan Mondal |
---|---|
Rok vydání: | 2020 |
Předmět: |
0301 basic medicine
business.industry Deep learning Cancer Feature selection Computational biology Biology medicine.disease Genome Glycome 03 medical and health sciences 030104 developmental biology 0302 clinical medicine Breast cancer Feature (computer vision) 030220 oncology & carcinogenesis medicine Artificial intelligence business Gene |
Zdroj: | BIBM |
DOI: | 10.1109/bibm49941.2020.9313450 |
Popis: | Background: Aberrant protein glycosylation is a common feature of cancer and contributes to malignant behavior. However, how and to what extent the cellular glycome is involved in cancer development and progression is still undefined. The primary objective of this study is to conduct insilico identification of glycome genes that could reveal a signature of cancer using expression profiles of cancer genomes. There exists a list of $\sim 500$ glycome genes in several molecular categories. This study is based on the hypothesis that if the glycosylation is a common feature of cancer, there exists a shortlist of cancer glycome genes and their expression profiles should carry the signature capable of differentiating 33 different cancers available in The Cancer Genome Atlas (TCGA).Method: The distribution of cancer samples in TCGA is highly imbalanced, ranging from 36 for Cholangiocarcinoma (CHOL) to 1089 for Breast Cancer (BRCA). Supervised feature selection approaches to identify the signature genes would be biased to larger groups. We developed a computational framework using concrete autoencoder (CAE), a deep learning-based unsupervised feature selection algorithm, to find the cancer-related glycome genes. The criteria of optimal feature subset used in this study are (a) the number of features should be as few as possible, and (b) accuracy of classification using the selected features should be >90%.Results: Our experiment showed a shortlist of glycome genes (132 genes) that can differentiate 33 different cancers with an accuracy of 92%. This study reflects that the cancer glycome genes signify the origins of cancer. |
Databáze: | OpenAIRE |
Externí odkaz: |