Wide and deep learning for automatic cell type identification

Autor:	Brooke L. Fridley, Jose R. Conejo-Garcia, Xiaoqing Yu, Chris Wilson, Xuefeng Wang
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	Cell type Computer science Biophysics Overfitting Machine learning computer.software_genre Biochemistry Regularization (mathematics) 03 medical and health sciences Immune system 0302 clinical medicine Single cell data Structural Biology Genetics Dropout (neural networks) ComputingMethodologies_COMPUTERGRAPHICS 030304 developmental biology 0303 health sciences Artificial neural network business.industry Deep learning Classification Computer Science Applications Identification (information) Statistical classification 030220 oncology & carcinogenesis Snapshot (computer storage) Artificial intelligence business computer TP248.13-248.65 Research Article Biotechnology
Zdroj:	Computational and Structural Biotechnology Journal, Vol 19, Iss, Pp 1052-1062 (2021) Computational and Structural Biotechnology Journal
ISSN:	2001-0370
Popis:	Graphical abstract Cell type classification is an important problem in cancer research, especially with the advent of single cell technologies. Correctly identifying cells within the tumor microenvironment can provide oncologists with a snapshot of how a patient’s immune system reacts to the tumor. Wide and deep learning (WDL) is an approach to construct a cell-classification prediction model that can learn patterns within high-dimensional data (deep) and ensure that biologically relevant features (wide) remain in the final model. In this paper, we demonstrate that regularization can prevent overfitting and adding a wide component to a neural network can result in a model with better predictive performance. In particular, we observed that a combination of dropout and ℓ2 regularization can lead to a validation loss function that does not depend on the number of training iterations and does not experience a significant decrease in prediction accuracy compared to models with ℓ1, dropout, or no regularization. Additionally, we show WDL can have superior classification accuracy when the training and testing of a model are completed data on that arise from the same cancer type but different platforms. More specifically, WDL compared to traditional deep learning models can substantially increase the overall cell type prediction accuracy (36.5 to 86.9%) and T cell subtypes (CD4: 2.4 to 59.1%, and CD8: 19.5 to 96.1%) when the models were trained using melanoma data obtained from the 10X platform and tested on basal cell carcinoma data obtained using SMART-seq. WDL obtains higher accuracy when compared to state-of-the-art cell classification algorithms CHETAH (70.36%) and SingleR (70.59%).
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::eb982e61e6372cadae1d1866e87846c3 http://www.sciencedirect.com/science/article/pii/S2001037021000313 Zobrazit plný text záznamu