Adversarial domain adaptation for cross data source macromolecule in situ structural classification in cellular electron cryo-tomograms

Autor:	Min Xu, Ruogu Lin, Xiangrui Zeng, Kris M. Kitani
Rok vydání:	2019
Předmět:	Statistics and Probability Electron Microscope Tomography Domain adaptation Macromolecular Substances Computer science Information Storage and Retrieval Electrons Biochemistry 03 medical and health sciences 0302 clinical medicine Discriminative model Ismb/Eccb 2019 Conference Proceedings Molecular Biology 030304 developmental biology Data source 0303 health sciences Training set Molecular Structure business.industry Deep learning Pattern recognition Macromolecular Sequence Structure and Function Computer Science Applications Computational Mathematics Computational Theory and Mathematics Artificial intelligence Tomography business Classifier (UML) 030217 neurology & neurosurgery
Zdroj:	Bioinformatics
ISSN:	1460-2059 1367-4803
DOI:	10.1093/bioinformatics/btz364
Popis:	Motivation Since 2017, an increasing amount of attention has been paid to the supervised deep learning-based macromolecule in situ structural classification (i.e. subtomogram classification) in cellular electron cryo-tomography (CECT) due to the substantially higher scalability of deep learning. However, the success of such supervised approach relies heavily on the availability of large amounts of labeled training data. For CECT, creating valid training data from the same data source as prediction data is usually laborious and computationally intensive. It would be beneficial to have training data from a separate data source where the annotation is readily available or can be performed in a high-throughput fashion. However, the cross data source prediction is often biased due to the different image intensity distributions (a.k.a. domain shift). Results We adapt a deep learning-based adversarial domain adaptation (3D-ADA) method to timely address the domain shift problem in CECT data analysis. 3D-ADA first uses a source domain feature extractor to extract discriminative features from the training data as the input to a classifier. Then it adversarially trains a target domain feature extractor to reduce the distribution differences of the extracted features between training and prediction data. As a result, the same classifier can be directly applied to the prediction data. We tested 3D-ADA on both experimental and realistically simulated subtomogram datasets under different imaging conditions. 3D-ADA stably improved the cross data source prediction, as well as outperformed two popular domain adaptation methods. Furthermore, we demonstrate that 3D-ADA can improve cross data source recovery of novel macromolecular structures. Availability and implementation https://github.com/xulabs/projects Supplementary information Supplementary data are available at Bioinformatics online.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::76f2153a7aff26de11951ed0a34d05cb https://doi.org/10.1093/bioinformatics/btz364 Zobrazit plný text záznamu