Popis: |
Large-scale cellular imaging and phenotyping is a widely adopted strategy for understanding biological systems and chemical perturbations. Quantitative analysis of cellular images for identifying phenotypic changes is a key challenge within this strategy, and has recently seen promising progress with approaches based on deep neural networks. However, studies so far require either pre-segmented images as input or manual phenotype annotations for training, or both. To address these limitations, we have developed an unsupervised approach that exploits the inherent groupings within cellular imaging datasets to define surrogate classes that are used to train a multi-scale convolutional neural network. The trained network takes as input full-resolution microscopy images, and, without the need for segmentation, yields as output feature vectors that support phenotypic profiling. Benchmarked on two diverse benchmark datasets, the proposed approach yields accurate phenotypic predictions as well as compound potency estimates comparable to the state-of-the-art. More importantly, we show that the approach identifies novel cellular phenotypes not included in the manual annotation nor detected by previous studies.Author summaryCellular microscopy images provide detailed information about how cells respond to genetic or chemical treatments, and have been widely and successfully used in basic research and drug discovery. The recent breakthrough of deep learning methods for natural imaging recognition tasks has triggered the development and application of deep learning methods to cellular images to understand how cells change upon perturbation. Although successful, deep learning studies so far either can only take images of individual cells as input or require human experts to label a large amount of images. In this paper, we present an unsupervised deep learning approach that, without any human annotation, analyzes directly full-resolution microscopy images displaying typically hundreds of cells. We apply the approach to two benchmark datasets, and show that the approach identifies novel visual phenotypes not detected by previous studies. |