LPCANet: Classification of Laryngeal Cancer Histopathological Images Using a CNN with Position Attention and Channel Attention Mechanisms

Autor:	Chaowei Tang, Xiaoli Zhou, Yanqing Shao, Pan Huang, Francesco Mercaldo, Antonella Santone
Rok vydání:	2021
Předmět:	Neural Networks Channel (digital image) Computer science Image Processing Health Informatics CAD Convolutional neural network General Biochemistry Genetics and Molecular Biology Position attention mechanism Channel attention mechanism Computer 03 medical and health sciences Computer-Assisted Grad_CAM Histopathological images Interpretability Laryngeal cancer classification Algorithms Humans Image Processing Computer-Assisted Neural Networks Computer Laryngeal Neoplasms Medical diagnosis 030304 developmental biology 0303 health sciences business.industry 030302 biochemistry & molecular biology Pattern recognition Gold standard (test) Class (biology) Computer Science Applications Feature (computer vision) Artificial intelligence business
DOI:	10.6084/m9.figshare.14913048
Popis:	Laryngeal cancer is one of the most common malignant tumors in otolaryngology, and histopathological image analysis is the gold standard for the diagnosis of laryngeal cancer. However, pathologists have high subjectivity in their diagnoses, which makes it easy to miss diagnoses and misdiagnose. In addition, according to a literature search, there is currently no computer-aided diagnosis (CAD) algorithm that has been applied to the classification of histopathological images of laryngeal cancer. Convolutional neural networks (CNNs) are widely used in various other cancer classification tasks. However, the potential global and channel relationships of images may be ignored, which will affect the feature representation ability. Simultaneously, due to the lack of interpretability, the results are often difficult to accept by pathologists. we propose a laryngeal cancer classification network (LPCANet) based on a CNN and attention mechanisms. First, the original histopathological images are sequentially cropped into patches. Then, the patches are input into the basic ResNet50 to extract the local features. Then, a position attention module and a channel attention module are added in parallel to capture the spatial dependency and the channel dependency, respectively. The two modules produce the fusion feature map to enhance the feature representation and improve network classification performance. Moreover, the fusion feature map is extracted and visually analyzed by the grad-weighted class activation map (Grad_CAM) to provide a certain interpretability for the final results. The three-class classification performance of LPCANet is better than those of the five state-of-the-art classifiers (VGG16, ResNet50, InceptionV3, Xception and DenseNet121) on the two original resolutions (534 * 400 and 1067 * 800). On the 534 * 400 data, LPCANet achieved 73.18% accuracy, 74.04% precision, 73.15% recall, 72.9% F1-score, and 0.8826 AUC. On the 1067 * 800 data, LPCANet achieved 83.15% accuracy, 83.5% precision, 83.1% recall, 83.1% F1-score, and 0.9487 AUC. The results show that LPCANet enhances the feature representation by capturing the global and channel relationships and achieves better classification performance. In addition, the visual analysis of Grad_CAM makes the results interpretable, which makes it easier for the results to be accepted by pathologists and allows the method to become a second tool for auxiliary diagnosis.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e5809d062320676128337f05d6bc0402 Zobrazit plný text záznamu