Deep-Sea Fauna Segmentation: A Comparative Analysis of Convolutional and Vision Transformer Architectures at Lucky Strike Vent Field

Autor:	P. J. Soto Vega, G. X. Andrade-Miranda, G. A. O. P. da Costa, P. Papadakis, M. Matabos, T. Napoleon, A. Karine, H. Fagundes Gasparoto
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Technology Engineering (General). Civil engineering (General) TA1-2040 Applied optics. Photonics TA1501-1820
Zdroj:	ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol X-3-2024, Pp 387-395 (2024)
Druh dokumentu:	article
ISSN:	2194-9042 2194-9050
DOI:	10.5194/isprs-annals-X-3-2024-387-2024
Popis:	Due to recent technological developments, the acquisition and availability of deep-sea imagery has increased exponentially in the last years, leading to an increasing backlog in image annotation and processing, attributable to limited specialized human resources. In this work, we investigate the performance of well-established convolutional neural networks and Vision Transformer (ViT) based architectures, namely, DeepLabv3+ and UNETR, for the segmentation of fauna in deep-sea images. The dataset consists of images captured at the Lucky Strike Vent field, located on the mid-Atlantic ridge, of three edifices named Montsegur, White Castle, and Eiffel Tower. Our experimental investigation reveals that the Vision Transformer consistently outperforms the fully convolutional deep learning architecture, by approximately 14% in terms of F1-Score, demonstrating the effectiveness of ViTs in capturing intricate patterns and long-range dependencies present in deep-sea imagery. Our findings highlight the potential of ViTs as a promising approach for accurate semantic segmentation in challenging environmental contexts, paving the way for improved understanding and analysis of deep-sea ecosystems.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/74bf88f73b2a4dc9b4c4fc8ccee87960 Zobrazit plný text záznamu View record in DOAJ