Vocal Tract Contour Tracking in rtMRI Using Deep Temporal Regression Network

Autor:	Engin Erzin, Sasan Asadiabadi
Přispěvatelé:	Asadiabadi, Sasan, Erzin, Engin (ORCID 0000-0002-2715-2368 & YÖK ID 34503), Graduate School of Sciences and Engineering, College of Engineering, Department of Electrical and Electronics Engineering, Department of Computer Engineering
Rok vydání:	2020
Předmět:	Estimation Magnetic resonance imaging Speech processing Image segmentation Training Heating systems Tracking Appearance model Contour detection Deep neural network Real-time magnetic resonance imaging (rtMRI) Speech production Vocal tract Landmark Acoustics and Ultrasonics Computer science business.industry Deep learning Stability (learning theory) Pattern recognition Manner of articulation Active appearance model Acoustics Engineering 030507 speech-language pathology & audiology 03 medical and health sciences Computational Mathematics Computer Science (miscellaneous) Artificial intelligence Electrical and Electronic Engineering 0305 other medical science business
Zdroj:	IEEE/ACM Transactions on Audio, Speech, and Language Processing
ISSN:	2329-9304 2329-9290
DOI:	10.1109/taslp.2020.3036182
Popis:	Recent advances in real-time Magnetic Resonance Imaging (rtMRI) provide an invaluable tool to study speech articulation. In this paper, we present an effective deep learning approach for supervised detection and tracking of vocal tract contours in a sequence of rtMRI frames. We train a single input multiple output deep temporal regression network (DTRN) to detect the vocal tract (VT) contour and the separation boundary between different articulators. The DTRN learns the non-linear mapping from an overlapping fixed-length sequence of rtMRI frames to the corresponding articulatory movements, where a blend of the overlapping contour estimates defines the detected VT contour. The detected contour is refined at a post-processing stage using an appearance model to further improve the accuracy of VT contour detection. The proposed VT contour tracking model is trained and evaluated over the USC-TIMIT dataset. Performance evaluation is carried out using three objective assessment metrics for the separating landmark detection, contour tracking and temporal stability of the contour landmarks in comparison with three baseline approaches from the recent literature. Results indicate significant improvements with the proposed method over the state-of-the-art baselines. NA
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ef5efaab0107eb75e0b3802e2be6ef44 https://doi.org/10.1109/taslp.2020.3036182 Zobrazit plný text záznamu