Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data

Autor: Marios S. Pattichis, Antonio Gomez, Sylvia Celedón-Pattichis, Carlos LopezLeiva, Venkatesh Jatla, Mario Esparza, Luis Sanchez Tapia
Rok vydání: 2021
Předmět:
Zdroj: Computer Analysis of Images and Patterns ISBN: 9783030891275
CAIP (1)
DOI: 10.1007/978-3-030-89128-2_8
Popis: Speech recognition is very challenging in student learning environments that are characterized by significant cross-talk and background noise. To address this problem, we present a bilingual speech recognition system that uses an interactive video analysis system to estimate the 3D speaker geometry for realistic audio simulations. We demonstrate the use of our system in generating a complex audio dataset that contains significant cross-talk and background noise that approximate real-life classroom recordings. We then test our proposed system with real-life recordings.
Databáze: OpenAIRE