Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data
Autor: | Marios S. Pattichis, Antonio Gomez, Sylvia Celedón-Pattichis, Carlos LopezLeiva, Venkatesh Jatla, Mario Esparza, Luis Sanchez Tapia |
---|---|
Rok vydání: | 2021 |
Předmět: | |
Zdroj: | Computer Analysis of Images and Patterns ISBN: 9783030891275 CAIP (1) |
DOI: | 10.1007/978-3-030-89128-2_8 |
Popis: | Speech recognition is very challenging in student learning environments that are characterized by significant cross-talk and background noise. To address this problem, we present a bilingual speech recognition system that uses an interactive video analysis system to estimate the 3D speaker geometry for realistic audio simulations. We demonstrate the use of our system in generating a complex audio dataset that contains significant cross-talk and background noise that approximate real-life classroom recordings. We then test our proposed system with real-life recordings. |
Databáze: | OpenAIRE |
Externí odkaz: |