Zobrazeno 1 - 10
of 229
pro vyhledávání: '"Berger, Simon"'
The ongoing research scenario for automatic speech recognition (ASR) envisions a clear division between end-to-end approaches and classic modular systems. Even though a high-level comparison between the two approaches in terms of their requirements a
Externí odkaz:
http://arxiv.org/abs/2407.11641
Autor:
Vieting, Peter, Berger, Simon, von Neumann, Thilo, Boeddeker, Christoph, Schlüter, Ralf, Haeb-Umbach, Reinhold
Many real-life applications of automatic speech recognition (ASR) require processing of overlapped speech. A commonmethod involves first separating the speech into overlap-free streams and then performing ASR on the resulting signals. Recently, the i
Externí odkaz:
http://arxiv.org/abs/2309.08454
Multi-speaker automatic speech recognition (ASR) is crucial for many real-world applications, but it requires dedicated modeling techniques. Existing approaches can be divided into modular and end-to-end methods. Modular approaches separate speakers
Externí odkaz:
http://arxiv.org/abs/2306.12173
Modern public ASR tools usually provide rich support for training various sequence-to-sequence (S2S) models, but rather simple support for decoding open-vocabulary scenarios only. For closed-vocabulary scenarios, public tools supporting lexical-const
Externí odkaz:
http://arxiv.org/abs/2305.17782
In this work, we compare from-scratch sequence-level cross-entropy (full-sum) training of Hidden Markov Model (HMM) and Connectionist Temporal Classification (CTC) topologies for automatic speech recognition (ASR). Besides accuracy, we further analyz
Externí odkaz:
http://arxiv.org/abs/2210.09951
Autor:
Kohlbrenner, Tea1,2 (AUTHOR), Berger, Simon1,3 (AUTHOR), Laranjeira, Ana Cristina1,2 (AUTHOR), Aegerter-Wilmsen, Tinri1 (AUTHOR), Comi, Laura Filomena1,2 (AUTHOR), deMello, Andrew3 (AUTHOR), Hajnal, Alex1 (AUTHOR) alex.hajnal@mls.uzh.ch
Publikováno v:
PLoS Biology. 8/23/2024, Vol. 22 Issue 8, p1-27. 27p.
Publikováno v:
In Current Biology 3 June 2024 34(11):2373-2386
To join the advantages of classical and end-to-end approaches for speech recognition, we present a simple, novel and competitive approach for phoneme-based neural transducer modeling. Different alignment label topologies are compared and word-end-bas
Externí odkaz:
http://arxiv.org/abs/2010.16368
Autor:
Rutkauskaite, Justina, Berger, Simon, Stavrakis, Stavros, Dressler, Oliver, Heyman, John, Casadevall i Solvas, Xavier, deMello, Andrew, Mazutis, Linas
Publikováno v:
In iScience 15 July 2022 25(7)
Autor:
Berger, Simon, Hofmann, Robert
Publikováno v:
Geomechanik und Tunnelbau; Oct2024, Vol. 17 Issue 5, p535-544, 10p