Zobrazeno 1 - 10
of 27
pro vyhledávání: '"Matthias Sperber"'
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 2 (2021)
Externí odkaz:
https://doaj.org/article/6e8fae6669ab410288964cfbda7e8a89
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 8 (2021)
Externí odkaz:
https://doaj.org/article/8c52ff0fb5944971ac2c3488d0e4e677
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 7, Pp 313-325 (2019)
Transactions of the Association for Computational Linguistics, 7, 313–325
Transactions of the Association for Computational Linguistics, 7, 313–325
Speech translation has traditionally been approached through cascaded models consisting of a speech recognizer trained on a corpus of transcribed speech, and a machine translation system trained on parallel texts. Several recent works have shown the
Publikováno v:
EACL
Using end-to-end models for speech translation (ST) has increasingly been the focus of the ST community. These models condense the previously cascaded systems by directly converting sound waves into translated text. However, cascaded models have the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::88519cf01ed79080cfa48f6f417974cd
The conventional paradigm in speech translation starts with a speech recognition step to generate transcripts, followed by a translation step with the automatic transcripts as input. To address various shortcomings of this paradigm, recent work explo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c317e9a7e4928337c1914027dd6cbafe
http://arxiv.org/abs/2007.12741
http://arxiv.org/abs/2007.12741
Publikováno v:
NAACL-HLT (1)
Spoken language translation applications for speech suffer due to conversational speech phenomena, particularly the presence of disfluencies. With the rise of end-to-end speech translation models, processing steps such as disfluency removal that were
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d91162deade31b2eba602b3cf331a677
http://arxiv.org/abs/1906.00556
http://arxiv.org/abs/1906.00556
Publikováno v:
ACL (1)
Lattices are an efficient and effective method to encode ambiguity of upstream systems in natural language processing tasks, for example to compactly capture multiple speech recognition hypotheses, or to represent multiple linguistic analyses. Previo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::56e9bc65981e992f057eb606447838e3
Publikováno v:
ACL (1)
Previous work on end-to-end translation from speech has primarily used frame-level features as speech representations, which creates longer, sparser sequences than text. We show that a naive method to create compressed phoneme-like speech representat
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1fc922d3b6ff4bda6cf02accc36cf579
Publikováno v:
INTERSPEECH
Self-attention is a method of encoding sequences of vectors by relating these vectors to each-other based on pairwise similarities. These models have recently shown promising results for modeling discrete sequences, but they are non-trivial to apply
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1685f7efb9f687ee470812ae89c7eeab
http://arxiv.org/abs/1803.09519
http://arxiv.org/abs/1803.09519
Publikováno v:
Speech and Computer ISBN: 9783319995786
SPECOM
SPECOM
Estimating cepstral mean and variance normalization (CMVN) in run-on and real-time settings poses several challenges. Using a moving average for variance and mean estimation requires a comparatively long history of data from a speaker which is not ap
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::6b2938002d9c2ae85c6f05fe500ce6f1
https://doi.org/10.1007/978-3-319-99579-3_47
https://doi.org/10.1007/978-3-319-99579-3_47