Zobrazeno 1 - 10
of 23
pro vyhledávání: '"Ondrej Klejch"'
Publikováno v:
IEEE Open Journal of Signal Processing, Vol 2, Pp 33-66 (2021)
We present a structured overview of adaptation algorithms for neural network-based speech recognition, considering both hybrid hidden Markov model / neural network systems and end-to-end neural network systems, with a focus on speaker adaptation, dom
Externí odkaz:
https://doaj.org/article/051266d6d8f14251a11a9ac33b8a5a6c
Publikováno v:
IEEE Open Journal of Signal Processing, Vol 2, Pp 33-66 (2021)
EEE Open Journal of Signal Processing
Bell, P, Fainberg, J, Klejch, O, Li, J, Renals, S & Swietojanski, P 2021, ' Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview ', IEEE Open Journal of Signal Processing, vol. 2, pp. 33-66 . https://doi.org/10.1109/OJSP.2020.3045349
EEE Open Journal of Signal Processing
Bell, P, Fainberg, J, Klejch, O, Li, J, Renals, S & Swietojanski, P 2021, ' Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview ', IEEE Open Journal of Signal Processing, vol. 2, pp. 33-66 . https://doi.org/10.1109/OJSP.2020.3045349
We present a structured overview of adaptation algorithms for neural network-based speech recognition, considering both hybrid hidden Markov model / neural network systems and end-to-end neural network systems, with a focus on speaker adaptation, dom
Autor:
Thomas Reitmaier, Electra Wallington, Dani Kalarikalayil Raju, Ondrej Klejch, Jennifer Pearson, Matt Jones, Peter Bell, Simon Robinson
Publikováno v:
Reitmaier, T, Wallington, E, Kalarikalayil Raju, D, Klejch, O, Pearson, J, Jones, M, Bell, P & Robinson, S 2022, Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers . in S Barbosa, C Lampe, C Appert, D A Shamma, S Drucker, J Williamson & K Yatani (eds), Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems ., 299, CHI '22, New York, NY, USA, The ACM CHI Conference on Human Factors in Computing Systems 2022, New Orleans, Louisiana, United States, 30/04/22 . https://doi.org/10.1145/3491102.3517639
Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7f9eb88673c0c06363ca510952d1f77b
https://cronfa.swan.ac.uk/Record/cronfa59573/Download/59573__22666__12859841e0c34bc9949e1c476fc39f76.pdf
https://cronfa.swan.ac.uk/Record/cronfa59573/Download/59573__22666__12859841e0c34bc9949e1c476fc39f76.pdf
Autor:
Ondrej Klejch, Ulrich Germann, Penny Labropoulou, Rémi Calizzano, Julija Melnika, Jana Hamrlová, Athanasia Kolovou, Dimitris Gkoumas, Mickaël Rigault, Miltos Deligiannis, Cristian Berrio, Lukáš Kačena, Jose Manuel Gomez-Perez, Victoria Arranz, Kalina Bontcheva, Dimitris Galanis, Katrin Marheinecke, Katja Prinz, Khalid Choukri, Katerina Gkirtzou, Jan Hajič, Valérie Mapelli, Julián Moreno-Schneider, Steve Renals, Nils Feldhus, Florian Kintzel, Stefanie Hegele, Dusan Varis, Andres Garcia-Silva, Gerhard Backfried, Leon Voukoutis, Stelios Piperidis, Georg Rehm, Ian Roberts, Andrejs Vasiļjevs, Miro Janosik
Publikováno v:
Rehm, G, Piperidis, S, Bontcheva, K, Hajic, J, Arranz, V, Vasiljevs, A, Backfried, G, Gomez-Perez, J M, Germann, U, Calizzano, R, Feldhus, N, Hegele, S, Kintzel, F, Marheinecke, K, Moreno-Schneider, J, Galanis, D, Labropoulou, P, Deligiannis, M, Gkirtzou, K, Kolovou, A, Gkoumas, D, Voukoutis, L, Roberts, I, Hamrlova, J, Varis, D, Kacena, L, Choukri, K, Mapelli, V, Rigault, M, Melnika, J, Janosik, M, Prinz, K, Garcia-Silva, A, Berrio, C, Klejch, O & Renals, S 2021, European Language Grid: A Joint Platform for the European Language Technology Community . in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations . Online, pp. 221-230, 16th conference of the European Chapter of the Association for Computational Linguistics, Virtual Conference, 19/04/21 . < https://www.aclweb.org/anthology/2021.eacl-demos.26 >
Scopus-Elsevier
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
EACL (System Demonstrations)
Scopus-Elsevier
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
EACL (System Demonstrations)
Europe is a multilingual society, in which dozens of languages are spoken. The only option to enable and to benefit from multilingualism is through Language Technologies (LT), i.e., Natural Language Processing and Speech Technologies. We describe the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6d414f07b3b9fb74ce0da08025276295
https://hdl.handle.net/11346/BIBLIO@id=8451545542355799050
https://hdl.handle.net/11346/BIBLIO@id=8451545542355799050
Publikováno v:
Klejch, O, Wallington, E & Bell, P 2022, Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR . in H Ko & J H L Hansen (eds), Proceedings of Interspeech 2022 . pp. 2288-2292, Interspeech 2022, Incheon, Korea, Democratic People's Republic of, 18/09/22 . https://doi.org/10.21437/Interspeech.2022-10170
We present a method for cross-lingual training an ASR system using absolutely no transcribed training data from the target language, and with no phonetic knowledge of the language in question. Our approach uses a novel application of a decipherment a
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d74559f89f6cf738a6ea263d89f05645
Autor:
Caroline Pantofaru, Ondrej Klejch, Cordelia Schmid, Joseph Roth, Arkadiusz Stopczynski, Sharadh Ramaswamy, Zhonghua Xi, Radhika Marvin, Andrew C. Gallagher, Sourish Chaudhuri, Liat Kaver
Publikováno v:
ICASSP
Active speaker detection is an important component in video analysis algorithms for applications such as speaker diarization, video re-targeting for meetings, speech enhancement, and human-robot interaction. The absence of a large, carefully labeled
Publikováno v:
Information Retrieval Technology ISBN: 9783030428341
AIRS
AIRS
Cross-language speech retrieval systems face a cascade of errors due to transcription and translation ambiguity. Using 1-best speech recognition and 1-best translation in such a scenario could adversely affect recall if those 1-best system guesses ar
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::22b0dc466db06b480706ac744aad3f15
https://doi.org/10.1007/978-3-030-42835-8_13
https://doi.org/10.1007/978-3-030-42835-8_13
Publikováno v:
ASRU
Fainberg, J, Klejch, O, Loweimi, E, Bell, P & Renals, S 2020, Acoustic model adaptation from raw waveforms with Sincnet . in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) . Institute of Electrical and Electronics Engineers (IEEE), pp. 897-904, IEEE Automatic Speech Recognition and Understanding Workshop 2019, Sentosa, Singapore, 14/12/19 . https://doi.org/10.1109/ASRU46091.2019.9003974
Fainberg, J, Klejch, O, Loweimi, E, Bell, P & Renals, S 2020, Acoustic model adaptation from raw waveforms with Sincnet . in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) . Institute of Electrical and Electronics Engineers (IEEE), pp. 897-904, IEEE Automatic Speech Recognition and Understanding Workshop 2019, Sentosa, Singapore, 14/12/19 . https://doi.org/10.1109/ASRU46091.2019.9003974
Raw waveform acoustic modelling has recently gained interest due to neural networks' ability to learn feature extraction, and the potential for finding better representations for a given scenario than hand-crafted features. SincNet has been proposed
Publikováno v:
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
ASRU
Klejch, O, Fainberg, J, Bell, P & Renals, S 2020, Speaker adaptive training using model agnostic meta-learning . in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) . Institute of Electrical and Electronics Engineers (IEEE), pp. 881-888, IEEE Automatic Speech Recognition and Understanding Workshop 2019, Sentosa, Singapore, 14/12/19 . https://doi.org/10.1109/ASRU46091.2019.9003751
ASRU
Klejch, O, Fainberg, J, Bell, P & Renals, S 2020, Speaker adaptive training using model agnostic meta-learning . in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) . Institute of Electrical and Electronics Engineers (IEEE), pp. 881-888, IEEE Automatic Speech Recognition and Understanding Workshop 2019, Sentosa, Singapore, 14/12/19 . https://doi.org/10.1109/ASRU46091.2019.9003751
Speaker adaptive training (SAT) of neural network acoustic models learns models in a way that makes them more suitable for adaptation to test conditions. Conventionally, model-based speaker adaptive training is performed by having a set of speaker de
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::57ecec7b0bd9134d433eea63f8f4a0b3
http://arxiv.org/abs/1910.10605
http://arxiv.org/abs/1910.10605
Publikováno v:
INTERSPEECH
Fainberg, J, Klejch, O, Renals, S & Bell, P 2019, Lattice-based lightly-supervised acoustic model training . in Proceedings Interspeech 2019 . pp. 1596-1600, Interspeech 2019, Graz, Austria, 15/09/19 . https://doi.org/10.21437/Interspeech.2019-2533
Interspeech 2019
Fainberg, J, Klejch, O, Renals, S & Bell, P 2019, Lattice-based lightly-supervised acoustic model training . in Proceedings Interspeech 2019 . pp. 1596-1600, Interspeech 2019, Graz, Austria, 15/09/19 . https://doi.org/10.21437/Interspeech.2019-2533
Interspeech 2019
In the broadcast domain there is an abundance of related text data and partial transcriptions, such as closed captions and subtitles. This text data can be used for lightly supervised training, in which text matching the audio is selected using an ex