Zobrazeno 1 - 10
of 21
pro vyhledávání: '"Maxim Korenevsky"'
Autor:
Yuri Y. Khokhlov, Andrei Andrusenko, Maxim Korenevsky, Ivan Medennikov, Mariya Korenevskaya, Aleksandr Laptev, Anton Mitrofanov, Aleksei Romanenko, Ivan Podluzhny, Aleksei Ilin
Neural network-based language models are commonly used in rescoring approaches to improve the quality of modern automatic speech recognition (ASR) systems. Most of the existing methods are computationally expensive since they use autoregressive langu
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5cf51c8c93b6c2f3481391c44d0dce6a
http://arxiv.org/abs/2104.02526
http://arxiv.org/abs/2104.02526
Autor:
Ivan Medennikov, Maxim Korenevsky, Tatiana Prisyach, Yuri Khokhlov, Mariya Korenevskaya, Ivan Sorokin, Tatiana Timofeeva, Anton Mitrofanov, Andrei Andrusenko, Ivan Podluzhny, Aleksandr Laptev, Aleksei Romanenko
Publikováno v:
6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020).
Autor:
Ivan Podluzhny, Ivan Sorokin, Yuri Y. Khokhlov, Andrei Andrusenko, Tatiana Prisyach, Mariya Korenevskaya, Aleksei Romanenko, Aleksandr Laptev, Maxim Korenevsky, Ivan Medennikov, Tatiana Timofeeva, Anton Mitrofanov
Publikováno v:
INTERSPEECH
Speaker diarization for real-life scenarios is an extremely challenging problem. Widely used clustering-based diarization approaches perform rather poorly in such conditions, mainly due to the limited ability to handle overlapping speech. We propose
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4ae9f336565297623daccc3ef226a606
Autor:
Maxim Korenevsky
Publikováno v:
Speech Communication. 89:84-91
HIghlightsVector Taylor Series (VTS) is a popular approach in robust speech recognition.Speech distortion model taking phase term into account is more accurate.Phase term can be modeled as a Gaussian random vector.Phase term modeling improves speech
Autor:
Ivan Sorokin, Ivan Podluzhny, Aleksei Romanenko, Maxim Korenevsky, Andrei Andrusenko, Tatiana Prisyach, Ivan Medennikov, Tatiana Timofeeva, Anton Mitrofanov, Yuri Y. Khokhlov, Aleksandr Laptev, Mariya Korenevskaya
Publikováno v:
5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018).
Autor:
Maxim Korenevsky, Nico Axtmann, David Suendermann-Oeft, Najmeh Sadoughi, Michael Brenndoerfer, Amanda L. Robinson, Mark Miller, Erik Edwards, Greg P. Finley
Publikováno v:
Speech and Computer ISBN: 9783319995786
SPECOM
SPECOM
A synthetic corpus of dialogs was constructed from the LibriSpeech corpus, and is made freely available for diarization research. It includes over 90 h of training data, and over 9 h each of development and test data. Both 2-person and 3-person dialo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::e5f5be605c2eef975348aa12af96f39a
https://doi.org/10.1007/978-3-319-99579-3_13
https://doi.org/10.1007/978-3-319-99579-3_13
Autor:
David Suendermann-Oeft, Amanda L. Robinson, Mark Miller, Michael Brenndoerfer, Erik Edwards, Nico Axtmann, Greg P. Finley, Maxim Korenevsky, Najmeh Sadoughi
Publikováno v:
Speech and Computer ISBN: 9783319995786
SPECOM
SPECOM
A top-down approach to speaker diarization is developed using a modified Baum-Welch algorithm. The HMM states combine phonemes according to structural positions under syllabic phonological theory. By nature of the structural phonology, there are at m
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::75523d0ff686e577ecc913844294cbf2
https://doi.org/10.1007/978-3-319-99579-3_14
https://doi.org/10.1007/978-3-319-99579-3_14
Publikováno v:
Speech and Computer ISBN: 9783319995786
SPECOM
SPECOM
In this work we present simple grapheme-based system for low-resource speech recognition using Babel data for Turkish spontaneous speech (80 h). We have investigated different neural network architectures performance, including fully-convolutional, r
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::bb8b5b95c24a6768c3f594b2ab4adcec
https://doi.org/10.1007/978-3-319-99579-3_4
https://doi.org/10.1007/978-3-319-99579-3_4
Autor:
Nico Axtmann, Maxim Korenevsky, Michael Brenndoerfer, Najmeh Sadoughi, David Suendermann-Oeft, Erik Edwards, Greg P. Finley, Wael Salloum, Amanda L. Robinson, Mark Miller
Publikováno v:
Speech and Computer ISBN: 9783319995786
SPECOM
SPECOM
Training models for speech recognition usually requires accurate word-level transcription of available speech data. For the domain of medical dictations, it is common to have “semi-literal” transcripts available: large numbers of speech files alo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::b645588d32900fcea8dab1602591db28
https://doi.org/10.1007/978-3-319-99579-3_19
https://doi.org/10.1007/978-3-319-99579-3_19
Autor:
Amanda L. Robinson, Najmeh Sadoughi, Maxim Korenevsky, David Suendermann-Oeft, Mark Miller, Nico Axtmann, Greg P. Finley, Michael Brenndoerfer, Erik Edwards
Publikováno v:
Speech and Computer ISBN: 9783319995786
SPECOM
SPECOM
We present a section boundary detection framework specifically for clinical dictations. Detection is cast as a semi-supervised binary tagging problem and solved using a neural network model composed of a stack of embeddings, unidirectional long-short
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::0e4723470686e61838151e539fd8d800
https://doi.org/10.1007/978-3-319-99579-3_58
https://doi.org/10.1007/978-3-319-99579-3_58