Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Shruti Palaskar"'
Publikováno v:
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Autor:
Mark Hasegawa-Johnson, Lucas Ondel, Elin Larsen, Shruti Palaskar, Liming Wang, Sebastian Stüker, Francesco Ciannella, Markus Müller, Odette Scharenborg, Rachid Riad, Florian Metze, Pierre Godard, Laurent Besacier, Mingxing Du, Alan W. Black, Danny Merkx, Emmanuel Dupoux, Philip Arthur, Graham Neubig
Publikováno v:
IEEE/ACM Transactions on Audio, Speech and Language Processing
IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020, ⟨10.1109/TASLP.2020.2973896⟩
IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 964-975
IEEE/ACM Transactions on Audio Speech and Language Processing, 28, pp. 964-975
IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2020, ⟨10.1109/TASLP.2020.2973896⟩
IEEE-ACM Transactions on Audio, Speech, and Language Processing, 28
IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020, ⟨10.1109/TASLP.2020.2973896⟩
IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 964-975
IEEE/ACM Transactions on Audio Speech and Language Processing, 28, pp. 964-975
IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2020, ⟨10.1109/TASLP.2020.2973896⟩
IEEE-ACM Transactions on Audio, Speech, and Language Processing, 28
International audience; Speech technology plays an important role in our everyday life. Speech is, among others, used for human-computer interaction, including, for instance, information retrieval and on-line shopping. In the case of an unwritten lan
Publikováno v:
Interspeech 2021.
Publikováno v:
ICASSP
Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an increasingly viable service for companies of any size building speech-based products. While these ASR systems are trained on large amounts of data, domain mismatch is still a
Publikováno v:
Proceedings of the First Workshop on Natural Language Processing for Medical Conversations.
Domain Adaptation for Automatic Speech Recognition (ASR) error correction via machine translation is a useful technique for improving out-of-domain outputs of pre-trained ASR systems to obtain optimal results for specific in-domain tasks. We use this
Autor:
Xavier Giro-i-Nieto, Shruti Palaskar, Deepti Ghadiyaram, Kenneth DeHaan, Amanda Duarte, Lucas Ventura, Jordi Torres, Florian Metze
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Digital.CSIC. Repositorio Institucional del CSIC
instname
CVPR
Universitat Politècnica de Catalunya (UPC)
Digital.CSIC. Repositorio Institucional del CSIC
instname
CVPR
Trabajo presentado en la IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), celebrada de forma virtual del 19 al 25 de junio de 2021
One of the factors that have hindered progress in the areas of sign language recognition, tr
One of the factors that have hindered progress in the areas of sign language recognition, tr
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8b8f11f0f1f12b91b5cc9293c6c8c458
Publikováno v:
ICASSP
End-to-end acoustic-to-word speech recognition models have recently gained popularity because they are easy to train, scale well to large amounts of training data, and do not require a lexicon. In addition, word models may also be easier to integrate
Publikováno v:
ACL (1)
In this paper, we study abstractive summarization for open-domain videos. Unlike the traditional text news summarization, the goal is less to "compress" text information but rather to provide a fluent textual summary of information that has been coll
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6aa27daa5a29233765228267168bf6df
Publikováno v:
Computer Speech & Language. 64:101093
Audio-Visual Scene-Aware Dialog (AVSD) is best understood as an extension of Visual Question Answering, the task of generating a textual answer in response to a textual question on multi-media content. In AVSD, the answer-relevant “context” is ex
Publikováno v:
ICASSP
An increasing number of datasets contain multiple views, such as video, sound and automatic captions. A basic challenge in representation learning is how to leverage multiple views to learn better representations. This is further complicated by the e
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::63791503bec8fae677ca8433a62dce28
http://arxiv.org/abs/1811.08890
http://arxiv.org/abs/1811.08890