Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Kushal Lakhotia"'
Autor:
Kushal Lakhotia, Eugene Kharitonov, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Benjamin Bolte, Tu-Anh Nguyen, Jade Copet, Alexei Baevski, Abdelrahman Mohamed, Emmanuel Dupoux
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 9, Pp 1336-1354 (2021)
AbstractWe introduce Generative Spoken Language Modeling, the task of learning the acoustic and linguistic characteristics of a language from raw audio (no text, no labels), and a set of metrics to automatically evaluate the learned representations a
Externí odkaz:
https://doaj.org/article/9f80ed410bfd419cbf7950582c5e58b4
Autor:
Asish Ghoshal, Srinivasan Iyer, Bhargavi Paranjape, Kushal Lakhotia, Scott Wen-tau Yih, Yashar Mehdad
Publikováno v:
Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval.
Autor:
Ruslan Salakhutdinov, Wei-Ning Hsu, Yao-Hung Hubert Tsai, Benjamin Bolte, Abdelrahman Mohamed, Kushal Lakhotia
Publikováno v:
IEEE/ACM Transactions on Audio, Speech, and Language Processing. 29:3451-3460
Self-supervised approaches for speech representation learning are challenged by three unique problems: (1) there are multiple sound units in each input utterance, (2) there is no lexicon of input sound units during the pre-training phase, and (3) sou
Autor:
Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, Patrick von Platen, Yatharth Saraf, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli
This paper presents XLS-R, a large-scale model for cross-lingual speech representation learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a million hours of publicly available speech audio in 128 languages, an orde
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d9d1740f72468c26efa5658360ae3194
http://arxiv.org/abs/2111.09296
http://arxiv.org/abs/2111.09296
Autor:
Po-Han Chi, Hung-yi Lee, Zili Huang, Ko-tik Lee, Shang-Wen Li, Tzu-hsien Huang, Guan-Ting Lin, Wei-Cheng Tseng, Jiatong Shi, Yung-Sung Chuang, Shinji Watanabe, Yist Y. Lin, Da-Rong Liu, Andy T. Liu, Shuyan Dong, Cheng-I Jeff Lai, Xuankai Chang, Shu-wen Yang, Abdelrahman Mohamed, Kushal Lakhotia
Publikováno v:
Interspeech 2021.
Self-supervised learning (SSL) has proven vital for advancing research in natural language processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on large volumes of unlabeled data and achieves state-of-the-art (SOTA) for va
Autor:
Emmanuel Dupoux, Jade Copet, Wei-Ning Hsu, Kushal Lakhotia, Eugene Kharitonov, Yossi Adi, Abdelrahman Mohamed, Adam Polyak
Publikováno v:
HAL
INTERSPEECH 2021-Annual Conference of the International Speech Communication Association
INTERSPEECH 2021-Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic
INTERSPEECH 2021-Annual Conference of the International Speech Communication Association
INTERSPEECH 2021-Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic
We propose using self-supervised discrete representations for the task of speech resynthesis. To generate disentangled representation, we separately extract low-bitrate representations for speech content, prosodic information, and speaker identity. T
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::944641e134d739dd6f410d784b6d6ff4
https://hal.inria.fr/hal-03329245
https://hal.inria.fr/hal-03329245