Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Hassid, Michael"'
It is a common belief that large language models (LLMs) are better than smaller-sized ones. However, larger models also require significantly more time and compute during inference. This begs the question: what happens when both models operate under
Externí odkaz:
http://arxiv.org/abs/2404.00725
Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only transformers can in fact be conceptualized as unbounded
Externí odkaz:
http://arxiv.org/abs/2401.06104
Autor:
Nguyen, Tu Anh, Hsu, Wei-Ning, D'Avirro, Antony, Shi, Bowen, Gat, Itai, Fazel-Zarani, Maryam, Remez, Tal, Copet, Jade, Synnaeve, Gabriel, Hassid, Michael, Kreuk, Felix, Adi, Yossi, Dupoux, Emmanuel
Recent work has shown that it is possible to resynthesize high-quality speech based, not on text, but on low bitrate discrete units that have been learned in a self-supervised fashion and can therefore capture expressive aspects of speech that are ha
Externí odkaz:
http://arxiv.org/abs/2308.05725
Adaptive inference is a simple method for reducing inference costs. The method works by maintaining multiple classifiers of different capacities, and allocating resources to each test instance according to its difficulty. In this work, we compare the
Externí odkaz:
http://arxiv.org/abs/2306.02307
Autor:
Hassid, Michael, Remez, Tal, Nguyen, Tu Anh, Gat, Itai, Conneau, Alexis, Kreuk, Felix, Copet, Jade, Defossez, Alexandre, Synnaeve, Gabriel, Dupoux, Emmanuel, Schwartz, Roy, Adi, Yossi
Speech language models (SpeechLMs) process and generate acoustic data only, without textual supervision. In this work, we propose TWIST, a method for training SpeechLMs using a warm-start from a pretrained textual language models. We show using both
Externí odkaz:
http://arxiv.org/abs/2305.13009
Autor:
Hassid, Michael, Peng, Hao, Rotem, Daniel, Kasai, Jungo, Montero, Ivan, Smith, Noah A., Schwartz, Roy
The attention mechanism is considered the backbone of the widely-used Transformer architecture. It contextualizes the input by computing input-specific attention matrices. We find that this mechanism, while powerful and elegant, is not as important a
Externí odkaz:
http://arxiv.org/abs/2211.03495
Autor:
Treviso, Marcos, Lee, Ji-Ung, Ji, Tianchu, van Aken, Betty, Cao, Qingqing, Ciosici, Manuel R., Hassid, Michael, Heafield, Kenneth, Hooker, Sara, Raffel, Colin, Martins, Pedro H., Martins, André F. T., Forde, Jessica Zosa, Milder, Peter, Simpson, Edwin, Slonim, Noam, Dodge, Jesse, Strubell, Emma, Balasubramanian, Niranjan, Derczynski, Leon, Gurevych, Iryna, Schwartz, Roy
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources include data,
Externí odkaz:
http://arxiv.org/abs/2209.00099
Autor:
Hassid, Michael, Ramanovich, Michelle Tadmor, Shillingford, Brendan, Wang, Miaosen, Jia, Ye, Remez, Tal
In this paper we present VDTTS, a Visually-Driven Text-to-Speech model. Motivated by dubbing, VDTTS takes advantage of video frames as an additional input alongside text, and generates speech that matches the video signal. We demonstrate how this all
Externí odkaz:
http://arxiv.org/abs/2111.10139
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.