Výsledky vyhledávání - "Tomasello, Paden"

Report

Autor: Ma, Xutai, Sun, Anna, Ouyang, Siqi, Inaguma, Hirofumi, Tomasello, Paden

We introduce the Efficient Monotonic Multihead Attention (EMMA), a state-of-the-art simultaneous translation model with numerically-stable and unbiased monotonic alignment estimation. In addition, we present improved training and inference strategies

Externí odkaz: http://arxiv.org/abs/2312.04515

Zobrazit plný text záznamu

Report

SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

Autor: Communication, Seamless, Barrault, Loïc, Chung, Yu-An, Meglioli, Mariano Cora, Dale, David, Dong, Ning, Duquenne, Paul-Ambroise, Elsahar, Hady, Gong, Hongyu, Heffernan, Kevin, Hoffman, John, Klaiber, Christopher, Li, Pengwei, Licht, Daniel, Maillard, Jean, Rakotoarison, Alice, Sadagopan, Kaushik Ram, Wenzek, Guillaume, Ye, Ethan, Akula, Bapi, Chen, Peng-Jen, Hachem, Naji El, Ellis, Brian, Gonzalez, Gabriel Mejia, Haaheim, Justin, Hansanti, Prangthip, Howes, Russ, Huang, Bernie, Hwang, Min-Jae, Inaguma, Hirofumi, Jain, Somya, Kalbassi, Elahe, Kallet, Amanda, Kulikov, Ilia, Lam, Janice, Li, Daniel, Ma, Xutai, Mavlyutov, Ruslan, Peloquin, Benjamin, Ramadan, Mohamed, Ramakrishnan, Abinesh, Sun, Anna, Tran, Kevin, Tran, Tuan, Tufanov, Igor, Vogeti, Vish, Wood, Carleigh, Yang, Yilin, Yu, Bokai, Andrews, Pierre, Balioglu, Can, Costa-jussà, Marta R., Celebi, Onur, Elbayad, Maha, Gao, Cynthia, Guzmán, Francisco, Kao, Justine, Lee, Ann, Mourachko, Alexandre, Pino, Juan, Popuri, Sravya, Ropers, Christophe, Saleem, Safiyyah, Schwenk, Holger, Tomasello, Paden, Wang, Changhan, Wang, Jeff, Wang, Skyler

What does it take to create the Babel Fish, a tool that can help individuals translate speech between any two languages? While recent breakthroughs in text-based models have pushed machine translation coverage beyond 200 languages, unified speech-to-

Externí odkaz: http://arxiv.org/abs/2308.11596

Zobrazit plný text záznamu

Report

Scaling Speech Technology to 1,000+ Languages

Autor: Pratap, Vineel, Tjandra, Andros, Shi, Bowen, Tomasello, Paden, Babu, Arun, Kundu, Sayani, Elkahky, Ali, Ni, Zhaoheng, Vyas, Apoorv, Fazel-Zarandi, Maryam, Baevski, Alexei, Adi, Yossi, Zhang, Xiaohui, Hsu, Wei-Ning, Conneau, Alexis, Auli, Michael

Expanding the language coverage of speech technology has the potential to improve access to information for many more people. However, current speech technology is restricted to about one hundred languages which is a small fraction of the over 7,000

Externí odkaz: http://arxiv.org/abs/2305.13516

Zobrazit plný text záznamu

Report

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

Autor: Tang, Yun, Sun, Anna Y., Inaguma, Hirofumi, Chen, Xinyue, Dong, Ning, Ma, Xutai, Tomasello, Paden D., Pino, Juan

Transducer and Attention based Encoder-Decoder (AED) are two widely used frameworks for speech-to-text tasks. They are designed for different purposes and each has its own benefits and drawbacks for speech-to-text tasks. In order to leverage strength

Externí odkaz: http://arxiv.org/abs/2305.03101

Zobrazit plný text záznamu

Report

Efficient Speech Representation Learning with Low-Bit Quantization

Autor: Yeh, Ching-Feng, Hsu, Wei-Ning, Tomasello, Paden, Mohamed, Abdelrahman

With the development of hardware for machine learning, newer models often come at the cost of both increased sizes and computational complexity. In effort to improve the efficiency for these models, we apply and investigate recent quantization techni

Externí odkaz: http://arxiv.org/abs/2301.00652

Zobrazit plný text záznamu

Report

Continual Learning for On-Device Speech Recognition using Disentangled Conformers

Autor: Diwan, Anuj, Yeh, Ching-Feng, Hsu, Wei-Ning, Tomasello, Paden, Choi, Eunsol, Harwath, David, Mohamed, Abdelrahman

Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as speech models are increasingly deployed on personal devices, such models encounter user-specific distributional shifts. To simulate this real-world s

Externí odkaz: http://arxiv.org/abs/2212.01393

Zobrazit plný text záznamu

Report

Speech-to-Speech Translation For A Real-world Unwritten Language

Autor: Chen, Peng-Jen, Tran, Kevin, Yang, Yilin, Du, Jingfei, Kao, Justine, Chung, Yu-An, Tomasello, Paden, Duquenne, Paul-Ambroise, Schwenk, Holger, Gong, Hongyu, Inaguma, Hirofumi, Popuri, Sravya, Wang, Changhan, Pino, Juan, Hsu, Wei-Ning, Lee, Ann

We study speech-to-speech translation (S2ST) that translates speech from one language into another language and focuses on building systems to support languages without standard text writing systems. We use English-Taiwanese Hokkien as a case study,

Externí odkaz: http://arxiv.org/abs/2211.06474

Zobrazit plný text záznamu

Report

STOP: A dataset for Spoken Task Oriented Semantic Parsing

Autor: Tomasello, Paden, Shrivastava, Akshat, Lazar, Daniel, Hsu, Po-Chun, Le, Duc, Sagar, Adithya, Elkahky, Ali, Copet, Jade, Hsu, Wei-Ning, Adi, Yossi, Algayres, Robin, Nguyen, Tu Ahn, Dupoux, Emmanuel, Zettlemoyer, Luke, Mohamed, Abdelrahman

End-to-end spoken language understanding (SLU) predicts intent directly from audio using a single model. It promises to improve the performance of assistant systems by leveraging acoustic information lost in the intermediate textual representation an

Externí odkaz: http://arxiv.org/abs/2207.10643

Zobrazit plný text záznamu

Report

Deliberation Model for On-Device Spoken Language Understanding

Autor: Le, Duc, Shrivastava, Akshat, Tomasello, Paden, Kim, Suyoun, Livshits, Aleksandr, Kalinli, Ozlem, Seltzer, Michael L.

We propose a novel deliberation-based approach to end-to-end (E2E) spoken language understanding (SLU), where a streaming automatic speech recognition (ASR) model produces the first-pass hypothesis and a second-pass natural language understanding (NL

Externí odkaz: http://arxiv.org/abs/2204.01893

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání