Zobrazeno 1 - 10
of 25
pro vyhledávání: '"Christian Fuegen"'
Autor:
Christian Fuegen, Abhinav Arora, Michael L. Seltzer, Ching-Feng Yeh, Suyoun Kim, Ozlem Kalinli, Duc Le
Publikováno v:
Interspeech 2021.
Word Error Rate (WER) has been the predominant metric used to evaluate the performance of automatic speech recognition (ASR) systems. However, WER is sometimes not a good indicator for downstream Natural Language Understanding (NLU) tasks, such as in
Autor:
Jay Mahadeokar, Alex Xiao, Christian Fuegen, Duc Le, Michael L. Seltzer, Yuan Shangguan, Chunyang Wu, Hang Su, Ozlem Kalinli, Yangyang Shi
Publikováno v:
Interspeech 2021.
Autor:
Christian Fuegen, Ozlem Kalinli, Chunyang Wu, Zhiping Xiu, Thilo Koehler, Qing He, Yangyang Shi
Publikováno v:
Interspeech 2021.
Publikováno v:
Interspeech 2021.
Publikováno v:
ICASSP
Packet loss may affect a wide range of applications that use voice over IP (VoIP), e.g. video conferencing. In this paper, we investigate a time-domain convolutional recurrent network (CRN) for online packet loss concealment. The CRN comprises a conv
Autor:
Ganesh Venkatesh, Alagappan Valliappan, Christian Fuegen, Jay Mahadeokar, Michael L. Seltzer, Vikas Chandra, Yuan Shangguan
Publikováno v:
ICASSP
Recurrent transducer models have emerged as a promising solution for speech recognition on the current and next generation smart devices. The transducer models provide competitive accuracy within a reasonable memory footprint alleviating the memory c
Autor:
Ching-Feng Yeh, Ozlem Kalinli, Yangyang Shi, Chunyang Wu, Rohit Prabhavalkar, Alex Xiao, Christian Fuegen, Duc Le, Michael L. Seltzer, Varun K. Nagaraja, Julian Chan, Jay Mahadeokar
We propose a dynamic encoder transducer (DET) for on-device speech recognition. One DET model scales to multiple devices with different computation capacities without retraining or finetuning. To trading off accuracy and latency, DET assigns differen
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c19b870d3c9858d0288619e077dd8bb4
http://arxiv.org/abs/2104.02176
http://arxiv.org/abs/2104.02176
Autor:
Chunyang Wu, Jiatong Zhou, Christian Fuegen, Ozlem Kalinli, Hang Su, Duc Le, Yuan Shangguan, Jay Mahadeokar, Rohit Prabhavalkar, Michael L. Seltzer, Yangyang Shi
As speech-enabled devices such as smartphones and smart speakers become increasingly ubiquitous, there is growing interest in building automatic speech recognition (ASR) systems that can run directly on-device; end-to-end (E2E) speech recognition mod
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0a19a4e21dc2dcdbbb3904ee54bc5d6d
Autor:
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers f
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1501461e84d5acdefffd137ee9ba374d
Transfer learning is critical for efficient information transfer across multiple related learning problems. A simple, yet effective transfer learning approach utilizes deep neural networks trained on a large-scale task for feature extraction. Such re
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c3eafbe18dc0e5d1695cbee1f5d820c6