Výsledky vyhledávání

Report

Focused Discriminative Training For Streaming CTC-Trained Automatic Speech Recognition Models

Autor: Haider, Adnan, Na, Xingyu, McDermott, Erik, Ng, Tim, Huang, Zhen, Zhuang, Xiaodan

This paper introduces a novel training framework called Focused Discriminative Training (FDT) to further improve streaming word-piece end-to-end (E2E) automatic speech recognition (ASR) models trained using either CTC or an interpolation of CTC and a

Externí odkaz: http://arxiv.org/abs/2408.13008

Zobrazit plný text záznamu

Report

Conformer-Based Speech Recognition On Extreme Edge-Computing Devices

Autor: Xu, Mingbin, Jin, Alex, Wang, Sicheng, Su, Mu, Ng, Tim, Mason, Henry, Han, Shiyi, Lei, Zhihong, Deng, Yaqiao, Huang, Zhen, Krishnamoorthy, Mahesh

With increasingly more powerful compute capabilities and resources in today's devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy. However, it is still c

Externí odkaz: http://arxiv.org/abs/2312.10359

Zobrazit plný text záznamu

Report

Towards Real-World Streaming Speech Translation for Code-Switched Speech

Autor: Alastruey, Belen, Sperber, Matthias, Gollan, Christian, Telaar, Dominic, Ng, Tim, Agarwal, Aashish

Code-switching (CS), i.e. mixing different languages in a single sentence, is a common phenomenon in communication and can be challenging in many Natural Language Processing (NLP) settings. Previous studies on CS speech have shown promising results f

Externí odkaz: http://arxiv.org/abs/2310.12648

Zobrazit plný text záznamu

Report

Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization

Autor: Lei, Zhihong, Pusateri, Ernest, Han, Shiyi, Liu, Leo, Xu, Mingbin, Ng, Tim, Travadi, Ruchir, Zhang, Youyuan, Hannemann, Mirko, Siu, Man-Hung, Huang, Zhen

Recent advances in deep learning and automatic speech recognition have improved the accuracy of end-to-end speech recognition systems, but recognition of personal content such as contact names remains a challenge. In this work, we describe our person

Externí odkaz: http://arxiv.org/abs/2310.09988

Zobrazit plný text záznamu

Report

Acoustic Model Fusion for End-to-end Speech Recognition

Autor: Lei, Zhihong, Xu, Mingbin, Han, Shiyi, Liu, Leo, Huang, Zhen, Ng, Tim, Zhang, Yuanyuan, Pusateri, Ernest, Hannemann, Mirko, Deng, Yaqiao, Siu, Man-Hung

Recent advances in deep learning and automatic speech recognition (ASR) have enabled the end-to-end (E2E) ASR system and boosted the accuracy to a new level. The E2E systems implicitly model all conventional ASR components, such as the acoustic model

Externí odkaz: http://arxiv.org/abs/2310.07062

Zobrazit plný text záznamu

Report

A Treatise On FST Lattice Based MMI Training

Autor: Haider, Adnan, Ng, Tim, Huang, Zhen, Na, Xingyu, Rosti, Antti Veikko

Maximum mutual information (MMI) has become one of the two de facto methods for sequence-level training of speech recognition acoustic models. This paper aims to isolate, identify and bring forward the implicit modelling decisions induced by the desi

Externí odkaz: http://arxiv.org/abs/2210.08918

Zobrazit plný text záznamu

Report

Online Automatic Speech Recognition with Listen, Attend and Spell Model

Autor: Hsiao, Roger, Can, Dogan, Ng, Tim, Travadi, Ruchir, Ghoshal, Arnab

The Listen, Attend and Spell (LAS) model and other attention-based automatic speech recognition (ASR) models have known limitations when operated in a fully online mode. In this paper, we analyze the online operation of LAS models to demonstrate that

Externí odkaz: http://arxiv.org/abs/2008.05514

Zobrazit plný text záznamu

Report

SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition

Autor: Huang, Zhen, Ng, Tim, Liu, Leo, Mason, Henry, Zhuang, Xiaodan, Liu, Daben

Very deep CNNs achieve state-of-the-art results in both computer vision and speech recognition, but are difficult to train. The most popular way to train very deep CNNs is to use shortcut connections (SC) together with batch normalization (BN). Inspi

Externí odkaz: http://arxiv.org/abs/1910.01992

Zobrazit plný text záznamu

Report

New results on pseudosquare avoidance

Autor: Ng, Tim, Ochem, Pascal, Rampersad, Narad, Shallit, Jeffrey

We start by considering binary words containing the minimum possible numbers of squares and antisquares (where an antisquare is a word of the form $x \overline{x}$), and we completely classify which possibilities can occur. We consider avoiding $x p(

Externí odkaz: http://arxiv.org/abs/1904.09157

Zobrazit plný text záznamu

Elektronická kniha

Creating a Government Commitment to Well-Being

Autor: Ng, Tim, author

Publikováno v: Well-Being: Expanding the Definition of Progress : Insights From Practitioners, Researchers, and Innovators From Around the Globe, 2020, ill.

Externí odkaz: https://doi.org/10.1093/oso/9780190080495.003.0009

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání