Výsledky vyhledávání - "Mouchtaris, Athanasios"

Report

AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning

Autor: Yang, Yifan, Zhen, Kai, Banijamal, Ershad, Mouchtaris, Athanasios, Zhang, Zheng

Fine-tuning large language models (LLMs) has achieved remarkable performance across various natural language processing tasks, yet it demands more and more memory as model sizes keep growing. To address this issue, the recently proposed Memory-effici

Externí odkaz: http://arxiv.org/abs/2406.18060

Zobrazit plný text záznamu

Report

Accelerator-Aware Training for Transducer-Based Speech Recognition

Autor: Shakiah, Suhaila M., Swaminathan, Rupak Vignesh, Nguyen, Hieu Duy, Chinta, Raviteja, Afzal, Tariq, Susanj, Nathan, Mouchtaris, Athanasios, Strimel, Grant P., Rastrow, Ariya

Publikováno v: IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 100-107

Machine learning model weights and activations are represented in full-precision during training. This leads to performance degradation in runtime when deployed on neural network accelerator (NNA) chips, which leverage highly parallelized fixed-point

Externí odkaz: http://arxiv.org/abs/2305.07778

Zobrazit plný text záznamu

Report

Robust Acoustic and Semantic Contextual Biasing in Neural Transducers for Speech Recognition

Autor: Fu, Xuandi, Sathyendra, Kanthashree Mysore, Gandhe, Ankur, Liu, Jing, Strimel, Grant P., McGowan, Ross, Mouchtaris, Athanasios

Attention-based contextual biasing approaches have shown significant improvements in the recognition of generic and/or personal rare-words in End-to-End Automatic Speech Recognition (E2E ASR) systems like neural transducers. These approaches employ c

Externí odkaz: http://arxiv.org/abs/2305.05271

Zobrazit plný text záznamu

Report

Lookahead When It Matters: Adaptive Non-causal Transformers for Streaming Neural Transducers

Autor: Strimel, Grant P., Xie, Yi, King, Brian, Radfar, Martin, Rastrow, Ariya, Mouchtaris, Athanasios

Streaming speech recognition architectures are employed for low-latency, real-time applications. Such architectures are often characterized by their causality. Causal architectures emit tokens at each frame, relying only on current and past signal, w

Externí odkaz: http://arxiv.org/abs/2305.04159

Zobrazit plný text záznamu

Report

Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition

Autor: Sahai, Saumya Y., Liu, Jing, Muniyappa, Thejaswi, Sathyendra, Kanthashree M., Alexandridis, Anastasios, Strimel, Grant P., McGowan, Ross, Rastrow, Ariya, Chang, Feng-Ju, Mouchtaris, Athanasios, Kunzmann, Siegfried

We present dual-attention neural biasing, an architecture designed to boost Wake Words (WW) recognition and improve inference time latency on speech recognition tasks. This architecture enables a dynamic switch for its runtime compute paths by exploi

Externí odkaz: http://arxiv.org/abs/2304.01905

Zobrazit plný text záznamu

Report

Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition

Autor: Chang, Feng-Ju, Alexandridis, Anastasios, Swaminathan, Rupak Vignesh, Radfar, Martin, Mallidi, Harish, Omologo, Maurizio, Mouchtaris, Athanasios, King, Brian, Maas, Roland

To achieve robust far-field automatic speech recognition (ASR), existing techniques typically employ an acoustic front end (AFE) cascaded with a neural transducer (NT) ASR model. The AFE output, however, could be unreliable, as the beamforming output

Externí odkaz: http://arxiv.org/abs/2303.00692

Zobrazit plný text záznamu

Report

Sub-8-bit quantization for on-device speech recognition: a regularization-free approach

Autor: Zhen, Kai, Radfar, Martin, Nguyen, Hieu Duy, Strimel, Grant P., Susanj, Nathan, Mouchtaris, Athanasios

For on-device automatic speech recognition (ASR), quantization aware training (QAT) is ubiquitous to achieve the trade-off between model predictive performance and efficiency. Among existing QAT methods, one major drawback is that the quantization ce

Externí odkaz: http://arxiv.org/abs/2210.09188

Zobrazit plný text záznamu

Report

ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition

Autor: Radfar, Martin, Barnwal, Rohit, Swaminathan, Rupak Vignesh, Chang, Feng-Ju, Strimel, Grant P., Susanj, Nathan, Mouchtaris, Athanasios

The recurrent neural network transducer (RNN-T) is a prominent streaming end-to-end (E2E) ASR technology. In RNN-T, the acoustic encoder commonly consists of stacks of LSTMs. Very recently, as an alternative to LSTM layers, the Conformer architecture

Externí odkaz: http://arxiv.org/abs/2209.14868

Zobrazit plný text záznamu

Report

Compute Cost Amortized Transformer for Streaming ASR

Autor: Xie, Yi, Macoskey, Jonathan, Radfar, Martin, Chang, Feng-Ju, King, Brian, Rastrow, Ariya, Mouchtaris, Athanasios, Strimel, Grant P.

We present a streaming, Transformer-based end-to-end automatic speech recognition (ASR) architecture which achieves efficient neural inference through compute cost amortization. Our architecture creates sparse computation pathways dynamically at infe

Externí odkaz: http://arxiv.org/abs/2207.02393

Zobrazit plný text záznamu

Report

Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition

Autor: Zhen, Kai, Nguyen, Hieu Duy, Chinta, Raviteja, Susanj, Nathan, Mouchtaris, Athanasios, Afzal, Tariq, Rastrow, Ariya

We present a novel sub-8-bit quantization-aware training (S8BQAT) scheme for 8-bit neural network accelerators. Our method is inspired from Lloyd-Max compression theory with practical adaptations for a feasible computational overhead during training.

Externí odkaz: http://arxiv.org/abs/2206.15408

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání