Výsledky vyhledávání

Report

Two-pass Endpoint Detection for Speech Recognition

Autor: Raju, Anirudh, Khare, Aparna, He, Di, Sklyar, Ilya, Chen, Long, Alptekin, Sam, Trinh, Viet Anh, Zhang, Zhe, Vaz, Colin, Ravichandran, Venkatesh, Maas, Roland, Rastrow, Ariya

Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands. The endpoint detector has to trade-off between accuracy and latency, since waiting longer reduces the cases of users being

Externí odkaz: http://arxiv.org/abs/2401.08916

Zobrazit plný text záznamu

Report

Adaptive Endpointing with Deep Contextual Multi-armed Bandits

Autor: Min, Do June, Stolcke, Andreas, Raju, Anirudh, Vaz, Colin, He, Di, Ravichandran, Venkatesh, Trinh, Viet Anh

Publikováno v: Proc. IEEE ICASSP, June 2023

Current endpointing (EP) solutions learn in a supervised framework, which does not allow the model to incorporate feedback and improve in an online setting. Also, it is a common practice to utilize costly grid-search to find the best configuration fo

Externí odkaz: http://arxiv.org/abs/2303.13407

Zobrazit plný text záznamu

Report

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images

Autor: Lim, Yongwan, Toutios, Asterios, Bliesener, Yannick, Tian, Ye, Lingala, Sajan Goud, Vaz, Colin, Sorensen, Tanner, Oh, Miran, Harper, Sarah, Chen, Weiyi, Lee, Yoonjeong, Töger, Johannes, Montesserin, Mairym Lloréns, Smith, Caitlin, Godinez, Bianca, Goldstein, Louis, Byrd, Dani, Nayak, Krishna S., Narayanan, Shrikanth S.

Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited,

Externí odkaz: http://arxiv.org/abs/2102.07896

Zobrazit plný text záznamu

Report

Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features

Autor: Dhawan, Kunal, Vaz, Colin, Travadi, Ruchir, Narayanan, Shrikanth

We propose an algorithm to extract noise-robust acoustic features from noisy speech. We use Total Variability Modeling in combination with Non-negative Matrix Factorization (NMF) to learn a total variability subspace and adapt NMF dictionaries for ea

Externí odkaz: http://arxiv.org/abs/1907.06859

Zobrazit plný text záznamu

Akademický článek

Extending the Beta divergence to complex values

Autor: Vaz, Colin, Narayanan, Shrikanth

Publikováno v: In Pattern Recognition Letters April 2021 144:105-111

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Conference

CNMF-based acoustic features for noise-robust ASR.

Autor: Vaz, Colin, Dimitriadis, Dimitrios, Thomas, Samuel, Narayanan, Shrikanth

Publikováno v: 2016 IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP); 2016, p5735-5739, 5p

Zobrazit plný text záznamu

Conference

Energy-constrained minimum variance response filter for robust vowel spectral estimation.

Autor: Vaz, Colin, Tsiartas, Andreas, Narayanan, Shrikanth

Publikováno v: 2014 IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP); 2014, p6275-6279, 5p

Zobrazit plný text záznamu

Conference

Barista: A framework for concurrent speech processing by usc-sail.

Autor: Can, Dogan, Gibson, James, Vaz, Colin, Georgiou, Panayiotis G., Narayanan, Shrikanth S.

Publikováno v: 2014 IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP); 2014, p3306-3310, 5p

Zobrazit plný text záznamu

Akademický článek

Barista: A Framework for Concurrent Speech Processing by USC-SAIL.

Autor: Can D; Signal Analysis and Interpretation Lab, University of Southern California, CA 90089., Gibson J; Signal Analysis and Interpretation Lab, University of Southern California, CA 90089., Vaz C; Signal Analysis and Interpretation Lab, University of Southern California, CA 90089., Georgiou PG; Signal Analysis and Interpretation Lab, University of Southern California, CA 90089., Narayanan SS; Signal Analysis and Interpretation Lab, University of Southern California, CA 90089.

Publikováno v: Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP (Conference) [Proc IEEE Int Conf Acoust Speech Signal Process] 2014 May; Vol. 2014, pp. 3306-3310.

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání