Zobrazeno 1 - 10
of 57
pro vyhledávání: '"Spyros Matsoukas"'
Autor:
Park, Dookun, Yuan, Hao, Kim, Dongmin, Zhang, Yinglei, Spyros, Matsoukas, Kim, Young-Bum, Sarikaya, Ruhi, Guo, Edward, Ling, Yuan, Quinn, Kevin, Hung, Pham, Yao, Benjamin, Lee, Sungjin
Measuring user satisfaction level is a challenging task, and a critical component in developing large-scale conversational agent systems serving the needs of real users. An widely used approach to tackle this is to collect human annotation data and u
Externí odkaz:
http://arxiv.org/abs/2006.07113
Autor:
Meng Feng, Chieh-Chi Kao, Qingming Tang, Ming Sun, Viktor Rozgic, Spyros Matsoukas, Chao Wang
Standard acoustic event classification (AEC) solutions require large-scale collection of data from client devices for model optimization. Federated learning (FL) is a compelling framework that decouples data collection and model training to enhance c
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::86c462adb2e2b65ff6e9aa598f454ea2
Autor:
Joshua Levy, Constantinos Papayiannis, Viktor Rozgic, Bo Yang, Chao Wang, Mao Li, Daniel Bone, Andreas Stolcke, Spyros Matsoukas
Publikováno v:
ICASSP
Speech emotion recognition (SER) is a key technology to enable more natural human-machine communication. However, SER has long suffered from a lack of public large-scale labeled datasets. To circumvent this problem, we investigate how unsupervised re
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::81d373800d813e63ab19bd78913d2f43
http://arxiv.org/abs/2102.06357
http://arxiv.org/abs/2102.06357
Autor:
Josep Valls-Vargas, Lazaros Polymenakos, Spyros Matsoukas, Aditya Tiwari, Praveen Kumar Bodigutla
Publikováno v:
EMNLP (Findings)
Dialogue level quality estimation is vital for optimizing data driven dialogue management. Current automated methods to estimate turn and dialogue level user satisfaction employ hand-crafted features and rely on complex annotation schemes, which redu
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7ad4530f39c0b7cc9d803d400c59905c
http://arxiv.org/abs/2010.02495
http://arxiv.org/abs/2010.02495
Publikováno v:
ICASSP
Wake word (WW) spotting is challenging in far-field not only because of the interference in signal transmission but also the complexity in acoustic environment. Traditional WW model training requires large amount of in-domain WW-specific data with su
Publikováno v:
ICASSP
We study few-shot acoustic event detection (AED) in this paper. Few-shot learning enables detection of new events with very limited labeled data. Compared to other research areas like computer vision, few-shot learning for audio recognition has been
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b8708deca6c56bd2c766120fc2d76246
Autor:
Chris Beauchene, Yuriy Mishchenko, Oleg Rybakov, Shiv Naga Prasad Vitaladevuni, Spyros Matsoukas, Ming Sun, Yusuf Goren
Publikováno v:
ICMLA
In this paper, we investigate novel quantization approaches to reduce memory and computational footprint of deep neural network (DNN) based keyword spotters (KWS). We propose a new method for KWS offline and online quantization, which we call dynamic
Publikováno v:
INTERSPEECH
Acoustic Event Detection (AED), aiming at detecting categories of events based on audio signals, has found application in many intelligent systems. Recently deep neural network significantly advances this field and reduces detection errors to a large
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5793aa2dd70ac42424c77f88a5667282
http://arxiv.org/abs/1907.00873
http://arxiv.org/abs/1907.00873
Publikováno v:
ICASSP
This paper presents our work of training acoustic event detection (AED) models using unlabeled dataset. Recent acoustic event detectors are based on large-scale neural networks, which are typically trained with huge amounts of labeled data. Labels fo
Autor:
Arindam Mandal, Sanchit Agarwal, Abhishek Sethi, Spyros Matsoukas, Tagyoung Chung, Rahul Goel
Publikováno v:
SLT
Typical spoken language understanding systems provide narrow semantic parses using a domain-specific ontology. The parses contain intents and slots that are directly consumed by downstream domain applications. In this work we discuss expanding such s
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::562212470e8cfe5d0b62a13872066e63
http://arxiv.org/abs/1810.11497
http://arxiv.org/abs/1810.11497