Zobrazeno 1 - 10
of 27
pro vyhledávání: '"Viktor Rozgic"'
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Publikováno v:
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Autor:
Arman Zharmagambetov, Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Viktor Rozgic, Jasha Droppo, Chao Wang
Publikováno v:
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Autor:
Meng Feng, Chieh-Chi Kao, Qingming Tang, Ming Sun, Viktor Rozgic, Spyros Matsoukas, Chao Wang
Standard acoustic event classification (AEC) solutions require large-scale collection of data from client devices for model optimization. Federated learning (FL) is a compelling framework that decouples data collection and model training to enhance c
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::86c462adb2e2b65ff6e9aa598f454ea2
Autor:
Joshua Levy, Constantinos Papayiannis, Viktor Rozgic, Bo Yang, Chao Wang, Mao Li, Daniel Bone, Andreas Stolcke, Spyros Matsoukas
Publikováno v:
ICASSP
Speech emotion recognition (SER) is a key technology to enable more natural human-machine communication. However, SER has long suffered from a lack of public large-scale labeled datasets. To circumvent this problem, we investigate how unsupervised re
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::81d373800d813e63ab19bd78913d2f43
http://arxiv.org/abs/2102.06357
http://arxiv.org/abs/2102.06357
Publikováno v:
INTERSPEECH
Acoustic Event Detection (AED), aiming at detecting categories of events based on audio signals, has found application in many intelligent systems. Recently deep neural network significantly advances this field and reduces detection errors to a large
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5793aa2dd70ac42424c77f88a5667282
http://arxiv.org/abs/1907.00873
http://arxiv.org/abs/1907.00873
Publikováno v:
ICASSP
Conventional models for emotion recognition from speech signal are trained in supervised fashion using speech utterances with emotion labels. In this study we hypothesize that speech signal depends on multiple latent variables including the emotional
Publikováno v:
ICASSP
This paper presents our work of training acoustic event detection (AED) models using unlabeled dataset. Recent acoustic event detectors are based on large-scale neural networks, which are typically trained with huge amounts of labeled data. Labels fo
Publikováno v:
ICASSP
We study media presence detection, that is, learning to recognize if a sound segment (typically lasting for a few seconds) of a long recorded stream contains media (TV) sound. This problem is difficult because non-media sound sources can be quite div
Publikováno v:
ACL (1)
Studies on emotion recognition (ER) show that combining lexical and acoustic information results in more robust and accurate models. The majority of the studies focus on settings where both modalities are available in training and evaluation. However
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b3b94221e9baed336e247681937f4645