Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Soheil Khorram"'
Autor:
Soheil Khorram, Anshuman Tripathi, Jaeyoung Kim, Han Lu, Qian Zhang, Rohit Prabhavalkar, Hasim Sak
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Publikováno v:
ASRU
Despite significant efforts over the last few years to build a robust automatic speech recognition (ASR) system for different acoustic settings, the performance of the current state-of-the-art technologies significantly degrades in noisy reverberant
Publikováno v:
ASRU
Training acoustic models with sequentially incoming data -- while both leveraging new data and avoiding the forgetting effect-- is an essential obstacle to achieving human intelligence level in speech recognition. An obvious approach to leverage data
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c3103a3af9fc2e6e3020b02cb8fb0257
http://arxiv.org/abs/1910.00565
http://arxiv.org/abs/1910.00565
Publikováno v:
ICASSP
Emotions modulate speech acoustics as well as language. The latter influences the sequences of phonemes that are produced, which in turn further modulate the acoustics. Therefore, phonemes impact emotion recognition in two ways: (1) they introduce an
Publikováno v:
ICASSP
DTW calculates the similarity or alignment between two signals, subject to temporal warping. However, its computational complexity grows exponentially with the number of time-series. Although there have been algorithms developed that are linear in th
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b97b867efea87499684a3025f4ccbb4a
http://arxiv.org/abs/1903.09245
http://arxiv.org/abs/1903.09245
Publikováno v:
IEEE Trans Affect Comput
Time-continuous dimensional descriptions of emotions (e.g., arousal, valence) allow researchers to characterize short-time changes and to capture long-term trends in emotion expression. However, continuous emotion labels are generally not synchronize
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::79db627137c584f22624360631032371
Publikováno v:
INTERSPEECH
Single-microphone, speaker-independent speech separation is normally performed through two steps: (i) separating the specific speech sources, and (ii) determining the best output-label assignment to find the separation error. The second step is the m
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::405491cb8d73a775770fd9680fbd3aab
Publikováno v:
INTERSPEECH
Bipolar Disorder is a chronic psychiatric illness characterized by pathological mood swings associated with severe disruptions in emotion regulation. Clinical monitoring of mood is key to the care of these dynamic and incapacitating mood states. Freq
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8228f3b12bccd26d593b7c570813d8c9
http://arxiv.org/abs/1806.10658
http://arxiv.org/abs/1806.10658
Publikováno v:
ICMI
In this paper, we present an analysis of different multimodal fusion approaches in the context of deep learning, focusing on pooling intermediate representations learned for the acoustic and lexical modalities. Traditional approaches to multimodal fe
Autor:
Emily Mower Provost, Melvin G. McInnis, Soheil Khorram, Zakaria Aldeneh, Dimitrios Dimitriadis
Publikováno v:
INTERSPEECH
The goal of continuous emotion recognition is to assign an emotion value to every frame in a sequence of acoustic features. We show that incorporating long-term temporal dependencies is critical for continuous emotion recognition tasks. To this end,
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ef276b80e11933d3060cbf588d1b30a5
http://arxiv.org/abs/1708.07050
http://arxiv.org/abs/1708.07050