Zobrazeno 1 - 10
of 31
pro vyhledávání: '"Lian, Hailun"'
Autor:
Li, Sunan, Lian, Hailun, Lu, Cheng, Zhao, Yan, Qi, Tianhua, Yang, Hao, Zong, Yuan, Zheng, Wenming
The emotion recognition has attracted more attention in recent decades. Although significant progress has been made in the recognition technology of the seven basic emotions, existing methods are still hard to tackle compound emotion recognition that
Externí odkaz:
http://arxiv.org/abs/2407.12973
In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion (EVC), aiming to achieve two major objectives of EVC: high content naturalness and high emotional naturalness, which are crucial for meeting the demands of human per
Externí odkaz:
http://arxiv.org/abs/2403.01494
Swin-Transformer has demonstrated remarkable success in computer vision by leveraging its hierarchical feature representation based on Transformer. In speech signals, emotional information is distributed across different scales of speech features, e.
Externí odkaz:
http://arxiv.org/abs/2401.10536
Cross-corpus speech emotion recognition (SER) poses a challenge due to feature distribution mismatch, potentially degrading the performance of established SER methods. In this paper, we tackle this challenge by proposing a novel transfer subspace lea
Externí odkaz:
http://arxiv.org/abs/2312.06466
In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICA
Externí odkaz:
http://arxiv.org/abs/2310.03992
In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency
Externí odkaz:
http://arxiv.org/abs/2308.14568
Transformer has emerged in speech emotion recognition (SER) at present. However, its equal patch division not only damages frequency information but also ignores local emotion correlations across frames, which are key cues to represent emotion. To ha
Externí odkaz:
http://arxiv.org/abs/2306.01491
In this paper, we propose a novel deep transfer learning method called deep implicit distribution alignment networks (DIDAN) to deal with cross-corpus speech emotion recognition (SER) problem, in which the labeled training (source) and unlabeled test
Externí odkaz:
http://arxiv.org/abs/2302.08921
Spectrogram is commonly used as the input feature of deep neural networks to learn the high(er)-level time-frequency pattern of speech signal for speech emotion recognition (SER). \textcolor{black}{Generally, different emotions correspond to specific
Externí odkaz:
http://arxiv.org/abs/2210.12430
Publikováno v:
In Expert Systems With Applications 15 December 2024 258