Výsledky vyhledávání - "Lian, Hailun"

Report

Temporal Label Hierachical Network for Compound Emotion Recognition

Autor: Li, Sunan, Lian, Hailun, Lu, Cheng, Zhao, Yan, Qi, Tianhua, Yang, Hao, Zong, Yuan, Zheng, Wenming

The emotion recognition has attracted more attention in recent decades. Although significant progress has been made in the recognition technology of the seven basic emotions, existing methods are still hard to tackle compound emotion recognition that

Externí odkaz: http://arxiv.org/abs/2407.12973

Zobrazit plný text záznamu

Report

PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion

Autor: Qi, Tianhua, Zheng, Wenming, Lu, Cheng, Zong, Yuan, Lian, Hailun

In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion (EVC), aiming to achieve two major objectives of EVC: high content naturalness and high emotional naturalness, which are crucial for meeting the demands of human per

Externí odkaz: http://arxiv.org/abs/2403.01494

Zobrazit plný text záznamu

Report

Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition

Autor: Wang, Yong, Lu, Cheng, Lian, Hailun, Zhao, Yan, Schuller, Björn, Zong, Yuan, Zheng, Wenming

Swin-Transformer has demonstrated remarkable success in computer vision by leveraging its hierarchical feature representation based on Transformer. In speech signals, emotional information is distributed across different scales of speech features, e.

Externí odkaz: http://arxiv.org/abs/2401.10536

Zobrazit plný text záznamu

Report

Towards Domain-Specific Cross-Corpus Speech Emotion Recognition Approach

Autor: Zhao, Yan, Zong, Yuan, Lian, Hailun, Lu, Cheng, Shi, Jingang, Zheng, Wenming

Cross-corpus speech emotion recognition (SER) poses a challenge due to feature distribution mismatch, potentially degrading the performance of established SER methods. In this paper, we tackle this challenge by proposing a novel transfer subspace lea

Externí odkaz: http://arxiv.org/abs/2312.06466

Zobrazit plný text záznamu

Report

Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

Autor: Zhao, Yan, Zong, Yuan, Wang, Jincen, Lian, Hailun, Lu, Cheng, Zhao, Li, Zheng, Wenming

In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICA

Externí odkaz: http://arxiv.org/abs/2310.03992

Zobrazit plný text záznamu

Report

Time-Frequency Transformer: A Novel Time Frequency Joint Learning Method for Speech Emotion Recognition

Autor: Wang, Yong, Lu, Cheng, Zong, Yuan, Lian, Hailun, Zhao, Yan, Li, Sunan

In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency

Externí odkaz: http://arxiv.org/abs/2308.14568

Zobrazit plný text záznamu

Report

Learning Local to Global Feature Aggregation for Speech Emotion Recognition

Autor: Lu, Cheng, Lian, Hailun, Zheng, Wenming, Zong, Yuan, Zhao, Yan, Li, Sunan

Transformer has emerged in speech emotion recognition (SER) at present. However, its equal patch division not only damages frequency information but also ignores local emotion correlations across frames, which are key cues to represent emotion. To ha

Externí odkaz: http://arxiv.org/abs/2306.01491

Zobrazit plný text záznamu

Report

Deep Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

Autor: Zhao, Yan, Wang, Jincen, Zong, Yuan, Zheng, Wenming, Lian, Hailun, Zhao, Li

In this paper, we propose a novel deep transfer learning method called deep implicit distribution alignment networks (DIDAN) to deal with cross-corpus speech emotion recognition (SER) problem, in which the labeled training (source) and unlabeled test

Externí odkaz: http://arxiv.org/abs/2302.08921

Zobrazit plný text záznamu

Report

Speech Emotion Recognition via an Attentive Time-Frequency Neural Network

Autor: Lu, Cheng, Zheng, Wenming, Lian, Hailun, Zong, Yuan, Tang, Chuangao, Li, Sunan, Zhao, Yan

Spectrogram is commonly used as the input feature of deep neural networks to learn the high(er)-level time-frequency pattern of speech signal for speech emotion recognition (SER). \textcolor{black}{Generally, different emotions correspond to specific

Externí odkaz: http://arxiv.org/abs/2210.12430

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání