Výsledky vyhledávání - "Burkhardt, Felix"

Report

Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition

Autor: Kounadis-Bastian, Dionyssos, Schrüfer, Oliver, Derington, Anna, Wierstorf, Hagen, Eyben, Florian, Burkhardt, Felix, Schuller, Björn

Speech Emotion Recognition (SER) needs high computational resources to overcome the challenge of substantial annotator disagreement. Today SER is shifting towards dimensional annotations of arousal, dominance, and valence (A/D/V). Universal metrics a

Externí odkaz: http://arxiv.org/abs/2408.13920

Zobrazit plný text záznamu

Report

Uncertainty-Based Ensemble Learning For Speech Classification

Autor: Atmaja, Bagus Tris, Burkhardt, Felix

Speech classification has attracted increasing attention due to its wide applications, particularly in classifying physical and mental states. However, these tasks are challenging due to the high variability in speech signals. Ensemble learning has s

Externí odkaz: http://arxiv.org/abs/2407.17009

Zobrazit plný text záznamu

Report

Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition

Autor: Schrüfer, Oliver, Milling, Manuel, Burkhardt, Felix, Eyben, Florian, Schuller, Björn

Uncertainty Quantification (UQ) is an important building block for the reliable use of neural networks in real-world scenarios, as it can be a useful tool in identifying faulty predictions. Speech emotion recognition (SER) models can suffer from part

Externí odkaz: http://arxiv.org/abs/2407.01143

Zobrazit plný text záznamu

Report

Testing Speech Emotion Recognition Machine Learning Models

Autor: Derington, Anna, Wierstorf, Hagen, Özkil, Ali, Eyben, Florian, Burkhardt, Felix, Schuller, Björn W.

Machine learning models for speech emotion recognition (SER) can be trained for different tasks and are usually evaluated on the basis of a few available datasets per task. Tasks could include arousal, valence, dominance, emotional categories, or ton

Externí odkaz: http://arxiv.org/abs/2312.06270

Zobrazit plný text záznamu

Report

Going Retro: Astonishingly Simple Yet Effective Rule-based Prosody Modelling for Speech Synthesis Simulating Emotion Dimensions

Autor: Burkhardt, Felix, Reichel, Uwe, Eyben, Florian, Schuller, Björn

We introduce two rule-based models to modify the prosody of speech synthesis in order to modulate the emotion to be expressed. The prosody modulation is based on speech synthesis markup language (SSML) and can be used with any commercial speech synth

Externí odkaz: http://arxiv.org/abs/2307.02132

Zobrazit plný text záznamu

Report

Speech-based Age and Gender Prediction with Transformers

Autor: Burkhardt, Felix, Wagner, Johannes, Wierstorf, Hagen, Eyben, Florian, Schuller, Björn

We report on the curation of several publicly available datasets for age and gender prediction. Furthermore, we present experiments to predict age and gender with models based on a pre-trained wav2vec 2.0. Depending on the dataset, we achieve an MAE

Externí odkaz: http://arxiv.org/abs/2306.16962

Zobrazit plný text záznamu

Report

Happy or Evil Laughter? Analysing a Database of Natural Audio Samples

Autor: Düsterhöft, Aljoscha, Burkhardt, Felix, Schuller, Björn W.

We conducted a data collection on the basis of the Google AudioSet database by selecting a subset of the samples annotated with \textit{laughter}. The selection criterion was to be present a communicative act with clear connotation of being either po

Externí odkaz: http://arxiv.org/abs/2305.14023

Zobrazit plný text záznamu

Report

audb -- Sharing and Versioning of Audio and Annotation Data in Python

Autor: Wierstorf, Hagen, Wagner, Johannes, Eyben, Florian, Burkhardt, Felix, Schuller, Björn W.

Driven by the need for larger and more diverse datasets to pre-train and fine-tune increasingly complex machine learning models, the number of datasets is rapidly growing. audb is an open-source Python library that supports versioning and documentati

Externí odkaz: http://arxiv.org/abs/2303.00645

Zobrazit plný text záznamu

Report

Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

Autor: Triantafyllopoulos, Andreas, Wagner, Johannes, Wierstorf, Hagen, Schmitt, Maximilian, Reichel, Uwe, Eyben, Florian, Burkhardt, Felix, Schuller, Björn W.

Publikováno v: Proc. Interspeech 2022, 146-150

Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised mann

Externí odkaz: http://arxiv.org/abs/2204.00400

Zobrazit plný text záznamu

Report

Dawn of the transformer era in speech emotion recognition: closing the valence gap

Autor: Wagner, Johannes, Triantafyllopoulos, Andreas, Wierstorf, Hagen, Schmitt, Maximilian, Burkhardt, Felix, Eyben, Florian, Schuller, Björn W.

Publikováno v: in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 10745-10759, 1 Sept. 2023

Recent advances in transformer-based architectures which are pre-trained in self-supervised manner have shown great promise in several machine learning tasks. In the audio domain, such architectures have also been successfully utilised in the field o

Externí odkaz: http://arxiv.org/abs/2203.07378

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání