Výsledky vyhledávání - "AKAGI, Masato"

Report

Machine Anomalous Sound Detection Using Spectral-temporal Modulation Representations Derived from Machine-specific Filterbanks

Autor: Li, Kai, Zaman, Khalid, Li, Xingfeng, Akagi, Masato, Unoki, Masashi

Early detection of factory machinery malfunctions is crucial in industrial applications. In machine anomalous sound detection (ASD), different machines exhibit unique vibration-frequency ranges based on their physical properties. Meanwhile, the human

Externí odkaz: http://arxiv.org/abs/2409.05319

Zobrazit plný text záznamu

Report

Two-stage dimensional emotion recognition by fusing predictions of acoustic and text networks using SVM

Autor: Atmaja, Bagus Tris, Akagi, Masato

Publikováno v: Speech Commun., vol. 126, pp. 9-21, Feb. 2021

Automatic speech emotion recognition (SER) by a computer is a critical component for more natural human-machine interaction. As in human-human interaction, the capability to perceive emotion correctly is essential to take further steps in a particula

Externí odkaz: http://arxiv.org/abs/2210.14495

Zobrazit plný text záznamu

Report

Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion

Autor: Ho, Tuan Vu, Kobayashi, Maori, Akagi, Masato

In most of practical scenarios, the announcement system must deliver speech messages in a noisy environment, in which the background noise cannot be cancelled out. The local noise reduces speech intelligibility and increases listening effort of the l

Externí odkaz: http://arxiv.org/abs/2206.13021

Zobrazit plný text záznamu

Report

Deep Multilayer Perceptrons for Dimensional Speech Emotion Recognition

Autor: Atmaja, Bagus Tris, Akagi, Masato

Publikováno v: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020

Modern deep learning architectures are ordinarily performed on high-performance computing facilities due to the large size of the input features and complexity of its model. This paper proposes traditional multilayer perceptrons (MLP) with deep layer

Externí odkaz: http://arxiv.org/abs/2004.02355

Zobrazit plný text záznamu

Report

On The Differences Between Song and Speech Emotion Recognition: Effect of Feature Sets, Feature Types, and Classifiers

Autor: Atmaja, Bagus Tris, Akagi, Masato

Publikováno v: 2020 IEEE REGION 10 CONFERENCE (TENCON), 968-972

In this paper, we evaluate the different features sets, feature types, and classifiers on both song and speech emotion recognition. Three feature sets: GeMAPS, pyAudioAnalysis, and LibROSA; two feature types: low-level descriptors and high-level stat

Externí odkaz: http://arxiv.org/abs/2004.00200

Zobrazit plný text záznamu

Report

Evaluation of Error and Correlation-Based Loss Functions For Multitask Learning Dimensional Speech Emotion Recognition

Autor: Atmaja, Bagus Tris, Akagi, Masato

The choice of a loss function is a critical part of machine learning. This paper evaluated two different loss functions commonly used in regression-task dimensional speech emotion recognition, an error-based and a correlation-based loss functions. We

Externí odkaz: http://arxiv.org/abs/2003.10724

Zobrazit plný text záznamu

Report

The Effect of Silence Feature in Dimensional Speech Emotion Recognition

Autor: Atmaja, Bagus Tris, Akagi, Masato

Publikováno v: 10th International Conference on Speech Prosody 2020, 26-30

Silence is a part of human-to-human communication, which can be a clue for human emotion perception. For automatic emotion recognition by a computer, it is not clear whether silence is useful to determine human emotion within a speech. This paper pre

Externí odkaz: http://arxiv.org/abs/2003.01277

Zobrazit plný text záznamu

Report

Multitask Learning and Multistage Fusion for Dimensional Audiovisual Emotion Recognition

Autor: Atmaja, Bagus Tris, Akagi, Masato

Due to its ability to accurately predict emotional state using multimodal features, audiovisual emotion recognition has recently gained more interest from researchers. This paper proposes two methods to predict emotional attributes from audio and vis

Externí odkaz: http://arxiv.org/abs/2002.11312

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání