Výsledky vyhledávání - "Ambikairajah A"

Report

Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features

Autor: Meng, Hanyu, Breebaart, Jeroen, Stoddard, Jeremy, Sethu, Vidhyasaharan, Ambikairajah, Eliathamby

Estimating frequency-varying acoustic parameters is essential for enhancing immersive perception in realistic spatial audio creation. In this paper, we propose a unified framework that blindly estimates reverberation time (T60), direct-to-reverberant

Externí odkaz: http://arxiv.org/abs/2411.03172

Zobrazit plný text záznamu

Report

Dual-Constrained Dynamical Neural ODEs for Ambiguity-aware Continuous Emotion Prediction

Autor: Wu, Jingyao, Dang, Ting, Sethu, Vidhyasaharan, Ambikairajah, Eliathamby

There has been a significant focus on modelling emotion ambiguity in recent years, with advancements made in representing emotions as distributions to capture ambiguity. However, there has been comparatively less effort devoted to the consideration o

Externí odkaz: http://arxiv.org/abs/2407.21344

Zobrazit plný text záznamu

Report

Binaural Selective Attention Model for Target Speaker Extraction

Autor: Meng, Hanyu, Zhang, Qiquan, Zhang, Xiangyu, Sethu, Vidhyasaharan, Ambikairajah, Eliathamby

The remarkable ability of humans to selectively focus on a target speaker in cocktail party scenarios is facilitated by binaural audio processing. In this paper, we present a binaural time-domain Target Speaker Extraction model based on the Filter-an

Externí odkaz: http://arxiv.org/abs/2406.12236

Zobrazit plný text záznamu

Report

An Exploration of Length Generalization in Transformer-Based Speech Enhancement

Autor: Zhang, Qiquan, Zhu, Hongxu, Qian, Xinyuan, Ambikairajah, Eliathamby, Li, Haizhou

The use of Transformer architectures has facilitated remarkable progress in speech enhancement. Training Transformers using substantially long speech utterances is often infeasible as self-attention suffers from quadratic complexity. It is a critical

Externí odkaz: http://arxiv.org/abs/2406.11401

Zobrazit plný text záznamu

Report

Mamba in Speech: Towards an Alternative to Self-Attention

Autor: Zhang, Xiangyu, Zhang, Qiquan, Liu, Hexin, Xiao, Tianyi, Qian, Xinyuan, Ahmed, Beena, Ambikairajah, Eliathamby, Li, Haizhou, Epps, Julien

Transformer and its derivatives have achieved success in diverse tasks across computer vision, natural language processing, and speech processing. To reduce the complexity of computations within the multi-head self-attention mechanism in Transformer,

Externí odkaz: http://arxiv.org/abs/2405.12609

Zobrazit plný text záznamu

Report

What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions

Autor: Meng, Hanyu, Sethu, Vidhyasaharan, Ambikairajah, Eliathamby

Publikováno v: Interspeech 2023

There is increasing interest in the use of the LEArnable Front-end (LEAF) in a variety of speech processing systems. However, there is a dearth of analyses of what is actually learnt and the relative importance of training the different components of

Externí odkaz: http://arxiv.org/abs/2404.06702

Zobrazit plný text záznamu

Report

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Autor: Zhang, Qiquan, Ge, Meng, Zhu, Hongxu, Ambikairajah, Eliathamby, Song, Qi, Ni, Zhaoheng, Li, Haizhou

Transformer architecture has enabled recent progress in speech enhancement. Since Transformers are position-agostic, positional encoding is the de facto standard component used to enable Transformers to distinguish the order of elements in a sequence

Externí odkaz: http://arxiv.org/abs/2401.09686

Zobrazit plný text záznamu

Report

Joint Spatio-Temporal Discretisation of Nonlinear Active Cochlear Models

Autor: Dang, T., Sethu, V., Ambikairajah, E., Epps, J., Li, H.

Biologically inspired auditory models play an important role in developing effective audio representations that can be tightly integrated into speech and audio processing systems. Current computational models of the cochlea are typically expressed in

Externí odkaz: http://arxiv.org/abs/2108.05993

Zobrazit plný text záznamu

Report

A Novel Markovian Framework for Integrating Absolute and Relative Ordinal Emotion Information

Autor: Wu, Jingyao, Dang, Ting, Sethu, Vidhyasaharan, Ambikairajah, Eliathamby

There is growing interest in affective computing for the representation and prediction of emotions along ordinal scales. However, the term ordinal emotion label has been used to refer to both absolute notions such as low or high arousal, as well as r

Externí odkaz: http://arxiv.org/abs/2108.04605

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání