Zobrazeno 1 - 10
of 737
pro vyhledávání: '"Ambikairajah A"'
Autor:
Meng, Hanyu, Breebaart, Jeroen, Stoddard, Jeremy, Sethu, Vidhyasaharan, Ambikairajah, Eliathamby
Estimating frequency-varying acoustic parameters is essential for enhancing immersive perception in realistic spatial audio creation. In this paper, we propose a unified framework that blindly estimates reverberation time (T60), direct-to-reverberant
Externí odkaz:
http://arxiv.org/abs/2411.03172
There has been a significant focus on modelling emotion ambiguity in recent years, with advancements made in representing emotions as distributions to capture ambiguity. However, there has been comparatively less effort devoted to the consideration o
Externí odkaz:
http://arxiv.org/abs/2407.21344
The remarkable ability of humans to selectively focus on a target speaker in cocktail party scenarios is facilitated by binaural audio processing. In this paper, we present a binaural time-domain Target Speaker Extraction model based on the Filter-an
Externí odkaz:
http://arxiv.org/abs/2406.12236
The use of Transformer architectures has facilitated remarkable progress in speech enhancement. Training Transformers using substantially long speech utterances is often infeasible as self-attention suffers from quadratic complexity. It is a critical
Externí odkaz:
http://arxiv.org/abs/2406.11401
Autor:
Zhang, Xiangyu, Zhang, Qiquan, Liu, Hexin, Xiao, Tianyi, Qian, Xinyuan, Ahmed, Beena, Ambikairajah, Eliathamby, Li, Haizhou, Epps, Julien
Transformer and its derivatives have achieved success in diverse tasks across computer vision, natural language processing, and speech processing. To reduce the complexity of computations within the multi-head self-attention mechanism in Transformer,
Externí odkaz:
http://arxiv.org/abs/2405.12609
Publikováno v:
Interspeech 2023
There is increasing interest in the use of the LEArnable Front-end (LEAF) in a variety of speech processing systems. However, there is a dearth of analyses of what is actually learnt and the relative importance of training the different components of
Externí odkaz:
http://arxiv.org/abs/2404.06702
Autor:
Zhang, Qiquan, Ge, Meng, Zhu, Hongxu, Ambikairajah, Eliathamby, Song, Qi, Ni, Zhaoheng, Li, Haizhou
Transformer architecture has enabled recent progress in speech enhancement. Since Transformers are position-agostic, positional encoding is the de facto standard component used to enable Transformers to distinguish the order of elements in a sequence
Externí odkaz:
http://arxiv.org/abs/2401.09686
Biologically inspired auditory models play an important role in developing effective audio representations that can be tightly integrated into speech and audio processing systems. Current computational models of the cochlea are typically expressed in
Externí odkaz:
http://arxiv.org/abs/2108.05993
There is growing interest in affective computing for the representation and prediction of emotions along ordinal scales. However, the term ordinal emotion label has been used to refer to both absolute notions such as low or high arousal, as well as r
Externí odkaz:
http://arxiv.org/abs/2108.04605
Autor:
Wickramasinghe, Buddhi, Ambikairajah, Eliathamby, Sethu, Vidhyasaharan, Epps, Julien, Li, Haizhou, Dang, Ting
Publikováno v:
In Speech Communication October 2023 154