Zobrazeno 1 - 10
of 35
pro vyhledávání: '"Kitic, Srdan"'
Autor:
Kitić, Srđan, Daniel, Jérôme
Recently proposed Generalized Time-domain Velocity Vector (GTVV) is a generalization of relative room impulse response in spherical harmonic (aka Ambisonic) domain that allows for blind estimation of early-echo parameters: the directions and relative
Externí odkaz:
http://arxiv.org/abs/2305.03558
Autor:
Daniel, Jérôme, Kitić, Srđan
Range estimation of a far field sound source in a reverberant environment is known to be a notoriously difficult problem, hence most localization methods are only capable of estimating the source's Direction-of-Arrival (DoA). In an earlier work, we h
Externí odkaz:
http://arxiv.org/abs/2203.05265
Autor:
Kitić, Srđan, Daniel, Jérôme
We introduce and analyze Generalized Time Domain Velocity Vector (GTVV), an extension of the previously presented acoustic multipath footprint extracted from the Ambisonic recordings. GTVV is better adapted to adverse acoustic conditions, and enables
Externí odkaz:
http://arxiv.org/abs/2110.06304
This article is a survey on deep learning methods for single and multiple sound source localization. We are particularly interested in sound source localization in indoor/domestic environment, where reverberation and diffuse noise are present. We pro
Externí odkaz:
http://arxiv.org/abs/2109.03465
In this work, we propose a novel self-attention based neural network for robust multi-speaker localization from Ambisonics recordings. Starting from a state-of-the-art convolutional recurrent neural network, we investigate the benefit of replacing th
Externí odkaz:
http://arxiv.org/abs/2107.11066
In this work, we propose to extend a state-of-the-art multi-source localization system based on a convolutional recurrent neural network and Ambisonics signals. We significantly improve the performance of the baseline network by changing the layout b
Externí odkaz:
http://arxiv.org/abs/2105.01897
Speaker counting is the task of estimating the number of people that are simultaneously speaking in an audio recording. For several audio processing tasks such as speaker diarization, separation, localization and tracking, knowing the number of speak
Externí odkaz:
http://arxiv.org/abs/2101.01977
Autor:
Daniel, Jérôme, Kitić, Srđan
We propose a conceptually and computationally simple form of sound velocity that offers a readable view of the interference between direct and indirect sound waves. Unlike most approaches in the literature, it jointly exploits both active and reactiv
Externí odkaz:
http://arxiv.org/abs/2006.02099
We present a CNN architecture for speech enhancement from multichannel first-order Ambisonics mixtures. The data-dependent spatial filters, deduced from a mask-based approach, are used to help an automatic speech recognition engine to face adverse co
Externí odkaz:
http://arxiv.org/abs/2006.01708
Recent advances in audio declipping have substantially improved the state of the art.% in certain saturation regimes. Yet, practitioners need guidelines to choose a method, and while existing benchmarks have been instrumental in advancing the field,
Externí odkaz:
http://arxiv.org/abs/2005.10228