Zobrazeno 1 - 10
of 373
pro vyhledávání: '"Bai, Mingsian R."'
Autor:
Hsu, Yicheng, Bai, Mingsian R.
Binaural Audio Telepresence (BAT) aims to encode the acoustic scene at the far end into binaural signals for the user at the near end. BAT encompasses an immense range of applications that can vary between two extreme modes of Immersive BAT (I-BAT) a
Externí odkaz:
http://arxiv.org/abs/2405.08742
A robust multichannel speaker diarization and separation system is proposed by exploiting the spatio-temporal activity of the speakers. The system is realized in a hybrid architecture that combines the array signal processing units and the deep learn
Externí odkaz:
http://arxiv.org/abs/2401.16850
Autor:
Hsu, Yicheng, Bai, Mingsian R.
Audio Telepresence (AT) aims to create an immersive experience of the audio scene at the far end for the user(s) at the near end. The application of AT could encompass scenarios with varying degrees of emphasis on signal enhancement and ambience pres
Externí odkaz:
http://arxiv.org/abs/2311.12706
Recent research advances in deep neural network (DNN)-based beamformers have shown great promise for speech enhancement under adverse acoustic conditions. Different network architectures and input features have been explored in estimating beamforming
Externí odkaz:
http://arxiv.org/abs/2310.12837
Autor:
Hsu, Yicheng, Bai, Mingsian R.
Personal voice activity detection has received increased attention due to the growing popularity of personal mobile devices and smart speakers. PVAD is often an integral element to speech enhancement and recognition for these applications in which li
Externí odkaz:
http://arxiv.org/abs/2304.08887
Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence
Personalized speech enhancement has been a field of active research for suppression of speechlike interferers such as competing speakers or TV dialogues. Compared with single channel approaches, multichannel PSE systems can be more effective in adver
Externí odkaz:
http://arxiv.org/abs/2211.08748
Telepresence aims to create an immersive but virtual experience of the audio and visual scene at the far end for users at the near end. In this contribution, we propose an array-based binaural rendering system that converts the array microphone signa
Externí odkaz:
http://arxiv.org/abs/2210.11123
Recently, speech enhancement technologies that are based on deep learning have received considerable research attention. If the spatial information in microphone signals is exploited, microphone arrays can be advantageous under some adverse acoustic
Externí odkaz:
http://arxiv.org/abs/2207.08126
Speech enhancement and source localization has been active research for several decades with a wide range of real-world applications. Recently, the Deep Complex Convolution Recurrent network (DCCRN) has yielded impressive enhancement performance for
Externí odkaz:
http://arxiv.org/abs/2206.09728
Teleconferencing is becoming essential during the COVID-19 pandemic. However, in real-world applications, speech quality can deteriorate due to, for example, background interference, noise, or reverberation. To solve this problem, target speech extra
Externí odkaz:
http://arxiv.org/abs/2112.05686