Výsledky vyhledávání

Report

FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses

Autor: Xu, Zhongweiyang, Aroudi, Ali, Tan, Ke, Pandey, Ashutosh, Lee, Jung-Suk, Xu, Buye, Nesta, Francesco

This paper presents a novel multi-channel speech enhancement approach, FoVNet, that enables highly efficient speech enhancement within a configurable field of view (FoV) of a smart-glasses user without needing specific target-talker(s) directions. It

Externí odkaz: http://arxiv.org/abs/2408.06468

Zobrazit plný text záznamu

Report

AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling

Autor: Kalkhorani, Vahid Ahmadi, Yu, Cheng, Kumar, Anurag, Tan, Ke, Xu, Buye, Wang, DeLiang

Adding visual cues to audio-based speech separation can improve separation performance. This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement, target speaker extraction, and multi-talker speaker separation. AV-CrossNet

Externí odkaz: http://arxiv.org/abs/2406.11619

Zobrazit plný text záznamu

Report

A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement

Autor: Shankar, Ravi, Tan, Ke, Xu, Buye, Kumar, Anurag

Self-supervised learned models have been found to be very effective for certain speech tasks such as automatic speech recognition, speaker identification, keyword spotting and others. While the features are undeniably useful in speech recognition and

Externí odkaz: http://arxiv.org/abs/2403.01369

Zobrazit plný text záznamu

Report

Beyond spectroscopy. II. Stellar parameters for over twenty million stars in the northern sky from SAGES DR1 and Gaia DR3

We present precise photometric estimates of stellar parameters, including effective temperature, metallicity, luminosity classification, distance, and stellar age, for nearly 26 million stars using the methodology developed in the first paper of this

Externí odkaz: http://arxiv.org/abs/2307.04469

Zobrazit plný text záznamu

Report

TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio

Autor: Kumar, Anurag, Tan, Ke, Ni, Zhaoheng, Manocha, Pranay, Zhang, Xiaohui, Henderson, Ethan, Xu, Buye

Measuring quality and intelligibility of a speech signal is usually a critical step in development of speech processing systems. To enable this, a variety of metrics to measure quality and intelligibility under different assumptions have been develop

Externí odkaz: http://arxiv.org/abs/2304.01448

Zobrazit plný text záznamu

Report

A Dwarf Galaxy Debris Stream Associated with Palomar 1 and the Anticenter Stream

Autor: Yang, Yong, Zhao, Jing-Kun, Ye, Xian-Hao, Zhao, Gang, Tan, Ke-Feng

Publikováno v: 2023 ApJL 945 L5

We report the discovery of a new stream (dubbed as Yangtze) detected in $Gaia$ Data Release 3. The stream is at a heliocentric distance of $\sim$ 9.12 kpc and spans nearly 27$\deg$ by 1.9$\deg$ on sky. The colour-magnitude diagram of Yangtze indicate

Externí odkaz: http://arxiv.org/abs/2302.05232

Zobrazit plný text záznamu

Report

Rethinking complex-valued deep neural networks for monaural speech enhancement

Autor: Wu, Haibin, Tan, Ke, Xu, Buye, Kumar, Anurag, Wong, Daniel

Despite multiple efforts made towards adopting complex-valued deep neural networks (DNNs), it remains an open question whether complex-valued DNNs are generally more effective than real-valued DNNs for monaural speech enhancement. This work is devote

Externí odkaz: http://arxiv.org/abs/2301.04320

Zobrazit plný text záznamu

Report

Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-channel Speech Enhancement

Autor: Chen, Kuan-Lin, Wong, Daniel D. E., Tan, Ke, Xu, Buye, Kumar, Anurag, Ithapu, Vamsi Krishna

Most speech enhancement (SE) models learn a point estimate and do not make use of uncertainty estimation in the learning process. In this paper, we show that modeling heteroscedastic uncertainty by minimizing a multivariate Gaussian negative log-like

Externí odkaz: http://arxiv.org/abs/2211.08624

Zobrazit plný text záznamu

Dissertation/ Thesis

Convolutional and recurrent neural networks for real-time speech separation in the complex domain

Autor: Tan, Ke

Speech signals are usually distorted by acoustic interference in daily listening environments. Such distortions severely degrade speech intelligibility and quality for human listeners, and make many speech-related tasks, such as automatic speech reco

Externí odkaz: http://rave.ohiolink.edu/etdc/view?acc_num=osu1626983471600193

Zobrazit plný text záznamu

Akademický článek

Microsurgical treatment of severe aneurysmal subarachnoid hemorrhage: an analysis of 14 cases

Autor: GUO Peng, SONG Yinglun, LI Xiong, TAN Ke, WANG Yu, LI Tao, PENG Yutao, ZHANG Haoyu, DONG Le, WU Wenqian, LI Jinping

Publikováno v: Zhongguo linchuang yanjiu, Vol 37, Iss 4, Pp 560-563 (2024)

Objective To analyze the therapeutic effect of microsurgery on patients with severe aneurysmal subarachnoid hemorrhage (SaSAH). Methods A retrospective analysis was conducted on 14 SaSAH patients admitted to Beijing Chao-Yang Hospital, Capital Medica

Externí odkaz: https://doaj.org/article/886323c700f247c6bcec84bd21d6e5dd

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání