Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator

Autor:	Ratnarajah, Anton, Zhang, Shi-Xiong, Yu, Meng, Tang, Zhenyu, Manocha, Dinesh, Yu, Dong
Rok vydání:	2022
Předmět:	FOS: Computer and information sciences Computer Science - Machine Learning Sound (cs.SD) Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Audio and Speech Processing (eess.AS) FOS: Electrical engineering electronic engineering information engineering Computer Science - Sound Electrical Engineering and Systems Science - Audio and Speech Processing Machine Learning (cs.LG)
Zdroj:	ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Popis:	We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment. Our FAST-RIR takes rectangular room dimensions, listener and speaker positions, and reverberation time as inputs and generates specular and diffuse reflections for a given acoustic environment. Our FAST-RIR is capable of generating RIRs for a given input reverberation time with an average error of 0.02s. We evaluate our generated RIRs in automatic speech recognition (ASR) applications using Google Speech API, Microsoft Speech API, and Kaldi tools. We show that our proposed FAST-RIR with batch size 1 is 400 times faster than a state-of-the-art diffuse acoustic simulator (DAS) on a CPU and gives similar performance to DAS in ASR experiments. Our FAST-RIR is 12 times faster than an existing GPU-based RIR generator (gpuRIR). We show that our FAST-RIR outperforms gpuRIR by 2.5% in an AMI far-field ASR benchmark. Comment: Accepted to ICASSP 2022. More results and source code is available at https://anton-jeran.github.io/FRIR/
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::01ff5c3af1c81a74df5cb37eca178425 https://doi.org/10.1109/icassp43922.2022.9747846 Zobrazit plný text záznamu