SYNTHETIC DATA FOR DNN-BASED DOA ESTIMATION OF INDOOR SPEECH
Autor: | Johannes Kvam, Yi Liu, Femke B. Gelderblom, Tor Andre Myrvoll |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Cross-correlation
Artificial neural network Computer science business.industry reverberation feature extraction Feature extraction Direction of arrival Pattern recognition Perceptron transient response Synthetic data deep learning (artificial intelligence) Azimuth direction-of-arrival estimation speech synthesis multilayer perceptrons Artificial intelligence business Impulse response speech processing |
Zdroj: | ICASSP |
ISSN: | 4390-4394 |
Popis: | This paper investigates the use of different room impulse response (RIR) simulation methods for synthesizing training data for deep neural network-based direction of arrival (DOA) estimation of speech in reverberant rooms. Different sets of synthetic RIRs are obtained using the image source method (ISM) and more advanced methods including diffuse reflections and/or source directivity. Multi-layer perceptron (MLP) deep neural network (DNN) models are trained on generalized cross correlation (GCC) features extracted for each set. Finally, models are tested on features obtained from measured RIRs. This study shows the importance of training with RIRs from directive sources, as resultant DOA models achieved up to 51% error reduction compared to the steered response power with phase transform (SRP-PHAT) baseline (significant with p< |
Databáze: | OpenAIRE |
Externí odkaz: |