Estimating fundamental frequency and formants based on periodicity glimpses: a deep learning approach

Autor:	Joanna Luberadzka, Hendrik Kayser, Volker Hohmann
Rok vydání:	2020
Předmět:	Artificial neural network business.industry Computer science Deep learning Feature vector Speech recognition Acoustic space 03 medical and health sciences 0302 clinical medicine Formant medicine.anatomical_structure Video tracking medicine Auditory system Artificial intelligence 030223 otorhinolaryngology Set (psychology) business 030217 neurology & neurosurgery
Zdroj:	ICHI
DOI:	10.1109/ichi48887.2020.9374386
Popis:	Despite many technological advances, hearing aids still amplify the background sounds together with the signal of interest. To understand how to process the acoustic information in an optimal way for a human listener, we have to understand why a healthy auditory system performs this task with such a great efficiency. Several studies show the importance of the so called auditory glimpses in decoding of the auditory scene. They are usually defined as time-frequency bins dominated by one source, which the auditory system may use to track this source in a crowded acoustic space. Josupeit et al. in [6]-[8] developed an algorithm inspired by these findings. It extracts the speech glimpses, defined as the salient tonal components of a sound mixture, called the sparse periodicity-based auditory features (sPAF).In this study, we investigated if sPAF can be used to estimate the instantaneous voice parameters: fundamental frequency F0 and formant frequencies F1 and F2. We used a supervised machine learning technique for finding the mapping between parameter and feature space. Using a formant synthesizer, we created a labeled data set containing instantaneous sPAF and the corresponding parameter values. We trained a deep neural network and evaluated the prediction performance of the learned model. The results showed that the sPAF represent the parameters of a single voice very well, which opens a possibility to use the sPAF for more complex scenarios of auditory object tracking.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::da25aecbfead4b20172be3a58bc3a9f8 https://doi.org/10.1109/ichi48887.2020.9374386 Zobrazit plný text záznamu