Multi-task learning DNN to improve gender identification from speech leveraging age information of the speaker

Autor:	Mousmita Sarma, Nagendra Kumar Goel, Kandarpa Kumar Sarma
Rok vydání:	2020
Předmět:	Linguistics and Language Artificial neural network Computer science Speech recognition Multi-task learning Filter (signal processing) Mixture model Language and Linguistics Human-Computer Interaction Identification (information) Discriminative model Feature (machine learning) Computer Vision and Pattern Recognition Projection (set theory) Software
Zdroj:	International Journal of Speech Technology. 23:223-240
ISSN:	1572-8110 1381-2416
DOI:	10.1007/s10772-020-09680-4
Popis:	We propose a method which provides age of the speaker as an additional information while training a machine learning model for gender identification. To achieve this objective, we design a multi-task learning Deep Neural Network (DNN) model where the primary output layer has the speakers’ gender as target. Further, we use age group of the speaker as auxiliary target for each utterance, where age groups are created considering the gender of the speaker. We experimentally prove that multi-task learning DNN outperforms Gaussian Mixture Model (GMM) or single-task learning DNN trained only for gender recognition for more real life oriented datasets. For such datasets we have recordings of speakers’ from all age groups (children to seniors). We use raw speech waveform as input to our DNN which executes the multi-task learning with the freedom to follow gender and age discriminative features during training. The raw waveform front end uses convolutional layer based filter learning. Further, we use Long Short Term Memory cell based recurrent projection (LSTMP) layers for modeling temporal dynamics of speech from learned feature representation.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::7dd0f79ec24ce65462f2f3b550f5c715 https://doi.org/10.1007/s10772-020-09680-4 Zobrazit plný text záznamu Full text from SpringerLink