Children’s Age and Gender Recognition from Raw Speech Waveform Using DNN

Autor: Kandarpa Kumar Sarma, Nagendra Kumar Goel, Mousmita Sarma
Rok vydání: 2020
Předmět:
Zdroj: Lecture Notes in Networks and Systems ISBN: 9789811527739
DOI: 10.1007/978-981-15-2774-6_1
Popis: We propose raw speech waveform-based end-to-end deep neural network (DNN) architectures to estimate age and gender of children within the age range of 4–14 years. To achieve this objective, we design single-task and multi-task learning DNN configuration. In the multi-task learning DNN, we use age and gender as separate label in two output layers and jointly optimize the total objective loss. We use a data-driven approach of learning feature from raw waveform within the DNN, which provides the learning process freedom to learn gender and age discriminative features during training. Interleaving time-delay neural network and long short-term memory (TDNN-LSTM) layers with time-restricted self-attention mechanism has been used for modeling of speech temporal dynamics. Experimental results provide a comparative analysis of single-task and multi-task learning process for age and gender recognition from children’s speech.
Databáze: OpenAIRE