Autor: |
Kandarpa Kumar Sarma, Nagendra Kumar Goel, Mousmita Sarma |
Rok vydání: |
2020 |
Předmět: |
|
Zdroj: |
Lecture Notes in Networks and Systems ISBN: 9789811527739 |
DOI: |
10.1007/978-981-15-2774-6_1 |
Popis: |
We propose raw speech waveform-based end-to-end deep neural network (DNN) architectures to estimate age and gender of children within the age range of 4–14 years. To achieve this objective, we design single-task and multi-task learning DNN configuration. In the multi-task learning DNN, we use age and gender as separate label in two output layers and jointly optimize the total objective loss. We use a data-driven approach of learning feature from raw waveform within the DNN, which provides the learning process freedom to learn gender and age discriminative features during training. Interleaving time-delay neural network and long short-term memory (TDNN-LSTM) layers with time-restricted self-attention mechanism has been used for modeling of speech temporal dynamics. Experimental results provide a comparative analysis of single-task and multi-task learning process for age and gender recognition from children’s speech. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|