Gender and region detection from human voice using the three-layer feature extraction method with 1D CNN
Autor: | Mohammad Amaz Uddin, Sayem Hossain, Refat Khan Pathan, Munmun Biswas |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
gender recognition
Computer Networks and Communications Computer science Speech recognition Feature extraction Region detection mfcc TK5101-6720 Information technology region of accent detection T58.5-58.64 Computer Science Applications Product reviews Computer Science (miscellaneous) Telecommunication State (computer science) Mel-frequency cepstrum 1d cnn Electrical and Electronic Engineering Layer (object-oriented design) Human voice |
Zdroj: | Journal of Information and Telecommunication, Vol 0, Iss 0, Pp 1-16 (2021) |
ISSN: | 2475-1847 2475-1839 |
Popis: | Analysing the human voice has always been a challenge to the engineering society for various purposes such as product review, emotional state detection, developing AI, and much more. Two basic grounds of voice or speech analysis are to detect human gender and the geographical region based on accent. This study presents a three-layer feature extraction method from the raw human voice to detect the gender as male or female, as well as the region from where that voice belongs. Fundamental frequency, spectral entropy, spectral flatness, and mode frequency have been calculated in the first layer of feature extraction. On the other hand, Mel Frequency Cepstral Coefficient has been used to extract the features in the second layer and linear predictive coding in the third layer. Regular voice contains some noises which have been removed with multiple audio data filtering processes to get noise-free smooth data. Multi-Output-based 1D Convolutional Neural Network has been used to recognize gender and region from a combined dataset which consists of TIMIT, RAVDESS, and BGC datasets. The model has successfully predicted the gender with 93.01% and region with 97.07% accuracy. This method works better than usual state-of-the-art methods in separate datasets along with the combined dataset on both gender and region classification. |
Databáze: | OpenAIRE |
Externí odkaz: |